nohup: ignoring input
Global seed set to 231
/usr/local/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /data/jenkins_workspace/workspace/pytorch_23.04_abi@4/aten/src/ATen/native/TensorShape.cpp:3191.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/usr/local/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/usr/local/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Global seed set to 231
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/3
Global seed set to 231
Global seed set to 231
/usr/local/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /data/jenkins_workspace/workspace/pytorch_23.04_abi@4/aten/src/ATen/native/TensorShape.cpp:3191.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/usr/local/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/usr/local/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
/usr/local/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /data/jenkins_workspace/workspace/pytorch_23.04_abi@4/aten/src/ATen/native/TensorShape.cpp:3191.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/usr/local/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/usr/local/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Global seed set to 231
initializing ddp: GLOBAL_RANK: 1, MEMBER: 2/3
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1103 04:22:32.982725 4113395 ProcessGroupNCCL.cpp:669] [Rank 1] ProcessGroupNCCL initialized with following options:
NCCL_ASYNC_ERROR_HANDLING: 0
NCCL_DESYNC_DEBUG: 0
NCCL_BLOCKING_WAIT: 0
TIMEOUT(ms): 1800000
USE_HIGH_PRIORITY_STREAM: 0
I1103 04:22:32.982746 4119599 ProcessGroupNCCL.cpp:835] [Rank 1] NCCL watchdog thread started!
Global seed set to 231
initializing ddp: GLOBAL_RANK: 2, MEMBER: 3/3
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1103 04:22:42.253474 4113753 ProcessGroupNCCL.cpp:669] [Rank 2] ProcessGroupNCCL initialized with following options:
NCCL_ASYNC_ERROR_HANDLING: 0
NCCL_DESYNC_DEBUG: 0
NCCL_BLOCKING_WAIT: 0
TIMEOUT(ms): 1800000
USE_HIGH_PRIORITY_STREAM: 0
I1103 04:22:42.253731 4120039 ProcessGroupNCCL.cpp:835] [Rank 2] NCCL watchdog thread started!
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1103 04:22:42.256446 4120041 ProcessGroupNCCL.cpp:835] [Rank 0] NCCL watchdog thread started!
I1103 04:22:42.256433 4109126 ProcessGroupNCCL.cpp:669] [Rank 0] ProcessGroupNCCL initialized with following options:
NCCL_ASYNC_ERROR_HANDLING: 0
NCCL_DESYNC_DEBUG: 0
NCCL_BLOCKING_WAIT: 0
TIMEOUT(ms): 1800000
USE_HIGH_PRIORITY_STREAM: 0
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All DDP processes registered. Starting ddp with 3 processes
----------------------------------------------------------------------------------------------------

I1103 04:22:43.273196 4109126 ProcessGroupNCCL.cpp:1274] NCCL_DEBUG: N/A
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]

  | Name              | Type                   | Params
-------------------------------------------------------------
0 | model             | DiffusionWrapper       | 865 M 
1 | first_stage_model | AutoencoderKL          | 83.7 M
2 | cond_stage_model  | FrozenOpenCLIPEmbedder | 354 M 
3 | control_model     | ControlNet             | 363 M 
4 | preprocess_model  | SwinIR                 | 15.8 M
5 | cond_encoder      | Sequential             | 34.2 M
-------------------------------------------------------------
1.2 B     Trainable params
487 M     Non-trainable params
1.7 B     Total params
6,866.827 Total estimated model params size (MB)
No module 'xformers'. Proceeding without it.
ControlLDM: Running in eps-prediction mode
DiffusionWrapper has 865.91 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
/home/modelzoo/DiffBIR/weights/open_clip_pytorch_model.bin
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /usr/local/lib/python3.9/site-packages/lpips/weights/v0.1/alex.pth

Validation sanity check: 0it [00:00, ?it/s]/usr/local/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py:105: UserWarning: The dataloader, val dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 128 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(

Validation sanity check:   0%|          | 0/2 [00:00<?, ?it/s]No module 'xformers'. Proceeding without it.
ControlLDM: Running in eps-prediction mode
DiffusionWrapper has 865.91 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
/home/modelzoo/DiffBIR/weights/open_clip_pytorch_model.bin
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /usr/local/lib/python3.9/site-packages/lpips/weights/v0.1/alex.pth
No module 'xformers'. Proceeding without it.
ControlLDM: Running in eps-prediction mode
DiffusionWrapper has 865.91 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
/home/modelzoo/DiffBIR/weights/open_clip_pytorch_model.bin
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /usr/local/lib/python3.9/site-packages/lpips/weights/v0.1/alex.pth

Validation sanity check:  50%|█████     | 1/2 [00:11<00:11, 11.00s/it]
Validation sanity check: 100%|██████████| 2/2 [00:14<00:00,  6.61s/it]
                                                                      Global seed set to 231
Global seed set to 231
Global seed set to 231
/usr/local/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py:105: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 128 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(


Training: -1it [00:00, ?it/s]
Training:   0%|          | 0/5971 [00:00<00:00, 18808.54it/s]
Epoch 0:   0%|          | 0/5971 [00:00<00:01, 3472.11it/s]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:25,  1.89it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:16,  2.93it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.76it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.74it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:08,  5.02it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.19it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.30it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:01<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.46it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.53it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.59it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.64it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.56it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:03<00:05,  5.58it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.60it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.56it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.59it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.50it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.51it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:05<00:03,  5.54it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.49it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.59it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.45it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.45it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.37it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.45it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.54it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.57it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.33it/s]

Epoch 0:   0%|          | 1/5971 [00:13<11:22:18,  6.86s/it]
Epoch 0:   0%|          | 1/5971 [00:13<11:22:23,  6.86s/it, loss=0.00302, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.76e-5, train/loss_step=0.00302, global_step=0.000]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.20it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.86it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.70it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.13it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.15it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.32it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.25it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.42it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.44it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.24it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.34it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.33it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.52it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s]

Epoch 0:   0%|          | 2/5971 [00:26<14:24:49,  8.69s/it, loss=0.00302, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.76e-5, train/loss_step=0.00302, global_step=0.000]
Epoch 0:   0%|          | 2/5971 [00:26<14:24:52,  8.69s/it, loss=0.0365, v_num=0, train/loss_simple_step=0.070, train/loss_vlb_step=0.00024, train/loss_step=0.070, global_step=0.000]     timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.37it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.62it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.88it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.02it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.42it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.52it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.53it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.48it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.56it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.58it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.56it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.54it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.59it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.62it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.20it/s]

Epoch 0:   0%|          | 3/5971 [00:38<15:47:10,  9.52s/it, loss=0.0365, v_num=0, train/loss_simple_step=0.070, train/loss_vlb_step=0.00024, train/loss_step=0.070, global_step=0.000]
Epoch 0:   0%|          | 3/5971 [00:38<15:47:11,  9.52s/it, loss=0.0827, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000617, train/loss_step=0.175, global_step=0.000]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:26,  1.87it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:15,  3.09it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.85it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.42it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.25it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.27it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:01<00:07,  5.41it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.51it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.62it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.64it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.66it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.68it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:03<00:05,  5.62it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:04<00:04,  5.56it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.60it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.64it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.66it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.58it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:06<00:02,  5.56it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.52it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.42it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.41it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.40it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.38it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.44it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:08<00:00,  5.42it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.48it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.37it/s]

Epoch 0:   0%|          | 4/5971 [00:51<17:01:58, 10.28s/it, loss=0.0827, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000617, train/loss_step=0.175, global_step=0.000]
Epoch 0:   0%|          | 4/5971 [00:51<17:01:59, 10.28s/it, loss=0.252, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0306, train/loss_step=0.761, global_step=0.000]   
Epoch 0:   0%|          | 5/5971 [00:52<14:26:39,  8.72s/it, loss=0.252, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0306, train/loss_step=0.761, global_step=0.000]
Epoch 0:   0%|          | 5/5971 [00:52<14:26:40,  8.72s/it, loss=0.203, v_num=0, train/loss_simple_step=0.00467, train/loss_vlb_step=2.67e-5, train/loss_step=0.00467, global_step=1.000]
Epoch 0:   0%|          | 6/5971 [00:53<12:35:41,  7.60s/it, loss=0.203, v_num=0, train/loss_simple_step=0.00467, train/loss_vlb_step=2.67e-5, train/loss_step=0.00467, global_step=1.000]
Epoch 0:   0%|          | 6/5971 [00:53<12:35:42,  7.60s/it, loss=0.17, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.4e-5, train/loss_step=0.00419, global_step=1.000]  
Epoch 0:   0%|          | 7/5971 [00:54<11:12:07,  6.76s/it, loss=0.17, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.4e-5, train/loss_step=0.00419, global_step=1.000]
Epoch 0:   0%|          | 7/5971 [00:54<11:12:08,  6.76s/it, loss=0.203, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00276, train/loss_step=0.404, global_step=1.000]  
Epoch 0:   0%|          | 8/5971 [00:56<10:23:11,  6.27s/it, loss=0.203, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00276, train/loss_step=0.404, global_step=1.000]
Epoch 0:   0%|          | 8/5971 [00:56<10:23:11,  6.27s/it, loss=0.19, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000353, train/loss_step=0.102, global_step=1.000]
Epoch 0:   0%|          | 9/5971 [00:57<9:31:54,  5.76s/it, loss=0.19, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000353, train/loss_step=0.102, global_step=1.000] 
Epoch 0:   0%|          | 9/5971 [00:57<9:31:54,  5.76s/it, loss=0.185, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000486, train/loss_step=0.143, global_step=2.000]
Epoch 0:   0%|          | 10/5971 [00:58<8:47:47,  5.31s/it, loss=0.185, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000486, train/loss_step=0.143, global_step=2.000]
Epoch 0:   0%|          | 10/5971 [00:58<8:47:48,  5.31s/it, loss=0.174, v_num=0, train/loss_simple_step=0.0753, train/loss_vlb_step=0.000265, train/loss_step=0.0753, global_step=2.000]
Epoch 0:   0%|          | 11/5971 [00:59<8:11:07,  4.94s/it, loss=0.174, v_num=0, train/loss_simple_step=0.0753, train/loss_vlb_step=0.000265, train/loss_step=0.0753, global_step=2.000]
Epoch 0:   0%|          | 11/5971 [00:59<8:11:08,  4.94s/it, loss=0.168, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000371, train/loss_step=0.112, global_step=2.000]  
Epoch 0:   0%|          | 12/5971 [01:01<7:50:59,  4.74s/it, loss=0.168, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000371, train/loss_step=0.112, global_step=2.000]
Epoch 0:   0%|          | 12/5971 [01:01<7:50:59,  4.74s/it, loss=0.181, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00126, train/loss_step=0.314, global_step=2.000] 
Epoch 0:   0%|          | 13/5971 [01:02<7:23:37,  4.47s/it, loss=0.181, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00126, train/loss_step=0.314, global_step=2.000]
Epoch 0:   0%|          | 13/5971 [01:02<7:23:37,  4.47s/it, loss=0.168, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.92e-5, train/loss_step=0.0157, global_step=3.000]
Epoch 0:   0%|          | 14/5971 [01:03<6:59:42,  4.23s/it, loss=0.168, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.92e-5, train/loss_step=0.0157, global_step=3.000]
Epoch 0:   0%|          | 14/5971 [01:03<6:59:43,  4.23s/it, loss=0.197, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.00706, train/loss_step=0.575, global_step=3.000]  
Epoch 0:   0%|          | 15/5971 [01:04<6:38:53,  4.02s/it, loss=0.197, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.00706, train/loss_step=0.575, global_step=3.000]
Epoch 0:   0%|          | 15/5971 [01:04<6:38:53,  4.02s/it, loss=0.198, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.00075, train/loss_step=0.211, global_step=3.000]
Epoch 0:   0%|          | 16/5971 [01:06<6:30:09,  3.93s/it, loss=0.198, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.00075, train/loss_step=0.211, global_step=3.000]
Epoch 0:   0%|          | 16/5971 [01:06<6:30:09,  3.93s/it, loss=0.189, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000215, train/loss_step=0.0579, global_step=3.000]
Epoch 0:   0%|          | 17/5971 [01:07<6:13:21,  3.76s/it, loss=0.189, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000215, train/loss_step=0.0579, global_step=3.000]
Epoch 0:   0%|          | 17/5971 [01:07<6:13:21,  3.76s/it, loss=0.212, v_num=0, train/loss_simple_step=0.572, train/loss_vlb_step=0.00508, train/loss_step=0.572, global_step=4.000]   
Epoch 0:   0%|          | 18/5971 [01:08<5:58:12,  3.61s/it, loss=0.212, v_num=0, train/loss_simple_step=0.572, train/loss_vlb_step=0.00508, train/loss_step=0.572, global_step=4.000]
Epoch 0:   0%|          | 18/5971 [01:08<5:58:12,  3.61s/it, loss=0.211, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000637, train/loss_step=0.192, global_step=4.000]
Epoch 0:   0%|          | 19/5971 [01:09<5:44:35,  3.47s/it, loss=0.211, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000637, train/loss_step=0.192, global_step=4.000]
Epoch 0:   0%|          | 19/5971 [01:09<5:44:35,  3.47s/it, loss=0.203, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000233, train/loss_step=0.0661, global_step=4.000]
Epoch 0:   0%|          | 20/5971 [01:11<5:38:39,  3.41s/it, loss=0.203, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000233, train/loss_step=0.0661, global_step=4.000]
Epoch 0:   0%|          | 20/5971 [01:11<5:38:39,  3.41s/it, loss=0.206, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000994, train/loss_step=0.254, global_step=4.000]  
Epoch 0:   0%|          | 21/5971 [01:12<5:27:18,  3.30s/it, loss=0.206, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000994, train/loss_step=0.254, global_step=4.000]
Epoch 0:   0%|          | 21/5971 [01:12<5:27:18,  3.30s/it, loss=0.239, v_num=0, train/loss_simple_step=0.672, train/loss_vlb_step=0.0209, train/loss_step=0.672, global_step=5.000]  
Epoch 0:   0%|          | 22/5971 [01:13<5:16:44,  3.19s/it, loss=0.239, v_num=0, train/loss_simple_step=0.672, train/loss_vlb_step=0.0209, train/loss_step=0.672, global_step=5.000]
Epoch 0:   0%|          | 22/5971 [01:13<5:16:44,  3.19s/it, loss=0.236, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.38e-5, train/loss_step=0.0113, global_step=5.000]
Epoch 0:   0%|          | 23/5971 [01:14<5:07:28,  3.10s/it, loss=0.236, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.38e-5, train/loss_step=0.0113, global_step=5.000]
Epoch 0:   0%|          | 23/5971 [01:14<5:07:28,  3.10s/it, loss=0.228, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=9.21e-5, train/loss_step=0.0212, global_step=5.000]
Epoch 0:   0%|          | 24/5971 [01:16<5:03:55,  3.07s/it, loss=0.228, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=9.21e-5, train/loss_step=0.0212, global_step=5.000]
Epoch 0:   0%|          | 24/5971 [01:16<5:03:55,  3.07s/it, loss=0.191, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=7.03e-5, train/loss_step=0.0145, global_step=5.000]
Epoch 0:   0%|          | 25/5971 [01:17<4:55:43,  2.98s/it, loss=0.191, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=7.03e-5, train/loss_step=0.0145, global_step=5.000]
Epoch 0:   0%|          | 25/5971 [01:17<4:55:43,  2.98s/it, loss=0.207, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.0011, train/loss_step=0.317, global_step=6.000]   
Epoch 0:   0%|          | 26/5971 [01:18<4:47:56,  2.91s/it, loss=0.207, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.0011, train/loss_step=0.317, global_step=6.000]
Epoch 0:   0%|          | 26/5971 [01:18<4:47:56,  2.91s/it, loss=0.207, v_num=0, train/loss_simple_step=0.00793, train/loss_vlb_step=4.02e-5, train/loss_step=0.00793, global_step=6.000]
Epoch 0:   0%|          | 27/5971 [01:19<4:40:43,  2.83s/it, loss=0.207, v_num=0, train/loss_simple_step=0.00793, train/loss_vlb_step=4.02e-5, train/loss_step=0.00793, global_step=6.000]
Epoch 0:   0%|          | 27/5971 [01:19<4:40:43,  2.83s/it, loss=0.187, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=6.05e-5, train/loss_step=0.0125, global_step=6.000]  
Epoch 0:   0%|          | 28/5971 [01:21<4:38:27,  2.81s/it, loss=0.187, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=6.05e-5, train/loss_step=0.0125, global_step=6.000]
Epoch 0:   0%|          | 28/5971 [01:21<4:38:27,  2.81s/it, loss=0.203, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00302, train/loss_step=0.417, global_step=6.000]  
Epoch 0:   0%|          | 29/5971 [01:22<4:32:11,  2.75s/it, loss=0.203, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00302, train/loss_step=0.417, global_step=6.000]
Epoch 0:   0%|          | 29/5971 [01:22<4:32:11,  2.75s/it, loss=0.206, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000655, train/loss_step=0.194, global_step=7.000]
Epoch 0:   1%|          | 30/5971 [01:23<4:26:13,  2.69s/it, loss=0.206, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000655, train/loss_step=0.194, global_step=7.000]
Epoch 0:   1%|          | 30/5971 [01:23<4:26:13,  2.69s/it, loss=0.204, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.00012, train/loss_step=0.031, global_step=7.000] 
Epoch 0:   1%|          | 31/5971 [01:24<4:20:45,  2.63s/it, loss=0.204, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.00012, train/loss_step=0.031, global_step=7.000]
Epoch 0:   1%|          | 31/5971 [01:24<4:20:45,  2.63s/it, loss=0.2, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000138, train/loss_step=0.0376, global_step=7.000]
Epoch 0:   1%|          | 32/5971 [01:26<4:19:05,  2.62s/it, loss=0.2, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000138, train/loss_step=0.0376, global_step=7.000]
Epoch 0:   1%|          | 32/5971 [01:26<4:19:05,  2.62s/it, loss=0.2, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00165, train/loss_step=0.325, global_step=7.000]   
Epoch 0:   1%|          | 33/5971 [01:27<4:14:03,  2.57s/it, loss=0.2, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00165, train/loss_step=0.325, global_step=7.000]
Epoch 0:   1%|          | 33/5971 [01:27<4:14:03,  2.57s/it, loss=0.202, v_num=0, train/loss_simple_step=0.0501, train/loss_vlb_step=0.000186, train/loss_step=0.0501, global_step=8.000]
Epoch 0:   1%|          | 34/5971 [01:28<4:09:13,  2.52s/it, loss=0.202, v_num=0, train/loss_simple_step=0.0501, train/loss_vlb_step=0.000186, train/loss_step=0.0501, global_step=8.000]
Epoch 0:   1%|          | 34/5971 [01:28<4:09:13,  2.52s/it, loss=0.178, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000286, train/loss_step=0.0837, global_step=8.000]
Epoch 0:   1%|          | 35/5971 [01:29<4:04:40,  2.47s/it, loss=0.178, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000286, train/loss_step=0.0837, global_step=8.000]
Epoch 0:   1%|          | 35/5971 [01:29<4:04:40,  2.47s/it, loss=0.176, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000612, train/loss_step=0.185, global_step=8.000]  
Epoch 0:   1%|          | 36/5971 [01:31<4:04:57,  2.48s/it, loss=0.176, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000612, train/loss_step=0.185, global_step=8.000]
Epoch 0:   1%|          | 36/5971 [01:31<4:04:57,  2.48s/it, loss=0.181, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000519, train/loss_step=0.156, global_step=8.000]
Epoch 0:   1%|          | 37/5971 [01:32<4:00:51,  2.44s/it, loss=0.181, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000519, train/loss_step=0.156, global_step=8.000]
Epoch 0:   1%|          | 37/5971 [01:32<4:00:51,  2.44s/it, loss=0.176, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00381, train/loss_step=0.469, global_step=9.000] 
Epoch 0:   1%|          | 38/5971 [01:33<3:56:51,  2.40s/it, loss=0.176, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00381, train/loss_step=0.469, global_step=9.000]
Epoch 0:   1%|          | 38/5971 [01:33<3:56:51,  2.40s/it, loss=0.182, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00135, train/loss_step=0.308, global_step=9.000]
Epoch 0:   1%|          | 39/5971 [01:34<3:53:02,  2.36s/it, loss=0.182, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00135, train/loss_step=0.308, global_step=9.000]
Epoch 0:   1%|          | 39/5971 [01:34<3:53:02,  2.36s/it, loss=0.194, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00113, train/loss_step=0.302, global_step=9.000]
Epoch 0:   1%|          | 40/5971 [01:36<3:53:25,  2.36s/it, loss=0.194, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00113, train/loss_step=0.302, global_step=9.000]
Epoch 0:   1%|          | 40/5971 [01:36<3:53:25,  2.36s/it, loss=0.196, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.0013, train/loss_step=0.299, global_step=9.000] 
Epoch 0:   1%|          | 41/5971 [01:37<3:50:08,  2.33s/it, loss=0.196, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.0013, train/loss_step=0.299, global_step=9.000]
Epoch 0:   1%|          | 41/5971 [01:37<3:50:08,  2.33s/it, loss=0.178, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00148, train/loss_step=0.324, global_step=10.00]
Epoch 0:   1%|          | 42/5971 [01:38<3:46:46,  2.29s/it, loss=0.178, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00148, train/loss_step=0.324, global_step=10.00]
Epoch 0:   1%|          | 42/5971 [01:38<3:46:46,  2.29s/it, loss=0.184, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=10.00]
Epoch 0:   1%|          | 43/5971 [01:39<3:43:34,  2.26s/it, loss=0.184, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=10.00]
Epoch 0:   1%|          | 43/5971 [01:39<3:43:34,  2.26s/it, loss=0.206, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00338, train/loss_step=0.467, global_step=10.00] 
Epoch 0:   1%|          | 44/5971 [01:41<3:43:48,  2.27s/it, loss=0.206, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00338, train/loss_step=0.467, global_step=10.00]
Epoch 0:   1%|          | 44/5971 [01:41<3:43:48,  2.27s/it, loss=0.206, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.63e-5, train/loss_step=0.0121, global_step=10.00]
Epoch 0:   1%|          | 45/5971 [01:42<3:40:49,  2.24s/it, loss=0.206, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.63e-5, train/loss_step=0.0121, global_step=10.00]
Epoch 0:   1%|          | 45/5971 [01:42<3:40:49,  2.24s/it, loss=0.222, v_num=0, train/loss_simple_step=0.634, train/loss_vlb_step=0.00938, train/loss_step=0.634, global_step=11.00]  
Epoch 0:   1%|          | 46/5971 [01:43<3:37:55,  2.21s/it, loss=0.222, v_num=0, train/loss_simple_step=0.634, train/loss_vlb_step=0.00938, train/loss_step=0.634, global_step=11.00]
Epoch 0:   1%|          | 46/5971 [01:43<3:37:55,  2.21s/it, loss=0.242, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00258, train/loss_step=0.412, global_step=11.00]
Epoch 0:   1%|          | 47/5971 [01:44<3:35:10,  2.18s/it, loss=0.242, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00258, train/loss_step=0.412, global_step=11.00]
Epoch 0:   1%|          | 47/5971 [01:44<3:35:11,  2.18s/it, loss=0.264, v_num=0, train/loss_simple_step=0.450, train/loss_vlb_step=0.00259, train/loss_step=0.450, global_step=11.00]
Epoch 0:   1%|          | 48/5971 [01:47<3:36:18,  2.19s/it, loss=0.264, v_num=0, train/loss_simple_step=0.450, train/loss_vlb_step=0.00259, train/loss_step=0.450, global_step=11.00]
Epoch 0:   1%|          | 48/5971 [01:47<3:36:18,  2.19s/it, loss=0.244, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.18e-5, train/loss_step=0.0205, global_step=11.00]
Epoch 0:   1%|          | 49/5971 [01:48<3:34:10,  2.17s/it, loss=0.244, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.18e-5, train/loss_step=0.0205, global_step=11.00]
Epoch 0:   1%|          | 49/5971 [01:48<3:34:10,  2.17s/it, loss=0.256, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00256, train/loss_step=0.423, global_step=12.00]  
Epoch 0:   1%|          | 50/5971 [01:49<3:31:40,  2.15s/it, loss=0.256, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00256, train/loss_step=0.423, global_step=12.00]
Epoch 0:   1%|          | 50/5971 [01:49<3:31:41,  2.15s/it, loss=0.278, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00333, train/loss_step=0.480, global_step=12.00]
Epoch 0:   1%|          | 51/5971 [01:50<3:29:15,  2.12s/it, loss=0.278, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00333, train/loss_step=0.480, global_step=12.00]
Epoch 0:   1%|          | 51/5971 [01:50<3:29:15,  2.12s/it, loss=0.278, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000131, train/loss_step=0.034, global_step=12.00]
Epoch 0:   1%|          | 52/5971 [01:52<3:30:03,  2.13s/it, loss=0.278, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000131, train/loss_step=0.034, global_step=12.00]
Epoch 0:   1%|          | 52/5971 [01:52<3:30:03,  2.13s/it, loss=0.262, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.25e-5, train/loss_step=0.00387, global_step=12.00]
Epoch 0:   1%|          | 53/5971 [01:54<3:28:14,  2.11s/it, loss=0.262, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.25e-5, train/loss_step=0.00387, global_step=12.00]
Epoch 0:   1%|          | 53/5971 [01:54<3:28:14,  2.11s/it, loss=0.262, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000194, train/loss_step=0.0538, global_step=13.00] 
Epoch 0:   1%|          | 54/5971 [01:54<3:25:59,  2.09s/it, loss=0.262, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000194, train/loss_step=0.0538, global_step=13.00]
Epoch 0:   1%|          | 54/5971 [01:54<3:26:00,  2.09s/it, loss=0.258, v_num=0, train/loss_simple_step=0.006, train/loss_vlb_step=3.35e-5, train/loss_step=0.006, global_step=13.00]   
Epoch 0:   1%|          | 55/5971 [01:55<3:23:48,  2.07s/it, loss=0.258, v_num=0, train/loss_simple_step=0.006, train/loss_vlb_step=3.35e-5, train/loss_step=0.006, global_step=13.00]
Epoch 0:   1%|          | 55/5971 [01:55<3:23:48,  2.07s/it, loss=0.249, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.25e-5, train/loss_step=0.011, global_step=13.00]
Epoch 0:   1%|          | 56/5971 [01:58<3:25:37,  2.09s/it, loss=0.249, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.25e-5, train/loss_step=0.011, global_step=13.00]
Epoch 0:   1%|          | 56/5971 [01:58<3:25:37,  2.09s/it, loss=0.254, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00101, train/loss_step=0.255, global_step=13.00]
Epoch 0:   1%|          | 57/5971 [01:59<3:23:35,  2.07s/it, loss=0.254, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00101, train/loss_step=0.255, global_step=13.00]
Epoch 0:   1%|          | 57/5971 [01:59<3:23:35,  2.07s/it, loss=0.235, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000262, train/loss_step=0.0763, global_step=14.00]
Epoch 0:   1%|          | 58/5971 [02:00<3:21:36,  2.05s/it, loss=0.235, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000262, train/loss_step=0.0763, global_step=14.00]
Epoch 0:   1%|          | 58/5971 [02:00<3:21:36,  2.05s/it, loss=0.221, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000161, train/loss_step=0.0417, global_step=14.00]
Epoch 0:   1%|          | 59/5971 [02:01<3:19:40,  2.03s/it, loss=0.221, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000161, train/loss_step=0.0417, global_step=14.00]
Epoch 0:   1%|          | 59/5971 [02:01<3:19:40,  2.03s/it, loss=0.207, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.97e-5, train/loss_step=0.00966, global_step=14.00]
Epoch 0:   1%|          | 60/5971 [02:05<3:22:23,  2.05s/it, loss=0.207, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.97e-5, train/loss_step=0.00966, global_step=14.00]
Epoch 0:   1%|          | 60/5971 [02:05<3:22:23,  2.05s/it, loss=0.192, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=6.47e-5, train/loss_step=0.0132, global_step=14.00]  
Epoch 0:   1%|          | 61/5971 [02:06<3:20:32,  2.04s/it, loss=0.192, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=6.47e-5, train/loss_step=0.0132, global_step=14.00]
Epoch 0:   1%|          | 61/5971 [02:06<3:20:32,  2.04s/it, loss=0.186, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000701, train/loss_step=0.187, global_step=15.00] 
Epoch 0:   1%|          | 62/5971 [02:07<3:18:43,  2.02s/it, loss=0.186, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000701, train/loss_step=0.187, global_step=15.00]
Epoch 0:   1%|          | 62/5971 [02:07<3:18:43,  2.02s/it, loss=0.19, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000775, train/loss_step=0.209, global_step=15.00] 
Epoch 0:   1%|          | 63/5971 [02:08<3:16:59,  2.00s/it, loss=0.19, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000775, train/loss_step=0.209, global_step=15.00]
Epoch 0:   1%|          | 63/5971 [02:08<3:16:59,  2.00s/it, loss=0.189, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00265, train/loss_step=0.441, global_step=15.00]
Epoch 0:   1%|          | 64/5971 [02:10<3:17:34,  2.01s/it, loss=0.189, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00265, train/loss_step=0.441, global_step=15.00]
Epoch 0:   1%|          | 64/5971 [02:10<3:17:34,  2.01s/it, loss=0.189, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.67e-5, train/loss_step=0.0168, global_step=15.00]
Epoch 0:   1%|          | 65/5971 [02:11<3:16:06,  1.99s/it, loss=0.189, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.67e-5, train/loss_step=0.0168, global_step=15.00]
Epoch 0:   1%|          | 65/5971 [02:11<3:16:06,  1.99s/it, loss=0.16, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.00021, train/loss_step=0.0559, global_step=16.00] 
Epoch 0:   1%|          | 66/5971 [02:12<3:14:26,  1.98s/it, loss=0.16, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.00021, train/loss_step=0.0559, global_step=16.00]
Epoch 0:   1%|          | 66/5971 [02:12<3:14:26,  1.98s/it, loss=0.149, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000686, train/loss_step=0.201, global_step=16.00]
Epoch 0:   1%|          | 67/5971 [02:13<3:12:50,  1.96s/it, loss=0.149, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000686, train/loss_step=0.201, global_step=16.00]
Epoch 0:   1%|          | 67/5971 [02:13<3:12:50,  1.96s/it, loss=0.14, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.000989, train/loss_step=0.263, global_step=16.00] 
Epoch 0:   1%|          | 68/5971 [02:15<3:13:00,  1.96s/it, loss=0.14, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.000989, train/loss_step=0.263, global_step=16.00]
Epoch 0:   1%|          | 68/5971 [02:15<3:13:00,  1.96s/it, loss=0.148, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000639, train/loss_step=0.189, global_step=16.00]
Epoch 0:   1%|          | 69/5971 [02:16<3:11:29,  1.95s/it, loss=0.148, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000639, train/loss_step=0.189, global_step=16.00]
Epoch 0:   1%|          | 69/5971 [02:16<3:11:29,  1.95s/it, loss=0.129, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000118, train/loss_step=0.0297, global_step=17.00]
Epoch 0:   1%|          | 70/5971 [02:17<3:09:58,  1.93s/it, loss=0.129, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000118, train/loss_step=0.0297, global_step=17.00]
Epoch 0:   1%|          | 70/5971 [02:17<3:09:58,  1.93s/it, loss=0.106, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.81e-5, train/loss_step=0.0143, global_step=17.00] 
Epoch 0:   1%|          | 71/5971 [02:18<3:08:30,  1.92s/it, loss=0.106, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.81e-5, train/loss_step=0.0143, global_step=17.00]
Epoch 0:   1%|          | 71/5971 [02:18<3:08:30,  1.92s/it, loss=0.105, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.75e-5, train/loss_step=0.017, global_step=17.00]  
Epoch 0:   1%|          | 72/5971 [02:20<3:09:37,  1.93s/it, loss=0.105, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.75e-5, train/loss_step=0.017, global_step=17.00]
Epoch 0:   1%|          | 72/5971 [02:20<3:09:37,  1.93s/it, loss=0.134, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00821, train/loss_step=0.594, global_step=17.00]
Epoch 0:   1%|          | 73/5971 [02:21<3:08:17,  1.92s/it, loss=0.134, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00821, train/loss_step=0.594, global_step=17.00]
Epoch 0:   1%|          | 73/5971 [02:21<3:08:17,  1.92s/it, loss=0.132, v_num=0, train/loss_simple_step=0.00589, train/loss_vlb_step=3.22e-5, train/loss_step=0.00589, global_step=18.00]
Epoch 0:   1%|          | 74/5971 [02:22<3:06:53,  1.90s/it, loss=0.132, v_num=0, train/loss_simple_step=0.00589, train/loss_vlb_step=3.22e-5, train/loss_step=0.00589, global_step=18.00]
Epoch 0:   1%|          | 74/5971 [02:22<3:06:53,  1.90s/it, loss=0.132, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=2.11e-5, train/loss_step=0.0036, global_step=18.00]  
Epoch 0:   1%|▏         | 75/5971 [02:23<3:05:31,  1.89s/it, loss=0.132, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=2.11e-5, train/loss_step=0.0036, global_step=18.00]
Epoch 0:   1%|▏         | 75/5971 [02:23<3:05:31,  1.89s/it, loss=0.132, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=6.08e-5, train/loss_step=0.0125, global_step=18.00]
Epoch 0:   1%|▏         | 76/5971 [02:25<3:06:05,  1.89s/it, loss=0.132, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=6.08e-5, train/loss_step=0.0125, global_step=18.00]
Epoch 0:   1%|▏         | 76/5971 [02:25<3:06:05,  1.89s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.000199, train/loss_step=0.0571, global_step=18.00]
Epoch 0:   1%|▏         | 77/5971 [02:26<3:04:49,  1.88s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.000199, train/loss_step=0.0571, global_step=18.00]
Epoch 0:   1%|▏         | 77/5971 [02:26<3:04:49,  1.88s/it, loss=0.157, v_num=0, train/loss_simple_step=0.788, train/loss_vlb_step=0.021, train/loss_step=0.788, global_step=19.00]     
Epoch 0:   1%|▏         | 78/5971 [02:27<3:03:33,  1.87s/it, loss=0.157, v_num=0, train/loss_simple_step=0.788, train/loss_vlb_step=0.021, train/loss_step=0.788, global_step=19.00]
Epoch 0:   1%|▏         | 78/5971 [02:27<3:03:33,  1.87s/it, loss=0.161, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=19.00]
Epoch 0:   1%|▏         | 79/5971 [02:28<3:02:19,  1.86s/it, loss=0.161, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=19.00]
Epoch 0:   1%|▏         | 79/5971 [02:28<3:02:19,  1.86s/it, loss=0.161, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.91e-5, train/loss_step=0.00337, global_step=19.00]
Epoch 0:   1%|▏         | 80/5971 [02:31<3:04:04,  1.87s/it, loss=0.161, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.91e-5, train/loss_step=0.00337, global_step=19.00]
Epoch 0:   1%|▏         | 80/5971 [02:31<3:04:04,  1.87s/it, loss=0.184, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00431, train/loss_step=0.480, global_step=19.00]    
Epoch 0:   1%|▏         | 81/5971 [02:32<3:02:54,  1.86s/it, loss=0.184, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00431, train/loss_step=0.480, global_step=19.00]
Epoch 0:   1%|▏         | 81/5971 [02:32<3:02:54,  1.86s/it, loss=0.181, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000362, train/loss_step=0.109, global_step=20.00]
Epoch 0:   1%|▏         | 82/5971 [02:33<3:01:42,  1.85s/it, loss=0.181, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000362, train/loss_step=0.109, global_step=20.00]
Epoch 0:   1%|▏         | 82/5971 [02:33<3:01:42,  1.85s/it, loss=0.173, v_num=0, train/loss_simple_step=0.0516, train/loss_vlb_step=0.000191, train/loss_step=0.0516, global_step=20.00]
Epoch 0:   1%|▏         | 83/5971 [02:34<3:00:32,  1.84s/it, loss=0.173, v_num=0, train/loss_simple_step=0.0516, train/loss_vlb_step=0.000191, train/loss_step=0.0516, global_step=20.00]
Epoch 0:   1%|▏         | 83/5971 [02:34<3:00:32,  1.84s/it, loss=0.161, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000683, train/loss_step=0.201, global_step=20.00]  
Epoch 0:   1%|▏         | 84/5971 [02:37<3:01:19,  1.85s/it, loss=0.161, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000683, train/loss_step=0.201, global_step=20.00]
Epoch 0:   1%|▏         | 84/5971 [02:37<3:01:19,  1.85s/it, loss=0.162, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000181, train/loss_step=0.0487, global_step=20.00]
Epoch 0:   1%|▏         | 85/5971 [02:38<3:00:33,  1.84s/it, loss=0.162, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000181, train/loss_step=0.0487, global_step=20.00]
Epoch 0:   1%|▏         | 85/5971 [02:38<3:00:33,  1.84s/it, loss=0.161, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000118, train/loss_step=0.0284, global_step=21.00]
Epoch 0:   1%|▏         | 86/5971 [02:39<2:59:27,  1.83s/it, loss=0.161, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000118, train/loss_step=0.0284, global_step=21.00]
Epoch 0:   1%|▏         | 86/5971 [02:39<2:59:27,  1.83s/it, loss=0.151, v_num=0, train/loss_simple_step=0.00734, train/loss_vlb_step=3.66e-5, train/loss_step=0.00734, global_step=21.00]
Epoch 0:   1%|▏         | 87/5971 [02:40<2:58:21,  1.82s/it, loss=0.151, v_num=0, train/loss_simple_step=0.00734, train/loss_vlb_step=3.66e-5, train/loss_step=0.00734, global_step=21.00]
Epoch 0:   1%|▏         | 87/5971 [02:40<2:58:21,  1.82s/it, loss=0.145, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000446, train/loss_step=0.135, global_step=21.00]   
Epoch 0:   1%|▏         | 88/5971 [02:42<2:58:37,  1.82s/it, loss=0.145, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000446, train/loss_step=0.135, global_step=21.00]
Epoch 0:   1%|▏         | 88/5971 [02:42<2:58:37,  1.82s/it, loss=0.136, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.62e-5, train/loss_step=0.00274, global_step=21.00]
Epoch 0:   1%|▏         | 89/5971 [02:43<2:57:34,  1.81s/it, loss=0.136, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.62e-5, train/loss_step=0.00274, global_step=21.00]
Epoch 0:   1%|▏         | 89/5971 [02:43<2:57:34,  1.81s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0056, train/loss_vlb_step=3.1e-5, train/loss_step=0.0056, global_step=22.00]   
Epoch 0:   2%|▏         | 90/5971 [02:43<2:56:32,  1.80s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0056, train/loss_vlb_step=3.1e-5, train/loss_step=0.0056, global_step=22.00]
Epoch 0:   2%|▏         | 90/5971 [02:43<2:56:32,  1.80s/it, loss=0.134, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.38e-5, train/loss_step=0.004, global_step=22.00] 
Epoch 0:   2%|▏         | 91/5971 [02:44<2:55:31,  1.79s/it, loss=0.134, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.38e-5, train/loss_step=0.004, global_step=22.00]
Epoch 0:   2%|▏         | 91/5971 [02:44<2:55:32,  1.79s/it, loss=0.141, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000603, train/loss_step=0.170, global_step=22.00]
Epoch 0:   2%|▏         | 92/5971 [02:47<2:56:18,  1.80s/it, loss=0.141, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000603, train/loss_step=0.170, global_step=22.00]
Epoch 0:   2%|▏         | 92/5971 [02:47<2:56:18,  1.80s/it, loss=0.123, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000799, train/loss_step=0.223, global_step=22.00]
Epoch 0:   2%|▏         | 93/5971 [02:48<2:55:19,  1.79s/it, loss=0.123, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000799, train/loss_step=0.223, global_step=22.00]
Epoch 0:   2%|▏         | 93/5971 [02:48<2:55:19,  1.79s/it, loss=0.125, v_num=0, train/loss_simple_step=0.0565, train/loss_vlb_step=0.000199, train/loss_step=0.0565, global_step=23.00]
Epoch 0:   2%|▏         | 94/5971 [02:49<2:54:20,  1.78s/it, loss=0.125, v_num=0, train/loss_simple_step=0.0565, train/loss_vlb_step=0.000199, train/loss_step=0.0565, global_step=23.00]
Epoch 0:   2%|▏         | 94/5971 [02:49<2:54:20,  1.78s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000161, train/loss_step=0.0436, global_step=23.00]
Epoch 0:   2%|▏         | 95/5971 [02:50<2:53:31,  1.77s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000161, train/loss_step=0.0436, global_step=23.00]
Epoch 0:   2%|▏         | 95/5971 [02:50<2:53:31,  1.77s/it, loss=0.128, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000119, train/loss_step=0.029, global_step=23.00]  
Epoch 0:   2%|▏         | 96/5971 [02:52<2:54:31,  1.78s/it, loss=0.128, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000119, train/loss_step=0.029, global_step=23.00]
Epoch 0:   2%|▏         | 96/5971 [02:52<2:54:31,  1.78s/it, loss=0.126, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=6.18e-5, train/loss_step=0.0125, global_step=23.00]
Epoch 0:   2%|▏         | 97/5971 [02:53<2:53:37,  1.77s/it, loss=0.126, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=6.18e-5, train/loss_step=0.0125, global_step=23.00]
Epoch 0:   2%|▏         | 97/5971 [02:53<2:53:37,  1.77s/it, loss=0.0923, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=24.00]
Epoch 0:   2%|▏         | 98/5971 [02:54<2:52:42,  1.76s/it, loss=0.0923, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=24.00]
Epoch 0:   2%|▏         | 98/5971 [02:54<2:52:42,  1.76s/it, loss=0.0866, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=4.15e-5, train/loss_step=0.00843, global_step=24.00]
Epoch 0:   2%|▏         | 99/5971 [02:55<2:51:48,  1.76s/it, loss=0.0866, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=4.15e-5, train/loss_step=0.00843, global_step=24.00]
Epoch 0:   2%|▏         | 99/5971 [02:55<2:51:48,  1.76s/it, loss=0.0872, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=7.16e-5, train/loss_step=0.0152, global_step=24.00]  
Epoch 0:   2%|▏         | 100/5971 [02:57<2:52:23,  1.76s/it, loss=0.0872, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=7.16e-5, train/loss_step=0.0152, global_step=24.00]
Epoch 0:   2%|▏         | 100/5971 [02:57<2:52:23,  1.76s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.43it/s][A
Epoch 0:   2%|▏         | 102/5971 [02:58<2:49:25,  1.73s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:   1%|          | 2/167 [00:00<00:53,  3.06it/s][A
Epoch 0:   2%|▏         | 104/5971 [02:58<2:46:24,  1.70s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.47it/s][A
Epoch 0:   2%|▏         | 107/5971 [02:58<2:41:48,  1.66s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.95it/s][A
Epoch 0:   2%|▏         | 110/5971 [02:58<2:37:27,  1.61s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.19it/s][A
Epoch 0:   2%|▏         | 113/5971 [02:59<2:33:19,  1.57s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.41it/s][A
Epoch 0:   2%|▏         | 116/5971 [02:59<2:29:25,  1.53s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.68it/s][A
Epoch 0:   2%|▏         | 119/5971 [02:59<2:25:42,  1.49s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 24.07it/s][A
Epoch 0:   2%|▏         | 123/5971 [02:59<2:21:00,  1.45s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 26.84it/s][A
Epoch 0:   2%|▏         | 127/5971 [02:59<2:16:35,  1.40s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 27.51it/s][A
Epoch 0:   2%|▏         | 131/5971 [02:59<2:12:28,  1.36s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  19%|█▊        | 31/167 [00:01<00:04, 27.71it/s][A

Validating:  20%|██        | 34/167 [00:01<00:04, 27.87it/s][A
Epoch 0:   2%|▏         | 135/5971 [02:59<2:08:35,  1.32s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 28.98it/s][A
Epoch 0:   2%|▏         | 139/5971 [02:59<2:04:55,  1.29s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 28.84it/s][A
Epoch 0:   2%|▏         | 143/5971 [03:00<2:01:27,  1.25s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 29.93it/s][A
Epoch 0:   2%|▏         | 147/5971 [03:00<1:58:10,  1.22s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  30%|██▉       | 50/167 [00:02<00:03, 30.98it/s][A
Epoch 0:   3%|▎         | 151/5971 [03:00<1:55:03,  1.19s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  32%|███▏      | 54/167 [00:02<00:03, 29.71it/s][A
Epoch 0:   3%|▎         | 155/5971 [03:00<1:52:07,  1.16s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  35%|███▍      | 58/167 [00:02<00:03, 30.81it/s][A
Epoch 0:   3%|▎         | 159/5971 [03:00<1:49:19,  1.13s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 28.49it/s][A
Epoch 0:   3%|▎         | 163/5971 [03:00<1:46:40,  1.10s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 28.37it/s][A
Epoch 0:   3%|▎         | 167/5971 [03:00<1:44:08,  1.08s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  41%|████      | 68/167 [00:02<00:03, 28.64it/s][A
Epoch 0:   3%|▎         | 171/5971 [03:01<1:41:44,  1.05s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.64it/s][A
Epoch 0:   3%|▎         | 175/5971 [03:01<1:39:25,  1.03s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 28.48it/s][A
Epoch 0:   3%|▎         | 179/5971 [03:01<1:37:13,  1.01s/it, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  47%|████▋     | 79/167 [00:03<00:02, 29.44it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:02, 29.41it/s][A
Epoch 0:   3%|▎         | 183/5971 [03:01<1:35:06,  1.01it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  51%|█████     | 85/167 [00:03<00:02, 27.77it/s][A
Epoch 0:   3%|▎         | 187/5971 [03:01<1:33:07,  1.04it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 26.76it/s][A
Epoch 0:   3%|▎         | 191/5971 [03:01<1:31:11,  1.06it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.21it/s][A

Validating:  56%|█████▋    | 94/167 [00:03<00:02, 27.00it/s][A
Epoch 0:   3%|▎         | 195/5971 [03:01<1:29:20,  1.08it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.84it/s][A
Epoch 0:   3%|▎         | 199/5971 [03:02<1:27:33,  1.10it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 27.23it/s][A
Epoch 0:   3%|▎         | 203/5971 [03:02<1:25:50,  1.12it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.21it/s][A
Epoch 0:   3%|▎         | 207/5971 [03:02<1:24:12,  1.14it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 28.23it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:01, 28.53it/s][A
Epoch 0:   4%|▎         | 211/5971 [03:02<1:22:37,  1.16it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 28.52it/s][A
Epoch 0:   4%|▎         | 215/5971 [03:02<1:21:05,  1.18it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  70%|███████   | 117/167 [00:04<00:01, 29.78it/s][A
Epoch 0:   4%|▎         | 219/5971 [03:02<1:19:37,  1.20it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 28.60it/s][A
Epoch 0:   4%|▎         | 223/5971 [03:02<1:18:12,  1.22it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  74%|███████▍  | 124/167 [00:04<00:01, 29.48it/s][A
Epoch 0:   4%|▍         | 227/5971 [03:02<1:16:49,  1.25it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 30.19it/s][A
Epoch 0:   4%|▍         | 231/5971 [03:03<1:15:30,  1.27it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 28.26it/s][A
Epoch 0:   4%|▍         | 235/5971 [03:03<1:14:14,  1.29it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  81%|████████  | 135/167 [00:05<00:01, 28.50it/s][A
Epoch 0:   4%|▍         | 239/5971 [03:03<1:13:00,  1.31it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  83%|████████▎ | 139/167 [00:05<00:00, 28.90it/s][A

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 27.71it/s][A
Epoch 0:   4%|▍         | 243/5971 [03:03<1:11:49,  1.33it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 26.48it/s][A
Epoch 0:   4%|▍         | 247/5971 [03:03<1:10:40,  1.35it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 26.16it/s][A
Epoch 0:   4%|▍         | 251/5971 [03:03<1:09:34,  1.37it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  90%|█████████ | 151/167 [00:05<00:00, 26.04it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.07it/s][A
Epoch 0:   4%|▍         | 255/5971 [03:04<1:08:29,  1.39it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 24.49it/s][A
Epoch 0:   4%|▍         | 259/5971 [03:04<1:07:27,  1.41it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 24.32it/s][A
Epoch 0:   4%|▍         | 263/5971 [03:04<1:06:26,  1.43it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.67it/s][A

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 25.63it/s][A
Epoch 0:   4%|▍         | 267/5971 [03:04<1:05:27,  1.45it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]
Epoch 0:   4%|▍         | 268/5971 [03:04<1:05:21,  1.45it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.8e-5, train/loss_step=0.00731, global_step=24.00]

                                                             [A
Epoch 0:   5%|▍         | 269/5971 [03:05<1:05:27,  1.45it/s, loss=0.0626, v_num=0, train/loss_simple_step=0.0896, train/loss_vlb_step=0.000298, train/loss_step=0.0896, global_step=25.00]
Epoch 0:   5%|▍         | 270/5971 [03:06<1:05:31,  1.45it/s, loss=0.0784, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00167, train/loss_step=0.367, global_step=25.00]   
Epoch 0:   5%|▍         | 271/5971 [03:07<1:05:34,  1.45it/s, loss=0.0784, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00167, train/loss_step=0.367, global_step=25.00]
Epoch 0:   5%|▍         | 271/5971 [03:07<1:05:34,  1.45it/s, loss=0.069, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.87e-5, train/loss_step=0.012, global_step=25.00] 
Epoch 0:   5%|▍         | 272/5971 [03:09<1:06:04,  1.44it/s, loss=0.0712, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000317, train/loss_step=0.0937, global_step=25.00]
Epoch 0:   5%|▍         | 273/5971 [03:10<1:06:09,  1.44it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000638, train/loss_step=0.183, global_step=26.00]  
Epoch 0:   5%|▍         | 274/5971 [03:11<1:06:12,  1.43it/s, loss=0.116, v_num=0, train/loss_simple_step=0.758, train/loss_vlb_step=0.0158, train/loss_step=0.758, global_step=26.00]   
Epoch 0:   5%|▍         | 275/5971 [03:12<1:06:15,  1.43it/s, loss=0.116, v_num=0, train/loss_simple_step=0.758, train/loss_vlb_step=0.0158, train/loss_step=0.758, global_step=26.00]
Epoch 0:   5%|▍         | 275/5971 [03:12<1:06:15,  1.43it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000197, train/loss_step=0.0529, global_step=26.00]
Epoch 0:   5%|▍         | 276/5971 [03:14<1:06:45,  1.42it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000231, train/loss_step=0.0641, global_step=26.00]
Epoch 0:   5%|▍         | 277/5971 [03:15<1:06:52,  1.42it/s, loss=0.136, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.0048, train/loss_step=0.420, global_step=27.00]    
Epoch 0:   5%|▍         | 278/5971 [03:16<1:06:55,  1.42it/s, loss=0.142, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=27.00]
Epoch 0:   5%|▍         | 279/5971 [03:18<1:07:07,  1.41it/s, loss=0.142, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=27.00]
Epoch 0:   5%|▍         | 279/5971 [03:18<1:07:07,  1.41it/s, loss=0.162, v_num=0, train/loss_simple_step=0.556, train/loss_vlb_step=0.00403, train/loss_step=0.556, global_step=27.00]
Epoch 0:   5%|▍         | 280/5971 [03:20<1:07:42,  1.40it/s, loss=0.156, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000399, train/loss_step=0.119, global_step=27.00]
Epoch 0:   5%|▍         | 281/5971 [03:21<1:07:51,  1.40it/s, loss=0.16, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000398, train/loss_step=0.119, global_step=28.00] 
Epoch 0:   5%|▍         | 282/5971 [03:22<1:07:56,  1.40it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.56e-5, train/loss_step=0.0165, global_step=28.00]
Epoch 0:   5%|▍         | 283/5971 [03:23<1:07:59,  1.39it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.56e-5, train/loss_step=0.0165, global_step=28.00]
Epoch 0:   5%|▍         | 283/5971 [03:23<1:07:59,  1.39it/s, loss=0.197, v_num=0, train/loss_simple_step=0.799, train/loss_vlb_step=0.0321, train/loss_step=0.799, global_step=28.00]   
Epoch 0:   5%|▍         | 284/5971 [03:25<1:08:26,  1.39it/s, loss=0.2, v_num=0, train/loss_simple_step=0.069, train/loss_vlb_step=0.000252, train/loss_step=0.069, global_step=28.00]
Epoch 0:   5%|▍         | 285/5971 [03:26<1:08:28,  1.38it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00649, train/loss_vlb_step=3.46e-5, train/loss_step=0.00649, global_step=29.00]
Epoch 0:   5%|▍         | 286/5971 [03:27<1:08:31,  1.38it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000152, train/loss_step=0.0405, global_step=29.00] 
Epoch 0:   5%|▍         | 287/5971 [03:28<1:08:33,  1.38it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000152, train/loss_step=0.0405, global_step=29.00]
Epoch 0:   5%|▍         | 287/5971 [03:28<1:08:33,  1.38it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.88e-5, train/loss_step=0.0103, global_step=29.00] 
Epoch 0:   5%|▍         | 288/5971 [03:30<1:09:00,  1.37it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000121, train/loss_step=0.0307, global_step=29.00]
Epoch 0:   5%|▍         | 289/5971 [03:31<1:09:03,  1.37it/s, loss=0.203, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000711, train/loss_step=0.211, global_step=30.00]  
Epoch 0:   5%|▍         | 290/5971 [03:32<1:09:05,  1.37it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.8e-5, train/loss_step=0.0149, global_step=30.00]
Epoch 0:   5%|▍         | 291/5971 [03:33<1:09:06,  1.37it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.8e-5, train/loss_step=0.0149, global_step=30.00]
Epoch 0:   5%|▍         | 291/5971 [03:33<1:09:06,  1.37it/s, loss=0.194, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000632, train/loss_step=0.184, global_step=30.00]
Epoch 0:   5%|▍         | 292/5971 [03:36<1:10:00,  1.35it/s, loss=0.195, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=30.00]
Epoch 0:   5%|▍         | 293/5971 [03:37<1:10:02,  1.35it/s, loss=0.2, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.000933, train/loss_step=0.269, global_step=31.00]  
Epoch 0:   5%|▍         | 294/5971 [03:38<1:10:04,  1.35it/s, loss=0.19, v_num=0, train/loss_simple_step=0.555, train/loss_vlb_step=0.0056, train/loss_step=0.555, global_step=31.00] 
Epoch 0:   5%|▍         | 295/5971 [03:39<1:10:05,  1.35it/s, loss=0.19, v_num=0, train/loss_simple_step=0.555, train/loss_vlb_step=0.0056, train/loss_step=0.555, global_step=31.00]
Epoch 0:   5%|▍         | 295/5971 [03:39<1:10:05,  1.35it/s, loss=0.211, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00318, train/loss_step=0.490, global_step=31.00]
Epoch 0:   5%|▍         | 296/5971 [03:42<1:10:44,  1.34it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.00015, train/loss_step=0.0379, global_step=31.00]
Epoch 0:   5%|▍         | 297/5971 [03:43<1:10:46,  1.34it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000156, train/loss_step=0.0448, global_step=32.00]
Epoch 0:   5%|▍         | 298/5971 [03:43<1:10:48,  1.34it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=5.09e-5, train/loss_step=0.0108, global_step=32.00] 
Epoch 0:   5%|▌         | 299/5971 [03:44<1:10:50,  1.33it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=5.09e-5, train/loss_step=0.0108, global_step=32.00]
Epoch 0:   5%|▌         | 299/5971 [03:44<1:10:50,  1.33it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000215, train/loss_step=0.0615, global_step=32.00]
Epoch 0:   5%|▌         | 300/5971 [03:46<1:11:15,  1.33it/s, loss=0.165, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000775, train/loss_step=0.206, global_step=32.00]  
Epoch 0:   5%|▌         | 301/5971 [03:47<1:11:17,  1.33it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00659, train/loss_vlb_step=3.42e-5, train/loss_step=0.00659, global_step=33.00]
Epoch 0:   5%|▌         | 302/5971 [03:48<1:11:19,  1.32it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000151, train/loss_step=0.0372, global_step=33.00]
Epoch 0:   5%|▌         | 303/5971 [03:49<1:11:20,  1.32it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000151, train/loss_step=0.0372, global_step=33.00]
Epoch 0:   5%|▌         | 303/5971 [03:49<1:11:20,  1.32it/s, loss=0.122, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000119, train/loss_step=0.031, global_step=33.00]  
Epoch 0:   5%|▌         | 304/5971 [03:51<1:11:49,  1.31it/s, loss=0.127, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000523, train/loss_step=0.158, global_step=33.00]
Epoch 0:   5%|▌         | 305/5971 [03:52<1:11:51,  1.31it/s, loss=0.142, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00134, train/loss_step=0.316, global_step=34.00] 
Epoch 0:   5%|▌         | 306/5971 [03:53<1:11:53,  1.31it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=0.0001, train/loss_step=0.0244, global_step=34.00]
Epoch 0:   5%|▌         | 307/5971 [03:54<1:11:54,  1.31it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=0.0001, train/loss_step=0.0244, global_step=34.00]
Epoch 0:   5%|▌         | 307/5971 [03:54<1:11:54,  1.31it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000327, train/loss_step=0.0991, global_step=34.00]
Epoch 0:   5%|▌         | 308/5971 [03:56<1:12:19,  1.31it/s, loss=0.15, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000415, train/loss_step=0.125, global_step=34.00]   
Epoch 0:   5%|▌         | 309/5971 [03:57<1:12:21,  1.30it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00276, train/loss_vlb_step=1.65e-5, train/loss_step=0.00276, global_step=35.00]
Epoch 0:   5%|▌         | 310/5971 [03:58<1:12:22,  1.30it/s, loss=0.164, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00462, train/loss_step=0.502, global_step=35.00]   
Epoch 0:   5%|▌         | 311/5971 [03:59<1:12:23,  1.30it/s, loss=0.164, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00462, train/loss_step=0.502, global_step=35.00]
Epoch 0:   5%|▌         | 311/5971 [03:59<1:12:23,  1.30it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.66e-5, train/loss_step=0.00273, global_step=35.00]
Epoch 0:   5%|▌         | 312/5971 [04:01<1:12:55,  1.29it/s, loss=0.167, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00172, train/loss_step=0.366, global_step=35.00]    
Epoch 0:   5%|▌         | 313/5971 [04:02<1:12:56,  1.29it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00507, train/loss_vlb_step=2.71e-5, train/loss_step=0.00507, global_step=36.00]
Epoch 0:   5%|▌         | 314/5971 [04:03<1:12:58,  1.29it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000253, train/loss_step=0.0759, global_step=36.00]  
Epoch 0:   5%|▌         | 315/5971 [04:04<1:12:59,  1.29it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000253, train/loss_step=0.0759, global_step=36.00]
Epoch 0:   5%|▌         | 315/5971 [04:04<1:12:59,  1.29it/s, loss=0.115, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.00064, train/loss_step=0.190, global_step=36.00]  
Epoch 0:   5%|▌         | 316/5971 [04:06<1:13:22,  1.28it/s, loss=0.125, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000865, train/loss_step=0.238, global_step=36.00]
Epoch 0:   5%|▌         | 317/5971 [04:07<1:13:24,  1.28it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.86e-5, train/loss_step=0.00319, global_step=37.00]
Epoch 0:   5%|▌         | 318/5971 [04:08<1:13:25,  1.28it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000213, train/loss_step=0.0621, global_step=37.00] 
Epoch 0:   5%|▌         | 319/5971 [04:09<1:13:26,  1.28it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000213, train/loss_step=0.0621, global_step=37.00]
Epoch 0:   5%|▌         | 319/5971 [04:09<1:13:26,  1.28it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000102, train/loss_step=0.0259, global_step=37.00]
Epoch 0:   5%|▌         | 320/5971 [04:11<1:13:49,  1.28it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0612, train/loss_vlb_step=0.000217, train/loss_step=0.0612, global_step=37.00]
Epoch 0:   5%|▌         | 321/5971 [04:12<1:13:51,  1.27it/s, loss=0.127, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000763, train/loss_step=0.211, global_step=38.00]  
Epoch 0:   5%|▌         | 322/5971 [04:13<1:13:52,  1.27it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.46e-5, train/loss_step=0.00244, global_step=38.00]
Epoch 0:   5%|▌         | 323/5971 [04:14<1:13:53,  1.27it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.46e-5, train/loss_step=0.00244, global_step=38.00]
Epoch 0:   5%|▌         | 323/5971 [04:14<1:13:53,  1.27it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000176, train/loss_step=0.0481, global_step=38.00] 
Epoch 0:   5%|▌         | 324/5971 [04:16<1:14:19,  1.27it/s, loss=0.127, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000587, train/loss_step=0.173, global_step=38.00]  
Epoch 0:   5%|▌         | 325/5971 [04:17<1:14:21,  1.27it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0773, train/loss_vlb_step=0.000268, train/loss_step=0.0773, global_step=39.00]
Epoch 0:   5%|▌         | 326/5971 [04:18<1:14:22,  1.27it/s, loss=0.12, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000456, train/loss_step=0.138, global_step=39.00]   
Epoch 0:   5%|▌         | 327/5971 [04:19<1:14:22,  1.26it/s, loss=0.12, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000456, train/loss_step=0.138, global_step=39.00]
Epoch 0:   5%|▌         | 327/5971 [04:19<1:14:22,  1.26it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=6.34e-5, train/loss_step=0.0136, global_step=39.00]
Epoch 0:   5%|▌         | 328/5971 [04:21<1:14:47,  1.26it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00402, train/loss_vlb_step=2.27e-5, train/loss_step=0.00402, global_step=39.00]
Epoch 0:   6%|▌         | 329/5971 [04:22<1:14:48,  1.26it/s, loss=0.129, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00197, train/loss_step=0.385, global_step=40.00]   
Epoch 0:   6%|▌         | 330/5971 [04:23<1:14:49,  1.26it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000149, train/loss_step=0.0395, global_step=40.00]
Epoch 0:   6%|▌         | 331/5971 [04:24<1:14:50,  1.26it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000149, train/loss_step=0.0395, global_step=40.00]
Epoch 0:   6%|▌         | 331/5971 [04:24<1:14:50,  1.26it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000192, train/loss_step=0.0536, global_step=40.00]
Epoch 0:   6%|▌         | 332/5971 [04:26<1:15:16,  1.25it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.3e-5, train/loss_step=0.00397, global_step=40.00]
Epoch 0:   6%|▌         | 333/5971 [04:27<1:15:18,  1.25it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.00512, train/loss_vlb_step=2.82e-5, train/loss_step=0.00512, global_step=41.00]
Epoch 0:   6%|▌         | 334/5971 [04:28<1:15:18,  1.25it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.00956, train/loss_vlb_step=4.58e-5, train/loss_step=0.00956, global_step=41.00]
Epoch 0:   6%|▌         | 335/5971 [04:29<1:15:21,  1.25it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.00956, train/loss_vlb_step=4.58e-5, train/loss_step=0.00956, global_step=41.00]
Epoch 0:   6%|▌         | 335/5971 [04:29<1:15:21,  1.25it/s, loss=0.0805, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000196, train/loss_step=0.0552, global_step=41.00] 
Epoch 0:   6%|▌         | 336/5971 [04:31<1:15:42,  1.24it/s, loss=0.0704, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000137, train/loss_step=0.0362, global_step=41.00]
Epoch 0:   6%|▌         | 337/5971 [04:32<1:15:43,  1.24it/s, loss=0.09, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00242, train/loss_step=0.394, global_step=42.00]     
Epoch 0:   6%|▌         | 338/5971 [04:33<1:15:43,  1.24it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000393, train/loss_step=0.119, global_step=42.00]
Epoch 0:   6%|▌         | 339/5971 [04:34<1:15:44,  1.24it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000393, train/loss_step=0.119, global_step=42.00]
Epoch 0:   6%|▌         | 339/5971 [04:34<1:15:44,  1.24it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000337, train/loss_step=0.102, global_step=42.00]
Epoch 0:   6%|▌         | 340/5971 [04:36<1:16:08,  1.23it/s, loss=0.136, v_num=0, train/loss_simple_step=0.853, train/loss_vlb_step=0.144, train/loss_step=0.853, global_step=42.00]    
Epoch 0:   6%|▌         | 341/5971 [04:37<1:16:09,  1.23it/s, loss=0.143, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00176, train/loss_step=0.352, global_step=43.00]
Epoch 0:   6%|▌         | 342/5971 [04:38<1:16:09,  1.23it/s, loss=0.166, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00243, train/loss_step=0.452, global_step=43.00]
Epoch 0:   6%|▌         | 343/5971 [04:39<1:16:09,  1.23it/s, loss=0.166, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00243, train/loss_step=0.452, global_step=43.00]
Epoch 0:   6%|▌         | 343/5971 [04:39<1:16:09,  1.23it/s, loss=0.183, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00192, train/loss_step=0.392, global_step=43.00]
Epoch 0:   6%|▌         | 344/5971 [04:41<1:16:30,  1.23it/s, loss=0.176, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=0.000101, train/loss_step=0.025, global_step=43.00]
Epoch 0:   6%|▌         | 345/5971 [04:42<1:16:31,  1.23it/s, loss=0.18, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000567, train/loss_step=0.171, global_step=44.00] 
Epoch 0:   6%|▌         | 346/5971 [04:43<1:16:31,  1.23it/s, loss=0.194, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00226, train/loss_step=0.422, global_step=44.00]
Epoch 0:   6%|▌         | 347/5971 [04:44<1:16:31,  1.22it/s, loss=0.194, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00226, train/loss_step=0.422, global_step=44.00]
Epoch 0:   6%|▌         | 347/5971 [04:44<1:16:31,  1.22it/s, loss=0.202, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000571, train/loss_step=0.161, global_step=44.00]
Epoch 0:   6%|▌         | 348/5971 [04:46<1:16:51,  1.22it/s, loss=0.213, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000902, train/loss_step=0.233, global_step=44.00]
Epoch 0:   6%|▌         | 349/5971 [04:47<1:16:52,  1.22it/s, loss=0.204, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000649, train/loss_step=0.196, global_step=45.00]
Epoch 0:   6%|▌         | 350/5971 [04:48<1:16:52,  1.22it/s, loss=0.219, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00146, train/loss_step=0.342, global_step=45.00] 
Epoch 0:   6%|▌         | 351/5971 [04:48<1:16:52,  1.22it/s, loss=0.219, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00146, train/loss_step=0.342, global_step=45.00]
Epoch 0:   6%|▌         | 351/5971 [04:48<1:16:52,  1.22it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0753, train/loss_vlb_step=0.000256, train/loss_step=0.0753, global_step=45.00]
Epoch 0:   6%|▌         | 352/5971 [04:51<1:17:13,  1.21it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000172, train/loss_step=0.0453, global_step=45.00]
Epoch 0:   6%|▌         | 353/5971 [04:51<1:17:13,  1.21it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000204, train/loss_step=0.0596, global_step=46.00]
Epoch 0:   6%|▌         | 354/5971 [04:52<1:17:13,  1.21it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.56e-5, train/loss_step=0.00261, global_step=46.00]
Epoch 0:   6%|▌         | 355/5971 [04:53<1:17:13,  1.21it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.56e-5, train/loss_step=0.00261, global_step=46.00]
Epoch 0:   6%|▌         | 355/5971 [04:53<1:17:13,  1.21it/s, loss=0.238, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00145, train/loss_step=0.327, global_step=46.00]    
Epoch 0:   6%|▌         | 356/5971 [04:55<1:17:33,  1.21it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.81e-5, train/loss_step=0.0214, global_step=46.00]
Epoch 0:   6%|▌         | 357/5971 [04:56<1:17:33,  1.21it/s, loss=0.226, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000559, train/loss_step=0.170, global_step=47.00] 
Epoch 0:   6%|▌         | 358/5971 [04:57<1:17:33,  1.21it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.000234, train/loss_step=0.0682, global_step=47.00]
Epoch 0:   6%|▌         | 359/5971 [04:58<1:17:33,  1.21it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.000234, train/loss_step=0.0682, global_step=47.00]
Epoch 0:   6%|▌         | 359/5971 [04:58<1:17:33,  1.21it/s, loss=0.225, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=47.00]  
Epoch 0:   6%|▌         | 360/5971 [05:00<1:17:56,  1.20it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.6e-5, train/loss_step=0.0229, global_step=47.00]
Epoch 0:   6%|▌         | 361/5971 [05:01<1:17:56,  1.20it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.75e-5, train/loss_step=0.0149, global_step=48.00]
Epoch 0:   6%|▌         | 362/5971 [05:02<1:17:57,  1.20it/s, loss=0.164, v_num=0, train/loss_simple_step=0.416, train/loss_vlb_step=0.00216, train/loss_step=0.416, global_step=48.00]  
Epoch 0:   6%|▌         | 363/5971 [05:03<1:17:57,  1.20it/s, loss=0.164, v_num=0, train/loss_simple_step=0.416, train/loss_vlb_step=0.00216, train/loss_step=0.416, global_step=48.00]
Epoch 0:   6%|▌         | 363/5971 [05:03<1:17:57,  1.20it/s, loss=0.184, v_num=0, train/loss_simple_step=0.792, train/loss_vlb_step=0.016, train/loss_step=0.792, global_step=48.00]  
Epoch 0:   6%|▌         | 364/5971 [05:05<1:18:16,  1.19it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.000255, train/loss_step=0.0728, global_step=48.00]
Epoch 0:   6%|▌         | 365/5971 [05:06<1:18:16,  1.19it/s, loss=0.18, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.000154, train/loss_step=0.041, global_step=49.00]   
Epoch 0:   6%|▌         | 366/5971 [05:07<1:18:16,  1.19it/s, loss=0.165, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=49.00]
Epoch 0:   6%|▌         | 367/5971 [05:08<1:18:16,  1.19it/s, loss=0.165, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=49.00]
Epoch 0:   6%|▌         | 367/5971 [05:08<1:18:16,  1.19it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000213, train/loss_step=0.0619, global_step=49.00]
Epoch 0:   6%|▌         | 368/5971 [05:10<1:18:35,  1.19it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:20,  2.06it/s][A

Validating:   1%|          | 2/167 [00:01<01:53,  1.45it/s][A
Epoch 0:   6%|▌         | 371/5971 [05:11<1:18:14,  1.19it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:   3%|▎         | 5/167 [00:01<00:36,  4.50it/s][A
Epoch 0:   6%|▋         | 375/5971 [05:12<1:17:23,  1.21it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:   5%|▍         | 8/167 [00:01<00:20,  7.69it/s][A
Epoch 0:   6%|▋         | 379/5971 [05:12<1:16:34,  1.22it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:   7%|▋         | 11/167 [00:01<00:14, 10.64it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:11, 13.64it/s][A
Epoch 0:   6%|▋         | 383/5971 [05:12<1:15:45,  1.23it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  10%|█         | 17/167 [00:01<00:09, 16.28it/s][A
Epoch 0:   6%|▋         | 387/5971 [05:12<1:14:57,  1.24it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  12%|█▏        | 20/167 [00:02<00:07, 19.13it/s][A
Epoch 0:   7%|▋         | 391/5971 [05:12<1:14:10,  1.25it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  14%|█▍        | 23/167 [00:02<00:06, 21.26it/s][A

Validating:  16%|█▌        | 26/167 [00:02<00:06, 22.92it/s][A
Epoch 0:   7%|▋         | 395/5971 [05:12<1:13:24,  1.27it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  17%|█▋        | 29/167 [00:02<00:05, 23.25it/s][A
Epoch 0:   7%|▋         | 399/5971 [05:12<1:12:39,  1.28it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  19%|█▉        | 32/167 [00:02<00:05, 24.84it/s][A
Epoch 0:   7%|▋         | 403/5971 [05:13<1:11:55,  1.29it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  21%|██        | 35/167 [00:02<00:05, 24.16it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.50it/s][A
Epoch 0:   7%|▋         | 407/5971 [05:13<1:11:11,  1.30it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.79it/s][A
Epoch 0:   7%|▋         | 411/5971 [05:13<1:10:29,  1.31it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.64it/s][A
Epoch 0:   7%|▋         | 415/5971 [05:13<1:09:47,  1.33it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  28%|██▊       | 47/167 [00:03<00:04, 24.99it/s][A

Validating:  30%|██▉       | 50/167 [00:03<00:04, 24.48it/s][A
Epoch 0:   7%|▋         | 419/5971 [05:13<1:09:07,  1.34it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  32%|███▏      | 53/167 [00:03<00:04, 24.66it/s][A
Epoch 0:   7%|▋         | 423/5971 [05:13<1:08:27,  1.35it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  34%|███▍      | 57/167 [00:03<00:04, 26.23it/s][A
Epoch 0:   7%|▋         | 427/5971 [05:14<1:07:47,  1.36it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  36%|███▌      | 60/167 [00:03<00:03, 26.89it/s][A
Epoch 0:   7%|▋         | 431/5971 [05:14<1:07:08,  1.38it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  38%|███▊      | 63/167 [00:03<00:03, 27.24it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.86it/s][A
Epoch 0:   7%|▋         | 435/5971 [05:14<1:06:31,  1.39it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.71it/s][A
Epoch 0:   7%|▋         | 439/5971 [05:14<1:05:53,  1.40it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.40it/s][A
Epoch 0:   7%|▋         | 443/5971 [05:14<1:05:17,  1.41it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  45%|████▍     | 75/167 [00:04<00:03, 26.84it/s][A

Validating:  47%|████▋     | 78/167 [00:04<00:03, 26.38it/s][A
Epoch 0:   7%|▋         | 447/5971 [05:14<1:04:41,  1.42it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  49%|████▊     | 81/167 [00:04<00:03, 26.34it/s][A
Epoch 0:   8%|▊         | 451/5971 [05:14<1:04:06,  1.44it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  50%|█████     | 84/167 [00:04<00:03, 26.65it/s][A
Epoch 0:   8%|▊         | 455/5971 [05:15<1:03:31,  1.45it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 26.48it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.56it/s][A
Epoch 0:   8%|▊         | 459/5971 [05:15<1:02:57,  1.46it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.73it/s][A
Epoch 0:   8%|▊         | 463/5971 [05:15<1:02:23,  1.47it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.39it/s][A
Epoch 0:   8%|▊         | 467/5971 [05:15<1:01:50,  1.48it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.73it/s][A

Validating:  61%|██████    | 102/167 [00:05<00:02, 28.33it/s][A
Epoch 0:   8%|▊         | 471/5971 [05:15<1:01:18,  1.50it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  63%|██████▎   | 105/167 [00:05<00:02, 27.83it/s][A
Epoch 0:   8%|▊         | 475/5971 [05:15<1:00:46,  1.51it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 27.97it/s][A
Epoch 0:   8%|▊         | 479/5971 [05:15<1:00:14,  1.52it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  67%|██████▋   | 112/167 [00:05<00:01, 29.12it/s][A
Epoch 0:   8%|▊         | 483/5971 [05:16<59:43,  1.53it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]  

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 27.62it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.49it/s][A
Epoch 0:   8%|▊         | 487/5971 [05:16<59:13,  1.54it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.19it/s][A
Epoch 0:   8%|▊         | 491/5971 [05:16<58:44,  1.55it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.24it/s][A
Epoch 0:   8%|▊         | 495/5971 [05:16<58:14,  1.57it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  76%|███████▌  | 127/167 [00:06<00:01, 25.47it/s][A

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 26.08it/s][A
Epoch 0:   8%|▊         | 499/5971 [05:16<57:46,  1.58it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  80%|███████▉  | 133/167 [00:06<00:01, 25.78it/s][A
Epoch 0:   8%|▊         | 503/5971 [05:16<57:17,  1.59it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.33it/s][A
Epoch 0:   8%|▊         | 507/5971 [05:17<56:49,  1.60it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 26.11it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.33it/s][A
Epoch 0:   9%|▊         | 511/5971 [05:17<56:22,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.78it/s][A
Epoch 0:   9%|▊         | 515/5971 [05:17<55:55,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 24.81it/s][A
Epoch 0:   9%|▊         | 519/5971 [05:17<55:28,  1.64it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 24.74it/s][A

Validating:  92%|█████████▏| 154/167 [00:07<00:00, 25.00it/s][A
Epoch 0:   9%|▉         | 523/5971 [05:17<55:02,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  94%|█████████▍| 157/167 [00:07<00:00, 25.17it/s][A
Epoch 0:   9%|▉         | 527/5971 [05:17<54:36,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 25.05it/s][A
Epoch 0:   9%|▉         | 531/5971 [05:17<54:11,  1.67it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 24.92it/s][A

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 25.56it/s][A
Epoch 0:   9%|▉         | 535/5971 [05:18<53:46,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]
Epoch 0:   9%|▉         | 536/5971 [05:18<53:42,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=49.00]

                                                             [A
Epoch 0:   9%|▉         | 537/5971 [05:19<53:45,  1.68it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000261, train/loss_step=0.0743, global_step=50.00]
Epoch 0:   9%|▉         | 538/5971 [05:20<53:47,  1.68it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00793, train/loss_vlb_step=3.94e-5, train/loss_step=0.00793, global_step=50.00]
Epoch 0:   9%|▉         | 539/5971 [05:21<53:49,  1.68it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00793, train/loss_vlb_step=3.94e-5, train/loss_step=0.00793, global_step=50.00]
Epoch 0:   9%|▉         | 539/5971 [05:21<53:49,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.705, train/loss_vlb_step=0.0208, train/loss_step=0.705, global_step=50.00]     
Epoch 0:   9%|▉         | 540/5971 [05:23<54:10,  1.67it/s, loss=0.174, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000881, train/loss_step=0.241, global_step=50.00]
Epoch 0:   9%|▉         | 541/5971 [05:24<54:13,  1.67it/s, loss=0.193, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00328, train/loss_step=0.440, global_step=51.00] 
Epoch 0:   9%|▉         | 542/5971 [05:25<54:15,  1.67it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.36e-5, train/loss_step=0.00408, global_step=51.00]
Epoch 0:   9%|▉         | 543/5971 [05:26<54:17,  1.67it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.36e-5, train/loss_step=0.00408, global_step=51.00]
Epoch 0:   9%|▉         | 543/5971 [05:26<54:17,  1.67it/s, loss=0.177, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=6.18e-5, train/loss_step=0.013, global_step=51.00]    
Epoch 0:   9%|▉         | 544/5971 [05:28<54:35,  1.66it/s, loss=0.187, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000763, train/loss_step=0.219, global_step=51.00]
Epoch 0:   9%|▉         | 545/5971 [05:29<54:37,  1.66it/s, loss=0.205, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.0049, train/loss_step=0.525, global_step=52.00]  
Epoch 0:   9%|▉         | 546/5971 [05:30<54:39,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000609, train/loss_step=0.184, global_step=52.00]
Epoch 0:   9%|▉         | 547/5971 [05:31<54:41,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000609, train/loss_step=0.184, global_step=52.00]
Epoch 0:   9%|▉         | 547/5971 [05:31<54:41,  1.65it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=7.08e-5, train/loss_step=0.0154, global_step=52.00]
Epoch 0:   9%|▉         | 548/5971 [05:34<55:03,  1.64it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.49e-5, train/loss_step=0.0236, global_step=52.00]
Epoch 0:   9%|▉         | 549/5971 [05:35<55:05,  1.64it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00694, train/loss_vlb_step=3.41e-5, train/loss_step=0.00694, global_step=53.00]
Epoch 0:   9%|▉         | 550/5971 [05:36<55:07,  1.64it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=5.2e-5, train/loss_step=0.0106, global_step=53.00]   
Epoch 0:   9%|▉         | 551/5971 [05:37<55:09,  1.64it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=5.2e-5, train/loss_step=0.0106, global_step=53.00]
Epoch 0:   9%|▉         | 551/5971 [05:37<55:09,  1.64it/s, loss=0.153, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000532, train/loss_step=0.162, global_step=53.00]
Epoch 0:   9%|▉         | 552/5971 [05:39<55:30,  1.63it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.4e-5, train/loss_step=0.0178, global_step=53.00] 
Epoch 0:   9%|▉         | 553/5971 [05:40<55:32,  1.63it/s, loss=0.168, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00191, train/loss_step=0.397, global_step=54.00]
Epoch 0:   9%|▉         | 554/5971 [05:41<55:35,  1.62it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.12e-5, train/loss_step=0.00388, global_step=54.00]
Epoch 0:   9%|▉         | 555/5971 [05:42<55:37,  1.62it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.12e-5, train/loss_step=0.00388, global_step=54.00]
Epoch 0:   9%|▉         | 555/5971 [05:42<55:37,  1.62it/s, loss=0.169, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.00063, train/loss_step=0.189, global_step=54.00]    
Epoch 0:   9%|▉         | 556/5971 [05:44<55:51,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000109, train/loss_step=0.026, global_step=54.00]
Epoch 0:   9%|▉         | 557/5971 [05:45<55:53,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0993, train/loss_vlb_step=0.00033, train/loss_step=0.0993, global_step=55.00]
Epoch 0:   9%|▉         | 558/5971 [05:46<55:54,  1.61it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0756, train/loss_vlb_step=0.000257, train/loss_step=0.0756, global_step=55.00]
Epoch 0:   9%|▉         | 559/5971 [05:47<55:56,  1.61it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0756, train/loss_vlb_step=0.000257, train/loss_step=0.0756, global_step=55.00]
Epoch 0:   9%|▉         | 559/5971 [05:47<55:56,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.83e-5, train/loss_step=0.0178, global_step=55.00] 
Epoch 0:   9%|▉         | 560/5971 [05:49<56:13,  1.60it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.52e-5, train/loss_step=0.00698, global_step=55.00]
Epoch 0:   9%|▉         | 561/5971 [05:50<56:15,  1.60it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.7e-5, train/loss_step=0.00285, global_step=56.00]   
Epoch 0:   9%|▉         | 562/5971 [05:51<56:17,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000682, train/loss_step=0.204, global_step=56.00] 
Epoch 0:   9%|▉         | 563/5971 [05:52<56:19,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000682, train/loss_step=0.204, global_step=56.00]
Epoch 0:   9%|▉         | 563/5971 [05:52<56:19,  1.60it/s, loss=0.135, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00409, train/loss_step=0.519, global_step=56.00]
Epoch 0:   9%|▉         | 564/5971 [05:55<56:37,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.36e-5, train/loss_step=0.00228, global_step=56.00]
Epoch 0:   9%|▉         | 565/5971 [05:55<56:39,  1.59it/s, loss=0.12, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00337, train/loss_step=0.441, global_step=57.00]     
Epoch 0:   9%|▉         | 566/5971 [05:56<56:41,  1.59it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.66e-5, train/loss_step=0.00292, global_step=57.00]
Epoch 0:   9%|▉         | 567/5971 [05:57<56:43,  1.59it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.66e-5, train/loss_step=0.00292, global_step=57.00]
Epoch 0:   9%|▉         | 567/5971 [05:57<56:43,  1.59it/s, loss=0.121, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000752, train/loss_step=0.205, global_step=57.00]   
Epoch 0:  10%|▉         | 568/5971 [06:00<56:59,  1.58it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00803, train/loss_vlb_step=3.92e-5, train/loss_step=0.00803, global_step=57.00]
Epoch 0:  10%|▉         | 569/5971 [06:00<57:00,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00413, train/loss_step=0.498, global_step=58.00]   
Epoch 0:  10%|▉         | 570/5971 [06:01<57:02,  1.58it/s, loss=0.164, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00369, train/loss_step=0.396, global_step=58.00]
Epoch 0:  10%|▉         | 571/5971 [06:02<57:04,  1.58it/s, loss=0.164, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00369, train/loss_step=0.396, global_step=58.00]
Epoch 0:  10%|▉         | 571/5971 [06:02<57:04,  1.58it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00211, train/loss_vlb_step=1.28e-5, train/loss_step=0.00211, global_step=58.00]
Epoch 0:  10%|▉         | 572/5971 [06:04<57:18,  1.57it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000171, train/loss_step=0.0484, global_step=58.00] 
Epoch 0:  10%|▉         | 573/5971 [06:05<57:19,  1.57it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0412, train/loss_vlb_step=0.000148, train/loss_step=0.0412, global_step=59.00]
Epoch 0:  10%|▉         | 574/5971 [06:06<57:21,  1.57it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00421, train/loss_vlb_step=2.32e-5, train/loss_step=0.00421, global_step=59.00]
Epoch 0:  10%|▉         | 575/5971 [06:07<57:24,  1.57it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00421, train/loss_vlb_step=2.32e-5, train/loss_step=0.00421, global_step=59.00]
Epoch 0:  10%|▉         | 575/5971 [06:07<57:24,  1.57it/s, loss=0.135, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000339, train/loss_step=0.102, global_step=59.00]   
Epoch 0:  10%|▉         | 576/5971 [06:09<57:38,  1.56it/s, loss=0.141, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000454, train/loss_step=0.138, global_step=59.00]
Epoch 0:  10%|▉         | 577/5971 [06:10<57:40,  1.56it/s, loss=0.154, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00182, train/loss_step=0.363, global_step=60.00] 
Epoch 0:  10%|▉         | 578/5971 [06:11<57:41,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000534, train/loss_step=0.156, global_step=60.00]
Epoch 0:  10%|▉         | 579/5971 [06:12<57:43,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000534, train/loss_step=0.156, global_step=60.00]
Epoch 0:  10%|▉         | 579/5971 [06:12<57:43,  1.56it/s, loss=0.162, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.00035, train/loss_step=0.107, global_step=60.00] 
Epoch 0:  10%|▉         | 580/5971 [06:14<57:58,  1.55it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00713, train/loss_vlb_step=3.43e-5, train/loss_step=0.00713, global_step=60.00]
Epoch 0:  10%|▉         | 581/5971 [06:15<57:59,  1.55it/s, loss=0.176, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.000969, train/loss_step=0.269, global_step=61.00]   
Epoch 0:  10%|▉         | 582/5971 [06:16<58:01,  1.55it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.89e-5, train/loss_step=0.00334, global_step=61.00]
Epoch 0:  10%|▉         | 583/5971 [06:17<58:02,  1.55it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.89e-5, train/loss_step=0.00334, global_step=61.00]
Epoch 0:  10%|▉         | 583/5971 [06:17<58:02,  1.55it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.36e-5, train/loss_step=0.0118, global_step=61.00]   
Epoch 0:  10%|▉         | 584/5971 [06:19<58:18,  1.54it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.75e-5, train/loss_step=0.0132, global_step=61.00]
Epoch 0:  10%|▉         | 585/5971 [06:20<58:19,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00111, train/loss_step=0.263, global_step=62.00]  
Epoch 0:  10%|▉         | 586/5971 [06:21<58:21,  1.54it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000284, train/loss_step=0.0864, global_step=62.00]
Epoch 0:  10%|▉         | 587/5971 [06:22<58:22,  1.54it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000284, train/loss_step=0.0864, global_step=62.00]
Epoch 0:  10%|▉         | 587/5971 [06:22<58:22,  1.54it/s, loss=0.141, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00123, train/loss_step=0.294, global_step=62.00]   
Epoch 0:  10%|▉         | 588/5971 [06:25<58:45,  1.53it/s, loss=0.146, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000377, train/loss_step=0.115, global_step=62.00]
Epoch 0:  10%|▉         | 589/5971 [06:26<58:47,  1.53it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.43e-5, train/loss_step=0.0195, global_step=63.00]
Epoch 0:  10%|▉         | 590/5971 [06:27<58:48,  1.53it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000151, train/loss_step=0.0419, global_step=63.00]
Epoch 0:  10%|▉         | 591/5971 [06:28<58:49,  1.52it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000151, train/loss_step=0.0419, global_step=63.00]
Epoch 0:  10%|▉         | 591/5971 [06:28<58:49,  1.52it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.21e-5, train/loss_step=0.00206, global_step=63.00]
Epoch 0:  10%|▉         | 592/5971 [06:30<59:02,  1.52it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00831, train/loss_vlb_step=4.21e-5, train/loss_step=0.00831, global_step=63.00]
Epoch 0:  10%|▉         | 593/5971 [06:31<59:03,  1.52it/s, loss=0.133, v_num=0, train/loss_simple_step=0.661, train/loss_vlb_step=0.00935, train/loss_step=0.661, global_step=64.00]    
Epoch 0:  10%|▉         | 594/5971 [06:32<59:05,  1.52it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00356, train/loss_vlb_step=1.97e-5, train/loss_step=0.00356, global_step=64.00]
Epoch 0:  10%|▉         | 595/5971 [06:33<59:06,  1.52it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00356, train/loss_vlb_step=1.97e-5, train/loss_step=0.00356, global_step=64.00]
Epoch 0:  10%|▉         | 595/5971 [06:33<59:06,  1.52it/s, loss=0.133, v_num=0, train/loss_simple_step=0.089, train/loss_vlb_step=0.000296, train/loss_step=0.089, global_step=64.00]   
Epoch 0:  10%|▉         | 596/5971 [06:35<59:20,  1.51it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00514, train/loss_vlb_step=2.75e-5, train/loss_step=0.00514, global_step=64.00]
Epoch 0:  10%|▉         | 597/5971 [06:36<59:22,  1.51it/s, loss=0.115, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000503, train/loss_step=0.153, global_step=65.00]   
Epoch 0:  10%|█         | 598/5971 [06:37<59:23,  1.51it/s, loss=0.12, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000863, train/loss_step=0.239, global_step=65.00] 
Epoch 0:  10%|█         | 599/5971 [06:38<59:24,  1.51it/s, loss=0.12, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000863, train/loss_step=0.239, global_step=65.00]
Epoch 0:  10%|█         | 599/5971 [06:38<59:24,  1.51it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.17e-5, train/loss_step=0.00394, global_step=65.00]
Epoch 0:  10%|█         | 600/5971 [06:41<59:44,  1.50it/s, loss=0.128, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.000936, train/loss_step=0.269, global_step=65.00]   
Epoch 0:  10%|█         | 601/5971 [06:41<59:45,  1.50it/s, loss=0.133, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00235, train/loss_step=0.386, global_step=66.00] 
Epoch 0:  10%|█         | 602/5971 [06:42<59:46,  1.50it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000148, train/loss_step=0.0409, global_step=66.00]
Epoch 0:  10%|█         | 603/5971 [06:43<59:47,  1.50it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000148, train/loss_step=0.0409, global_step=66.00]
Epoch 0:  10%|█         | 603/5971 [06:43<59:47,  1.50it/s, loss=0.142, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.00048, train/loss_step=0.145, global_step=66.00]   
Epoch 0:  10%|█         | 604/5971 [06:45<1:00:00,  1.49it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.67e-5, train/loss_step=0.0176, global_step=66.00]
Epoch 0:  10%|█         | 605/5971 [06:46<1:00:01,  1.49it/s, loss=0.147, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00158, train/loss_step=0.370, global_step=67.00]  
Epoch 0:  10%|█         | 606/5971 [06:47<1:00:03,  1.49it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000147, train/loss_step=0.0376, global_step=67.00]
Epoch 0:  10%|█         | 607/5971 [06:48<1:00:04,  1.49it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000147, train/loss_step=0.0376, global_step=67.00]
Epoch 0:  10%|█         | 607/5971 [06:48<1:00:04,  1.49it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0646, train/loss_vlb_step=0.000218, train/loss_step=0.0646, global_step=67.00]
Epoch 0:  10%|█         | 608/5971 [06:50<1:00:18,  1.48it/s, loss=0.135, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000446, train/loss_step=0.135, global_step=67.00]  
Epoch 0:  10%|█         | 609/5971 [06:51<1:00:19,  1.48it/s, loss=0.15, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00145, train/loss_step=0.319, global_step=68.00]  
Epoch 0:  10%|█         | 610/5971 [06:52<1:00:20,  1.48it/s, loss=0.154, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000426, train/loss_step=0.130, global_step=68.00]
Epoch 0:  10%|█         | 611/5971 [06:53<1:00:22,  1.48it/s, loss=0.154, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000426, train/loss_step=0.130, global_step=68.00]
Epoch 0:  10%|█         | 611/5971 [06:53<1:00:22,  1.48it/s, loss=0.162, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000507, train/loss_step=0.153, global_step=68.00]
Epoch 0:  10%|█         | 612/5971 [06:55<1:00:36,  1.47it/s, loss=0.179, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00163, train/loss_step=0.367, global_step=68.00] 
Epoch 0:  10%|█         | 613/5971 [06:56<1:00:37,  1.47it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.67e-5, train/loss_step=0.00297, global_step=69.00]
Epoch 0:  10%|█         | 614/5971 [06:57<1:00:39,  1.47it/s, loss=0.171, v_num=0, train/loss_simple_step=0.487, train/loss_vlb_step=0.00413, train/loss_step=0.487, global_step=69.00]    
Epoch 0:  10%|█         | 615/5971 [06:58<1:00:40,  1.47it/s, loss=0.171, v_num=0, train/loss_simple_step=0.487, train/loss_vlb_step=0.00413, train/loss_step=0.487, global_step=69.00]
Epoch 0:  10%|█         | 615/5971 [06:58<1:00:40,  1.47it/s, loss=0.193, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00533, train/loss_step=0.528, global_step=69.00]
Epoch 0:  10%|█         | 616/5971 [07:01<1:00:59,  1.46it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.12e-5, train/loss_step=0.0143, global_step=69.00]
Epoch 0:  10%|█         | 617/5971 [07:02<1:01:00,  1.46it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.4e-5, train/loss_step=0.00235, global_step=70.00]
Epoch 0:  10%|█         | 618/5971 [07:03<1:01:01,  1.46it/s, loss=0.181, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000467, train/loss_step=0.138, global_step=70.00]  
Epoch 0:  10%|█         | 619/5971 [07:04<1:01:02,  1.46it/s, loss=0.181, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000467, train/loss_step=0.138, global_step=70.00]
Epoch 0:  10%|█         | 619/5971 [07:04<1:01:02,  1.46it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=70.00]
Epoch 0:  10%|█         | 620/5971 [07:06<1:01:17,  1.45it/s, loss=0.176, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000591, train/loss_step=0.170, global_step=70.00] 
Epoch 0:  10%|█         | 621/5971 [07:07<1:01:19,  1.45it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.53e-5, train/loss_step=0.00264, global_step=71.00]
Epoch 0:  10%|█         | 622/5971 [07:08<1:01:19,  1.45it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.91e-5, train/loss_step=0.0232, global_step=71.00]  
Epoch 0:  10%|█         | 623/5971 [07:09<1:01:20,  1.45it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.91e-5, train/loss_step=0.0232, global_step=71.00]
Epoch 0:  10%|█         | 623/5971 [07:09<1:01:20,  1.45it/s, loss=0.155, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000455, train/loss_step=0.133, global_step=71.00] 
Epoch 0:  10%|█         | 624/5971 [07:11<1:01:34,  1.45it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000113, train/loss_step=0.0301, global_step=71.00]
Epoch 0:  10%|█         | 625/5971 [07:12<1:01:35,  1.45it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.83e-5, train/loss_step=0.0032, global_step=72.00] 
Epoch 0:  10%|█         | 626/5971 [07:13<1:01:36,  1.45it/s, loss=0.164, v_num=0, train/loss_simple_step=0.560, train/loss_vlb_step=0.00558, train/loss_step=0.560, global_step=72.00]  
Epoch 0:  11%|█         | 627/5971 [07:14<1:01:37,  1.45it/s, loss=0.164, v_num=0, train/loss_simple_step=0.560, train/loss_vlb_step=0.00558, train/loss_step=0.560, global_step=72.00]
Epoch 0:  11%|█         | 627/5971 [07:14<1:01:37,  1.45it/s, loss=0.193, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0164, train/loss_step=0.643, global_step=72.00] 
Epoch 0:  11%|█         | 628/5971 [07:16<1:01:48,  1.44it/s, loss=0.197, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000782, train/loss_step=0.223, global_step=72.00]
Epoch 0:  11%|█         | 629/5971 [07:17<1:01:50,  1.44it/s, loss=0.193, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000859, train/loss_step=0.237, global_step=73.00]
Epoch 0:  11%|█         | 630/5971 [07:18<1:01:51,  1.44it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.99e-5, train/loss_step=0.0161, global_step=73.00]
Epoch 0:  11%|█         | 631/5971 [07:19<1:01:51,  1.44it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.99e-5, train/loss_step=0.0161, global_step=73.00]
Epoch 0:  11%|█         | 631/5971 [07:19<1:01:51,  1.44it/s, loss=0.208, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.00957, train/loss_step=0.570, global_step=73.00]  
Epoch 0:  11%|█         | 632/5971 [07:21<1:02:04,  1.43it/s, loss=0.2, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000666, train/loss_step=0.197, global_step=73.00] 
Epoch 0:  11%|█         | 633/5971 [07:22<1:02:05,  1.43it/s, loss=0.212, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00121, train/loss_step=0.255, global_step=74.00]
Epoch 0:  11%|█         | 634/5971 [07:23<1:02:05,  1.43it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000154, train/loss_step=0.0432, global_step=74.00]
Epoch 0:  11%|█         | 635/5971 [07:24<1:02:06,  1.43it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000154, train/loss_step=0.0432, global_step=74.00]
Epoch 0:  11%|█         | 635/5971 [07:24<1:02:06,  1.43it/s, loss=0.178, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00105, train/loss_step=0.279, global_step=74.00]  
Epoch 0:  11%|█         | 636/5971 [07:26<1:02:18,  1.43it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:46,  1.55it/s][A
Epoch 0:  11%|█         | 639/5971 [07:27<1:02:04,  1.43it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:   2%|▏         | 4/167 [00:00<00:26,  6.15it/s][A
Epoch 0:  11%|█         | 643/5971 [07:27<1:01:40,  1.44it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:   4%|▍         | 7/167 [00:00<00:15, 10.52it/s][A

Validating:   6%|▌         | 10/167 [00:01<00:10, 14.32it/s][A
Epoch 0:  11%|█         | 647/5971 [07:27<1:01:15,  1.45it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:   8%|▊         | 13/167 [00:01<00:08, 17.36it/s][A
Epoch 0:  11%|█         | 651/5971 [07:27<1:00:51,  1.46it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.72it/s][A
Epoch 0:  11%|█         | 655/5971 [07:27<1:00:28,  1.47it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  11%|█▏        | 19/167 [00:01<00:07, 20.22it/s][A

Validating:  13%|█▎        | 22/167 [00:01<00:06, 22.32it/s][A
Epoch 0:  11%|█         | 659/5971 [07:27<1:00:04,  1.47it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 23.81it/s][A
Epoch 0:  11%|█         | 663/5971 [07:28<59:41,  1.48it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]  

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.74it/s][A
Epoch 0:  11%|█         | 667/5971 [07:28<59:18,  1.49it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.10it/s][A

Validating:  20%|██        | 34/167 [00:01<00:05, 26.16it/s][A
Epoch 0:  11%|█         | 671/5971 [07:28<58:55,  1.50it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 25.81it/s][A
Epoch 0:  11%|█▏        | 675/5971 [07:28<58:33,  1.51it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.77it/s][A
Epoch 0:  11%|█▏        | 679/5971 [07:28<58:11,  1.52it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  26%|██▌       | 43/167 [00:02<00:05, 24.60it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:05, 23.97it/s][A
Epoch 0:  11%|█▏        | 683/5971 [07:28<57:49,  1.52it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 24.57it/s][A
Epoch 0:  12%|█▏        | 687/5971 [07:28<57:28,  1.53it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  31%|███       | 52/167 [00:02<00:04, 24.11it/s][A
Epoch 0:  12%|█▏        | 691/5971 [07:29<57:06,  1.54it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 24.60it/s][A

Validating:  35%|███▍      | 58/167 [00:02<00:04, 25.63it/s][A
Epoch 0:  12%|█▏        | 695/5971 [07:29<56:45,  1.55it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 25.79it/s][A
Epoch 0:  12%|█▏        | 699/5971 [07:29<56:24,  1.56it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.43it/s][A
Epoch 0:  12%|█▏        | 703/5971 [07:29<56:04,  1.57it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  40%|████      | 67/167 [00:03<00:04, 24.95it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.21it/s][A
Epoch 0:  12%|█▏        | 707/5971 [07:29<55:43,  1.57it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  44%|████▎     | 73/167 [00:03<00:04, 23.11it/s][A
Epoch 0:  12%|█▏        | 711/5971 [07:29<55:24,  1.58it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 24.37it/s][A
Epoch 0:  12%|█▏        | 715/5971 [07:30<55:04,  1.59it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 24.93it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:03, 24.49it/s][A
Epoch 0:  12%|█▏        | 719/5971 [07:30<54:44,  1.60it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  51%|█████     | 85/167 [00:04<00:03, 23.64it/s][A
Epoch 0:  12%|█▏        | 723/5971 [07:30<54:25,  1.61it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 23.13it/s][A
Epoch 0:  12%|█▏        | 727/5971 [07:30<54:05,  1.62it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 24.06it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 24.46it/s][A
Epoch 0:  12%|█▏        | 731/5971 [07:30<53:46,  1.62it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 24.55it/s][A
Epoch 0:  12%|█▏        | 735/5971 [07:30<53:27,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.63it/s][A
Epoch 0:  12%|█▏        | 739/5971 [07:31<53:09,  1.64it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.72it/s][A

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.91it/s][A
Epoch 0:  12%|█▏        | 743/5971 [07:31<52:50,  1.65it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 24.75it/s][A
Epoch 0:  13%|█▎        | 747/5971 [07:31<52:32,  1.66it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 24.41it/s][A
Epoch 0:  13%|█▎        | 751/5971 [07:31<52:14,  1.67it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 24.65it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:02, 24.37it/s][A
Epoch 0:  13%|█▎        | 755/5971 [07:31<51:56,  1.67it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 24.91it/s][A
Epoch 0:  13%|█▎        | 759/5971 [07:31<51:38,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 24.91it/s][A
Epoch 0:  13%|█▎        | 763/5971 [07:32<51:21,  1.69it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.67it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.11it/s][A
Epoch 0:  13%|█▎        | 767/5971 [07:32<51:03,  1.70it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.69it/s][A
Epoch 0:  13%|█▎        | 771/5971 [07:32<50:46,  1.71it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.97it/s][A
Epoch 0:  13%|█▎        | 775/5971 [07:32<50:29,  1.71it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 26.05it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.75it/s][A
Epoch 0:  13%|█▎        | 779/5971 [07:32<50:12,  1.72it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 27.36it/s][A
Epoch 0:  13%|█▎        | 783/5971 [07:32<49:56,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.84it/s][A
Epoch 0:  13%|█▎        | 787/5971 [07:32<49:39,  1.74it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.58it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 27.46it/s][A
Epoch 0:  13%|█▎        | 791/5971 [07:33<49:23,  1.75it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.58it/s][A
Epoch 0:  13%|█▎        | 795/5971 [07:33<49:06,  1.76it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 28.43it/s][A
Epoch 0:  13%|█▎        | 799/5971 [07:33<48:50,  1.76it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 28.10it/s][A
Epoch 0:  13%|█▎        | 803/5971 [07:33<48:34,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

Validating: 100%|██████████| 167/167 [00:07<00:00, 27.84it/s][A
Epoch 0:  13%|█▎        | 804/5971 [07:33<48:33,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000552, train/loss_step=0.156, global_step=74.00]

                                                             [A
Epoch 0:  13%|█▎        | 805/5971 [07:34<48:35,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.00068, train/loss_step=0.201, global_step=75.00] 
Epoch 0:  13%|█▎        | 806/5971 [07:35<48:36,  1.77it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.61e-5, train/loss_step=0.0104, global_step=75.00]
Epoch 0:  14%|█▎        | 807/5971 [07:36<48:38,  1.77it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.61e-5, train/loss_step=0.0104, global_step=75.00]
Epoch 0:  14%|█▎        | 807/5971 [07:36<48:38,  1.77it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000237, train/loss_step=0.0709, global_step=75.00]
Epoch 0:  14%|█▎        | 808/5971 [07:38<48:47,  1.76it/s, loss=0.198, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.0012, train/loss_step=0.297, global_step=75.00]    
Epoch 0:  14%|█▎        | 809/5971 [07:39<48:49,  1.76it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000121, train/loss_step=0.0305, global_step=76.00]
Epoch 0:  14%|█▎        | 810/5971 [07:40<48:50,  1.76it/s, loss=0.214, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.0014, train/loss_step=0.325, global_step=76.00]    
Epoch 0:  14%|█▎        | 811/5971 [07:41<48:51,  1.76it/s, loss=0.214, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.0014, train/loss_step=0.325, global_step=76.00]
Epoch 0:  14%|█▎        | 811/5971 [07:41<48:51,  1.76it/s, loss=0.218, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.00098, train/loss_step=0.220, global_step=76.00]
Epoch 0:  14%|█▎        | 812/5971 [07:43<49:03,  1.75it/s, loss=0.23, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00107, train/loss_step=0.258, global_step=76.00] 
Epoch 0:  14%|█▎        | 813/5971 [07:44<49:05,  1.75it/s, loss=0.236, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000454, train/loss_step=0.138, global_step=77.00]
Epoch 0:  14%|█▎        | 814/5971 [07:45<49:06,  1.75it/s, loss=0.23, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00241, train/loss_step=0.433, global_step=77.00]  
Epoch 0:  14%|█▎        | 815/5971 [07:46<49:07,  1.75it/s, loss=0.23, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00241, train/loss_step=0.433, global_step=77.00]
Epoch 0:  14%|█▎        | 815/5971 [07:46<49:07,  1.75it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.84e-5, train/loss_step=0.0101, global_step=77.00]
Epoch 0:  14%|█▎        | 816/5971 [07:49<49:21,  1.74it/s, loss=0.19, v_num=0, train/loss_simple_step=0.058, train/loss_vlb_step=0.000206, train/loss_step=0.058, global_step=77.00]  
Epoch 0:  14%|█▎        | 817/5971 [07:50<49:22,  1.74it/s, loss=0.186, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000661, train/loss_step=0.160, global_step=78.00]
Epoch 0:  14%|█▎        | 818/5971 [07:51<49:24,  1.74it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.45e-5, train/loss_step=0.00435, global_step=78.00]
Epoch 0:  14%|█▎        | 819/5971 [07:51<49:25,  1.74it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.45e-5, train/loss_step=0.00435, global_step=78.00]
Epoch 0:  14%|█▎        | 819/5971 [07:51<49:25,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00174, train/loss_step=0.350, global_step=78.00]    
Epoch 0:  14%|█▎        | 820/5971 [07:54<49:34,  1.73it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000107, train/loss_step=0.0272, global_step=78.00]
Epoch 0:  14%|█▎        | 821/5971 [07:55<49:36,  1.73it/s, loss=0.159, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000339, train/loss_step=0.102, global_step=79.00]  
Epoch 0:  14%|█▍        | 822/5971 [07:55<49:37,  1.73it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0556, train/loss_vlb_step=0.000197, train/loss_step=0.0556, global_step=79.00]
Epoch 0:  14%|█▍        | 823/5971 [07:56<49:38,  1.73it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0556, train/loss_vlb_step=0.000197, train/loss_step=0.0556, global_step=79.00]
Epoch 0:  14%|█▍        | 823/5971 [07:56<49:38,  1.73it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000208, train/loss_step=0.0575, global_step=79.00]
Epoch 0:  14%|█▍        | 824/5971 [07:59<49:48,  1.72it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.57e-5, train/loss_step=0.0209, global_step=79.00] 
Epoch 0:  14%|█▍        | 825/5971 [07:59<49:49,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0608, train/loss_vlb_step=0.0002, train/loss_step=0.0608, global_step=80.00] 
Epoch 0:  14%|█▍        | 826/5971 [08:00<49:51,  1.72it/s, loss=0.158, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00301, train/loss_step=0.478, global_step=80.00] 
Epoch 0:  14%|█▍        | 827/5971 [08:01<49:52,  1.72it/s, loss=0.158, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00301, train/loss_step=0.478, global_step=80.00]
Epoch 0:  14%|█▍        | 827/5971 [08:01<49:52,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000883, train/loss_step=0.219, global_step=80.00]
Epoch 0:  14%|█▍        | 828/5971 [08:04<50:02,  1.71it/s, loss=0.159, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000612, train/loss_step=0.183, global_step=80.00]
Epoch 0:  14%|█▍        | 829/5971 [08:04<50:04,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000469, train/loss_step=0.142, global_step=81.00]
Epoch 0:  14%|█▍        | 830/5971 [08:05<50:05,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=9.26e-5, train/loss_step=0.0218, global_step=81.00]
Epoch 0:  14%|█▍        | 831/5971 [08:06<50:06,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=9.26e-5, train/loss_step=0.0218, global_step=81.00]
Epoch 0:  14%|█▍        | 831/5971 [08:06<50:06,  1.71it/s, loss=0.159, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00203, train/loss_step=0.400, global_step=81.00] 
Epoch 0:  14%|█▍        | 832/5971 [08:08<50:16,  1.70it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.55e-5, train/loss_step=0.00268, global_step=81.00]
Epoch 0:  14%|█▍        | 833/5971 [08:09<50:17,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.000106, train/loss_step=0.0282, global_step=82.00] 
Epoch 0:  14%|█▍        | 834/5971 [08:10<50:19,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000751, train/loss_step=0.205, global_step=82.00]  
Epoch 0:  14%|█▍        | 835/5971 [08:11<50:20,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000751, train/loss_step=0.205, global_step=82.00]
Epoch 0:  14%|█▍        | 835/5971 [08:11<50:20,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000841, train/loss_step=0.240, global_step=82.00]
Epoch 0:  14%|█▍        | 836/5971 [08:14<50:30,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00138, train/loss_step=0.341, global_step=82.00] 
Epoch 0:  14%|█▍        | 837/5971 [08:14<50:32,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0966, train/loss_vlb_step=0.000322, train/loss_step=0.0966, global_step=83.00]
Epoch 0:  14%|█▍        | 838/5971 [08:15<50:33,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.45e-5, train/loss_step=0.00246, global_step=83.00]
Epoch 0:  14%|█▍        | 839/5971 [08:16<50:34,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.45e-5, train/loss_step=0.00246, global_step=83.00]
Epoch 0:  14%|█▍        | 839/5971 [08:16<50:34,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0827, train/loss_vlb_step=0.000286, train/loss_step=0.0827, global_step=83.00] 
Epoch 0:  14%|█▍        | 840/5971 [08:18<50:43,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00495, train/loss_vlb_step=2.6e-5, train/loss_step=0.00495, global_step=83.00]
Epoch 0:  14%|█▍        | 841/5971 [08:19<50:44,  1.69it/s, loss=0.14, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000564, train/loss_step=0.165, global_step=84.00]   
Epoch 0:  14%|█▍        | 842/5971 [08:20<50:45,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0295, train/loss_vlb_step=0.000119, train/loss_step=0.0295, global_step=84.00]
Epoch 0:  14%|█▍        | 843/5971 [08:21<50:46,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0295, train/loss_vlb_step=0.000119, train/loss_step=0.0295, global_step=84.00]
Epoch 0:  14%|█▍        | 843/5971 [08:21<50:46,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00317, train/loss_step=0.449, global_step=84.00]   
Epoch 0:  14%|█▍        | 844/5971 [08:23<50:56,  1.68it/s, loss=0.173, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00158, train/loss_step=0.316, global_step=84.00]
Epoch 0:  14%|█▍        | 845/5971 [08:24<50:57,  1.68it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.00016, train/loss_step=0.0425, global_step=85.00]
Epoch 0:  14%|█▍        | 846/5971 [08:25<50:58,  1.68it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.28e-5, train/loss_step=0.0021, global_step=85.00]
Epoch 0:  14%|█▍        | 847/5971 [08:26<50:59,  1.67it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.28e-5, train/loss_step=0.0021, global_step=85.00]
Epoch 0:  14%|█▍        | 847/5971 [08:26<50:59,  1.67it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.28e-5, train/loss_step=0.00408, global_step=85.00]
Epoch 0:  14%|█▍        | 848/5971 [08:29<51:12,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000124, train/loss_step=0.0313, global_step=85.00]  
Epoch 0:  14%|█▍        | 849/5971 [08:30<51:13,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000719, train/loss_step=0.201, global_step=86.00] 
Epoch 0:  14%|█▍        | 850/5971 [08:30<51:14,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.72e-5, train/loss_step=0.0156, global_step=86.00]
Epoch 0:  14%|█▍        | 851/5971 [08:31<51:15,  1.66it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.72e-5, train/loss_step=0.0156, global_step=86.00]
Epoch 0:  14%|█▍        | 851/5971 [08:31<51:15,  1.66it/s, loss=0.122, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000619, train/loss_step=0.186, global_step=86.00] 
Epoch 0:  14%|█▍        | 852/5971 [08:33<51:23,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000217, train/loss_step=0.0631, global_step=86.00]
Epoch 0:  14%|█▍        | 853/5971 [08:34<51:25,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.0007, train/loss_step=0.204, global_step=87.00]    
Epoch 0:  14%|█▍        | 854/5971 [08:35<51:26,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.55e-5, train/loss_step=0.00277, global_step=87.00]
Epoch 0:  14%|█▍        | 855/5971 [08:36<51:27,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.55e-5, train/loss_step=0.00277, global_step=87.00]
Epoch 0:  14%|█▍        | 855/5971 [08:36<51:27,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.34e-5, train/loss_step=0.0117, global_step=87.00]  
Epoch 0:  14%|█▍        | 856/5971 [08:38<51:37,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.0031, train/loss_step=0.481, global_step=87.00]    
Epoch 0:  14%|█▍        | 857/5971 [08:39<51:38,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000474, train/loss_step=0.142, global_step=88.00]
Epoch 0:  14%|█▍        | 858/5971 [08:40<51:39,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000598, train/loss_step=0.181, global_step=88.00]
Epoch 0:  14%|█▍        | 859/5971 [08:41<51:40,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000598, train/loss_step=0.181, global_step=88.00]
Epoch 0:  14%|█▍        | 859/5971 [08:41<51:40,  1.65it/s, loss=0.14, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00119, train/loss_step=0.273, global_step=88.00]  
Epoch 0:  14%|█▍        | 860/5971 [08:43<51:48,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.57e-5, train/loss_step=0.00715, global_step=88.00]
Epoch 0:  14%|█▍        | 861/5971 [08:44<51:50,  1.64it/s, loss=0.149, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00172, train/loss_step=0.335, global_step=89.00]   
Epoch 0:  14%|█▍        | 862/5971 [08:45<51:51,  1.64it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0891, train/loss_vlb_step=0.000298, train/loss_step=0.0891, global_step=89.00]
Epoch 0:  14%|█▍        | 863/5971 [08:46<51:52,  1.64it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0891, train/loss_vlb_step=0.000298, train/loss_step=0.0891, global_step=89.00]
Epoch 0:  14%|█▍        | 863/5971 [08:46<51:52,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000467, train/loss_step=0.136, global_step=89.00]  
Epoch 0:  14%|█▍        | 864/5971 [08:49<52:04,  1.63it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0672, train/loss_vlb_step=0.000227, train/loss_step=0.0672, global_step=89.00]
Epoch 0:  14%|█▍        | 865/5971 [08:50<52:06,  1.63it/s, loss=0.129, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000495, train/loss_step=0.150, global_step=90.00]  
Epoch 0:  15%|█▍        | 866/5971 [08:51<52:07,  1.63it/s, loss=0.138, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000625, train/loss_step=0.187, global_step=90.00]
Epoch 0:  15%|█▍        | 867/5971 [08:51<52:08,  1.63it/s, loss=0.138, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000625, train/loss_step=0.187, global_step=90.00]
Epoch 0:  15%|█▍        | 867/5971 [08:51<52:08,  1.63it/s, loss=0.145, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000452, train/loss_step=0.138, global_step=90.00]
Epoch 0:  15%|█▍        | 868/5971 [08:54<52:18,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0927, train/loss_vlb_step=0.000306, train/loss_step=0.0927, global_step=90.00]
Epoch 0:  15%|█▍        | 869/5971 [08:55<52:19,  1.63it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.16e-5, train/loss_step=0.00194, global_step=91.00]
Epoch 0:  15%|█▍        | 870/5971 [08:56<52:20,  1.62it/s, loss=0.151, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00128, train/loss_step=0.267, global_step=91.00]    
Epoch 0:  15%|█▍        | 871/5971 [08:57<52:21,  1.62it/s, loss=0.151, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00128, train/loss_step=0.267, global_step=91.00]
Epoch 0:  15%|█▍        | 871/5971 [08:57<52:21,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.33e-5, train/loss_step=0.00429, global_step=91.00]
Epoch 0:  15%|█▍        | 872/5971 [08:59<52:29,  1.62it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00454, train/loss_vlb_step=2.51e-5, train/loss_step=0.00454, global_step=91.00]
Epoch 0:  15%|█▍        | 873/5971 [09:00<52:30,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00095, train/loss_step=0.263, global_step=92.00]    
Epoch 0:  15%|█▍        | 874/5971 [09:01<52:31,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=5.09e-5, train/loss_step=0.0102, global_step=92.00]
Epoch 0:  15%|█▍        | 875/5971 [09:01<52:32,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=5.09e-5, train/loss_step=0.0102, global_step=92.00]
Epoch 0:  15%|█▍        | 875/5971 [09:01<52:32,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.41e-5, train/loss_step=0.00231, global_step=92.00]
Epoch 0:  15%|█▍        | 876/5971 [09:04<52:41,  1.61it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.87e-5, train/loss_step=0.0125, global_step=92.00]  
Epoch 0:  15%|█▍        | 877/5971 [09:05<52:42,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.447, train/loss_vlb_step=0.00299, train/loss_step=0.447, global_step=93.00]  
Epoch 0:  15%|█▍        | 878/5971 [09:06<52:43,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000768, train/loss_step=0.221, global_step=93.00]
Epoch 0:  15%|█▍        | 879/5971 [09:06<52:44,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000768, train/loss_step=0.221, global_step=93.00]
Epoch 0:  15%|█▍        | 879/5971 [09:06<52:44,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0774, train/loss_vlb_step=0.000268, train/loss_step=0.0774, global_step=93.00]
Epoch 0:  15%|█▍        | 880/5971 [09:09<52:52,  1.60it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0777, train/loss_vlb_step=0.00026, train/loss_step=0.0777, global_step=93.00] 
Epoch 0:  15%|█▍        | 881/5971 [09:09<52:53,  1.60it/s, loss=0.137, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00429, train/loss_step=0.495, global_step=94.00]  
Epoch 0:  15%|█▍        | 882/5971 [09:10<52:54,  1.60it/s, loss=0.138, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000377, train/loss_step=0.114, global_step=94.00]
Epoch 0:  15%|█▍        | 883/5971 [09:11<52:55,  1.60it/s, loss=0.138, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000377, train/loss_step=0.114, global_step=94.00]
Epoch 0:  15%|█▍        | 883/5971 [09:11<52:55,  1.60it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000118, train/loss_step=0.0307, global_step=94.00]
Epoch 0:  15%|█▍        | 884/5971 [09:13<53:03,  1.60it/s, loss=0.132, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.00018, train/loss_step=0.049, global_step=94.00]   
Epoch 0:  15%|█▍        | 885/5971 [09:14<53:04,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.0024, train/loss_step=0.443, global_step=95.00] 
Epoch 0:  15%|█▍        | 886/5971 [09:15<53:05,  1.60it/s, loss=0.146, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000632, train/loss_step=0.172, global_step=95.00]
Epoch 0:  15%|█▍        | 887/5971 [09:16<53:05,  1.60it/s, loss=0.146, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000632, train/loss_step=0.172, global_step=95.00]
Epoch 0:  15%|█▍        | 887/5971 [09:16<53:05,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.65e-5, train/loss_step=0.0124, global_step=95.00]
Epoch 0:  15%|█▍        | 888/5971 [09:19<53:17,  1.59it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.79e-5, train/loss_step=0.00314, global_step=95.00]
Epoch 0:  15%|█▍        | 889/5971 [09:20<53:18,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.81e-5, train/loss_step=0.00768, global_step=96.00]
Epoch 0:  15%|█▍        | 890/5971 [09:21<53:20,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.02e-5, train/loss_step=0.0115, global_step=96.00]  
Epoch 0:  15%|█▍        | 891/5971 [09:22<53:20,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.02e-5, train/loss_step=0.0115, global_step=96.00]
Epoch 0:  15%|█▍        | 891/5971 [09:22<53:20,  1.59it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.81e-5, train/loss_step=0.0192, global_step=96.00]
Epoch 0:  15%|█▍        | 892/5971 [09:24<53:28,  1.58it/s, loss=0.134, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000739, train/loss_step=0.209, global_step=96.00] 
Epoch 0:  15%|█▍        | 893/5971 [09:25<53:29,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000339, train/loss_step=0.101, global_step=97.00]
Epoch 0:  15%|█▍        | 894/5971 [09:25<53:30,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.71e-5, train/loss_step=0.00501, global_step=97.00]
Epoch 0:  15%|█▍        | 895/5971 [09:26<53:31,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.71e-5, train/loss_step=0.00501, global_step=97.00]
Epoch 0:  15%|█▍        | 895/5971 [09:26<53:31,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00184, train/loss_step=0.356, global_step=97.00]    
Epoch 0:  15%|█▌        | 896/5971 [09:29<53:39,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.00011, train/loss_step=0.0271, global_step=97.00]
Epoch 0:  15%|█▌        | 897/5971 [09:30<53:40,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0839, train/loss_vlb_step=0.000277, train/loss_step=0.0839, global_step=98.00]
Epoch 0:  15%|█▌        | 898/5971 [09:30<53:41,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.69e-5, train/loss_step=0.0104, global_step=98.00] 
Epoch 0:  15%|█▌        | 899/5971 [09:31<53:42,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.69e-5, train/loss_step=0.0104, global_step=98.00]
Epoch 0:  15%|█▌        | 899/5971 [09:31<53:42,  1.57it/s, loss=0.119, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000538, train/loss_step=0.155, global_step=98.00] 
Epoch 0:  15%|█▌        | 900/5971 [09:34<53:51,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00318, train/loss_vlb_step=1.75e-5, train/loss_step=0.00318, global_step=98.00]
Epoch 0:  15%|█▌        | 901/5971 [09:35<53:52,  1.57it/s, loss=0.106, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.0013, train/loss_step=0.308, global_step=99.00]     
Epoch 0:  15%|█▌        | 902/5971 [09:36<53:53,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0914, train/loss_vlb_step=0.000312, train/loss_step=0.0914, global_step=99.00]
Epoch 0:  15%|█▌        | 903/5971 [09:36<53:54,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0914, train/loss_vlb_step=0.000312, train/loss_step=0.0914, global_step=99.00]
Epoch 0:  15%|█▌        | 903/5971 [09:36<53:54,  1.57it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0442, train/loss_vlb_step=0.000159, train/loss_step=0.0442, global_step=99.00]
Epoch 0:  15%|█▌        | 904/5971 [09:39<54:01,  1.56it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:21,  2.03it/s][A

Validating:   1%|          | 2/167 [00:00<00:46,  3.56it/s][A
Epoch 0:  15%|█▌        | 907/5971 [09:39<53:53,  1.57it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.47it/s][A
Epoch 0:  15%|█▌        | 911/5971 [09:39<53:37,  1.57it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.38it/s][A
Epoch 0:  15%|█▌        | 915/5971 [09:40<53:21,  1.58it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.29it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.21it/s][A
Epoch 0:  15%|█▌        | 919/5971 [09:40<53:06,  1.59it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.34it/s][A
Epoch 0:  15%|█▌        | 923/5971 [09:40<52:50,  1.59it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.70it/s][A
Epoch 0:  16%|█▌        | 927/5971 [09:40<52:35,  1.60it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.15it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.55it/s][A
Epoch 0:  16%|█▌        | 931/5971 [09:40<52:20,  1.61it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.51it/s][A
Epoch 0:  16%|█▌        | 935/5971 [09:40<52:04,  1.61it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.29it/s][A
Epoch 0:  16%|█▌        | 939/5971 [09:40<51:49,  1.62it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 27.80it/s][A
Epoch 0:  16%|█▌        | 943/5971 [09:41<51:34,  1.62it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 27.69it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.79it/s][A
Epoch 0:  16%|█▌        | 947/5971 [09:41<51:20,  1.63it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.53it/s][A
Epoch 0:  16%|█▌        | 951/5971 [09:41<51:05,  1.64it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.65it/s][A
Epoch 0:  16%|█▌        | 955/5971 [09:41<50:51,  1.64it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.16it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.59it/s][A
Epoch 0:  16%|█▌        | 959/5971 [09:41<50:36,  1.65it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.66it/s][A
Epoch 0:  16%|█▌        | 963/5971 [09:41<50:22,  1.66it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.36it/s][A
Epoch 0:  16%|█▌        | 967/5971 [09:41<50:08,  1.66it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.99it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.54it/s][A
Epoch 0:  16%|█▋        | 971/5971 [09:42<49:54,  1.67it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.04it/s][A
Epoch 0:  16%|█▋        | 975/5971 [09:42<49:40,  1.68it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.51it/s][A
Epoch 0:  16%|█▋        | 979/5971 [09:42<49:26,  1.68it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.66it/s][A
Epoch 0:  16%|█▋        | 983/5971 [09:42<49:13,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.17it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.23it/s][A
Epoch 0:  17%|█▋        | 987/5971 [09:42<48:59,  1.70it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  51%|█████     | 85/167 [00:03<00:03, 25.88it/s][A
Epoch 0:  17%|█▋        | 991/5971 [09:42<48:46,  1.70it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 25.68it/s][A
Epoch 0:  17%|█▋        | 995/5971 [09:43<48:32,  1.71it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 26.07it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.68it/s][A
Epoch 0:  17%|█▋        | 999/5971 [09:43<48:19,  1.71it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.18it/s][A
Epoch 0:  17%|█▋        | 1003/5971 [09:43<48:06,  1.72it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.90it/s][A
Epoch 0:  17%|█▋        | 1007/5971 [09:43<47:53,  1.73it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.50it/s][A
Epoch 0:  17%|█▋        | 1011/5971 [09:43<47:40,  1.73it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.16it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.72it/s][A
Epoch 0:  17%|█▋        | 1015/5971 [09:43<47:27,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 25.58it/s][A
Epoch 0:  17%|█▋        | 1019/5971 [09:43<47:15,  1.75it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.16it/s][A
Epoch 0:  17%|█▋        | 1023/5971 [09:44<47:02,  1.75it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 27.06it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.91it/s][A
Epoch 0:  17%|█▋        | 1027/5971 [09:44<46:50,  1.76it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.44it/s][A
Epoch 0:  17%|█▋        | 1031/5971 [09:44<46:37,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.83it/s][A
Epoch 0:  17%|█▋        | 1035/5971 [09:44<46:25,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.92it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.80it/s][A
Epoch 0:  17%|█▋        | 1039/5971 [09:44<46:13,  1.78it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.33it/s][A
Epoch 0:  17%|█▋        | 1043/5971 [09:44<46:00,  1.78it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 25.47it/s][A
Epoch 0:  18%|█▊        | 1047/5971 [09:45<45:48,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.46it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.39it/s][A
Epoch 0:  18%|█▊        | 1051/5971 [09:45<45:36,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.00it/s][A
Epoch 0:  18%|█▊        | 1055/5971 [09:45<45:25,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.75it/s][A
Epoch 0:  18%|█▊        | 1059/5971 [09:45<45:13,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.41it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.42it/s][A
Epoch 0:  18%|█▊        | 1063/5971 [09:45<45:01,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.55it/s][A
Epoch 0:  18%|█▊        | 1067/5971 [09:45<44:49,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 28.37it/s][A
Epoch 0:  18%|█▊        | 1071/5971 [09:45<44:38,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]
Epoch 0:  18%|█▊        | 1072/5971 [09:46<44:36,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.39e-5, train/loss_step=0.017, global_step=99.00]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.52it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.95it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.12it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.49it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.40it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.34it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.45it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.56it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.53it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.37it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.29it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.48it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.58it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.12it/s]

Epoch 0:  18%|█▊        | 1073/5971 [09:58<45:28,  1.79it/s, loss=0.089, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000472, train/loss_step=0.143, global_step=100.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.29it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.94it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.41it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.01it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.20it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.46it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.52it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.60it/s][A
Epoch 0:  18%|█▊        | 1073/5971 [10:03<45:50,  1.78it/s, loss=0.089, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000472, train/loss_step=0.143, global_step=100.0]

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.68it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.65it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.28it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.32it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.59it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.61it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.37it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.37it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.32it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.25it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.50it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.22it/s]

Epoch 0:  18%|█▊        | 1074/5971 [10:10<46:20,  1.76it/s, loss=0.089, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000472, train/loss_step=0.143, global_step=100.0]
Epoch 0:  18%|█▊        | 1074/5971 [10:10<46:20,  1.76it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000121, train/loss_step=0.031, global_step=100.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.35it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.84it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.58it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.81it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.01it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.42it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.53it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.53it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.56it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.58it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.63it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.57it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.54it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.57it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.57it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.61it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.64it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.24it/s]

Epoch 0:  18%|█▊        | 1075/5971 [10:22<47:10,  1.73it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000121, train/loss_step=0.031, global_step=100.0]
Epoch 0:  18%|█▊        | 1075/5971 [10:22<47:10,  1.73it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00174, train/loss_step=0.319, global_step=100.0] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.37it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.70it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.97it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.17it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.42it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.56it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.54it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.58it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.62it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.62it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.49it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.43it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.57it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.52it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.51it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.23it/s]

Epoch 0:  18%|█▊        | 1076/5971 [10:35<48:08,  1.69it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00174, train/loss_step=0.319, global_step=100.0]
Epoch 0:  18%|█▊        | 1076/5971 [10:35<48:08,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00202, train/loss_step=0.391, global_step=100.0] 
Epoch 0:  18%|█▊        | 1077/5971 [10:36<48:09,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00202, train/loss_step=0.391, global_step=100.0]
Epoch 0:  18%|█▊        | 1077/5971 [10:36<48:09,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000438, train/loss_step=0.132, global_step=101.0]
Epoch 0:  18%|█▊        | 1078/5971 [10:37<48:10,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000438, train/loss_step=0.132, global_step=101.0]
Epoch 0:  18%|█▊        | 1078/5971 [10:37<48:10,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000897, train/loss_step=0.246, global_step=101.0]
Epoch 0:  18%|█▊        | 1079/5971 [10:38<48:10,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000897, train/loss_step=0.246, global_step=101.0]
Epoch 0:  18%|█▊        | 1079/5971 [10:38<48:11,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0295, train/loss_vlb_step=0.000114, train/loss_step=0.0295, global_step=101.0]
Epoch 0:  18%|█▊        | 1080/5971 [10:40<48:18,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0295, train/loss_vlb_step=0.000114, train/loss_step=0.0295, global_step=101.0]
Epoch 0:  18%|█▊        | 1080/5971 [10:40<48:18,  1.69it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.78e-5, train/loss_step=0.00309, global_step=101.0]
Epoch 0:  18%|█▊        | 1081/5971 [10:41<48:19,  1.69it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.78e-5, train/loss_step=0.00309, global_step=101.0]
Epoch 0:  18%|█▊        | 1081/5971 [10:41<48:19,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000303, train/loss_step=0.0901, global_step=102.0] 
Epoch 0:  18%|█▊        | 1082/5971 [10:42<48:20,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000303, train/loss_step=0.0901, global_step=102.0]
Epoch 0:  18%|█▊        | 1082/5971 [10:42<48:20,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00105, train/loss_step=0.269, global_step=102.0]   
Epoch 0:  18%|█▊        | 1083/5971 [10:43<48:20,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00105, train/loss_step=0.269, global_step=102.0]
Epoch 0:  18%|█▊        | 1083/5971 [10:43<48:20,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.79e-5, train/loss_step=0.0032, global_step=102.0]
Epoch 0:  18%|█▊        | 1084/5971 [10:46<48:30,  1.68it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.79e-5, train/loss_step=0.0032, global_step=102.0]
Epoch 0:  18%|█▊        | 1084/5971 [10:46<48:30,  1.68it/s, loss=0.14, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00266, train/loss_step=0.428, global_step=102.0]  
Epoch 0:  18%|█▊        | 1085/5971 [10:47<48:31,  1.68it/s, loss=0.14, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00266, train/loss_step=0.428, global_step=102.0]
Epoch 0:  18%|█▊        | 1085/5971 [10:47<48:31,  1.68it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00675, train/loss_vlb_step=3.46e-5, train/loss_step=0.00675, global_step=103.0]
Epoch 0:  18%|█▊        | 1086/5971 [10:48<48:32,  1.68it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00675, train/loss_vlb_step=3.46e-5, train/loss_step=0.00675, global_step=103.0]
Epoch 0:  18%|█▊        | 1086/5971 [10:48<48:32,  1.68it/s, loss=0.147, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000986, train/loss_step=0.229, global_step=103.0]   
Epoch 0:  18%|█▊        | 1087/5971 [10:48<48:32,  1.68it/s, loss=0.147, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000986, train/loss_step=0.229, global_step=103.0]
Epoch 0:  18%|█▊        | 1087/5971 [10:48<48:32,  1.68it/s, loss=0.157, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00215, train/loss_step=0.350, global_step=103.0] 
Epoch 0:  18%|█▊        | 1088/5971 [10:51<48:42,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00215, train/loss_step=0.350, global_step=103.0]
Epoch 0:  18%|█▊        | 1088/5971 [10:51<48:42,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00583, train/loss_vlb_step=2.97e-5, train/loss_step=0.00583, global_step=103.0]
Epoch 0:  18%|█▊        | 1089/5971 [10:52<48:43,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00583, train/loss_vlb_step=2.97e-5, train/loss_step=0.00583, global_step=103.0]
Epoch 0:  18%|█▊        | 1089/5971 [10:52<48:43,  1.67it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000144, train/loss_step=0.0387, global_step=104.0] 
Epoch 0:  18%|█▊        | 1090/5971 [10:53<48:44,  1.67it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000144, train/loss_step=0.0387, global_step=104.0]
Epoch 0:  18%|█▊        | 1090/5971 [10:53<48:44,  1.67it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000297, train/loss_step=0.0864, global_step=104.0]
Epoch 0:  18%|█▊        | 1091/5971 [10:54<48:44,  1.67it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000297, train/loss_step=0.0864, global_step=104.0]
Epoch 0:  18%|█▊        | 1091/5971 [10:54<48:44,  1.67it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=9.04e-5, train/loss_step=0.0217, global_step=104.0] 
Epoch 0:  18%|█▊        | 1092/5971 [10:56<48:51,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=9.04e-5, train/loss_step=0.0217, global_step=104.0]
Epoch 0:  18%|█▊        | 1092/5971 [10:56<48:51,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00895, train/loss_vlb_step=4.3e-5, train/loss_step=0.00895, global_step=104.0]
Epoch 0:  18%|█▊        | 1093/5971 [10:57<48:52,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00895, train/loss_vlb_step=4.3e-5, train/loss_step=0.00895, global_step=104.0]
Epoch 0:  18%|█▊        | 1093/5971 [10:57<48:52,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00968, train/loss_vlb_step=4.46e-5, train/loss_step=0.00968, global_step=105.0]
Epoch 0:  18%|█▊        | 1094/5971 [10:58<48:53,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00968, train/loss_vlb_step=4.46e-5, train/loss_step=0.00968, global_step=105.0]
Epoch 0:  18%|█▊        | 1094/5971 [10:58<48:53,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.82e-5, train/loss_step=0.0246, global_step=105.0]  
Epoch 0:  18%|█▊        | 1095/5971 [10:59<48:53,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.82e-5, train/loss_step=0.0246, global_step=105.0]
Epoch 0:  18%|█▊        | 1095/5971 [10:59<48:53,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000894, train/loss_step=0.216, global_step=105.0] 
Epoch 0:  18%|█▊        | 1096/5971 [11:01<49:00,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000894, train/loss_step=0.216, global_step=105.0]
Epoch 0:  18%|█▊        | 1096/5971 [11:01<49:00,  1.66it/s, loss=0.13, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00321, train/loss_step=0.412, global_step=105.0]  
Epoch 0:  18%|█▊        | 1097/5971 [11:02<49:01,  1.66it/s, loss=0.13, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00321, train/loss_step=0.412, global_step=105.0]
Epoch 0:  18%|█▊        | 1097/5971 [11:02<49:01,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00652, train/loss_vlb_step=3.16e-5, train/loss_step=0.00652, global_step=106.0]
Epoch 0:  18%|█▊        | 1098/5971 [11:03<49:02,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00652, train/loss_vlb_step=3.16e-5, train/loss_step=0.00652, global_step=106.0]
Epoch 0:  18%|█▊        | 1098/5971 [11:03<49:02,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.74e-5, train/loss_step=0.00302, global_step=106.0]
Epoch 0:  18%|█▊        | 1099/5971 [11:04<49:03,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.74e-5, train/loss_step=0.00302, global_step=106.0]
Epoch 0:  18%|█▊        | 1099/5971 [11:04<49:03,  1.66it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00439, train/loss_vlb_step=2.37e-5, train/loss_step=0.00439, global_step=106.0]
Epoch 0:  18%|█▊        | 1100/5971 [11:07<49:12,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00439, train/loss_vlb_step=2.37e-5, train/loss_step=0.00439, global_step=106.0]
Epoch 0:  18%|█▊        | 1100/5971 [11:07<49:12,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=106.0]   
Epoch 0:  18%|█▊        | 1101/5971 [11:08<49:13,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=106.0]
Epoch 0:  18%|█▊        | 1101/5971 [11:08<49:13,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=107.0]
Epoch 0:  18%|█▊        | 1102/5971 [11:09<49:13,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=107.0]
Epoch 0:  18%|█▊        | 1102/5971 [11:09<49:13,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0724, train/loss_vlb_step=0.00024, train/loss_step=0.0724, global_step=107.0]
Epoch 0:  18%|█▊        | 1103/5971 [11:10<49:14,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0724, train/loss_vlb_step=0.00024, train/loss_step=0.0724, global_step=107.0]
Epoch 0:  18%|█▊        | 1103/5971 [11:10<49:14,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.00011, train/loss_step=0.028, global_step=107.0] 
Epoch 0:  18%|█▊        | 1104/5971 [11:12<49:21,  1.64it/s, loss=0.111, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.00011, train/loss_step=0.028, global_step=107.0]
Epoch 0:  18%|█▊        | 1104/5971 [11:12<49:21,  1.64it/s, loss=0.101, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000781, train/loss_step=0.227, global_step=107.0]
Epoch 0:  19%|█▊        | 1105/5971 [11:13<49:22,  1.64it/s, loss=0.101, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000781, train/loss_step=0.227, global_step=107.0]
Epoch 0:  19%|█▊        | 1105/5971 [11:13<49:22,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.000127, train/loss_step=0.0365, global_step=108.0]
Epoch 0:  19%|█▊        | 1106/5971 [11:14<49:22,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.000127, train/loss_step=0.0365, global_step=108.0]
Epoch 0:  19%|█▊        | 1106/5971 [11:14<49:22,  1.64it/s, loss=0.11, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00165, train/loss_step=0.392, global_step=108.0]    
Epoch 0:  19%|█▊        | 1107/5971 [11:15<49:23,  1.64it/s, loss=0.11, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00165, train/loss_step=0.392, global_step=108.0]
Epoch 0:  19%|█▊        | 1107/5971 [11:15<49:23,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000669, train/loss_step=0.185, global_step=108.0]
Epoch 0:  19%|█▊        | 1108/5971 [11:17<49:32,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000669, train/loss_step=0.185, global_step=108.0]
Epoch 0:  19%|█▊        | 1108/5971 [11:17<49:32,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00411, train/loss_vlb_step=2.24e-5, train/loss_step=0.00411, global_step=108.0]
Epoch 0:  19%|█▊        | 1109/5971 [11:18<49:32,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00411, train/loss_vlb_step=2.24e-5, train/loss_step=0.00411, global_step=108.0]
Epoch 0:  19%|█▊        | 1109/5971 [11:18<49:32,  1.64it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.7e-5, train/loss_step=0.0176, global_step=109.0]   
Epoch 0:  19%|█▊        | 1110/5971 [11:19<49:33,  1.63it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.7e-5, train/loss_step=0.0176, global_step=109.0]
Epoch 0:  19%|█▊        | 1110/5971 [11:19<49:33,  1.63it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.62e-5, train/loss_step=0.00503, global_step=109.0]
Epoch 0:  19%|█▊        | 1111/5971 [11:20<49:34,  1.63it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.62e-5, train/loss_step=0.00503, global_step=109.0]
Epoch 0:  19%|█▊        | 1111/5971 [11:20<49:34,  1.63it/s, loss=0.11, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00106, train/loss_step=0.280, global_step=109.0]      
Epoch 0:  19%|█▊        | 1112/5971 [11:22<49:40,  1.63it/s, loss=0.11, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00106, train/loss_step=0.280, global_step=109.0]
Epoch 0:  19%|█▊        | 1112/5971 [11:22<49:40,  1.63it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000194, train/loss_step=0.0579, global_step=109.0]
Epoch 0:  19%|█▊        | 1113/5971 [11:23<49:40,  1.63it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000194, train/loss_step=0.0579, global_step=109.0]
Epoch 0:  19%|█▊        | 1113/5971 [11:23<49:40,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.29e-5, train/loss_step=0.017, global_step=110.0]   
Epoch 0:  19%|█▊        | 1114/5971 [11:24<49:41,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.29e-5, train/loss_step=0.017, global_step=110.0]
Epoch 0:  19%|█▊        | 1114/5971 [11:24<49:41,  1.63it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000236, train/loss_step=0.0688, global_step=110.0]
Epoch 0:  19%|█▊        | 1115/5971 [11:25<49:41,  1.63it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000236, train/loss_step=0.0688, global_step=110.0]
Epoch 0:  19%|█▊        | 1115/5971 [11:25<49:41,  1.63it/s, loss=0.119, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00115, train/loss_step=0.294, global_step=110.0]   
Epoch 0:  19%|█▊        | 1116/5971 [11:27<49:47,  1.62it/s, loss=0.119, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00115, train/loss_step=0.294, global_step=110.0]
Epoch 0:  19%|█▊        | 1116/5971 [11:27<49:47,  1.62it/s, loss=0.109, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000942, train/loss_step=0.216, global_step=110.0]
Epoch 0:  19%|█▊        | 1117/5971 [11:28<49:48,  1.62it/s, loss=0.109, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000942, train/loss_step=0.216, global_step=110.0]
Epoch 0:  19%|█▊        | 1117/5971 [11:28<49:48,  1.62it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00601, train/loss_vlb_step=3.25e-5, train/loss_step=0.00601, global_step=111.0]
Epoch 0:  19%|█▊        | 1118/5971 [11:29<49:49,  1.62it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00601, train/loss_vlb_step=3.25e-5, train/loss_step=0.00601, global_step=111.0]
Epoch 0:  19%|█▊        | 1118/5971 [11:29<49:49,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.656, train/loss_vlb_step=0.00672, train/loss_step=0.656, global_step=111.0]    
Epoch 0:  19%|█▊        | 1119/5971 [11:30<49:49,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.656, train/loss_vlb_step=0.00672, train/loss_step=0.656, global_step=111.0]
Epoch 0:  19%|█▊        | 1119/5971 [11:30<49:49,  1.62it/s, loss=0.15, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000597, train/loss_step=0.180, global_step=111.0]
Epoch 0:  19%|█▉        | 1120/5971 [11:32<49:56,  1.62it/s, loss=0.15, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000597, train/loss_step=0.180, global_step=111.0]
Epoch 0:  19%|█▉        | 1120/5971 [11:32<49:56,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000672, train/loss_step=0.198, global_step=111.0]
Epoch 0:  19%|█▉        | 1121/5971 [11:33<49:57,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000672, train/loss_step=0.198, global_step=111.0]
Epoch 0:  19%|█▉        | 1121/5971 [11:33<49:57,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.516, train/loss_vlb_step=0.00447, train/loss_step=0.516, global_step=112.0] 
Epoch 0:  19%|█▉        | 1122/5971 [11:34<49:57,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.516, train/loss_vlb_step=0.00447, train/loss_step=0.516, global_step=112.0]
Epoch 0:  19%|█▉        | 1122/5971 [11:34<49:57,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000136, train/loss_step=0.0358, global_step=112.0]
Epoch 0:  19%|█▉        | 1123/5971 [11:35<49:57,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000136, train/loss_step=0.0358, global_step=112.0]
Epoch 0:  19%|█▉        | 1123/5971 [11:35<49:57,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00459, train/loss_vlb_step=2.43e-5, train/loss_step=0.00459, global_step=112.0]
Epoch 0:  19%|█▉        | 1124/5971 [11:37<50:06,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00459, train/loss_vlb_step=2.43e-5, train/loss_step=0.00459, global_step=112.0]
Epoch 0:  19%|█▉        | 1124/5971 [11:37<50:06,  1.61it/s, loss=0.164, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=112.0]  
Epoch 0:  19%|█▉        | 1125/5971 [11:38<50:07,  1.61it/s, loss=0.164, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=112.0]
Epoch 0:  19%|█▉        | 1125/5971 [11:38<50:07,  1.61it/s, loss=0.171, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000654, train/loss_step=0.191, global_step=113.0]
Epoch 0:  19%|█▉        | 1126/5971 [11:39<50:07,  1.61it/s, loss=0.171, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000654, train/loss_step=0.191, global_step=113.0]
Epoch 0:  19%|█▉        | 1126/5971 [11:39<50:07,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.06e-5, train/loss_step=0.00175, global_step=113.0]
Epoch 0:  19%|█▉        | 1127/5971 [11:40<50:08,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.06e-5, train/loss_step=0.00175, global_step=113.0]
Epoch 0:  19%|█▉        | 1127/5971 [11:40<50:08,  1.61it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000224, train/loss_step=0.0671, global_step=113.0] 
Epoch 0:  19%|█▉        | 1128/5971 [11:42<50:14,  1.61it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000224, train/loss_step=0.0671, global_step=113.0]
Epoch 0:  19%|█▉        | 1128/5971 [11:42<50:14,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00181, train/loss_step=0.347, global_step=113.0]   
Epoch 0:  19%|█▉        | 1129/5971 [11:43<50:15,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00181, train/loss_step=0.347, global_step=113.0]
Epoch 0:  19%|█▉        | 1129/5971 [11:43<50:15,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00525, train/loss_vlb_step=2.67e-5, train/loss_step=0.00525, global_step=114.0]
Epoch 0:  19%|█▉        | 1130/5971 [11:44<50:15,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00525, train/loss_vlb_step=2.67e-5, train/loss_step=0.00525, global_step=114.0]
Epoch 0:  19%|█▉        | 1130/5971 [11:44<50:15,  1.61it/s, loss=0.18, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.0019, train/loss_step=0.350, global_step=114.0]      
Epoch 0:  19%|█▉        | 1131/5971 [11:45<50:16,  1.60it/s, loss=0.18, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.0019, train/loss_step=0.350, global_step=114.0]
Epoch 0:  19%|█▉        | 1131/5971 [11:45<50:16,  1.60it/s, loss=0.177, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000895, train/loss_step=0.233, global_step=114.0]
Epoch 0:  19%|█▉        | 1132/5971 [11:47<50:21,  1.60it/s, loss=0.177, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000895, train/loss_step=0.233, global_step=114.0]
Epoch 0:  19%|█▉        | 1132/5971 [11:47<50:21,  1.60it/s, loss=0.222, v_num=0, train/loss_simple_step=0.953, train/loss_vlb_step=0.480, train/loss_step=0.953, global_step=114.0]   
Epoch 0:  19%|█▉        | 1133/5971 [11:48<50:22,  1.60it/s, loss=0.222, v_num=0, train/loss_simple_step=0.953, train/loss_vlb_step=0.480, train/loss_step=0.953, global_step=114.0]
Epoch 0:  19%|█▉        | 1133/5971 [11:48<50:22,  1.60it/s, loss=0.227, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=115.0]
Epoch 0:  19%|█▉        | 1134/5971 [11:49<50:22,  1.60it/s, loss=0.227, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=115.0]
Epoch 0:  19%|█▉        | 1134/5971 [11:49<50:22,  1.60it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.76e-5, train/loss_step=0.00533, global_step=115.0]
Epoch 0:  19%|█▉        | 1135/5971 [11:50<50:23,  1.60it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.76e-5, train/loss_step=0.00533, global_step=115.0]
Epoch 0:  19%|█▉        | 1135/5971 [11:50<50:23,  1.60it/s, loss=0.223, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00169, train/loss_step=0.289, global_step=115.0]    
Epoch 0:  19%|█▉        | 1136/5971 [11:52<50:29,  1.60it/s, loss=0.223, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00169, train/loss_step=0.289, global_step=115.0]
Epoch 0:  19%|█▉        | 1136/5971 [11:52<50:29,  1.60it/s, loss=0.223, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000751, train/loss_step=0.220, global_step=115.0]
Epoch 0:  19%|█▉        | 1137/5971 [11:53<50:29,  1.60it/s, loss=0.223, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000751, train/loss_step=0.220, global_step=115.0]
Epoch 0:  19%|█▉        | 1137/5971 [11:53<50:29,  1.60it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0517, train/loss_vlb_step=0.000178, train/loss_step=0.0517, global_step=116.0]
Epoch 0:  19%|█▉        | 1138/5971 [11:54<50:30,  1.60it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0517, train/loss_vlb_step=0.000178, train/loss_step=0.0517, global_step=116.0]
Epoch 0:  19%|█▉        | 1138/5971 [11:54<50:30,  1.60it/s, loss=0.202, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000584, train/loss_step=0.177, global_step=116.0]  
Epoch 0:  19%|█▉        | 1139/5971 [11:54<50:30,  1.59it/s, loss=0.202, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000584, train/loss_step=0.177, global_step=116.0]
Epoch 0:  19%|█▉        | 1139/5971 [11:54<50:30,  1.59it/s, loss=0.206, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.0011, train/loss_step=0.268, global_step=116.0]  
Epoch 0:  19%|█▉        | 1140/5971 [11:58<50:40,  1.59it/s, loss=0.206, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.0011, train/loss_step=0.268, global_step=116.0]
Epoch 0:  19%|█▉        | 1140/5971 [11:58<50:40,  1.59it/s, loss=0.205, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000599, train/loss_step=0.179, global_step=116.0]
Epoch 0:  19%|█▉        | 1141/5971 [11:58<50:40,  1.59it/s, loss=0.205, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000599, train/loss_step=0.179, global_step=116.0]
Epoch 0:  19%|█▉        | 1141/5971 [11:58<50:40,  1.59it/s, loss=0.208, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00573, train/loss_step=0.567, global_step=117.0] 
Epoch 0:  19%|█▉        | 1142/5971 [11:59<50:41,  1.59it/s, loss=0.208, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00573, train/loss_step=0.567, global_step=117.0]
Epoch 0:  19%|█▉        | 1142/5971 [11:59<50:41,  1.59it/s, loss=0.208, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.000151, train/loss_step=0.044, global_step=117.0]
Epoch 0:  19%|█▉        | 1143/5971 [12:00<50:41,  1.59it/s, loss=0.208, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.000151, train/loss_step=0.044, global_step=117.0]
Epoch 0:  19%|█▉        | 1143/5971 [12:00<50:41,  1.59it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0862, train/loss_vlb_step=0.000286, train/loss_step=0.0862, global_step=117.0]
Epoch 0:  19%|█▉        | 1144/5971 [12:03<50:49,  1.58it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0862, train/loss_vlb_step=0.000286, train/loss_step=0.0862, global_step=117.0]
Epoch 0:  19%|█▉        | 1144/5971 [12:03<50:49,  1.58it/s, loss=0.22, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.00119, train/loss_step=0.261, global_step=117.0]    
Epoch 0:  19%|█▉        | 1145/5971 [12:04<50:50,  1.58it/s, loss=0.22, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.00119, train/loss_step=0.261, global_step=117.0]
Epoch 0:  19%|█▉        | 1145/5971 [12:04<50:50,  1.58it/s, loss=0.211, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.89e-5, train/loss_step=0.00334, global_step=118.0]
Epoch 0:  19%|█▉        | 1146/5971 [12:05<50:51,  1.58it/s, loss=0.211, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.89e-5, train/loss_step=0.00334, global_step=118.0]
Epoch 0:  19%|█▉        | 1146/5971 [12:05<50:51,  1.58it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=6.07e-5, train/loss_step=0.0135, global_step=118.0]  
Epoch 0:  19%|█▉        | 1147/5971 [12:06<50:51,  1.58it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=6.07e-5, train/loss_step=0.0135, global_step=118.0]
Epoch 0:  19%|█▉        | 1147/5971 [12:06<50:51,  1.58it/s, loss=0.216, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000521, train/loss_step=0.158, global_step=118.0] 
Epoch 0:  19%|█▉        | 1148/5971 [12:08<50:57,  1.58it/s, loss=0.216, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000521, train/loss_step=0.158, global_step=118.0]
Epoch 0:  19%|█▉        | 1148/5971 [12:08<50:57,  1.58it/s, loss=0.214, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00115, train/loss_step=0.306, global_step=118.0] 
Epoch 0:  19%|█▉        | 1149/5971 [12:09<50:58,  1.58it/s, loss=0.214, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00115, train/loss_step=0.306, global_step=118.0]
Epoch 0:  19%|█▉        | 1149/5971 [12:09<50:58,  1.58it/s, loss=0.245, v_num=0, train/loss_simple_step=0.635, train/loss_vlb_step=0.006, train/loss_step=0.635, global_step=119.0]  
Epoch 0:  19%|█▉        | 1150/5971 [12:10<50:58,  1.58it/s, loss=0.245, v_num=0, train/loss_simple_step=0.635, train/loss_vlb_step=0.006, train/loss_step=0.635, global_step=119.0]
Epoch 0:  19%|█▉        | 1150/5971 [12:10<50:58,  1.58it/s, loss=0.23, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000135, train/loss_step=0.036, global_step=119.0]
Epoch 0:  19%|█▉        | 1151/5971 [12:11<50:59,  1.58it/s, loss=0.23, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000135, train/loss_step=0.036, global_step=119.0]
Epoch 0:  19%|█▉        | 1151/5971 [12:11<50:59,  1.58it/s, loss=0.234, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00186, train/loss_step=0.311, global_step=119.0]
Epoch 0:  19%|█▉        | 1152/5971 [12:13<51:04,  1.57it/s, loss=0.234, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00186, train/loss_step=0.311, global_step=119.0]
Epoch 0:  19%|█▉        | 1152/5971 [12:13<51:04,  1.57it/s, loss=0.197, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000879, train/loss_step=0.218, global_step=119.0]
Epoch 0:  19%|█▉        | 1153/5971 [12:14<51:05,  1.57it/s, loss=0.197, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000879, train/loss_step=0.218, global_step=119.0]
Epoch 0:  19%|█▉        | 1153/5971 [12:14<51:05,  1.57it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00504, train/loss_vlb_step=2.66e-5, train/loss_step=0.00504, global_step=120.0]
Epoch 0:  19%|█▉        | 1154/5971 [12:15<51:05,  1.57it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00504, train/loss_vlb_step=2.66e-5, train/loss_step=0.00504, global_step=120.0]
Epoch 0:  19%|█▉        | 1154/5971 [12:15<51:05,  1.57it/s, loss=0.2, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000594, train/loss_step=0.174, global_step=120.0]     
Epoch 0:  19%|█▉        | 1155/5971 [12:16<51:06,  1.57it/s, loss=0.2, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000594, train/loss_step=0.174, global_step=120.0]
Epoch 0:  19%|█▉        | 1155/5971 [12:16<51:06,  1.57it/s, loss=0.202, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00137, train/loss_step=0.328, global_step=120.0]
Epoch 0:  19%|█▉        | 1156/5971 [12:18<51:11,  1.57it/s, loss=0.202, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00137, train/loss_step=0.328, global_step=120.0]
Epoch 0:  19%|█▉        | 1156/5971 [12:18<51:11,  1.57it/s, loss=0.209, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00141, train/loss_step=0.351, global_step=120.0]
Epoch 0:  19%|█▉        | 1157/5971 [12:19<51:12,  1.57it/s, loss=0.209, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00141, train/loss_step=0.351, global_step=120.0]
Epoch 0:  19%|█▉        | 1157/5971 [12:19<51:12,  1.57it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00423, train/loss_vlb_step=2.25e-5, train/loss_step=0.00423, global_step=121.0]
Epoch 0:  19%|█▉        | 1158/5971 [12:19<51:12,  1.57it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00423, train/loss_vlb_step=2.25e-5, train/loss_step=0.00423, global_step=121.0]
Epoch 0:  19%|█▉        | 1158/5971 [12:19<51:12,  1.57it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.97e-5, train/loss_step=0.0156, global_step=121.0]  
Epoch 0:  19%|█▉        | 1159/5971 [12:20<51:12,  1.57it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.97e-5, train/loss_step=0.0156, global_step=121.0]
Epoch 0:  19%|█▉        | 1159/5971 [12:20<51:12,  1.57it/s, loss=0.196, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000871, train/loss_step=0.233, global_step=121.0] 
Epoch 0:  19%|█▉        | 1160/5971 [12:23<51:22,  1.56it/s, loss=0.196, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000871, train/loss_step=0.233, global_step=121.0]
Epoch 0:  19%|█▉        | 1160/5971 [12:23<51:22,  1.56it/s, loss=0.193, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000354, train/loss_step=0.108, global_step=121.0]
Epoch 0:  19%|█▉        | 1161/5971 [12:24<51:23,  1.56it/s, loss=0.193, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000354, train/loss_step=0.108, global_step=121.0]
Epoch 0:  19%|█▉        | 1161/5971 [12:24<51:23,  1.56it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00823, train/loss_vlb_step=3.81e-5, train/loss_step=0.00823, global_step=122.0]
Epoch 0:  19%|█▉        | 1162/5971 [12:25<51:23,  1.56it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00823, train/loss_vlb_step=3.81e-5, train/loss_step=0.00823, global_step=122.0]
Epoch 0:  19%|█▉        | 1162/5971 [12:25<51:23,  1.56it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.66e-5, train/loss_step=0.00297, global_step=122.0]
Epoch 0:  19%|█▉        | 1163/5971 [12:26<51:24,  1.56it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.66e-5, train/loss_step=0.00297, global_step=122.0]
Epoch 0:  19%|█▉        | 1163/5971 [12:26<51:24,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000189, train/loss_step=0.0539, global_step=122.0] 
Epoch 0:  19%|█▉        | 1164/5971 [12:28<51:30,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000189, train/loss_step=0.0539, global_step=122.0]
Epoch 0:  19%|█▉        | 1164/5971 [12:28<51:30,  1.56it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.25e-5, train/loss_step=0.0116, global_step=122.0] 
Epoch 0:  20%|█▉        | 1165/5971 [12:29<51:30,  1.56it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.25e-5, train/loss_step=0.0116, global_step=122.0]
Epoch 0:  20%|█▉        | 1165/5971 [12:29<51:30,  1.56it/s, loss=0.184, v_num=0, train/loss_simple_step=0.716, train/loss_vlb_step=0.0161, train/loss_step=0.716, global_step=123.0]   
Epoch 0:  20%|█▉        | 1166/5971 [12:30<51:30,  1.55it/s, loss=0.184, v_num=0, train/loss_simple_step=0.716, train/loss_vlb_step=0.0161, train/loss_step=0.716, global_step=123.0]
Epoch 0:  20%|█▉        | 1166/5971 [12:30<51:30,  1.55it/s, loss=0.194, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000729, train/loss_step=0.211, global_step=123.0]
Epoch 0:  20%|█▉        | 1167/5971 [12:31<51:32,  1.55it/s, loss=0.194, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000729, train/loss_step=0.211, global_step=123.0]
Epoch 0:  20%|█▉        | 1167/5971 [12:31<51:32,  1.55it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.000174, train/loss_step=0.0472, global_step=123.0]
Epoch 0:  20%|█▉        | 1168/5971 [12:34<51:38,  1.55it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.000174, train/loss_step=0.0472, global_step=123.0]
Epoch 0:  20%|█▉        | 1168/5971 [12:34<51:38,  1.55it/s, loss=0.186, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000925, train/loss_step=0.243, global_step=123.0]  
Epoch 0:  20%|█▉        | 1169/5971 [12:35<51:38,  1.55it/s, loss=0.186, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000925, train/loss_step=0.243, global_step=123.0]
Epoch 0:  20%|█▉        | 1169/5971 [12:35<51:38,  1.55it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.55e-5, train/loss_step=0.00273, global_step=124.0]
Epoch 0:  20%|█▉        | 1170/5971 [12:35<51:39,  1.55it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.55e-5, train/loss_step=0.00273, global_step=124.0]
Epoch 0:  20%|█▉        | 1170/5971 [12:35<51:39,  1.55it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00271, train/loss_vlb_step=1.54e-5, train/loss_step=0.00271, global_step=124.0]
Epoch 0:  20%|█▉        | 1171/5971 [12:36<51:39,  1.55it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00271, train/loss_vlb_step=1.54e-5, train/loss_step=0.00271, global_step=124.0]
Epoch 0:  20%|█▉        | 1171/5971 [12:36<51:39,  1.55it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00615, train/loss_vlb_step=3.17e-5, train/loss_step=0.00615, global_step=124.0]
Epoch 0:  20%|█▉        | 1172/5971 [12:39<51:47,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00615, train/loss_vlb_step=3.17e-5, train/loss_step=0.00615, global_step=124.0]
Epoch 0:  20%|█▉        | 1172/5971 [12:39<51:47,  1.54it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.27it/s][A
Epoch 0:  20%|█▉        | 1174/5971 [12:40<51:43,  1.55it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:   1%|          | 2/167 [00:00<00:50,  3.28it/s][A
Epoch 0:  20%|█▉        | 1176/5971 [12:40<51:37,  1.55it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.87it/s][A
Epoch 0:  20%|█▉        | 1180/5971 [12:40<51:25,  1.55it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:   5%|▌         | 9/167 [00:00<00:10, 15.05it/s][A
Epoch 0:  20%|█▉        | 1184/5971 [12:40<51:12,  1.56it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:   7%|▋         | 12/167 [00:00<00:08, 18.41it/s][A

Validating:   9%|▉         | 15/167 [00:01<00:07, 19.92it/s][A
Epoch 0:  20%|█▉        | 1188/5971 [12:40<51:00,  1.56it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 23.22it/s][A
Epoch 0:  20%|█▉        | 1192/5971 [12:40<50:48,  1.57it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 24.12it/s][A
Epoch 0:  20%|██        | 1196/5971 [12:41<50:36,  1.57it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 24.54it/s][A
Epoch 0:  20%|██        | 1200/5971 [12:41<50:23,  1.58it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 25.20it/s][A

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.72it/s][A
Epoch 0:  20%|██        | 1204/5971 [12:41<50:11,  1.58it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  20%|██        | 34/167 [00:01<00:05, 26.15it/s][A
Epoch 0:  20%|██        | 1208/5971 [12:41<50:00,  1.59it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  22%|██▏       | 37/167 [00:01<00:05, 25.91it/s][A
Epoch 0:  20%|██        | 1212/5971 [12:41<49:48,  1.59it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 24.76it/s][A

Validating:  26%|██▌       | 43/167 [00:02<00:05, 24.24it/s][A
Epoch 0:  20%|██        | 1216/5971 [12:41<49:36,  1.60it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.53it/s][A
Epoch 0:  20%|██        | 1220/5971 [12:42<49:25,  1.60it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 24.88it/s][A
Epoch 0:  20%|██        | 1224/5971 [12:42<49:13,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  31%|███       | 52/167 [00:02<00:04, 24.90it/s][A

Validating:  33%|███▎      | 55/167 [00:02<00:07, 14.03it/s][A
Epoch 0:  21%|██        | 1228/5971 [12:42<49:03,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  35%|███▍      | 58/167 [00:03<00:06, 16.17it/s][A
Epoch 0:  21%|██        | 1232/5971 [12:42<48:51,  1.62it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  37%|███▋      | 61/167 [00:03<00:05, 18.60it/s][A
Epoch 0:  21%|██        | 1236/5971 [12:42<48:40,  1.62it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  38%|███▊      | 64/167 [00:03<00:05, 20.21it/s][A

Validating:  40%|████      | 67/167 [00:03<00:04, 21.77it/s][A
Epoch 0:  21%|██        | 1240/5971 [12:43<48:29,  1.63it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  42%|████▏     | 70/167 [00:03<00:04, 23.69it/s][A
Epoch 0:  21%|██        | 1244/5971 [12:43<48:17,  1.63it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 24.69it/s][A
Epoch 0:  21%|██        | 1248/5971 [12:43<48:06,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.55it/s][A

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.50it/s][A
Epoch 0:  21%|██        | 1252/5971 [12:43<47:55,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.66it/s][A
Epoch 0:  21%|██        | 1256/5971 [12:43<47:44,  1.65it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  51%|█████     | 85/167 [00:04<00:03, 25.84it/s][A
Epoch 0:  21%|██        | 1260/5971 [12:43<47:33,  1.65it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 25.11it/s][A

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.37it/s][A
Epoch 0:  21%|██        | 1264/5971 [12:44<47:22,  1.66it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.29it/s][A
Epoch 0:  21%|██        | 1268/5971 [12:44<47:11,  1.66it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.01it/s][A
Epoch 0:  21%|██▏       | 1272/5971 [12:44<47:01,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.49it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.24it/s][A
Epoch 0:  21%|██▏       | 1276/5971 [12:44<46:50,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.30it/s][A
Epoch 0:  21%|██▏       | 1280/5971 [12:44<46:39,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.06it/s][A
Epoch 0:  22%|██▏       | 1284/5971 [12:44<46:29,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 26.81it/s][A

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 26.10it/s][A
Epoch 0:  22%|██▏       | 1288/5971 [12:44<46:18,  1.69it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.62it/s][A
Epoch 0:  22%|██▏       | 1292/5971 [12:45<46:08,  1.69it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.96it/s][A
Epoch 0:  22%|██▏       | 1296/5971 [12:45<45:58,  1.70it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.66it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.75it/s][A
Epoch 0:  22%|██▏       | 1300/5971 [12:45<45:47,  1.70it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.23it/s][A
Epoch 0:  22%|██▏       | 1304/5971 [12:45<45:37,  1.70it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.94it/s][A
Epoch 0:  22%|██▏       | 1308/5971 [12:45<45:27,  1.71it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.88it/s][A

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 26.65it/s][A
Epoch 0:  22%|██▏       | 1312/5971 [12:45<45:17,  1.71it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 27.01it/s][A
Epoch 0:  22%|██▏       | 1316/5971 [12:45<45:07,  1.72it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 28.38it/s][A
Epoch 0:  22%|██▏       | 1320/5971 [12:46<44:57,  1.72it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 28.61it/s][A
Epoch 0:  22%|██▏       | 1324/5971 [12:46<44:47,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 28.55it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 28.39it/s][A
Epoch 0:  22%|██▏       | 1328/5971 [12:46<44:37,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.79it/s][A
Epoch 0:  22%|██▏       | 1332/5971 [12:46<44:27,  1.74it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 28.37it/s][A
Epoch 0:  22%|██▏       | 1336/5971 [12:46<44:17,  1.74it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 27.89it/s][A
Epoch 0:  22%|██▏       | 1340/5971 [12:46<44:08,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]
Epoch 0:  22%|██▏       | 1340/5971 [12:47<44:08,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00087, train/loss_step=0.253, global_step=124.0]

                                                             [A
Epoch 0:  22%|██▏       | 1341/5971 [12:47<44:09,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00946, train/loss_vlb_step=4.38e-5, train/loss_step=0.00946, global_step=125.0]
Epoch 0:  22%|██▏       | 1342/5971 [12:48<44:10,  1.75it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00701, train/loss_vlb_step=3.38e-5, train/loss_step=0.00701, global_step=125.0]
Epoch 0:  22%|██▏       | 1343/5971 [12:49<44:10,  1.75it/s, loss=0.131, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00175, train/loss_step=0.341, global_step=125.0]    
Epoch 0:  23%|██▎       | 1344/5971 [12:52<44:16,  1.74it/s, loss=0.131, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00175, train/loss_step=0.341, global_step=125.0]
Epoch 0:  23%|██▎       | 1344/5971 [12:52<44:16,  1.74it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.000191, train/loss_step=0.0546, global_step=125.0]
Epoch 0:  23%|██▎       | 1345/5971 [12:53<44:17,  1.74it/s, loss=0.131, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00123, train/loss_step=0.288, global_step=126.0]   
Epoch 0:  23%|██▎       | 1346/5971 [12:54<44:17,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00477, train/loss_step=0.525, global_step=126.0]
Epoch 0:  23%|██▎       | 1347/5971 [12:54<44:18,  1.74it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=4.09e-5, train/loss_step=0.00843, global_step=126.0]
Epoch 0:  23%|██▎       | 1348/5971 [12:58<44:26,  1.73it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=4.09e-5, train/loss_step=0.00843, global_step=126.0]
Epoch 0:  23%|██▎       | 1348/5971 [12:58<44:26,  1.73it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000184, train/loss_step=0.0529, global_step=126.0] 
Epoch 0:  23%|██▎       | 1349/5971 [12:58<44:26,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000185, train/loss_step=0.0536, global_step=127.0]
Epoch 0:  23%|██▎       | 1350/5971 [12:59<44:27,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00177, train/loss_step=0.305, global_step=127.0]    
Epoch 0:  23%|██▎       | 1351/5971 [13:00<44:27,  1.73it/s, loss=0.184, v_num=0, train/loss_simple_step=0.540, train/loss_vlb_step=0.00688, train/loss_step=0.540, global_step=127.0]
Epoch 0:  23%|██▎       | 1352/5971 [13:03<44:33,  1.73it/s, loss=0.184, v_num=0, train/loss_simple_step=0.540, train/loss_vlb_step=0.00688, train/loss_step=0.540, global_step=127.0]
Epoch 0:  23%|██▎       | 1352/5971 [13:03<44:33,  1.73it/s, loss=0.194, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000882, train/loss_step=0.211, global_step=127.0]
Epoch 0:  23%|██▎       | 1353/5971 [13:03<44:33,  1.73it/s, loss=0.176, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00187, train/loss_step=0.366, global_step=128.0] 
Epoch 0:  23%|██▎       | 1354/5971 [13:04<44:34,  1.73it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00241, train/loss_vlb_step=1.41e-5, train/loss_step=0.00241, global_step=128.0]
Epoch 0:  23%|██▎       | 1355/5971 [13:05<44:34,  1.73it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.000279, train/loss_step=0.0845, global_step=128.0] 
Epoch 0:  23%|██▎       | 1356/5971 [13:07<44:39,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.000279, train/loss_step=0.0845, global_step=128.0]
Epoch 0:  23%|██▎       | 1356/5971 [13:07<44:39,  1.72it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0652, train/loss_vlb_step=0.000223, train/loss_step=0.0652, global_step=128.0]
Epoch 0:  23%|██▎       | 1357/5971 [13:08<44:40,  1.72it/s, loss=0.166, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000476, train/loss_step=0.140, global_step=129.0]  
Epoch 0:  23%|██▎       | 1358/5971 [13:09<44:40,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000141, train/loss_step=0.0389, global_step=129.0]
Epoch 0:  23%|██▎       | 1359/5971 [13:10<44:41,  1.72it/s, loss=0.199, v_num=0, train/loss_simple_step=0.629, train/loss_vlb_step=0.00852, train/loss_step=0.629, global_step=129.0]   
Epoch 0:  23%|██▎       | 1360/5971 [13:12<44:46,  1.72it/s, loss=0.199, v_num=0, train/loss_simple_step=0.629, train/loss_vlb_step=0.00852, train/loss_step=0.629, global_step=129.0]
Epoch 0:  23%|██▎       | 1360/5971 [13:12<44:46,  1.72it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.38e-5, train/loss_step=0.0174, global_step=129.0]
Epoch 0:  23%|██▎       | 1361/5971 [13:13<44:46,  1.72it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00945, train/loss_vlb_step=4.67e-5, train/loss_step=0.00945, global_step=130.0]
Epoch 0:  23%|██▎       | 1362/5971 [13:14<44:47,  1.72it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000257, train/loss_step=0.0755, global_step=130.0]  
Epoch 0:  23%|██▎       | 1363/5971 [13:15<44:47,  1.71it/s, loss=0.199, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00309, train/loss_step=0.519, global_step=130.0]  
Epoch 0:  23%|██▎       | 1364/5971 [13:18<44:53,  1.71it/s, loss=0.199, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00309, train/loss_step=0.519, global_step=130.0]
Epoch 0:  23%|██▎       | 1364/5971 [13:18<44:53,  1.71it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0889, train/loss_vlb_step=0.000294, train/loss_step=0.0889, global_step=130.0]
Epoch 0:  23%|██▎       | 1365/5971 [13:19<44:54,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.056, train/loss_vlb_step=0.00019, train/loss_step=0.056, global_step=131.0]   
Epoch 0:  23%|██▎       | 1366/5971 [13:20<44:55,  1.71it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.48e-5, train/loss_step=0.00254, global_step=131.0]
Epoch 0:  23%|██▎       | 1367/5971 [13:20<44:55,  1.71it/s, loss=0.191, v_num=0, train/loss_simple_step=0.566, train/loss_vlb_step=0.00766, train/loss_step=0.566, global_step=131.0]    
Epoch 0:  23%|██▎       | 1368/5971 [13:23<45:02,  1.70it/s, loss=0.191, v_num=0, train/loss_simple_step=0.566, train/loss_vlb_step=0.00766, train/loss_step=0.566, global_step=131.0]
Epoch 0:  23%|██▎       | 1368/5971 [13:23<45:02,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000525, train/loss_step=0.160, global_step=131.0]
Epoch 0:  23%|██▎       | 1369/5971 [13:24<45:03,  1.70it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00812, train/loss_vlb_step=3.92e-5, train/loss_step=0.00812, global_step=132.0]
Epoch 0:  23%|██▎       | 1370/5971 [13:25<45:03,  1.70it/s, loss=0.193, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00122, train/loss_step=0.288, global_step=132.0]    
Epoch 0:  23%|██▎       | 1371/5971 [13:26<45:04,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.0001, train/loss_step=0.0257, global_step=132.0]
Epoch 0:  23%|██▎       | 1372/5971 [13:29<45:10,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.0001, train/loss_step=0.0257, global_step=132.0]
Epoch 0:  23%|██▎       | 1372/5971 [13:29<45:10,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.93e-5, train/loss_step=0.00588, global_step=132.0]
Epoch 0:  23%|██▎       | 1373/5971 [13:30<45:10,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000789, train/loss_step=0.214, global_step=133.0]    
Epoch 0:  23%|██▎       | 1374/5971 [13:30<45:11,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00113, train/loss_step=0.277, global_step=133.0]
Epoch 0:  23%|██▎       | 1375/5971 [13:31<45:11,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000111, train/loss_step=0.0291, global_step=133.0]
Epoch 0:  23%|██▎       | 1376/5971 [13:34<45:16,  1.69it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000111, train/loss_step=0.0291, global_step=133.0]
Epoch 0:  23%|██▎       | 1376/5971 [13:34<45:16,  1.69it/s, loss=0.179, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00239, train/loss_step=0.433, global_step=133.0]   
Epoch 0:  23%|██▎       | 1377/5971 [13:34<45:16,  1.69it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0083, train/loss_vlb_step=3.81e-5, train/loss_step=0.0083, global_step=134.0]
Epoch 0:  23%|██▎       | 1378/5971 [13:35<45:17,  1.69it/s, loss=0.173, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000136, train/loss_step=0.039, global_step=134.0] 
Epoch 0:  23%|██▎       | 1379/5971 [13:36<45:17,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000717, train/loss_step=0.199, global_step=134.0]
Epoch 0:  23%|██▎       | 1380/5971 [13:39<45:23,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000717, train/loss_step=0.199, global_step=134.0]
Epoch 0:  23%|██▎       | 1380/5971 [13:39<45:23,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000128, train/loss_step=0.0333, global_step=134.0]
Epoch 0:  23%|██▎       | 1381/5971 [13:40<45:24,  1.68it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00599, train/loss_vlb_step=3.07e-5, train/loss_step=0.00599, global_step=135.0]
Epoch 0:  23%|██▎       | 1382/5971 [13:41<45:24,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00136, train/loss_step=0.312, global_step=135.0]    
Epoch 0:  23%|██▎       | 1383/5971 [13:42<45:24,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.42e-5, train/loss_step=0.0207, global_step=135.0]
Epoch 0:  23%|██▎       | 1384/5971 [13:44<45:29,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.42e-5, train/loss_step=0.0207, global_step=135.0]
Epoch 0:  23%|██▎       | 1384/5971 [13:44<45:29,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.53e-5, train/loss_step=0.018, global_step=135.0]  
Epoch 0:  23%|██▎       | 1385/5971 [13:45<45:29,  1.68it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.96e-5, train/loss_step=0.0244, global_step=136.0]
Epoch 0:  23%|██▎       | 1386/5971 [13:45<45:30,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00713, train/loss_vlb_step=3.64e-5, train/loss_step=0.00713, global_step=136.0]
Epoch 0:  23%|██▎       | 1387/5971 [13:46<45:30,  1.68it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0866, train/loss_vlb_step=0.000293, train/loss_step=0.0866, global_step=136.0]  
Epoch 0:  23%|██▎       | 1388/5971 [13:49<45:35,  1.68it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0866, train/loss_vlb_step=0.000293, train/loss_step=0.0866, global_step=136.0]
Epoch 0:  23%|██▎       | 1388/5971 [13:49<45:35,  1.68it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.15e-5, train/loss_step=0.0142, global_step=136.0]
Epoch 0:  23%|██▎       | 1389/5971 [13:49<45:35,  1.67it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00305, train/loss_vlb_step=1.83e-5, train/loss_step=0.00305, global_step=137.0]
Epoch 0:  23%|██▎       | 1390/5971 [13:50<45:36,  1.67it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000175, train/loss_step=0.0514, global_step=137.0]
Epoch 0:  23%|██▎       | 1391/5971 [13:51<45:36,  1.67it/s, loss=0.101, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000857, train/loss_step=0.229, global_step=137.0]   
Epoch 0:  23%|██▎       | 1392/5971 [13:53<45:41,  1.67it/s, loss=0.101, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000857, train/loss_step=0.229, global_step=137.0]
Epoch 0:  23%|██▎       | 1392/5971 [13:53<45:41,  1.67it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.66e-5, train/loss_step=0.0206, global_step=137.0]
Epoch 0:  23%|██▎       | 1393/5971 [13:54<45:41,  1.67it/s, loss=0.0919, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000111, train/loss_step=0.0272, global_step=138.0]
Epoch 0:  23%|██▎       | 1394/5971 [13:55<45:42,  1.67it/s, loss=0.0842, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000403, train/loss_step=0.123, global_step=138.0]  
Epoch 0:  23%|██▎       | 1395/5971 [13:56<45:42,  1.67it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.00697, train/loss_vlb_step=3.51e-5, train/loss_step=0.00697, global_step=138.0]
Epoch 0:  23%|██▎       | 1396/5971 [13:58<45:46,  1.67it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.00697, train/loss_vlb_step=3.51e-5, train/loss_step=0.00697, global_step=138.0]
Epoch 0:  23%|██▎       | 1396/5971 [13:58<45:46,  1.67it/s, loss=0.0622, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.98e-5, train/loss_step=0.0134, global_step=138.0]  
Epoch 0:  23%|██▎       | 1397/5971 [13:59<45:47,  1.66it/s, loss=0.0622, v_num=0, train/loss_simple_step=0.00936, train/loss_vlb_step=4.36e-5, train/loss_step=0.00936, global_step=139.0]
Epoch 0:  23%|██▎       | 1398/5971 [14:00<45:47,  1.66it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.0583, train/loss_vlb_step=0.000199, train/loss_step=0.0583, global_step=139.0] 
Epoch 0:  23%|██▎       | 1399/5971 [14:01<45:47,  1.66it/s, loss=0.0721, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00481, train/loss_step=0.378, global_step=139.0]   
Epoch 0:  23%|██▎       | 1400/5971 [14:04<45:56,  1.66it/s, loss=0.0721, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00481, train/loss_step=0.378, global_step=139.0]
Epoch 0:  23%|██▎       | 1400/5971 [14:04<45:56,  1.66it/s, loss=0.0707, v_num=0, train/loss_simple_step=0.00539, train/loss_vlb_step=2.95e-5, train/loss_step=0.00539, global_step=139.0]
Epoch 0:  23%|██▎       | 1401/5971 [14:05<45:57,  1.66it/s, loss=0.0718, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000107, train/loss_step=0.0272, global_step=140.0] 
Epoch 0:  23%|██▎       | 1402/5971 [14:06<45:57,  1.66it/s, loss=0.0679, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000897, train/loss_step=0.234, global_step=140.0]  
Epoch 0:  23%|██▎       | 1403/5971 [14:07<45:57,  1.66it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000884, train/loss_step=0.241, global_step=140.0]
Epoch 0:  24%|██▎       | 1404/5971 [14:09<46:02,  1.65it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000884, train/loss_step=0.241, global_step=140.0]
Epoch 0:  24%|██▎       | 1404/5971 [14:09<46:02,  1.65it/s, loss=0.0857, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000512, train/loss_step=0.154, global_step=140.0]
Epoch 0:  24%|██▎       | 1405/5971 [14:10<46:02,  1.65it/s, loss=0.101, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.0013, train/loss_step=0.331, global_step=141.0]   
Epoch 0:  24%|██▎       | 1406/5971 [14:11<46:02,  1.65it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=8.13e-5, train/loss_step=0.0187, global_step=141.0]
Epoch 0:  24%|██▎       | 1407/5971 [14:12<46:03,  1.65it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000307, train/loss_step=0.0926, global_step=141.0]
Epoch 0:  24%|██▎       | 1408/5971 [14:14<46:07,  1.65it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000307, train/loss_step=0.0926, global_step=141.0]
Epoch 0:  24%|██▎       | 1408/5971 [14:14<46:07,  1.65it/s, loss=0.125, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00336, train/loss_step=0.475, global_step=141.0]   
Epoch 0:  24%|██▎       | 1409/5971 [14:15<46:08,  1.65it/s, loss=0.149, v_num=0, train/loss_simple_step=0.491, train/loss_vlb_step=0.0047, train/loss_step=0.491, global_step=142.0] 
Epoch 0:  24%|██▎       | 1410/5971 [14:16<46:08,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00601, train/loss_vlb_step=2.97e-5, train/loss_step=0.00601, global_step=142.0]
Epoch 0:  24%|██▎       | 1411/5971 [14:17<46:09,  1.65it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00918, train/loss_vlb_step=4.27e-5, train/loss_step=0.00918, global_step=142.0]
Epoch 0:  24%|██▎       | 1412/5971 [14:19<46:13,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00918, train/loss_vlb_step=4.27e-5, train/loss_step=0.00918, global_step=142.0]
Epoch 0:  24%|██▎       | 1412/5971 [14:19<46:13,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000272, train/loss_step=0.0771, global_step=142.0] 
Epoch 0:  24%|██▎       | 1413/5971 [14:20<46:13,  1.64it/s, loss=0.143, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=143.0]  
Epoch 0:  24%|██▎       | 1414/5971 [14:21<46:14,  1.64it/s, loss=0.155, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00173, train/loss_step=0.369, global_step=143.0] 
Epoch 0:  24%|██▎       | 1415/5971 [14:22<46:14,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00131, train/loss_step=0.303, global_step=143.0] 
Epoch 0:  24%|██▎       | 1416/5971 [14:24<46:19,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00131, train/loss_step=0.303, global_step=143.0]
Epoch 0:  24%|██▎       | 1416/5971 [14:24<46:19,  1.64it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0597, train/loss_vlb_step=0.000207, train/loss_step=0.0597, global_step=143.0]
Epoch 0:  24%|██▎       | 1417/5971 [14:25<46:19,  1.64it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.74e-5, train/loss_step=0.0247, global_step=144.0] 
Epoch 0:  24%|██▎       | 1418/5971 [14:26<46:19,  1.64it/s, loss=0.2, v_num=0, train/loss_simple_step=0.604, train/loss_vlb_step=0.00628, train/loss_step=0.604, global_step=144.0]    
Epoch 0:  24%|██▍       | 1419/5971 [14:27<46:20,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.703, train/loss_vlb_step=0.0246, train/loss_step=0.703, global_step=144.0]
Epoch 0:  24%|██▍       | 1420/5971 [14:29<46:25,  1.63it/s, loss=0.217, v_num=0, train/loss_simple_step=0.703, train/loss_vlb_step=0.0246, train/loss_step=0.703, global_step=144.0]
Epoch 0:  24%|██▍       | 1420/5971 [14:29<46:25,  1.63it/s, loss=0.224, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000498, train/loss_step=0.151, global_step=144.0]
Epoch 0:  24%|██▍       | 1421/5971 [14:30<46:25,  1.63it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.52e-5, train/loss_step=0.0124, global_step=145.0]
Epoch 0:  24%|██▍       | 1422/5971 [14:31<46:25,  1.63it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.93e-5, train/loss_step=0.0182, global_step=145.0]
Epoch 0:  24%|██▍       | 1423/5971 [14:32<46:25,  1.63it/s, loss=0.201, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.75e-5, train/loss_step=0.015, global_step=145.0]  
Epoch 0:  24%|██▍       | 1424/5971 [14:34<46:30,  1.63it/s, loss=0.201, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.75e-5, train/loss_step=0.015, global_step=145.0]
Epoch 0:  24%|██▍       | 1424/5971 [14:34<46:30,  1.63it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.71e-5, train/loss_step=0.0213, global_step=145.0]
Epoch 0:  24%|██▍       | 1425/5971 [14:35<46:30,  1.63it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000135, train/loss_step=0.0334, global_step=146.0]
Epoch 0:  24%|██▍       | 1426/5971 [14:36<46:31,  1.63it/s, loss=0.189, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.00075, train/loss_step=0.202, global_step=146.0]  
Epoch 0:  24%|██▍       | 1427/5971 [14:37<46:31,  1.63it/s, loss=0.192, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000556, train/loss_step=0.162, global_step=146.0]
Epoch 0:  24%|██▍       | 1428/5971 [14:40<46:37,  1.62it/s, loss=0.192, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000556, train/loss_step=0.162, global_step=146.0]
Epoch 0:  24%|██▍       | 1428/5971 [14:40<46:37,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0888, train/loss_vlb_step=0.000293, train/loss_step=0.0888, global_step=146.0]
Epoch 0:  24%|██▍       | 1429/5971 [14:41<46:38,  1.62it/s, loss=0.172, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00431, train/loss_step=0.480, global_step=147.0]   
Epoch 0:  24%|██▍       | 1430/5971 [14:42<46:39,  1.62it/s, loss=0.178, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=147.0]
Epoch 0:  24%|██▍       | 1431/5971 [14:42<46:39,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00104, train/loss_step=0.265, global_step=147.0] 
Epoch 0:  24%|██▍       | 1432/5971 [14:45<46:44,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00104, train/loss_step=0.265, global_step=147.0]
Epoch 0:  24%|██▍       | 1432/5971 [14:45<46:44,  1.62it/s, loss=0.203, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00123, train/loss_step=0.322, global_step=147.0]
Epoch 0:  24%|██▍       | 1433/5971 [14:46<46:44,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00511, train/loss_vlb_step=2.59e-5, train/loss_step=0.00511, global_step=148.0]
Epoch 0:  24%|██▍       | 1434/5971 [14:47<46:44,  1.62it/s, loss=0.188, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000558, train/loss_step=0.161, global_step=148.0]   
Epoch 0:  24%|██▍       | 1435/5971 [14:47<46:44,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.63e-5, train/loss_step=0.00297, global_step=148.0]
Epoch 0:  24%|██▍       | 1436/5971 [14:50<46:49,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.63e-5, train/loss_step=0.00297, global_step=148.0]
Epoch 0:  24%|██▍       | 1436/5971 [14:50<46:49,  1.61it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=8.02e-5, train/loss_step=0.0185, global_step=148.0]  
Epoch 0:  24%|██▍       | 1437/5971 [14:51<46:49,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00993, train/loss_vlb_step=4.31e-5, train/loss_step=0.00993, global_step=149.0]
Epoch 0:  24%|██▍       | 1438/5971 [14:52<46:50,  1.61it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.000118, train/loss_step=0.0332, global_step=149.0]
Epoch 0:  24%|██▍       | 1439/5971 [14:52<46:50,  1.61it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0029, train/loss_vlb_step=1.68e-5, train/loss_step=0.0029, global_step=149.0] 
Epoch 0:  24%|██▍       | 1440/5971 [14:55<46:55,  1.61it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0029, train/loss_vlb_step=1.68e-5, train/loss_step=0.0029, global_step=149.0]
Epoch 0:  24%|██▍       | 1440/5971 [14:55<46:55,  1.61it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:20,  2.06it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:28,  5.78it/s][A
Epoch 0:  24%|██▍       | 1444/5971 [14:56<46:47,  1.61it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:   4%|▎         | 6/167 [00:00<00:15, 10.64it/s][A
Epoch 0:  24%|██▍       | 1448/5971 [14:56<46:37,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.95it/s][A
Epoch 0:  24%|██▍       | 1452/5971 [14:56<46:28,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:   7%|▋         | 12/167 [00:01<00:11, 12.97it/s][A

Validating:   9%|▉         | 15/167 [00:01<00:09, 16.32it/s][A
Epoch 0:  24%|██▍       | 1456/5971 [14:56<46:18,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  11%|█         | 18/167 [00:01<00:07, 18.78it/s][A
Epoch 0:  24%|██▍       | 1460/5971 [14:56<46:08,  1.63it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 22.08it/s][A
Epoch 0:  25%|██▍       | 1464/5971 [14:56<45:59,  1.63it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 21.98it/s][A
Epoch 0:  25%|██▍       | 1468/5971 [14:57<45:49,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  17%|█▋        | 28/167 [00:01<00:06, 23.03it/s][A

Validating:  19%|█▊        | 31/167 [00:01<00:05, 23.17it/s][A
Epoch 0:  25%|██▍       | 1472/5971 [14:57<45:40,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  20%|██        | 34/167 [00:01<00:05, 24.42it/s][A
Epoch 0:  25%|██▍       | 1476/5971 [14:57<45:31,  1.65it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 24.70it/s][A
Epoch 0:  25%|██▍       | 1480/5971 [14:57<45:21,  1.65it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 24.99it/s][A

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.67it/s][A
Epoch 0:  25%|██▍       | 1484/5971 [14:57<45:12,  1.65it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 27.33it/s][A
Epoch 0:  25%|██▍       | 1488/5971 [14:57<45:03,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  30%|██▉       | 50/167 [00:02<00:06, 19.21it/s][A
Epoch 0:  25%|██▍       | 1492/5971 [14:58<44:54,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  32%|███▏      | 53/167 [00:02<00:05, 21.24it/s][A
Epoch 0:  25%|██▌       | 1496/5971 [14:58<44:45,  1.67it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 22.63it/s][A

Validating:  35%|███▌      | 59/167 [00:03<00:04, 23.19it/s][A
Epoch 0:  25%|██▌       | 1500/5971 [14:58<44:36,  1.67it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  37%|███▋      | 62/167 [00:03<00:04, 23.15it/s][A
Epoch 0:  25%|██▌       | 1504/5971 [14:58<44:27,  1.67it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 23.60it/s][A
Epoch 0:  25%|██▌       | 1508/5971 [14:58<44:18,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  41%|████      | 68/167 [00:03<00:03, 24.99it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.41it/s][A
Epoch 0:  25%|██▌       | 1512/5971 [14:58<44:09,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.50it/s][A
Epoch 0:  25%|██▌       | 1516/5971 [14:59<44:00,  1.69it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.99it/s][A
Epoch 0:  25%|██▌       | 1520/5971 [14:59<43:51,  1.69it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.23it/s][A

Validating:  50%|████▉     | 83/167 [00:04<00:03, 24.53it/s][A
Epoch 0:  26%|██▌       | 1524/5971 [14:59<43:42,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 23.57it/s][A
Epoch 0:  26%|██▌       | 1528/5971 [14:59<43:34,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 23.60it/s][A
Epoch 0:  26%|██▌       | 1532/5971 [14:59<43:25,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 23.96it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:03, 23.88it/s][A
Epoch 0:  26%|██▌       | 1536/5971 [14:59<43:16,  1.71it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.23it/s][A
Epoch 0:  26%|██▌       | 1540/5971 [15:00<43:08,  1.71it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  60%|██████    | 101/167 [00:04<00:02, 23.68it/s][A
Epoch 0:  26%|██▌       | 1544/5971 [15:00<42:59,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 22.90it/s][A

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 23.18it/s][A
Epoch 0:  26%|██▌       | 1548/5971 [15:00<42:51,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 23.63it/s][A
Epoch 0:  26%|██▌       | 1552/5971 [15:00<42:42,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 22.44it/s][A
Epoch 0:  26%|██▌       | 1556/5971 [15:00<42:34,  1.73it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 23.85it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.01it/s][A
Epoch 0:  26%|██▌       | 1560/5971 [15:00<42:25,  1.73it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 23.94it/s][A
Epoch 0:  26%|██▌       | 1564/5971 [15:01<42:17,  1.74it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 24.79it/s][A
Epoch 0:  26%|██▋       | 1568/5971 [15:01<42:09,  1.74it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.99it/s][A

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 24.83it/s][A
Epoch 0:  26%|██▋       | 1572/5971 [15:01<42:00,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  80%|████████  | 134/167 [00:06<00:01, 25.03it/s][A
Epoch 0:  26%|██▋       | 1576/5971 [15:01<41:52,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 24.06it/s][A
Epoch 0:  26%|██▋       | 1580/5971 [15:01<41:44,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 24.42it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.40it/s][A
Epoch 0:  27%|██▋       | 1584/5971 [15:01<41:36,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.75it/s][A
Epoch 0:  27%|██▋       | 1588/5971 [15:02<41:28,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.37it/s][A
Epoch 0:  27%|██▋       | 1592/5971 [15:02<41:20,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.28it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.55it/s][A
Epoch 0:  27%|██▋       | 1596/5971 [15:02<41:11,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 25.85it/s][A
Epoch 0:  27%|██▋       | 1600/5971 [15:02<41:03,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 26.41it/s][A
Epoch 0:  27%|██▋       | 1604/5971 [15:02<40:55,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 27.04it/s][A

Validating: 100%|██████████| 167/167 [00:07<00:00, 27.17it/s][A
Epoch 0:  27%|██▋       | 1608/5971 [15:02<40:48,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]
Epoch 0:  27%|██▋       | 1608/5971 [15:03<40:48,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=149.0]

                                                             [A
Epoch 0:  27%|██▋       | 1609/5971 [15:04<40:49,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0873, train/loss_vlb_step=0.000296, train/loss_step=0.0873, global_step=150.0]
Epoch 0:  27%|██▋       | 1610/5971 [15:04<40:49,  1.78it/s, loss=0.125, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00158, train/loss_step=0.316, global_step=150.0]  
Epoch 0:  27%|██▋       | 1611/5971 [15:05<40:50,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00674, train/loss_vlb_step=3.41e-5, train/loss_step=0.00674, global_step=150.0]
Epoch 0:  27%|██▋       | 1612/5971 [15:08<40:55,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00674, train/loss_vlb_step=3.41e-5, train/loss_step=0.00674, global_step=150.0]
Epoch 0:  27%|██▋       | 1612/5971 [15:08<40:55,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000556, train/loss_step=0.168, global_step=150.0]   
Epoch 0:  27%|██▋       | 1613/5971 [15:09<40:55,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.731, train/loss_vlb_step=0.0171, train/loss_step=0.731, global_step=151.0]  
Epoch 0:  27%|██▋       | 1614/5971 [15:10<40:56,  1.77it/s, loss=0.181, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.00286, train/loss_step=0.485, global_step=151.0]
Epoch 0:  27%|██▋       | 1615/5971 [15:11<40:56,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.59e-5, train/loss_step=0.00479, global_step=151.0]
Epoch 0:  27%|██▋       | 1616/5971 [15:13<41:00,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.59e-5, train/loss_step=0.00479, global_step=151.0]
Epoch 0:  27%|██▋       | 1616/5971 [15:13<41:00,  1.77it/s, loss=0.175, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.00044, train/loss_step=0.134, global_step=151.0]    
Epoch 0:  27%|██▋       | 1617/5971 [15:14<41:00,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00456, train/loss_step=0.502, global_step=152.0]
Epoch 0:  27%|██▋       | 1618/5971 [15:15<41:00,  1.77it/s, loss=0.179, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000612, train/loss_step=0.183, global_step=152.0]
Epoch 0:  27%|██▋       | 1619/5971 [15:16<41:01,  1.77it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0616, train/loss_vlb_step=0.000215, train/loss_step=0.0616, global_step=152.0]
Epoch 0:  27%|██▋       | 1620/5971 [15:18<41:05,  1.76it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0616, train/loss_vlb_step=0.000215, train/loss_step=0.0616, global_step=152.0]
Epoch 0:  27%|██▋       | 1620/5971 [15:18<41:05,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000699, train/loss_step=0.205, global_step=152.0]  
Epoch 0:  27%|██▋       | 1621/5971 [15:19<41:05,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=1.02e-5, train/loss_step=0.0017, global_step=153.0]
Epoch 0:  27%|██▋       | 1622/5971 [15:20<41:05,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.57e-5, train/loss_step=0.0102, global_step=153.0]
Epoch 0:  27%|██▋       | 1623/5971 [15:21<41:06,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00304, train/loss_vlb_step=1.76e-5, train/loss_step=0.00304, global_step=153.0]
Epoch 0:  27%|██▋       | 1624/5971 [15:23<41:09,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00304, train/loss_vlb_step=1.76e-5, train/loss_step=0.00304, global_step=153.0]
Epoch 0:  27%|██▋       | 1624/5971 [15:23<41:09,  1.76it/s, loss=0.175, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00238, train/loss_step=0.405, global_step=153.0]    
Epoch 0:  27%|██▋       | 1625/5971 [15:24<41:10,  1.76it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.14e-5, train/loss_step=0.00189, global_step=154.0]
Epoch 0:  27%|██▋       | 1626/5971 [15:25<41:10,  1.76it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00877, train/loss_vlb_step=4.21e-5, train/loss_step=0.00877, global_step=154.0]
Epoch 0:  27%|██▋       | 1627/5971 [15:25<41:10,  1.76it/s, loss=0.174, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9.06e-5, train/loss_step=0.022, global_step=154.0]    
Epoch 0:  27%|██▋       | 1628/5971 [15:28<41:14,  1.76it/s, loss=0.174, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9.06e-5, train/loss_step=0.022, global_step=154.0]
Epoch 0:  27%|██▋       | 1628/5971 [15:28<41:14,  1.76it/s, loss=0.185, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.0017, train/loss_step=0.366, global_step=154.0] 
Epoch 0:  27%|██▋       | 1629/5971 [15:29<41:14,  1.75it/s, loss=0.208, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.00756, train/loss_step=0.546, global_step=155.0]
Epoch 0:  27%|██▋       | 1630/5971 [15:29<41:15,  1.75it/s, loss=0.201, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000565, train/loss_step=0.171, global_step=155.0]
Epoch 0:  27%|██▋       | 1631/5971 [15:30<41:15,  1.75it/s, loss=0.21, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000606, train/loss_step=0.180, global_step=155.0] 
Epoch 0:  27%|██▋       | 1632/5971 [15:32<41:18,  1.75it/s, loss=0.21, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000606, train/loss_step=0.180, global_step=155.0]
Epoch 0:  27%|██▋       | 1632/5971 [15:32<41:18,  1.75it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.53e-5, train/loss_step=0.00264, global_step=155.0]
Epoch 0:  27%|██▋       | 1633/5971 [15:33<41:19,  1.75it/s, loss=0.17, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=156.0]    
Epoch 0:  27%|██▋       | 1634/5971 [15:34<41:19,  1.75it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.00016, train/loss_step=0.0426, global_step=156.0]
Epoch 0:  27%|██▋       | 1635/5971 [15:35<41:19,  1.75it/s, loss=0.162, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00115, train/loss_step=0.287, global_step=156.0]  
Epoch 0:  27%|██▋       | 1636/5971 [15:37<41:23,  1.75it/s, loss=0.162, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00115, train/loss_step=0.287, global_step=156.0]
Epoch 0:  27%|██▋       | 1636/5971 [15:37<41:23,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000562, train/loss_step=0.168, global_step=156.0]
Epoch 0:  27%|██▋       | 1637/5971 [15:38<41:24,  1.74it/s, loss=0.151, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000923, train/loss_step=0.242, global_step=157.0]
Epoch 0:  27%|██▋       | 1638/5971 [15:39<41:24,  1.74it/s, loss=0.143, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.63e-5, train/loss_step=0.013, global_step=157.0] 
Epoch 0:  27%|██▋       | 1639/5971 [15:40<41:24,  1.74it/s, loss=0.143, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000213, train/loss_step=0.062, global_step=157.0]
Epoch 0:  27%|██▋       | 1640/5971 [15:43<41:30,  1.74it/s, loss=0.143, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000213, train/loss_step=0.062, global_step=157.0]
Epoch 0:  27%|██▋       | 1640/5971 [15:43<41:30,  1.74it/s, loss=0.157, v_num=0, train/loss_simple_step=0.487, train/loss_vlb_step=0.00547, train/loss_step=0.487, global_step=157.0] 
Epoch 0:  27%|██▋       | 1641/5971 [15:44<41:30,  1.74it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000285, train/loss_step=0.0861, global_step=158.0]
Epoch 0:  27%|██▋       | 1642/5971 [15:45<41:31,  1.74it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000116, train/loss_step=0.0303, global_step=158.0]
Epoch 0:  28%|██▊       | 1643/5971 [15:46<41:31,  1.74it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.11e-5, train/loss_step=0.00393, global_step=158.0]
Epoch 0:  28%|██▊       | 1644/5971 [15:48<41:35,  1.73it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.11e-5, train/loss_step=0.00393, global_step=158.0]
Epoch 0:  28%|██▊       | 1644/5971 [15:48<41:35,  1.73it/s, loss=0.15, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.00059, train/loss_step=0.165, global_step=158.0]     
Epoch 0:  28%|██▊       | 1645/5971 [15:49<41:35,  1.73it/s, loss=0.166, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00148, train/loss_step=0.334, global_step=159.0]
Epoch 0:  28%|██▊       | 1646/5971 [15:50<41:36,  1.73it/s, loss=0.179, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000954, train/loss_step=0.255, global_step=159.0]
Epoch 0:  28%|██▊       | 1647/5971 [15:51<41:36,  1.73it/s, loss=0.193, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00124, train/loss_step=0.297, global_step=159.0] 
Epoch 0:  28%|██▊       | 1648/5971 [15:53<41:39,  1.73it/s, loss=0.193, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00124, train/loss_step=0.297, global_step=159.0]
Epoch 0:  28%|██▊       | 1648/5971 [15:53<41:39,  1.73it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=0.000101, train/loss_step=0.0247, global_step=159.0]
Epoch 0:  28%|██▊       | 1649/5971 [15:54<41:39,  1.73it/s, loss=0.174, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00335, train/loss_step=0.509, global_step=160.0]   
Epoch 0:  28%|██▊       | 1650/5971 [15:55<41:40,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.51e-5, train/loss_step=0.00256, global_step=160.0]
Epoch 0:  28%|██▊       | 1651/5971 [15:56<41:40,  1.73it/s, loss=0.163, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000473, train/loss_step=0.144, global_step=160.0]   
Epoch 0:  28%|██▊       | 1652/5971 [15:58<41:43,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000473, train/loss_step=0.144, global_step=160.0]
Epoch 0:  28%|██▊       | 1652/5971 [15:58<41:43,  1.72it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=160.0]
Epoch 0:  28%|██▊       | 1653/5971 [15:59<41:44,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.66e-5, train/loss_step=0.00296, global_step=161.0]
Epoch 0:  28%|██▊       | 1654/5971 [16:00<41:44,  1.72it/s, loss=0.172, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.00125, train/loss_step=0.261, global_step=161.0]    
Epoch 0:  28%|██▊       | 1655/5971 [16:00<41:44,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0952, train/loss_vlb_step=0.000318, train/loss_step=0.0952, global_step=161.0]
Epoch 0:  28%|██▊       | 1656/5971 [16:03<41:48,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0952, train/loss_vlb_step=0.000318, train/loss_step=0.0952, global_step=161.0]
Epoch 0:  28%|██▊       | 1656/5971 [16:03<41:48,  1.72it/s, loss=0.173, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00249, train/loss_step=0.386, global_step=161.0]   
Epoch 0:  28%|██▊       | 1657/5971 [16:04<41:49,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.32e-5, train/loss_step=0.00221, global_step=162.0]
Epoch 0:  28%|██▊       | 1658/5971 [16:05<41:49,  1.72it/s, loss=0.174, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.000935, train/loss_step=0.264, global_step=162.0]   
Epoch 0:  28%|██▊       | 1659/5971 [16:06<41:49,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000607, train/loss_step=0.170, global_step=162.0]
Epoch 0:  28%|██▊       | 1660/5971 [16:08<41:53,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000607, train/loss_step=0.170, global_step=162.0]
Epoch 0:  28%|██▊       | 1660/5971 [16:08<41:53,  1.72it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0911, train/loss_vlb_step=0.000299, train/loss_step=0.0911, global_step=162.0]
Epoch 0:  28%|██▊       | 1661/5971 [16:09<41:53,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.83e-5, train/loss_step=0.00541, global_step=163.0]
Epoch 0:  28%|██▊       | 1662/5971 [16:10<41:53,  1.71it/s, loss=0.173, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00172, train/loss_step=0.390, global_step=163.0]    
Epoch 0:  28%|██▊       | 1663/5971 [16:11<41:53,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00119, train/loss_step=0.318, global_step=163.0]
Epoch 0:  28%|██▊       | 1664/5971 [16:13<41:58,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00119, train/loss_step=0.318, global_step=163.0]
Epoch 0:  28%|██▊       | 1664/5971 [16:13<41:58,  1.71it/s, loss=0.215, v_num=0, train/loss_simple_step=0.685, train/loss_vlb_step=0.0297, train/loss_step=0.685, global_step=163.0] 
Epoch 0:  28%|██▊       | 1665/5971 [16:14<41:59,  1.71it/s, loss=0.207, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000596, train/loss_step=0.174, global_step=164.0]
Epoch 0:  28%|██▊       | 1666/5971 [16:15<41:59,  1.71it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000125, train/loss_step=0.0324, global_step=164.0]
Epoch 0:  28%|██▊       | 1667/5971 [16:16<41:59,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.2e-5, train/loss_step=0.0173, global_step=164.0]  
Epoch 0:  28%|██▊       | 1668/5971 [16:19<42:04,  1.70it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.2e-5, train/loss_step=0.0173, global_step=164.0]
Epoch 0:  28%|██▊       | 1668/5971 [16:19<42:04,  1.70it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000134, train/loss_step=0.0336, global_step=164.0]
Epoch 0:  28%|██▊       | 1669/5971 [16:19<42:04,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00913, train/loss_vlb_step=4.16e-5, train/loss_step=0.00913, global_step=165.0]
Epoch 0:  28%|██▊       | 1670/5971 [16:20<42:04,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=165.0]   
Epoch 0:  28%|██▊       | 1671/5971 [16:21<42:04,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000747, train/loss_step=0.212, global_step=165.0]
Epoch 0:  28%|██▊       | 1672/5971 [16:23<42:08,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000747, train/loss_step=0.212, global_step=165.0]
Epoch 0:  28%|██▊       | 1672/5971 [16:23<42:08,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0791, train/loss_vlb_step=0.000263, train/loss_step=0.0791, global_step=165.0]
Epoch 0:  28%|██▊       | 1673/5971 [16:24<42:08,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.18e-5, train/loss_step=0.0115, global_step=166.0] 
Epoch 0:  28%|██▊       | 1674/5971 [16:25<42:08,  1.70it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=1.04e-5, train/loss_step=0.00171, global_step=166.0]
Epoch 0:  28%|██▊       | 1675/5971 [16:26<42:08,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.52e-5, train/loss_step=0.00481, global_step=166.0] 
Epoch 0:  28%|██▊       | 1676/5971 [16:28<42:12,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.52e-5, train/loss_step=0.00481, global_step=166.0]
Epoch 0:  28%|██▊       | 1676/5971 [16:28<42:12,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00176, train/loss_step=0.270, global_step=166.0]   
Epoch 0:  28%|██▊       | 1677/5971 [16:29<42:13,  1.70it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0482, train/loss_vlb_step=0.000173, train/loss_step=0.0482, global_step=167.0]
Epoch 0:  28%|██▊       | 1678/5971 [16:30<42:13,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0082, train/loss_vlb_step=3.92e-5, train/loss_step=0.0082, global_step=167.0] 
Epoch 0:  28%|██▊       | 1679/5971 [16:31<42:13,  1.69it/s, loss=0.132, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=167.0] 
Epoch 0:  28%|██▊       | 1680/5971 [16:33<42:16,  1.69it/s, loss=0.132, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=167.0]
Epoch 0:  28%|██▊       | 1680/5971 [16:33<42:16,  1.69it/s, loss=0.14, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00094, train/loss_step=0.256, global_step=167.0]  
Epoch 0:  28%|██▊       | 1681/5971 [16:34<42:16,  1.69it/s, loss=0.173, v_num=0, train/loss_simple_step=0.666, train/loss_vlb_step=0.0115, train/loss_step=0.666, global_step=168.0]
Epoch 0:  28%|██▊       | 1682/5971 [16:35<42:16,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.3e-5, train/loss_step=0.00222, global_step=168.0]
Epoch 0:  28%|██▊       | 1683/5971 [16:36<42:17,  1.69it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.00028, train/loss_step=0.0851, global_step=168.0] 
Epoch 0:  28%|██▊       | 1684/5971 [16:38<42:21,  1.69it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.00028, train/loss_step=0.0851, global_step=168.0]
Epoch 0:  28%|██▊       | 1684/5971 [16:38<42:21,  1.69it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.03e-5, train/loss_step=0.0201, global_step=168.0]
Epoch 0:  28%|██▊       | 1685/5971 [16:39<42:21,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000267, train/loss_step=0.0809, global_step=169.0]
Epoch 0:  28%|██▊       | 1686/5971 [16:40<42:21,  1.69it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.15e-5, train/loss_step=0.00408, global_step=169.0]
Epoch 0:  28%|██▊       | 1687/5971 [16:41<42:21,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.707, train/loss_vlb_step=0.00884, train/loss_step=0.707, global_step=169.0]    
Epoch 0:  28%|██▊       | 1688/5971 [16:43<42:24,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.707, train/loss_vlb_step=0.00884, train/loss_step=0.707, global_step=169.0]
Epoch 0:  28%|██▊       | 1688/5971 [16:43<42:24,  1.68it/s, loss=0.149, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00101, train/loss_step=0.266, global_step=169.0]
Epoch 0:  28%|██▊       | 1689/5971 [16:44<42:25,  1.68it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.2e-5, train/loss_step=0.00409, global_step=170.0]
Epoch 0:  28%|██▊       | 1690/5971 [16:45<42:25,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000945, train/loss_step=0.250, global_step=170.0]  
Epoch 0:  28%|██▊       | 1691/5971 [16:46<42:25,  1.68it/s, loss=0.148, v_num=0, train/loss_simple_step=0.056, train/loss_vlb_step=0.000195, train/loss_step=0.056, global_step=170.0]
Epoch 0:  28%|██▊       | 1692/5971 [16:48<42:28,  1.68it/s, loss=0.148, v_num=0, train/loss_simple_step=0.056, train/loss_vlb_step=0.000195, train/loss_step=0.056, global_step=170.0]
Epoch 0:  28%|██▊       | 1692/5971 [16:48<42:28,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00137, train/loss_step=0.297, global_step=170.0] 
Epoch 0:  28%|██▊       | 1693/5971 [16:49<42:28,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.63e-5, train/loss_step=0.0212, global_step=171.0]
Epoch 0:  28%|██▊       | 1694/5971 [16:50<42:29,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00663, train/loss_vlb_step=3.47e-5, train/loss_step=0.00663, global_step=171.0]
Epoch 0:  28%|██▊       | 1695/5971 [16:51<42:29,  1.68it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000158, train/loss_step=0.0418, global_step=171.0] 
Epoch 0:  28%|██▊       | 1696/5971 [16:53<42:33,  1.67it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000158, train/loss_step=0.0418, global_step=171.0]
Epoch 0:  28%|██▊       | 1696/5971 [16:53<42:33,  1.67it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.21e-5, train/loss_step=0.0123, global_step=171.0] 
Epoch 0:  28%|██▊       | 1697/5971 [16:54<42:33,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000997, train/loss_step=0.251, global_step=172.0] 
Epoch 0:  28%|██▊       | 1698/5971 [16:55<42:33,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.74e-5, train/loss_step=0.00536, global_step=172.0]
Epoch 0:  28%|██▊       | 1699/5971 [16:56<42:33,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.517, train/loss_vlb_step=0.00433, train/loss_step=0.517, global_step=172.0]    
Epoch 0:  28%|██▊       | 1700/5971 [16:58<42:37,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.517, train/loss_vlb_step=0.00433, train/loss_step=0.517, global_step=172.0]
Epoch 0:  28%|██▊       | 1700/5971 [16:58<42:37,  1.67it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000197, train/loss_step=0.0539, global_step=172.0]
Epoch 0:  28%|██▊       | 1701/5971 [16:59<42:37,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00497, train/loss_step=0.478, global_step=173.0]   
Epoch 0:  29%|██▊       | 1702/5971 [17:00<42:37,  1.67it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0826, train/loss_vlb_step=0.000271, train/loss_step=0.0826, global_step=173.0]
Epoch 0:  29%|██▊       | 1703/5971 [17:01<42:37,  1.67it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.33e-5, train/loss_step=0.0219, global_step=173.0] 
Epoch 0:  29%|██▊       | 1704/5971 [17:03<42:42,  1.67it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.33e-5, train/loss_step=0.0219, global_step=173.0]
Epoch 0:  29%|██▊       | 1704/5971 [17:03<42:42,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00285, train/loss_step=0.476, global_step=173.0]  
Epoch 0:  29%|██▊       | 1705/5971 [17:04<42:42,  1.66it/s, loss=0.186, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000584, train/loss_step=0.175, global_step=174.0]
Epoch 0:  29%|██▊       | 1706/5971 [17:05<42:42,  1.66it/s, loss=0.205, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00185, train/loss_step=0.369, global_step=174.0] 
Epoch 0:  29%|██▊       | 1707/5971 [17:06<42:42,  1.66it/s, loss=0.195, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00405, train/loss_step=0.509, global_step=174.0]
Epoch 0:  29%|██▊       | 1708/5971 [17:08<42:45,  1.66it/s, loss=0.195, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00405, train/loss_step=0.509, global_step=174.0]
Epoch 0:  29%|██▊       | 1708/5971 [17:08<42:45,  1.66it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.40it/s][A

Validating:   1%|          | 2/167 [00:00<00:48,  3.42it/s][A
Epoch 0:  29%|██▊       | 1712/5971 [17:09<42:39,  1.66it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.25it/s][A
Epoch 0:  29%|██▊       | 1716/5971 [17:09<42:31,  1.67it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.79it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.36it/s][A
Epoch 0:  29%|██▉       | 1720/5971 [17:09<42:23,  1.67it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.11it/s][A
Epoch 0:  29%|██▉       | 1724/5971 [17:09<42:15,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.37it/s][A
Epoch 0:  29%|██▉       | 1728/5971 [17:09<42:07,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.39it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.45it/s][A
Epoch 0:  29%|██▉       | 1732/5971 [17:10<41:59,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.24it/s][A
Epoch 0:  29%|██▉       | 1736/5971 [17:10<41:51,  1.69it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.14it/s][A
Epoch 0:  29%|██▉       | 1740/5971 [17:10<41:44,  1.69it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.11it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 23.40it/s][A
Epoch 0:  29%|██▉       | 1744/5971 [17:10<41:36,  1.69it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.52it/s][A
Epoch 0:  29%|██▉       | 1748/5971 [17:10<41:28,  1.70it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.72it/s][A
Epoch 0:  29%|██▉       | 1752/5971 [17:10<41:20,  1.70it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 24.65it/s][A
Epoch 0:  29%|██▉       | 1756/5971 [17:11<41:13,  1.70it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.31it/s][A
Epoch 0:  29%|██▉       | 1760/5971 [17:11<41:05,  1.71it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  31%|███       | 52/167 [00:02<00:04, 27.23it/s][A

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.66it/s][A
Epoch 0:  30%|██▉       | 1764/5971 [17:11<40:58,  1.71it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.91it/s][A
Epoch 0:  30%|██▉       | 1768/5971 [17:11<40:50,  1.72it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 26.11it/s][A
Epoch 0:  30%|██▉       | 1772/5971 [17:11<40:43,  1.72it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.80it/s][A
Epoch 0:  30%|██▉       | 1776/5971 [17:11<40:35,  1.72it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.03it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.95it/s][A
Epoch 0:  30%|██▉       | 1780/5971 [17:11<40:28,  1.73it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.38it/s][A
Epoch 0:  30%|██▉       | 1784/5971 [17:12<40:20,  1.73it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.30it/s][A
Epoch 0:  30%|██▉       | 1788/5971 [17:12<40:13,  1.73it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.33it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.18it/s][A
Epoch 0:  30%|███       | 1792/5971 [17:12<40:06,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.79it/s][A
Epoch 0:  30%|███       | 1796/5971 [17:12<39:58,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.26it/s][A
Epoch 0:  30%|███       | 1800/5971 [17:12<39:51,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 27.35it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.23it/s][A
Epoch 0:  30%|███       | 1804/5971 [17:12<39:44,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 27.61it/s][A
Epoch 0:  30%|███       | 1808/5971 [17:12<39:37,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.88it/s][A
Epoch 0:  30%|███       | 1812/5971 [17:13<39:29,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.89it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.81it/s][A
Epoch 0:  30%|███       | 1816/5971 [17:13<39:22,  1.76it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.93it/s][A
Epoch 0:  30%|███       | 1820/5971 [17:13<39:15,  1.76it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.92it/s][A
Epoch 0:  31%|███       | 1824/5971 [17:13<39:08,  1.77it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 27.10it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 27.75it/s][A
Epoch 0:  31%|███       | 1828/5971 [17:13<39:01,  1.77it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 28.08it/s][A
Epoch 0:  31%|███       | 1832/5971 [17:13<38:54,  1.77it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 28.54it/s][A
Epoch 0:  31%|███       | 1836/5971 [17:13<38:47,  1.78it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 28.42it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 27.51it/s][A
Epoch 0:  31%|███       | 1840/5971 [17:14<38:40,  1.78it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.44it/s][A
Epoch 0:  31%|███       | 1844/5971 [17:14<38:33,  1.78it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.47it/s][A
Epoch 0:  31%|███       | 1848/5971 [17:14<38:26,  1.79it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.12it/s][A

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.58it/s][A
Epoch 0:  31%|███       | 1852/5971 [17:14<38:19,  1.79it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.52it/s][A
Epoch 0:  31%|███       | 1856/5971 [17:14<38:12,  1.79it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.15it/s][A
Epoch 0:  31%|███       | 1860/5971 [17:14<38:06,  1.80it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.19it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.54it/s][A
Epoch 0:  31%|███       | 1864/5971 [17:15<37:59,  1.80it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.95it/s][A
Epoch 0:  31%|███▏      | 1868/5971 [17:15<37:52,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.04it/s][A
Epoch 0:  31%|███▏      | 1872/5971 [17:15<37:45,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 27.34it/s][A
Epoch 0:  31%|███▏      | 1876/5971 [17:15<37:39,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]
Epoch 0:  31%|███▏      | 1876/5971 [17:15<37:39,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00167, train/loss_step=0.382, global_step=174.0]

                                                             [A
Epoch 0:  31%|███▏      | 1877/5971 [17:16<37:40,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.48e-5, train/loss_step=0.00246, global_step=175.0]
Epoch 0:  31%|███▏      | 1878/5971 [17:17<37:40,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000188, train/loss_step=0.0502, global_step=175.0]
Epoch 0:  31%|███▏      | 1879/5971 [17:18<37:40,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0846, train/loss_vlb_step=0.000279, train/loss_step=0.0846, global_step=175.0]
Epoch 0:  31%|███▏      | 1880/5971 [17:21<37:44,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0846, train/loss_vlb_step=0.000279, train/loss_step=0.0846, global_step=175.0]
Epoch 0:  31%|███▏      | 1880/5971 [17:21<37:44,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000428, train/loss_step=0.127, global_step=175.0]  
Epoch 0:  32%|███▏      | 1881/5971 [17:22<37:44,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.25e-5, train/loss_step=0.0236, global_step=176.0]
Epoch 0:  32%|███▏      | 1882/5971 [17:22<37:44,  1.81it/s, loss=0.196, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00112, train/loss_step=0.265, global_step=176.0]  
Epoch 0:  32%|███▏      | 1883/5971 [17:23<37:44,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=8.03e-5, train/loss_step=0.0181, global_step=176.0]
Epoch 0:  32%|███▏      | 1884/5971 [17:25<37:47,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=8.03e-5, train/loss_step=0.0181, global_step=176.0]
Epoch 0:  32%|███▏      | 1884/5971 [17:25<37:47,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00596, train/loss_vlb_step=3.11e-5, train/loss_step=0.00596, global_step=176.0]
Epoch 0:  32%|███▏      | 1885/5971 [17:26<37:47,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.81e-5, train/loss_step=0.00317, global_step=177.0]
Epoch 0:  32%|███▏      | 1886/5971 [17:27<37:48,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00413, train/loss_vlb_step=2.2e-5, train/loss_step=0.00413, global_step=177.0] 
Epoch 0:  32%|███▏      | 1887/5971 [17:28<37:48,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.8e-5, train/loss_step=0.0105, global_step=177.0]  
Epoch 0:  32%|███▏      | 1888/5971 [17:30<37:51,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.8e-5, train/loss_step=0.0105, global_step=177.0]
Epoch 0:  32%|███▏      | 1888/5971 [17:30<37:51,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00131, train/loss_step=0.280, global_step=177.0] 
Epoch 0:  32%|███▏      | 1889/5971 [17:31<37:51,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.85e-5, train/loss_step=0.0111, global_step=178.0]
Epoch 0:  32%|███▏      | 1890/5971 [17:32<37:51,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000334, train/loss_step=0.100, global_step=178.0] 
Epoch 0:  32%|███▏      | 1891/5971 [17:33<37:51,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.4e-5, train/loss_step=0.0138, global_step=178.0]
Epoch 0:  32%|███▏      | 1892/5971 [17:35<37:54,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.4e-5, train/loss_step=0.0138, global_step=178.0]
Epoch 0:  32%|███▏      | 1892/5971 [17:35<37:54,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.612, train/loss_vlb_step=0.0103, train/loss_step=0.612, global_step=178.0]  
Epoch 0:  32%|███▏      | 1893/5971 [17:36<37:54,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000112, train/loss_step=0.029, global_step=179.0]
Epoch 0:  32%|███▏      | 1894/5971 [17:37<37:54,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.593, train/loss_vlb_step=0.00636, train/loss_step=0.593, global_step=179.0] 
Epoch 0:  32%|███▏      | 1895/5971 [17:38<37:54,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.45e-5, train/loss_step=0.0227, global_step=179.0]
Epoch 0:  32%|███▏      | 1896/5971 [17:40<37:57,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.45e-5, train/loss_step=0.0227, global_step=179.0]
Epoch 0:  32%|███▏      | 1896/5971 [17:40<37:57,  1.79it/s, loss=0.121, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000561, train/loss_step=0.162, global_step=179.0] 
Epoch 0:  32%|███▏      | 1897/5971 [17:41<37:58,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000155, train/loss_step=0.0405, global_step=180.0]
Epoch 0:  32%|███▏      | 1898/5971 [17:42<37:58,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.38e-5, train/loss_step=0.00227, global_step=180.0]
Epoch 0:  32%|███▏      | 1899/5971 [17:43<37:58,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.00354, train/loss_step=0.489, global_step=180.0]   
Epoch 0:  32%|███▏      | 1900/5971 [17:45<38:01,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.00354, train/loss_step=0.489, global_step=180.0]
Epoch 0:  32%|███▏      | 1900/5971 [17:45<38:01,  1.78it/s, loss=0.136, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000156, train/loss_step=0.043, global_step=180.0]
Epoch 0:  32%|███▏      | 1901/5971 [17:46<38:01,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000222, train/loss_step=0.0628, global_step=181.0]
Epoch 0:  32%|███▏      | 1902/5971 [17:47<38:02,  1.78it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000283, train/loss_step=0.0815, global_step=181.0]
Epoch 0:  32%|███▏      | 1903/5971 [17:48<38:02,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000371, train/loss_step=0.111, global_step=181.0]  
Epoch 0:  32%|███▏      | 1904/5971 [17:50<38:05,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000371, train/loss_step=0.111, global_step=181.0]
Epoch 0:  32%|███▏      | 1904/5971 [17:50<38:05,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.25e-5, train/loss_step=0.00392, global_step=181.0]
Epoch 0:  32%|███▏      | 1905/5971 [17:51<38:05,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00179, train/loss_step=0.356, global_step=182.0]    
Epoch 0:  32%|███▏      | 1906/5971 [17:52<38:05,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.00022, train/loss_step=0.066, global_step=182.0]
Epoch 0:  32%|███▏      | 1907/5971 [17:53<38:05,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000476, train/loss_step=0.143, global_step=182.0]
Epoch 0:  32%|███▏      | 1908/5971 [17:55<38:08,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000476, train/loss_step=0.143, global_step=182.0]
Epoch 0:  32%|███▏      | 1908/5971 [17:55<38:08,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.0007, train/loss_step=0.192, global_step=182.0]  
Epoch 0:  32%|███▏      | 1909/5971 [17:56<38:08,  1.77it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00857, train/loss_vlb_step=4.06e-5, train/loss_step=0.00857, global_step=183.0]
Epoch 0:  32%|███▏      | 1910/5971 [17:57<38:08,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=0.000101, train/loss_step=0.0248, global_step=183.0] 
Epoch 0:  32%|███▏      | 1911/5971 [17:57<38:08,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000146, train/loss_step=0.0396, global_step=183.0]
Epoch 0:  32%|███▏      | 1912/5971 [18:00<38:12,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000146, train/loss_step=0.0396, global_step=183.0]
Epoch 0:  32%|███▏      | 1912/5971 [18:00<38:12,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00036, train/loss_step=0.109, global_step=183.0]   
Epoch 0:  32%|███▏      | 1913/5971 [18:01<38:12,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.88e-5, train/loss_step=0.022, global_step=184.0]
Epoch 0:  32%|███▏      | 1914/5971 [18:02<38:12,  1.77it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.00977, train/loss_vlb_step=4.5e-5, train/loss_step=0.00977, global_step=184.0]
Epoch 0:  32%|███▏      | 1915/5971 [18:02<38:12,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00105, train/loss_step=0.243, global_step=184.0]    
Epoch 0:  32%|███▏      | 1916/5971 [18:05<38:15,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00105, train/loss_step=0.243, global_step=184.0]
Epoch 0:  32%|███▏      | 1916/5971 [18:05<38:15,  1.77it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.89e-5, train/loss_step=0.0189, global_step=184.0]
Epoch 0:  32%|███▏      | 1917/5971 [18:06<38:15,  1.77it/s, loss=0.103, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000155, train/loss_step=0.043, global_step=185.0] 
Epoch 0:  32%|███▏      | 1918/5971 [18:06<38:15,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000194, train/loss_step=0.0538, global_step=185.0]
Epoch 0:  32%|███▏      | 1919/5971 [18:07<38:15,  1.77it/s, loss=0.11, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.0055, train/loss_step=0.575, global_step=185.0]     
Epoch 0:  32%|███▏      | 1920/5971 [18:10<38:19,  1.76it/s, loss=0.11, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.0055, train/loss_step=0.575, global_step=185.0]
Epoch 0:  32%|███▏      | 1920/5971 [18:10<38:19,  1.76it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00599, train/loss_vlb_step=3.06e-5, train/loss_step=0.00599, global_step=185.0]
Epoch 0:  32%|███▏      | 1921/5971 [18:11<38:19,  1.76it/s, loss=0.142, v_num=0, train/loss_simple_step=0.736, train/loss_vlb_step=0.0243, train/loss_step=0.736, global_step=186.0]     
Epoch 0:  32%|███▏      | 1922/5971 [18:12<38:20,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00476, train/loss_vlb_step=2.47e-5, train/loss_step=0.00476, global_step=186.0]
Epoch 0:  32%|███▏      | 1923/5971 [18:13<38:20,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000761, train/loss_step=0.217, global_step=186.0]   
Epoch 0:  32%|███▏      | 1924/5971 [18:15<38:22,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000761, train/loss_step=0.217, global_step=186.0]
Epoch 0:  32%|███▏      | 1924/5971 [18:15<38:22,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=4.63e-5, train/loss_step=0.0121, global_step=186.0]
Epoch 0:  32%|███▏      | 1925/5971 [18:16<38:23,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00385, train/loss_vlb_step=2.15e-5, train/loss_step=0.00385, global_step=187.0]
Epoch 0:  32%|███▏      | 1926/5971 [18:17<38:23,  1.76it/s, loss=0.129, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=187.0]   
Epoch 0:  32%|███▏      | 1927/5971 [18:18<38:23,  1.76it/s, loss=0.129, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.00043, train/loss_step=0.129, global_step=187.0] 
Epoch 0:  32%|███▏      | 1928/5971 [18:20<38:26,  1.75it/s, loss=0.129, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.00043, train/loss_step=0.129, global_step=187.0]
Epoch 0:  32%|███▏      | 1928/5971 [18:20<38:26,  1.75it/s, loss=0.119, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.54e-5, train/loss_step=0.010, global_step=187.0]
Epoch 0:  32%|███▏      | 1929/5971 [18:21<38:26,  1.75it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00628, train/loss_vlb_step=3.12e-5, train/loss_step=0.00628, global_step=188.0]
Epoch 0:  32%|███▏      | 1930/5971 [18:22<38:26,  1.75it/s, loss=0.127, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000635, train/loss_step=0.186, global_step=188.0]   
Epoch 0:  32%|███▏      | 1931/5971 [18:23<38:27,  1.75it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000127, train/loss_step=0.0348, global_step=188.0]
Epoch 0:  32%|███▏      | 1932/5971 [18:25<38:29,  1.75it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000127, train/loss_step=0.0348, global_step=188.0]
Epoch 0:  32%|███▏      | 1932/5971 [18:25<38:29,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.24e-5, train/loss_step=0.0165, global_step=188.0] 
Epoch 0:  32%|███▏      | 1933/5971 [18:26<38:29,  1.75it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000218, train/loss_step=0.0642, global_step=189.0]
Epoch 0:  32%|███▏      | 1934/5971 [18:27<38:29,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.000925, train/loss_step=0.256, global_step=189.0]  
Epoch 0:  32%|███▏      | 1935/5971 [18:28<38:30,  1.75it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.5e-5, train/loss_step=0.0121, global_step=189.0]
Epoch 0:  32%|███▏      | 1936/5971 [18:30<38:34,  1.74it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.5e-5, train/loss_step=0.0121, global_step=189.0]
Epoch 0:  32%|███▏      | 1936/5971 [18:30<38:34,  1.74it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.4e-5, train/loss_step=0.0177, global_step=189.0]
Epoch 0:  32%|███▏      | 1937/5971 [18:31<38:34,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00569, train/loss_vlb_step=2.96e-5, train/loss_step=0.00569, global_step=190.0]
Epoch 0:  32%|███▏      | 1938/5971 [18:32<38:34,  1.74it/s, loss=0.132, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000775, train/loss_step=0.219, global_step=190.0]   
Epoch 0:  32%|███▏      | 1939/5971 [18:33<38:34,  1.74it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.16e-5, train/loss_step=0.0113, global_step=190.0]
Epoch 0:  32%|███▏      | 1940/5971 [18:36<38:38,  1.74it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.16e-5, train/loss_step=0.0113, global_step=190.0]
Epoch 0:  32%|███▏      | 1940/5971 [18:36<38:38,  1.74it/s, loss=0.11, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000491, train/loss_step=0.143, global_step=190.0]  
Epoch 0:  33%|███▎      | 1941/5971 [18:37<38:38,  1.74it/s, loss=0.0739, v_num=0, train/loss_simple_step=0.00874, train/loss_vlb_step=4.13e-5, train/loss_step=0.00874, global_step=191.0]
Epoch 0:  33%|███▎      | 1942/5971 [18:38<38:38,  1.74it/s, loss=0.0753, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000121, train/loss_step=0.033, global_step=191.0]   
Epoch 0:  33%|███▎      | 1943/5971 [18:38<38:38,  1.74it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.0951, train/loss_vlb_step=0.000314, train/loss_step=0.0951, global_step=191.0]
Epoch 0:  33%|███▎      | 1944/5971 [18:41<38:41,  1.74it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.0951, train/loss_vlb_step=0.000314, train/loss_step=0.0951, global_step=191.0]
Epoch 0:  33%|███▎      | 1944/5971 [18:41<38:41,  1.74it/s, loss=0.0763, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000507, train/loss_step=0.153, global_step=191.0]  
Epoch 0:  33%|███▎      | 1945/5971 [18:41<38:41,  1.73it/s, loss=0.0763, v_num=0, train/loss_simple_step=0.00386, train/loss_vlb_step=2.12e-5, train/loss_step=0.00386, global_step=192.0]
Epoch 0:  33%|███▎      | 1946/5971 [18:42<38:41,  1.73it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.0034, train/loss_step=0.423, global_step=192.0]     
Epoch 0:  33%|███▎      | 1947/5971 [18:43<38:41,  1.73it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.16e-5, train/loss_step=0.0234, global_step=192.0]
Epoch 0:  33%|███▎      | 1948/5971 [18:45<38:43,  1.73it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.16e-5, train/loss_step=0.0234, global_step=192.0]
Epoch 0:  33%|███▎      | 1948/5971 [18:45<38:43,  1.73it/s, loss=0.0866, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.99e-5, train/loss_step=0.0204, global_step=192.0]
Epoch 0:  33%|███▎      | 1949/5971 [18:46<38:44,  1.73it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000796, train/loss_step=0.209, global_step=193.0] 
Epoch 0:  33%|███▎      | 1950/5971 [18:47<38:44,  1.73it/s, loss=0.106, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.0018, train/loss_step=0.374, global_step=193.0]   
Epoch 0:  33%|███▎      | 1951/5971 [18:48<38:44,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.0032, train/loss_step=0.461, global_step=193.0]
Epoch 0:  33%|███▎      | 1952/5971 [18:50<38:47,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.0032, train/loss_step=0.461, global_step=193.0]
Epoch 0:  33%|███▎      | 1952/5971 [18:50<38:47,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00959, train/loss_vlb_step=4.52e-5, train/loss_step=0.00959, global_step=193.0]
Epoch 0:  33%|███▎      | 1953/5971 [18:51<38:47,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00409, train/loss_step=0.534, global_step=194.0]    
Epoch 0:  33%|███▎      | 1954/5971 [18:52<38:47,  1.73it/s, loss=0.158, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00196, train/loss_step=0.394, global_step=194.0]
Epoch 0:  33%|███▎      | 1955/5971 [18:53<38:47,  1.73it/s, loss=0.164, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=194.0]
Epoch 0:  33%|███▎      | 1956/5971 [18:55<38:50,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=194.0]
Epoch 0:  33%|███▎      | 1956/5971 [18:55<38:50,  1.72it/s, loss=0.182, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.0019, train/loss_step=0.373, global_step=194.0]  
Epoch 0:  33%|███▎      | 1957/5971 [18:56<38:50,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000453, train/loss_step=0.138, global_step=195.0]
Epoch 0:  33%|███▎      | 1958/5971 [18:57<38:50,  1.72it/s, loss=0.191, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00109, train/loss_step=0.268, global_step=195.0] 
Epoch 0:  33%|███▎      | 1959/5971 [18:58<38:50,  1.72it/s, loss=0.215, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00465, train/loss_step=0.499, global_step=195.0]
Epoch 0:  33%|███▎      | 1960/5971 [19:01<38:53,  1.72it/s, loss=0.215, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00465, train/loss_step=0.499, global_step=195.0]
Epoch 0:  33%|███▎      | 1960/5971 [19:01<38:53,  1.72it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.7e-5, train/loss_step=0.00291, global_step=195.0]
Epoch 0:  33%|███▎      | 1961/5971 [19:01<38:53,  1.72it/s, loss=0.214, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=196.0]  
Epoch 0:  33%|███▎      | 1962/5971 [19:02<38:54,  1.72it/s, loss=0.22, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000575, train/loss_step=0.164, global_step=196.0] 
Epoch 0:  33%|███▎      | 1963/5971 [19:03<38:54,  1.72it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.02e-5, train/loss_step=0.0114, global_step=196.0]
Epoch 0:  33%|███▎      | 1964/5971 [19:05<38:56,  1.71it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.02e-5, train/loss_step=0.0114, global_step=196.0]
Epoch 0:  33%|███▎      | 1964/5971 [19:05<38:56,  1.71it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.46e-5, train/loss_step=0.0149, global_step=196.0]
Epoch 0:  33%|███▎      | 1965/5971 [19:06<38:56,  1.71it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000315, train/loss_step=0.0956, global_step=197.0]
Epoch 0:  33%|███▎      | 1966/5971 [19:07<38:56,  1.71it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000179, train/loss_step=0.0474, global_step=197.0]
Epoch 0:  33%|███▎      | 1967/5971 [19:08<38:56,  1.71it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.67e-5, train/loss_step=0.0028, global_step=197.0] 
Epoch 0:  33%|███▎      | 1968/5971 [19:10<38:59,  1.71it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.67e-5, train/loss_step=0.0028, global_step=197.0]
Epoch 0:  33%|███▎      | 1968/5971 [19:10<38:59,  1.71it/s, loss=0.203, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000662, train/loss_step=0.195, global_step=197.0] 
Epoch 0:  33%|███▎      | 1969/5971 [19:11<38:59,  1.71it/s, loss=0.207, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00128, train/loss_step=0.300, global_step=198.0] 
Epoch 0:  33%|███▎      | 1970/5971 [19:12<39:00,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00271, train/loss_vlb_step=1.52e-5, train/loss_step=0.00271, global_step=198.0]
Epoch 0:  33%|███▎      | 1971/5971 [19:13<39:00,  1.71it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000128, train/loss_step=0.0344, global_step=198.0] 
Epoch 0:  33%|███▎      | 1972/5971 [19:16<39:04,  1.71it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000128, train/loss_step=0.0344, global_step=198.0]
Epoch 0:  33%|███▎      | 1972/5971 [19:16<39:04,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000134, train/loss_step=0.0341, global_step=198.0]
Epoch 0:  33%|███▎      | 1973/5971 [19:17<39:04,  1.71it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00682, train/loss_vlb_step=3.56e-5, train/loss_step=0.00682, global_step=199.0]
Epoch 0:  33%|███▎      | 1974/5971 [19:18<39:04,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00196, train/loss_step=0.401, global_step=199.0]    
Epoch 0:  33%|███▎      | 1975/5971 [19:19<39:04,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000138, train/loss_step=0.0397, global_step=199.0]
Epoch 0:  33%|███▎      | 1976/5971 [19:21<39:06,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000138, train/loss_step=0.0397, global_step=199.0]
Epoch 0:  33%|███▎      | 1976/5971 [19:21<39:06,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.35it/s][A

Validating:   1%|          | 2/167 [00:00<00:49,  3.31it/s][A
Epoch 0:  33%|███▎      | 1980/5971 [19:22<39:01,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.81it/s][A
Epoch 0:  33%|███▎      | 1984/5971 [19:22<38:54,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.81it/s][A

Validating:   7%|▋         | 11/167 [00:01<00:10, 15.43it/s][A
Epoch 0:  33%|███▎      | 1988/5971 [19:22<38:47,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.90it/s][A
Epoch 0:  33%|███▎      | 1992/5971 [19:22<38:41,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.11it/s][A
Epoch 0:  33%|███▎      | 1996/5971 [19:22<38:34,  1.72it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.41it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.53it/s][A
Epoch 0:  33%|███▎      | 2000/5971 [19:22<38:27,  1.72it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.02it/s][A
Epoch 0:  34%|███▎      | 2004/5971 [19:23<38:21,  1.72it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.94it/s][A
Epoch 0:  34%|███▎      | 2008/5971 [19:23<38:14,  1.73it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.84it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 25.81it/s][A
Epoch 0:  34%|███▎      | 2012/5971 [19:23<38:08,  1.73it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.19it/s][A
Epoch 0:  34%|███▍      | 2016/5971 [19:23<38:01,  1.73it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.19it/s][A
Epoch 0:  34%|███▍      | 2020/5971 [19:23<37:54,  1.74it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.47it/s][A
Epoch 0:  34%|███▍      | 2024/5971 [19:23<37:48,  1.74it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.20it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 24.19it/s][A
Epoch 0:  34%|███▍      | 2028/5971 [19:24<37:42,  1.74it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 23.78it/s][A
Epoch 0:  34%|███▍      | 2032/5971 [19:24<37:35,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 24.85it/s][A
Epoch 0:  34%|███▍      | 2036/5971 [19:24<37:29,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.39it/s][A

Validating:  38%|███▊      | 63/167 [00:03<00:04, 24.65it/s][A
Epoch 0:  34%|███▍      | 2040/5971 [19:24<37:22,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.55it/s][A
Epoch 0:  34%|███▍      | 2044/5971 [19:24<37:16,  1.76it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.70it/s][A
Epoch 0:  34%|███▍      | 2048/5971 [19:24<37:10,  1.76it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.84it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.55it/s][A
Epoch 0:  34%|███▍      | 2052/5971 [19:24<37:03,  1.76it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.62it/s][A
Epoch 0:  34%|███▍      | 2056/5971 [19:25<36:57,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.33it/s][A
Epoch 0:  35%|███▍      | 2060/5971 [19:25<36:51,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.92it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.57it/s][A
Epoch 0:  35%|███▍      | 2064/5971 [19:25<36:44,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.38it/s][A
Epoch 0:  35%|███▍      | 2068/5971 [19:25<36:38,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.90it/s][A
Epoch 0:  35%|███▍      | 2072/5971 [19:25<36:32,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.29it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.16it/s][A
Epoch 0:  35%|███▍      | 2076/5971 [19:25<36:26,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.54it/s][A
Epoch 0:  35%|███▍      | 2080/5971 [19:25<36:20,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.57it/s][A
Epoch 0:  35%|███▍      | 2084/5971 [19:26<36:13,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.82it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.52it/s][A
Epoch 0:  35%|███▍      | 2088/5971 [19:26<36:07,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 25.66it/s][A
Epoch 0:  35%|███▌      | 2092/5971 [19:26<36:01,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.26it/s][A
Epoch 0:  35%|███▌      | 2096/5971 [19:26<35:55,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.87it/s][A
Epoch 0:  35%|███▌      | 2100/5971 [19:26<35:49,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.56it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.81it/s][A
Epoch 0:  35%|███▌      | 2104/5971 [19:26<35:43,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.28it/s][A
Epoch 0:  35%|███▌      | 2108/5971 [19:27<35:37,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.53it/s][A
Epoch 0:  35%|███▌      | 2112/5971 [19:27<35:31,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.30it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.29it/s][A
Epoch 0:  35%|███▌      | 2116/5971 [19:27<35:25,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.91it/s][A
Epoch 0:  36%|███▌      | 2120/5971 [19:27<35:19,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.77it/s][A
Epoch 0:  36%|███▌      | 2124/5971 [19:27<35:13,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.88it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.42it/s][A
Epoch 0:  36%|███▌      | 2128/5971 [19:27<35:07,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 27.74it/s][A
Epoch 0:  36%|███▌      | 2132/5971 [19:27<35:02,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 28.25it/s][A
Epoch 0:  36%|███▌      | 2136/5971 [19:28<34:56,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 28.42it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.42it/s][A
Epoch 0:  36%|███▌      | 2140/5971 [19:28<34:50,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.54it/s][A
Epoch 0:  36%|███▌      | 2144/5971 [19:28<34:44,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]
Epoch 0:  36%|███▌      | 2144/5971 [19:28<34:45,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000239, train/loss_step=0.0699, global_step=199.0]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.16it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.75it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.12it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.52it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.69it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.79it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.83it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.89it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  4.97it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.01it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.06it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.12it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.18it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.25it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.22it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.20it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.12it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.00it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:05<00:05,  5.02it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.09it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.07it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.07it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.09it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:06<00:04,  5.08it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.08it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.14it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.20it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.18it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.24it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.25it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.14it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.21it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.20it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.23it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.22it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.15it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.16it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.15it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.18it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.21it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.22it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.11it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.89it/s]

Epoch 0:  36%|███▌      | 2145/5971 [19:41<35:06,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.00026, train/loss_step=0.0788, global_step=200.0] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A
Epoch 0:  36%|███▌      | 2145/5971 [19:43<35:09,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.00026, train/loss_step=0.0788, global_step=200.0]

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.31it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.05it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.61it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  4.05it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.44it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.93it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.05it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.08it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.02it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  4.99it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.06it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.15it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.20it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.34it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.30it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.12it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.04it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.01it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  4.91it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.02it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.05it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  4.98it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.01it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.03it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.07it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.07it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.03it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.06it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.11it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.10it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.01it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  4.98it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.07it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.19it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.57it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.60it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.90it/s]

Epoch 0:  36%|███▌      | 2146/5971 [19:53<35:26,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.00026, train/loss_step=0.0788, global_step=200.0]
Epoch 0:  36%|███▌      | 2146/5971 [19:53<35:26,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00256, train/loss_step=0.357, global_step=200.0]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.70it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.13it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.37it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.62it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.76it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.81it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.93it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.04it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.12it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.13it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.14it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.24it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.21it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.24it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.15it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.11it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:05<00:05,  5.04it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.13it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.21it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.19it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.20it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.18it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.22it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.34it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.13it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.05it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.11it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.38it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.35it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.45it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.94it/s]

Epoch 0:  36%|███▌      | 2147/5971 [20:06<35:47,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00256, train/loss_step=0.357, global_step=200.0]
Epoch 0:  36%|███▌      | 2147/5971 [20:06<35:47,  1.78it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=9.3e-5, train/loss_step=0.0221, global_step=200.0] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.35it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.16it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.79it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.18it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.48it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.73it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.87it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.96it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.09it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.24it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.14it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.49it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.50it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.38it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.24it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.23it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.35it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.48it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.56it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.07it/s]

Epoch 0:  36%|███▌      | 2148/5971 [20:19<36:10,  1.76it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=9.3e-5, train/loss_step=0.0221, global_step=200.0]
Epoch 0:  36%|███▌      | 2148/5971 [20:19<36:10,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000914, train/loss_step=0.244, global_step=200.0]
Epoch 0:  36%|███▌      | 2149/5971 [20:20<36:10,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000914, train/loss_step=0.244, global_step=200.0]
Epoch 0:  36%|███▌      | 2149/5971 [20:20<36:10,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00505, train/loss_vlb_step=2.68e-5, train/loss_step=0.00505, global_step=201.0]
Epoch 0:  36%|███▌      | 2150/5971 [20:21<36:10,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00505, train/loss_vlb_step=2.68e-5, train/loss_step=0.00505, global_step=201.0]
Epoch 0:  36%|███▌      | 2150/5971 [20:21<36:10,  1.76it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.96e-5, train/loss_step=0.0159, global_step=201.0] 
Epoch 0:  36%|███▌      | 2151/5971 [20:22<36:10,  1.76it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.96e-5, train/loss_step=0.0159, global_step=201.0]
Epoch 0:  36%|███▌      | 2151/5971 [20:22<36:10,  1.76it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.00436, train/loss_vlb_step=2.36e-5, train/loss_step=0.00436, global_step=201.0]
Epoch 0:  36%|███▌      | 2152/5971 [20:24<36:12,  1.76it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.00436, train/loss_vlb_step=2.36e-5, train/loss_step=0.00436, global_step=201.0]
Epoch 0:  36%|███▌      | 2152/5971 [20:24<36:12,  1.76it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.33e-5, train/loss_step=0.00668, global_step=201.0]
Epoch 0:  36%|███▌      | 2153/5971 [20:25<36:12,  1.76it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.33e-5, train/loss_step=0.00668, global_step=201.0]
Epoch 0:  36%|███▌      | 2153/5971 [20:25<36:12,  1.76it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000117, train/loss_step=0.0294, global_step=202.0] 
Epoch 0:  36%|███▌      | 2154/5971 [20:26<36:12,  1.76it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000117, train/loss_step=0.0294, global_step=202.0]
Epoch 0:  36%|███▌      | 2154/5971 [20:26<36:12,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00534, train/loss_step=0.467, global_step=202.0]    
Epoch 0:  36%|███▌      | 2155/5971 [20:27<36:12,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00534, train/loss_step=0.467, global_step=202.0]
Epoch 0:  36%|███▌      | 2155/5971 [20:27<36:12,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.06e-5, train/loss_step=0.017, global_step=202.0]
Epoch 0:  36%|███▌      | 2156/5971 [20:29<36:15,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.06e-5, train/loss_step=0.017, global_step=202.0]
Epoch 0:  36%|███▌      | 2156/5971 [20:29<36:15,  1.75it/s, loss=0.117, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000803, train/loss_step=0.207, global_step=202.0]
Epoch 0:  36%|███▌      | 2157/5971 [20:30<36:15,  1.75it/s, loss=0.117, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000803, train/loss_step=0.207, global_step=202.0]
Epoch 0:  36%|███▌      | 2157/5971 [20:30<36:15,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.36e-5, train/loss_step=0.0023, global_step=203.0]
Epoch 0:  36%|███▌      | 2158/5971 [20:31<36:15,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.36e-5, train/loss_step=0.0023, global_step=203.0]
Epoch 0:  36%|███▌      | 2158/5971 [20:31<36:15,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00629, train/loss_vlb_step=3.17e-5, train/loss_step=0.00629, global_step=203.0]
Epoch 0:  36%|███▌      | 2159/5971 [20:32<36:15,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00629, train/loss_vlb_step=3.17e-5, train/loss_step=0.00629, global_step=203.0]
Epoch 0:  36%|███▌      | 2159/5971 [20:32<36:15,  1.75it/s, loss=0.114, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000942, train/loss_step=0.257, global_step=203.0]   
Epoch 0:  36%|███▌      | 2160/5971 [20:34<36:17,  1.75it/s, loss=0.114, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000942, train/loss_step=0.257, global_step=203.0]
Epoch 0:  36%|███▌      | 2160/5971 [20:34<36:17,  1.75it/s, loss=0.128, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00248, train/loss_step=0.322, global_step=203.0] 
Epoch 0:  36%|███▌      | 2161/5971 [20:35<36:17,  1.75it/s, loss=0.128, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00248, train/loss_step=0.322, global_step=203.0]
Epoch 0:  36%|███▌      | 2161/5971 [20:35<36:17,  1.75it/s, loss=0.135, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.00049, train/loss_step=0.148, global_step=204.0]
Epoch 0:  36%|███▌      | 2162/5971 [20:36<36:17,  1.75it/s, loss=0.135, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.00049, train/loss_step=0.148, global_step=204.0]
Epoch 0:  36%|███▌      | 2162/5971 [20:36<36:17,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.39e-5, train/loss_step=0.0119, global_step=204.0]
Epoch 0:  36%|███▌      | 2163/5971 [20:37<36:17,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.39e-5, train/loss_step=0.0119, global_step=204.0]
Epoch 0:  36%|███▌      | 2163/5971 [20:37<36:17,  1.75it/s, loss=0.115, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000127, train/loss_step=0.036, global_step=204.0] 
Epoch 0:  36%|███▌      | 2164/5971 [20:39<36:19,  1.75it/s, loss=0.115, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000127, train/loss_step=0.036, global_step=204.0]
Epoch 0:  36%|███▌      | 2164/5971 [20:39<36:19,  1.75it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.47e-5, train/loss_step=0.0119, global_step=204.0]
Epoch 0:  36%|███▋      | 2165/5971 [20:40<36:19,  1.75it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.47e-5, train/loss_step=0.0119, global_step=204.0]
Epoch 0:  36%|███▋      | 2165/5971 [20:40<36:19,  1.75it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.000144, train/loss_step=0.0398, global_step=205.0]
Epoch 0:  36%|███▋      | 2166/5971 [20:41<36:19,  1.75it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.000144, train/loss_step=0.0398, global_step=205.0]
Epoch 0:  36%|███▋      | 2166/5971 [20:41<36:19,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000666, train/loss_step=0.196, global_step=205.0] 
Epoch 0:  36%|███▋      | 2167/5971 [20:42<36:19,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000666, train/loss_step=0.196, global_step=205.0]
Epoch 0:  36%|███▋      | 2167/5971 [20:42<36:19,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000295, train/loss_step=0.0894, global_step=205.0]
Epoch 0:  36%|███▋      | 2168/5971 [20:44<36:21,  1.74it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000295, train/loss_step=0.0894, global_step=205.0]
Epoch 0:  36%|███▋      | 2168/5971 [20:44<36:21,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000597, train/loss_step=0.173, global_step=205.0]  
Epoch 0:  36%|███▋      | 2169/5971 [20:45<36:21,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000597, train/loss_step=0.173, global_step=205.0]
Epoch 0:  36%|███▋      | 2169/5971 [20:45<36:21,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.26e-5, train/loss_step=0.00214, global_step=206.0]
Epoch 0:  36%|███▋      | 2170/5971 [20:46<36:21,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.26e-5, train/loss_step=0.00214, global_step=206.0]
Epoch 0:  36%|███▋      | 2170/5971 [20:46<36:21,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00738, train/loss_vlb_step=3.58e-5, train/loss_step=0.00738, global_step=206.0]
Epoch 0:  36%|███▋      | 2171/5971 [20:46<36:21,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00738, train/loss_vlb_step=3.58e-5, train/loss_step=0.00738, global_step=206.0]
Epoch 0:  36%|███▋      | 2171/5971 [20:46<36:21,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000184, train/loss_step=0.0514, global_step=206.0] 
Epoch 0:  36%|███▋      | 2172/5971 [20:49<36:24,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000184, train/loss_step=0.0514, global_step=206.0]
Epoch 0:  36%|███▋      | 2172/5971 [20:49<36:24,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.09e-5, train/loss_step=0.00403, global_step=206.0]
Epoch 0:  36%|███▋      | 2173/5971 [20:50<36:24,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.09e-5, train/loss_step=0.00403, global_step=206.0]
Epoch 0:  36%|███▋      | 2173/5971 [20:50<36:24,  1.74it/s, loss=0.116, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00116, train/loss_step=0.271, global_step=207.0]    
Epoch 0:  36%|███▋      | 2174/5971 [20:51<36:24,  1.74it/s, loss=0.116, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00116, train/loss_step=0.271, global_step=207.0]
Epoch 0:  36%|███▋      | 2174/5971 [20:51<36:24,  1.74it/s, loss=0.0932, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.84e-5, train/loss_step=0.0103, global_step=207.0]
Epoch 0:  36%|███▋      | 2175/5971 [20:51<36:23,  1.74it/s, loss=0.0932, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.84e-5, train/loss_step=0.0103, global_step=207.0]
Epoch 0:  36%|███▋      | 2175/5971 [20:51<36:23,  1.74it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.00795, train/loss_vlb_step=3.74e-5, train/loss_step=0.00795, global_step=207.0]
Epoch 0:  36%|███▋      | 2176/5971 [20:54<36:26,  1.74it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.00795, train/loss_vlb_step=3.74e-5, train/loss_step=0.00795, global_step=207.0]
Epoch 0:  36%|███▋      | 2176/5971 [20:54<36:26,  1.74it/s, loss=0.0842, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000136, train/loss_step=0.037, global_step=207.0]   
Epoch 0:  36%|███▋      | 2177/5971 [20:54<36:26,  1.74it/s, loss=0.0842, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000136, train/loss_step=0.037, global_step=207.0]
Epoch 0:  36%|███▋      | 2177/5971 [20:54<36:26,  1.74it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.79e-5, train/loss_step=0.0139, global_step=208.0]
Epoch 0:  36%|███▋      | 2178/5971 [20:55<36:26,  1.74it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.79e-5, train/loss_step=0.0139, global_step=208.0]
Epoch 0:  36%|███▋      | 2178/5971 [20:55<36:26,  1.74it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.00022, train/loss_step=0.0655, global_step=208.0]
Epoch 0:  36%|███▋      | 2179/5971 [20:56<36:26,  1.73it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.00022, train/loss_step=0.0655, global_step=208.0]
Epoch 0:  36%|███▋      | 2179/5971 [20:56<36:26,  1.73it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000189, train/loss_step=0.0559, global_step=208.0]
Epoch 0:  37%|███▋      | 2180/5971 [20:58<36:28,  1.73it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000189, train/loss_step=0.0559, global_step=208.0]
Epoch 0:  37%|███▋      | 2180/5971 [20:58<36:28,  1.73it/s, loss=0.0746, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000915, train/loss_step=0.261, global_step=208.0]  
Epoch 0:  37%|███▋      | 2181/5971 [20:59<36:28,  1.73it/s, loss=0.0746, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000915, train/loss_step=0.261, global_step=208.0]
Epoch 0:  37%|███▋      | 2181/5971 [20:59<36:28,  1.73it/s, loss=0.0688, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000122, train/loss_step=0.0314, global_step=209.0]
Epoch 0:  37%|███▋      | 2182/5971 [21:00<36:28,  1.73it/s, loss=0.0688, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000122, train/loss_step=0.0314, global_step=209.0]
Epoch 0:  37%|███▋      | 2182/5971 [21:00<36:28,  1.73it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000155, train/loss_step=0.0405, global_step=209.0]
Epoch 0:  37%|███▋      | 2183/5971 [21:01<36:28,  1.73it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000155, train/loss_step=0.0405, global_step=209.0]
Epoch 0:  37%|███▋      | 2183/5971 [21:01<36:28,  1.73it/s, loss=0.0779, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000692, train/loss_step=0.190, global_step=209.0]  
Epoch 0:  37%|███▋      | 2184/5971 [21:03<36:30,  1.73it/s, loss=0.0779, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000692, train/loss_step=0.190, global_step=209.0]
Epoch 0:  37%|███▋      | 2184/5971 [21:03<36:30,  1.73it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.0065, train/loss_vlb_step=3.2e-5, train/loss_step=0.0065, global_step=209.0]
Epoch 0:  37%|███▋      | 2185/5971 [21:04<36:30,  1.73it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.0065, train/loss_vlb_step=3.2e-5, train/loss_step=0.0065, global_step=209.0]
Epoch 0:  37%|███▋      | 2185/5971 [21:04<36:30,  1.73it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000945, train/loss_step=0.252, global_step=210.0]
Epoch 0:  37%|███▋      | 2186/5971 [21:05<36:30,  1.73it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000945, train/loss_step=0.252, global_step=210.0]
Epoch 0:  37%|███▋      | 2186/5971 [21:05<36:30,  1.73it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.7e-5, train/loss_step=0.00292, global_step=210.0]
Epoch 0:  37%|███▋      | 2187/5971 [21:06<36:30,  1.73it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.7e-5, train/loss_step=0.00292, global_step=210.0]
Epoch 0:  37%|███▋      | 2187/5971 [21:06<36:30,  1.73it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00276, train/loss_step=0.453, global_step=210.0]   
Epoch 0:  37%|███▋      | 2188/5971 [21:08<36:32,  1.73it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00276, train/loss_step=0.453, global_step=210.0]
Epoch 0:  37%|███▋      | 2188/5971 [21:08<36:32,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00718, train/loss_step=0.497, global_step=210.0] 
Epoch 0:  37%|███▋      | 2189/5971 [21:09<36:32,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00718, train/loss_step=0.497, global_step=210.0]
Epoch 0:  37%|███▋      | 2189/5971 [21:09<36:32,  1.73it/s, loss=0.135, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.00269, train/loss_step=0.435, global_step=211.0]
Epoch 0:  37%|███▋      | 2190/5971 [21:10<36:32,  1.72it/s, loss=0.135, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.00269, train/loss_step=0.435, global_step=211.0]
Epoch 0:  37%|███▋      | 2190/5971 [21:10<36:32,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.23e-5, train/loss_step=0.00209, global_step=211.0]
Epoch 0:  37%|███▋      | 2191/5971 [21:11<36:32,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.23e-5, train/loss_step=0.00209, global_step=211.0]
Epoch 0:  37%|███▋      | 2191/5971 [21:11<36:32,  1.72it/s, loss=0.142, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000685, train/loss_step=0.200, global_step=211.0]   
Epoch 0:  37%|███▋      | 2192/5971 [21:13<36:34,  1.72it/s, loss=0.142, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000685, train/loss_step=0.200, global_step=211.0]
Epoch 0:  37%|███▋      | 2192/5971 [21:13<36:34,  1.72it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.94e-5, train/loss_step=0.0234, global_step=211.0]
Epoch 0:  37%|███▋      | 2193/5971 [21:14<36:34,  1.72it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.94e-5, train/loss_step=0.0234, global_step=211.0]
Epoch 0:  37%|███▋      | 2193/5971 [21:14<36:34,  1.72it/s, loss=0.147, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00164, train/loss_step=0.345, global_step=212.0]  
Epoch 0:  37%|███▋      | 2194/5971 [21:15<36:34,  1.72it/s, loss=0.147, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00164, train/loss_step=0.345, global_step=212.0]
Epoch 0:  37%|███▋      | 2194/5971 [21:15<36:34,  1.72it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=212.0]
Epoch 0:  37%|███▋      | 2195/5971 [21:16<36:34,  1.72it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=212.0]
Epoch 0:  37%|███▋      | 2195/5971 [21:16<36:34,  1.72it/s, loss=0.183, v_num=0, train/loss_simple_step=0.662, train/loss_vlb_step=0.00798, train/loss_step=0.662, global_step=212.0]  
Epoch 0:  37%|███▋      | 2196/5971 [21:18<36:36,  1.72it/s, loss=0.183, v_num=0, train/loss_simple_step=0.662, train/loss_vlb_step=0.00798, train/loss_step=0.662, global_step=212.0]
Epoch 0:  37%|███▋      | 2196/5971 [21:18<36:36,  1.72it/s, loss=0.194, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000997, train/loss_step=0.251, global_step=212.0]
Epoch 0:  37%|███▋      | 2197/5971 [21:19<36:36,  1.72it/s, loss=0.194, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000997, train/loss_step=0.251, global_step=212.0]
Epoch 0:  37%|███▋      | 2197/5971 [21:19<36:36,  1.72it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000121, train/loss_step=0.0313, global_step=213.0]
Epoch 0:  37%|███▋      | 2198/5971 [21:20<36:36,  1.72it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000121, train/loss_step=0.0313, global_step=213.0]
Epoch 0:  37%|███▋      | 2198/5971 [21:20<36:36,  1.72it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.88e-5, train/loss_step=0.0217, global_step=213.0] 
Epoch 0:  37%|███▋      | 2199/5971 [21:21<36:36,  1.72it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.88e-5, train/loss_step=0.0217, global_step=213.0]
Epoch 0:  37%|███▋      | 2199/5971 [21:21<36:36,  1.72it/s, loss=0.213, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00288, train/loss_step=0.478, global_step=213.0]  
Epoch 0:  37%|███▋      | 2200/5971 [21:24<36:40,  1.71it/s, loss=0.213, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00288, train/loss_step=0.478, global_step=213.0]
Epoch 0:  37%|███▋      | 2200/5971 [21:24<36:40,  1.71it/s, loss=0.218, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00163, train/loss_step=0.346, global_step=213.0]
Epoch 0:  37%|███▋      | 2201/5971 [21:24<36:39,  1.71it/s, loss=0.218, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00163, train/loss_step=0.346, global_step=213.0]
Epoch 0:  37%|███▋      | 2201/5971 [21:24<36:39,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000533, train/loss_step=0.155, global_step=214.0]
Epoch 0:  37%|███▋      | 2202/5971 [21:25<36:39,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000533, train/loss_step=0.155, global_step=214.0]
Epoch 0:  37%|███▋      | 2202/5971 [21:25<36:39,  1.71it/s, loss=0.222, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.77e-5, train/loss_step=0.00309, global_step=214.0]
Epoch 0:  37%|███▋      | 2203/5971 [21:26<36:39,  1.71it/s, loss=0.222, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.77e-5, train/loss_step=0.00309, global_step=214.0]
Epoch 0:  37%|███▋      | 2203/5971 [21:26<36:39,  1.71it/s, loss=0.249, v_num=0, train/loss_simple_step=0.732, train/loss_vlb_step=0.0379, train/loss_step=0.732, global_step=214.0]     
Epoch 0:  37%|███▋      | 2204/5971 [21:29<36:42,  1.71it/s, loss=0.249, v_num=0, train/loss_simple_step=0.732, train/loss_vlb_step=0.0379, train/loss_step=0.732, global_step=214.0]
Epoch 0:  37%|███▋      | 2204/5971 [21:29<36:42,  1.71it/s, loss=0.25, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000111, train/loss_step=0.0296, global_step=214.0]
Epoch 0:  37%|███▋      | 2205/5971 [21:29<36:42,  1.71it/s, loss=0.25, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000111, train/loss_step=0.0296, global_step=214.0]
Epoch 0:  37%|███▋      | 2205/5971 [21:29<36:42,  1.71it/s, loss=0.238, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=5.58e-5, train/loss_step=0.0151, global_step=215.0]
Epoch 0:  37%|███▋      | 2206/5971 [21:30<36:42,  1.71it/s, loss=0.238, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=5.58e-5, train/loss_step=0.0151, global_step=215.0]
Epoch 0:  37%|███▋      | 2206/5971 [21:30<36:42,  1.71it/s, loss=0.239, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.81e-5, train/loss_step=0.00784, global_step=215.0]
Epoch 0:  37%|███▋      | 2207/5971 [21:31<36:41,  1.71it/s, loss=0.239, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.81e-5, train/loss_step=0.00784, global_step=215.0]
Epoch 0:  37%|███▋      | 2207/5971 [21:31<36:41,  1.71it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.55e-5, train/loss_step=0.0117, global_step=215.0]  
Epoch 0:  37%|███▋      | 2208/5971 [21:33<36:44,  1.71it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.55e-5, train/loss_step=0.0117, global_step=215.0]
Epoch 0:  37%|███▋      | 2208/5971 [21:33<36:44,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000344, train/loss_step=0.104, global_step=215.0] 
Epoch 0:  37%|███▋      | 2209/5971 [21:34<36:43,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000344, train/loss_step=0.104, global_step=215.0]
Epoch 0:  37%|███▋      | 2209/5971 [21:34<36:43,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000472, train/loss_step=0.133, global_step=216.0]
Epoch 0:  37%|███▋      | 2210/5971 [21:35<36:43,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000472, train/loss_step=0.133, global_step=216.0]
Epoch 0:  37%|███▋      | 2210/5971 [21:35<36:43,  1.71it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000184, train/loss_step=0.0539, global_step=216.0]
Epoch 0:  37%|███▋      | 2211/5971 [21:36<36:43,  1.71it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000184, train/loss_step=0.0539, global_step=216.0]
Epoch 0:  37%|███▋      | 2211/5971 [21:36<36:43,  1.71it/s, loss=0.183, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.0006, train/loss_step=0.181, global_step=216.0]    
Epoch 0:  37%|███▋      | 2212/5971 [21:39<36:46,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.0006, train/loss_step=0.181, global_step=216.0]
Epoch 0:  37%|███▋      | 2212/5971 [21:39<36:46,  1.70it/s, loss=0.205, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00475, train/loss_step=0.457, global_step=216.0]
Epoch 0:  37%|███▋      | 2213/5971 [21:39<36:46,  1.70it/s, loss=0.205, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00475, train/loss_step=0.457, global_step=216.0]
Epoch 0:  37%|███▋      | 2213/5971 [21:39<36:46,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.27e-5, train/loss_step=0.0151, global_step=217.0]
Epoch 0:  37%|███▋      | 2214/5971 [21:40<36:46,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.27e-5, train/loss_step=0.0151, global_step=217.0]
Epoch 0:  37%|███▋      | 2214/5971 [21:40<36:46,  1.70it/s, loss=0.217, v_num=0, train/loss_simple_step=0.656, train/loss_vlb_step=0.011, train/loss_step=0.656, global_step=217.0]    
Epoch 0:  37%|███▋      | 2215/5971 [21:41<36:46,  1.70it/s, loss=0.217, v_num=0, train/loss_simple_step=0.656, train/loss_vlb_step=0.011, train/loss_step=0.656, global_step=217.0]
Epoch 0:  37%|███▋      | 2215/5971 [21:41<36:46,  1.70it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.4e-5, train/loss_step=0.00253, global_step=217.0]
Epoch 0:  37%|███▋      | 2216/5971 [21:44<36:49,  1.70it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.4e-5, train/loss_step=0.00253, global_step=217.0]
Epoch 0:  37%|███▋      | 2216/5971 [21:44<36:49,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000133, train/loss_step=0.0334, global_step=217.0]
Epoch 0:  37%|███▋      | 2217/5971 [21:45<36:48,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000133, train/loss_step=0.0334, global_step=217.0]
Epoch 0:  37%|███▋      | 2217/5971 [21:45<36:48,  1.70it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00453, train/loss_vlb_step=2.43e-5, train/loss_step=0.00453, global_step=218.0]
Epoch 0:  37%|███▋      | 2218/5971 [21:46<36:48,  1.70it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00453, train/loss_vlb_step=2.43e-5, train/loss_step=0.00453, global_step=218.0]
Epoch 0:  37%|███▋      | 2218/5971 [21:46<36:48,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000145, train/loss_step=0.0407, global_step=218.0] 
Epoch 0:  37%|███▋      | 2219/5971 [21:46<36:48,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000145, train/loss_step=0.0407, global_step=218.0]
Epoch 0:  37%|███▋      | 2219/5971 [21:46<36:48,  1.70it/s, loss=0.165, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00141, train/loss_step=0.316, global_step=218.0]   
Epoch 0:  37%|███▋      | 2220/5971 [21:50<36:52,  1.70it/s, loss=0.165, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00141, train/loss_step=0.316, global_step=218.0]
Epoch 0:  37%|███▋      | 2220/5971 [21:50<36:52,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.31e-5, train/loss_step=0.00481, global_step=218.0]
Epoch 0:  37%|███▋      | 2221/5971 [21:50<36:52,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.31e-5, train/loss_step=0.00481, global_step=218.0]
Epoch 0:  37%|███▋      | 2221/5971 [21:50<36:52,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.00051, train/loss_step=0.150, global_step=219.0]    
Epoch 0:  37%|███▋      | 2222/5971 [21:51<36:52,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.00051, train/loss_step=0.150, global_step=219.0]
Epoch 0:  37%|███▋      | 2222/5971 [21:51<36:52,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.43e-5, train/loss_step=0.012, global_step=219.0]
Epoch 0:  37%|███▋      | 2223/5971 [21:52<36:52,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.43e-5, train/loss_step=0.012, global_step=219.0]
Epoch 0:  37%|███▋      | 2223/5971 [21:52<36:52,  1.69it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=3.95e-5, train/loss_step=0.00854, global_step=219.0]
Epoch 0:  37%|███▋      | 2224/5971 [21:55<36:54,  1.69it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=3.95e-5, train/loss_step=0.00854, global_step=219.0]
Epoch 0:  37%|███▋      | 2224/5971 [21:55<36:54,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.2e-5, train/loss_step=0.00206, global_step=219.0]  
Epoch 0:  37%|███▋      | 2225/5971 [21:56<36:54,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.2e-5, train/loss_step=0.00206, global_step=219.0]
Epoch 0:  37%|███▋      | 2225/5971 [21:56<36:54,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00964, train/loss_vlb_step=4.52e-5, train/loss_step=0.00964, global_step=220.0]
Epoch 0:  37%|███▋      | 2226/5971 [21:56<36:54,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00964, train/loss_vlb_step=4.52e-5, train/loss_step=0.00964, global_step=220.0]
Epoch 0:  37%|███▋      | 2226/5971 [21:56<36:54,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00237, train/loss_step=0.380, global_step=220.0]   
Epoch 0:  37%|███▋      | 2227/5971 [21:57<36:54,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00237, train/loss_step=0.380, global_step=220.0]
Epoch 0:  37%|███▋      | 2227/5971 [21:57<36:54,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=220.0]
Epoch 0:  37%|███▋      | 2228/5971 [21:59<36:56,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=220.0]
Epoch 0:  37%|███▋      | 2228/5971 [21:59<36:56,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00476, train/loss_vlb_step=2.52e-5, train/loss_step=0.00476, global_step=220.0]
Epoch 0:  37%|███▋      | 2229/5971 [22:00<36:56,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00476, train/loss_vlb_step=2.52e-5, train/loss_step=0.00476, global_step=220.0]
Epoch 0:  37%|███▋      | 2229/5971 [22:00<36:56,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000844, train/loss_step=0.240, global_step=221.0]   
Epoch 0:  37%|███▋      | 2230/5971 [22:01<36:56,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000844, train/loss_step=0.240, global_step=221.0]
Epoch 0:  37%|███▋      | 2230/5971 [22:01<36:56,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.0064, train/loss_step=0.512, global_step=221.0]  
Epoch 0:  37%|███▋      | 2231/5971 [22:02<36:56,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.0064, train/loss_step=0.512, global_step=221.0]
Epoch 0:  37%|███▋      | 2231/5971 [22:02<36:56,  1.69it/s, loss=0.156, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000477, train/loss_step=0.144, global_step=221.0]
Epoch 0:  37%|███▋      | 2232/5971 [22:05<36:58,  1.69it/s, loss=0.156, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000477, train/loss_step=0.144, global_step=221.0]
Epoch 0:  37%|███▋      | 2232/5971 [22:05<36:58,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000173, train/loss_step=0.0502, global_step=221.0]
Epoch 0:  37%|███▋      | 2233/5971 [22:05<36:58,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000173, train/loss_step=0.0502, global_step=221.0]
Epoch 0:  37%|███▋      | 2233/5971 [22:05<36:58,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00737, train/loss_vlb_step=3.61e-5, train/loss_step=0.00737, global_step=222.0]
Epoch 0:  37%|███▋      | 2234/5971 [22:06<36:58,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00737, train/loss_vlb_step=3.61e-5, train/loss_step=0.00737, global_step=222.0]
Epoch 0:  37%|███▋      | 2234/5971 [22:06<36:58,  1.68it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.77e-5, train/loss_step=0.0054, global_step=222.0]  
Epoch 0:  37%|███▋      | 2235/5971 [22:07<36:58,  1.68it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.77e-5, train/loss_step=0.0054, global_step=222.0]
Epoch 0:  37%|███▋      | 2235/5971 [22:07<36:58,  1.68it/s, loss=0.113, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.00079, train/loss_step=0.213, global_step=222.0]  
Epoch 0:  37%|███▋      | 2236/5971 [22:09<37:00,  1.68it/s, loss=0.113, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.00079, train/loss_step=0.213, global_step=222.0]
Epoch 0:  37%|███▋      | 2236/5971 [22:09<37:00,  1.68it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0942, train/loss_vlb_step=0.00031, train/loss_step=0.0942, global_step=222.0]
Epoch 0:  37%|███▋      | 2237/5971 [22:10<37:00,  1.68it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0942, train/loss_vlb_step=0.00031, train/loss_step=0.0942, global_step=222.0]
Epoch 0:  37%|███▋      | 2237/5971 [22:10<37:00,  1.68it/s, loss=0.128, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000975, train/loss_step=0.253, global_step=223.0] 
Epoch 0:  37%|███▋      | 2238/5971 [22:11<37:00,  1.68it/s, loss=0.128, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000975, train/loss_step=0.253, global_step=223.0]
Epoch 0:  37%|███▋      | 2238/5971 [22:11<37:00,  1.68it/s, loss=0.132, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000383, train/loss_step=0.114, global_step=223.0]
Epoch 0:  37%|███▋      | 2239/5971 [22:12<37:00,  1.68it/s, loss=0.132, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000383, train/loss_step=0.114, global_step=223.0]
Epoch 0:  37%|███▋      | 2239/5971 [22:12<37:00,  1.68it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=223.0]
Epoch 0:  38%|███▊      | 2240/5971 [22:14<37:02,  1.68it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=223.0]
Epoch 0:  38%|███▊      | 2240/5971 [22:14<37:02,  1.68it/s, loss=0.122, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=223.0]    
Epoch 0:  38%|███▊      | 2241/5971 [22:15<37:01,  1.68it/s, loss=0.122, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=223.0]
Epoch 0:  38%|███▊      | 2241/5971 [22:15<37:01,  1.68it/s, loss=0.127, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.0011, train/loss_step=0.251, global_step=224.0] 
Epoch 0:  38%|███▊      | 2242/5971 [22:16<37:01,  1.68it/s, loss=0.127, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.0011, train/loss_step=0.251, global_step=224.0]
Epoch 0:  38%|███▊      | 2242/5971 [22:16<37:01,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000622, train/loss_step=0.168, global_step=224.0]
Epoch 0:  38%|███▊      | 2243/5971 [22:17<37:01,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000622, train/loss_step=0.168, global_step=224.0]
Epoch 0:  38%|███▊      | 2243/5971 [22:17<37:01,  1.68it/s, loss=0.147, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.00106, train/loss_step=0.248, global_step=224.0] 
Epoch 0:  38%|███▊      | 2244/5971 [22:19<37:03,  1.68it/s, loss=0.147, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.00106, train/loss_step=0.248, global_step=224.0]
Epoch 0:  38%|███▊      | 2244/5971 [22:19<37:03,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:18,  2.11it/s][A
Epoch 0:  38%|███▊      | 2246/5971 [22:20<37:01,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:   1%|          | 2/167 [00:00<01:00,  2.73it/s][A
Epoch 0:  38%|███▊      | 2248/5971 [22:20<36:58,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:   3%|▎         | 5/167 [00:00<00:20,  7.72it/s][A
Epoch 0:  38%|███▊      | 2251/5971 [22:20<36:54,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:   5%|▍         | 8/167 [00:00<00:13, 11.87it/s][A
Epoch 0:  38%|███▊      | 2254/5971 [22:20<36:49,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:   7%|▋         | 11/167 [00:01<00:10, 15.26it/s][A
Epoch 0:  38%|███▊      | 2257/5971 [22:20<36:45,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.27it/s][A
Epoch 0:  38%|███▊      | 2260/5971 [22:20<36:40,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.58it/s][A
Epoch 0:  38%|███▊      | 2263/5971 [22:20<36:36,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.70it/s][A
Epoch 0:  38%|███▊      | 2266/5971 [22:21<36:31,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.84it/s][A
Epoch 0:  38%|███▊      | 2269/5971 [22:21<36:27,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.09it/s][A
Epoch 0:  38%|███▊      | 2272/5971 [22:21<36:22,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.62it/s][A
Epoch 0:  38%|███▊      | 2275/5971 [22:21<36:18,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.02it/s][A
Epoch 0:  38%|███▊      | 2278/5971 [22:21<36:13,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  21%|██        | 35/167 [00:02<00:05, 24.47it/s][A
Epoch 0:  38%|███▊      | 2281/5971 [22:21<36:09,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.72it/s][A
Epoch 0:  38%|███▊      | 2284/5971 [22:21<36:05,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.85it/s][A
Epoch 0:  38%|███▊      | 2287/5971 [22:21<36:00,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 23.11it/s][A
Epoch 0:  38%|███▊      | 2290/5971 [22:22<35:56,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 24.49it/s][A
Epoch 0:  38%|███▊      | 2293/5971 [22:22<35:51,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.39it/s][A
Epoch 0:  38%|███▊      | 2296/5971 [22:22<35:47,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.40it/s][A
Epoch 0:  39%|███▊      | 2299/5971 [22:22<35:43,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.62it/s][A
Epoch 0:  39%|███▊      | 2302/5971 [22:22<35:38,  1.72it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.75it/s][A
Epoch 0:  39%|███▊      | 2305/5971 [22:22<35:34,  1.72it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  37%|███▋      | 62/167 [00:03<00:04, 25.37it/s][A
Epoch 0:  39%|███▊      | 2308/5971 [22:22<35:30,  1.72it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.51it/s][A
Epoch 0:  39%|███▊      | 2312/5971 [22:22<35:24,  1.72it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.61it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.57it/s][A
Epoch 0:  39%|███▉      | 2316/5971 [22:23<35:18,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.42it/s][A
Epoch 0:  39%|███▉      | 2320/5971 [22:23<35:12,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  46%|████▌     | 77/167 [00:03<00:04, 20.34it/s][A
Epoch 0:  39%|███▉      | 2324/5971 [22:23<35:07,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  48%|████▊     | 80/167 [00:03<00:04, 21.72it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 23.20it/s][A
Epoch 0:  39%|███▉      | 2328/5971 [22:23<35:01,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 24.14it/s][A
Epoch 0:  39%|███▉      | 2332/5971 [22:23<34:55,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 24.39it/s][A
Epoch 0:  39%|███▉      | 2336/5971 [22:23<34:50,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.80it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.23it/s][A
Epoch 0:  39%|███▉      | 2340/5971 [22:24<34:44,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.85it/s][A
Epoch 0:  39%|███▉      | 2344/5971 [22:24<34:39,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.54it/s][A
Epoch 0:  39%|███▉      | 2348/5971 [22:24<34:33,  1.75it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.27it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.90it/s][A
Epoch 0:  39%|███▉      | 2352/5971 [22:24<34:27,  1.75it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.12it/s][A
Epoch 0:  39%|███▉      | 2356/5971 [22:24<34:22,  1.75it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 26.07it/s][A
Epoch 0:  40%|███▉      | 2360/5971 [22:24<34:16,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 26.32it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.69it/s][A
Epoch 0:  40%|███▉      | 2364/5971 [22:24<34:11,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 27.42it/s][A
Epoch 0:  40%|███▉      | 2368/5971 [22:25<34:05,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.66it/s][A
Epoch 0:  40%|███▉      | 2372/5971 [22:25<34:00,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.74it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.09it/s][A
Epoch 0:  40%|███▉      | 2376/5971 [22:25<33:54,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  80%|████████  | 134/167 [00:05<00:01, 24.64it/s][A
Epoch 0:  40%|███▉      | 2380/5971 [22:25<33:49,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 25.71it/s][A
Epoch 0:  40%|███▉      | 2384/5971 [22:25<33:43,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 25.63it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.90it/s][A
Epoch 0:  40%|███▉      | 2388/5971 [22:25<33:38,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.40it/s][A
Epoch 0:  40%|████      | 2392/5971 [22:25<33:33,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.65it/s][A
Epoch 0:  40%|████      | 2396/5971 [22:26<33:27,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.88it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.88it/s][A
Epoch 0:  40%|████      | 2400/5971 [22:26<33:22,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 24.58it/s][A
Epoch 0:  40%|████      | 2404/5971 [22:26<33:17,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.84it/s][A
Epoch 0:  40%|████      | 2408/5971 [22:26<33:11,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 24.49it/s][A

Validating: 100%|██████████| 167/167 [00:07<00:00, 25.19it/s][A
Epoch 0:  40%|████      | 2412/5971 [22:26<33:06,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]
Epoch 0:  40%|████      | 2412/5971 [22:27<33:07,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000534, train/loss_step=0.158, global_step=224.0]

                                                             [A
Epoch 0:  40%|████      | 2413/5971 [22:28<33:07,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.000206, train/loss_step=0.0594, global_step=225.0]
Epoch 0:  40%|████      | 2414/5971 [22:29<33:07,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.87e-5, train/loss_step=0.00329, global_step=225.0]
Epoch 0:  40%|████      | 2415/5971 [22:30<33:07,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.24e-5, train/loss_step=0.0147, global_step=225.0]  
Epoch 0:  40%|████      | 2416/5971 [22:32<33:08,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.24e-5, train/loss_step=0.0147, global_step=225.0]
Epoch 0:  40%|████      | 2416/5971 [22:32<33:08,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.19e-5, train/loss_step=0.002, global_step=225.0]  
Epoch 0:  40%|████      | 2417/5971 [22:33<33:08,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000135, train/loss_step=0.0355, global_step=226.0]
Epoch 0:  40%|████      | 2418/5971 [22:34<33:08,  1.79it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.00773, train/loss_vlb_step=3.72e-5, train/loss_step=0.00773, global_step=226.0]
Epoch 0:  41%|████      | 2419/5971 [22:34<33:08,  1.79it/s, loss=0.094, v_num=0, train/loss_simple_step=0.0805, train/loss_vlb_step=0.000268, train/loss_step=0.0805, global_step=226.0]  
Epoch 0:  41%|████      | 2420/5971 [22:37<33:10,  1.78it/s, loss=0.094, v_num=0, train/loss_simple_step=0.0805, train/loss_vlb_step=0.000268, train/loss_step=0.0805, global_step=226.0]
Epoch 0:  41%|████      | 2420/5971 [22:37<33:10,  1.78it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.0816, train/loss_vlb_step=0.000268, train/loss_step=0.0816, global_step=226.0]
Epoch 0:  41%|████      | 2421/5971 [22:38<33:10,  1.78it/s, loss=0.114, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00217, train/loss_step=0.372, global_step=227.0]    
Epoch 0:  41%|████      | 2422/5971 [22:39<33:10,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000544, train/loss_step=0.165, global_step=227.0]
Epoch 0:  41%|████      | 2423/5971 [22:39<33:10,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.16e-5, train/loss_step=0.00191, global_step=227.0]
Epoch 0:  41%|████      | 2424/5971 [22:42<33:12,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.16e-5, train/loss_step=0.00191, global_step=227.0]
Epoch 0:  41%|████      | 2424/5971 [22:42<33:12,  1.78it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000146, train/loss_step=0.0413, global_step=227.0] 
Epoch 0:  41%|████      | 2425/5971 [22:42<33:12,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000502, train/loss_step=0.149, global_step=228.0]  
Epoch 0:  41%|████      | 2426/5971 [22:43<33:12,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000912, train/loss_step=0.241, global_step=228.0] 
Epoch 0:  41%|████      | 2427/5971 [22:44<33:11,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000855, train/loss_step=0.222, global_step=228.0]
Epoch 0:  41%|████      | 2428/5971 [22:47<33:14,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000855, train/loss_step=0.222, global_step=228.0]
Epoch 0:  41%|████      | 2428/5971 [22:47<33:14,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000559, train/loss_step=0.168, global_step=228.0]
Epoch 0:  41%|████      | 2429/5971 [22:47<33:13,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00133, train/loss_vlb_step=8.05e-6, train/loss_step=0.00133, global_step=229.0]
Epoch 0:  41%|████      | 2430/5971 [22:48<33:13,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.52e-5, train/loss_step=0.0173, global_step=229.0]  
Epoch 0:  41%|████      | 2431/5971 [22:49<33:13,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.602, train/loss_vlb_step=0.0117, train/loss_step=0.602, global_step=229.0]   
Epoch 0:  41%|████      | 2432/5971 [22:51<33:15,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.602, train/loss_vlb_step=0.0117, train/loss_step=0.602, global_step=229.0]
Epoch 0:  41%|████      | 2432/5971 [22:51<33:15,  1.77it/s, loss=0.125, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000921, train/loss_step=0.236, global_step=229.0]
Epoch 0:  41%|████      | 2433/5971 [22:52<33:15,  1.77it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0482, train/loss_vlb_step=0.000174, train/loss_step=0.0482, global_step=230.0]
Epoch 0:  41%|████      | 2434/5971 [22:53<33:15,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00117, train/loss_step=0.290, global_step=230.0]   
Epoch 0:  41%|████      | 2435/5971 [22:54<33:15,  1.77it/s, loss=0.145, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=230.0]
Epoch 0:  41%|████      | 2436/5971 [22:56<33:16,  1.77it/s, loss=0.145, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=230.0]
Epoch 0:  41%|████      | 2436/5971 [22:56<33:16,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.00044, train/loss_step=0.134, global_step=230.0]
Epoch 0:  41%|████      | 2437/5971 [22:57<33:16,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.00019, train/loss_step=0.0575, global_step=231.0]
Epoch 0:  41%|████      | 2438/5971 [22:58<33:16,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00935, train/loss_vlb_step=4.46e-5, train/loss_step=0.00935, global_step=231.0]
Epoch 0:  41%|████      | 2439/5971 [22:59<33:16,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.00529, train/loss_step=0.546, global_step=231.0]    
Epoch 0:  41%|████      | 2440/5971 [23:01<33:18,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.00529, train/loss_step=0.546, global_step=231.0]
Epoch 0:  41%|████      | 2440/5971 [23:01<33:18,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=9.06e-5, train/loss_step=0.026, global_step=231.0]
Epoch 0:  41%|████      | 2441/5971 [23:02<33:18,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.69e-5, train/loss_step=0.0178, global_step=232.0]
Epoch 0:  41%|████      | 2442/5971 [23:03<33:18,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00184, train/loss_step=0.319, global_step=232.0]  
Epoch 0:  41%|████      | 2443/5971 [23:04<33:18,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000308, train/loss_step=0.0922, global_step=232.0]
Epoch 0:  41%|████      | 2444/5971 [23:06<33:20,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000308, train/loss_step=0.0922, global_step=232.0]
Epoch 0:  41%|████      | 2444/5971 [23:06<33:20,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000139, train/loss_step=0.0378, global_step=232.0]
Epoch 0:  41%|████      | 2445/5971 [23:07<33:19,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.38e-5, train/loss_step=0.0114, global_step=233.0]  
Epoch 0:  41%|████      | 2446/5971 [23:08<33:19,  1.76it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.000241, train/loss_step=0.0718, global_step=233.0]
Epoch 0:  41%|████      | 2447/5971 [23:09<33:19,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.39e-5, train/loss_step=0.00656, global_step=233.0]
Epoch 0:  41%|████      | 2448/5971 [23:11<33:22,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.39e-5, train/loss_step=0.00656, global_step=233.0]
Epoch 0:  41%|████      | 2448/5971 [23:11<33:22,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00157, train/loss_step=0.370, global_step=233.0]    
Epoch 0:  41%|████      | 2449/5971 [23:12<33:21,  1.76it/s, loss=0.186, v_num=0, train/loss_simple_step=0.695, train/loss_vlb_step=0.0151, train/loss_step=0.695, global_step=234.0] 
Epoch 0:  41%|████      | 2450/5971 [23:13<33:21,  1.76it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0827, train/loss_vlb_step=0.000284, train/loss_step=0.0827, global_step=234.0]
Epoch 0:  41%|████      | 2451/5971 [23:14<33:21,  1.76it/s, loss=0.183, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00249, train/loss_step=0.476, global_step=234.0]   
Epoch 0:  41%|████      | 2452/5971 [23:16<33:23,  1.76it/s, loss=0.183, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00249, train/loss_step=0.476, global_step=234.0]
Epoch 0:  41%|████      | 2452/5971 [23:16<33:23,  1.76it/s, loss=0.184, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00111, train/loss_step=0.258, global_step=234.0]
Epoch 0:  41%|████      | 2453/5971 [23:17<33:23,  1.76it/s, loss=0.19, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000537, train/loss_step=0.161, global_step=235.0]
Epoch 0:  41%|████      | 2454/5971 [23:18<33:23,  1.76it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000142, train/loss_step=0.0387, global_step=235.0]
Epoch 0:  41%|████      | 2455/5971 [23:19<33:23,  1.76it/s, loss=0.184, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00112, train/loss_step=0.262, global_step=235.0]   
Epoch 0:  41%|████      | 2456/5971 [23:21<33:24,  1.75it/s, loss=0.184, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00112, train/loss_step=0.262, global_step=235.0]
Epoch 0:  41%|████      | 2456/5971 [23:21<33:24,  1.75it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.62e-5, train/loss_step=0.00295, global_step=235.0]
Epoch 0:  41%|████      | 2457/5971 [23:22<33:24,  1.75it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.09e-5, train/loss_step=0.00384, global_step=236.0]
Epoch 0:  41%|████      | 2458/5971 [23:23<33:24,  1.75it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.59e-5, train/loss_step=0.0132, global_step=236.0]  
Epoch 0:  41%|████      | 2459/5971 [23:23<33:24,  1.75it/s, loss=0.158, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000787, train/loss_step=0.210, global_step=236.0] 
Epoch 0:  41%|████      | 2460/5971 [23:26<33:26,  1.75it/s, loss=0.158, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000787, train/loss_step=0.210, global_step=236.0]
Epoch 0:  41%|████      | 2460/5971 [23:26<33:26,  1.75it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=2.02e-5, train/loss_step=0.00376, global_step=236.0]
Epoch 0:  41%|████      | 2461/5971 [23:27<33:25,  1.75it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00257, train/loss_vlb_step=1.49e-5, train/loss_step=0.00257, global_step=237.0]
Epoch 0:  41%|████      | 2462/5971 [23:27<33:25,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00101, train/loss_step=0.259, global_step=237.0]    
Epoch 0:  41%|████      | 2463/5971 [23:28<33:25,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00158, train/loss_step=0.317, global_step=237.0]
Epoch 0:  41%|████▏     | 2464/5971 [23:31<33:27,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00158, train/loss_step=0.317, global_step=237.0]
Epoch 0:  41%|████▏     | 2464/5971 [23:31<33:27,  1.75it/s, loss=0.177, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00105, train/loss_step=0.285, global_step=237.0]
Epoch 0:  41%|████▏     | 2465/5971 [23:32<33:27,  1.75it/s, loss=0.191, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00152, train/loss_step=0.302, global_step=238.0]
Epoch 0:  41%|████▏     | 2466/5971 [23:33<33:27,  1.75it/s, loss=0.217, v_num=0, train/loss_simple_step=0.589, train/loss_vlb_step=0.00851, train/loss_step=0.589, global_step=238.0]
Epoch 0:  41%|████▏     | 2467/5971 [23:33<33:27,  1.75it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=9.68e-5, train/loss_step=0.0255, global_step=238.0]
Epoch 0:  41%|████▏     | 2468/5971 [23:36<33:29,  1.74it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=9.68e-5, train/loss_step=0.0255, global_step=238.0]
Epoch 0:  41%|████▏     | 2468/5971 [23:36<33:29,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.55e-5, train/loss_step=0.0186, global_step=238.0]  
Epoch 0:  41%|████▏     | 2469/5971 [23:37<33:29,  1.74it/s, loss=0.166, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.2e-5, train/loss_step=0.015, global_step=239.0] 
Epoch 0:  41%|████▏     | 2470/5971 [23:38<33:29,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.00011, train/loss_step=0.0279, global_step=239.0]
Epoch 0:  41%|████▏     | 2471/5971 [23:39<33:29,  1.74it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000127, train/loss_step=0.0351, global_step=239.0]
Epoch 0:  41%|████▏     | 2472/5971 [23:41<33:30,  1.74it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000127, train/loss_step=0.0351, global_step=239.0]
Epoch 0:  41%|████▏     | 2472/5971 [23:41<33:30,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00261, train/loss_step=0.396, global_step=239.0]   
Epoch 0:  41%|████▏     | 2473/5971 [23:42<33:30,  1.74it/s, loss=0.152, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000922, train/loss_step=0.236, global_step=240.0]
Epoch 0:  41%|████▏     | 2474/5971 [23:43<33:30,  1.74it/s, loss=0.171, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.0025, train/loss_step=0.415, global_step=240.0]  
Epoch 0:  41%|████▏     | 2475/5971 [23:43<33:30,  1.74it/s, loss=0.186, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00844, train/loss_step=0.571, global_step=240.0]
Epoch 0:  41%|████▏     | 2476/5971 [23:46<33:32,  1.74it/s, loss=0.186, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00844, train/loss_step=0.571, global_step=240.0]
Epoch 0:  41%|████▏     | 2476/5971 [23:46<33:32,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000166, train/loss_step=0.0474, global_step=240.0]
Epoch 0:  41%|████▏     | 2477/5971 [23:46<33:32,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.79e-5, train/loss_step=0.0177, global_step=241.0] 
Epoch 0:  42%|████▏     | 2478/5971 [23:47<33:31,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.87e-5, train/loss_step=0.0149, global_step=241.0]
Epoch 0:  42%|████▏     | 2479/5971 [23:48<33:31,  1.74it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0776, train/loss_vlb_step=0.00026, train/loss_step=0.0776, global_step=241.0]
Epoch 0:  42%|████▏     | 2480/5971 [23:51<33:33,  1.73it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0776, train/loss_vlb_step=0.00026, train/loss_step=0.0776, global_step=241.0]
Epoch 0:  42%|████▏     | 2480/5971 [23:51<33:33,  1.73it/s, loss=0.196, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00109, train/loss_step=0.269, global_step=241.0]  
Epoch 0:  42%|████▏     | 2481/5971 [23:51<33:33,  1.73it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00865, train/loss_vlb_step=3.99e-5, train/loss_step=0.00865, global_step=242.0]
Epoch 0:  42%|████▏     | 2482/5971 [23:52<33:33,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000117, train/loss_step=0.0299, global_step=242.0] 
Epoch 0:  42%|████▏     | 2483/5971 [23:53<33:33,  1.73it/s, loss=0.181, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000972, train/loss_step=0.238, global_step=242.0]  
Epoch 0:  42%|████▏     | 2484/5971 [23:55<33:35,  1.73it/s, loss=0.181, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000972, train/loss_step=0.238, global_step=242.0]
Epoch 0:  42%|████▏     | 2484/5971 [23:55<33:35,  1.73it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00414, train/loss_vlb_step=2.2e-5, train/loss_step=0.00414, global_step=242.0]
Epoch 0:  42%|████▏     | 2485/5971 [23:56<33:34,  1.73it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.34e-5, train/loss_step=0.00224, global_step=243.0]
Epoch 0:  42%|████▏     | 2486/5971 [23:57<33:34,  1.73it/s, loss=0.149, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00383, train/loss_step=0.524, global_step=243.0]    
Epoch 0:  42%|████▏     | 2487/5971 [23:58<33:34,  1.73it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000173, train/loss_step=0.0487, global_step=243.0]
Epoch 0:  42%|████▏     | 2488/5971 [24:00<33:36,  1.73it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000173, train/loss_step=0.0487, global_step=243.0]
Epoch 0:  42%|████▏     | 2488/5971 [24:00<33:36,  1.73it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.34e-5, train/loss_step=0.0158, global_step=243.0] 
Epoch 0:  42%|████▏     | 2489/5971 [24:01<33:36,  1.73it/s, loss=0.156, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000479, train/loss_step=0.145, global_step=244.0]
Epoch 0:  42%|████▏     | 2490/5971 [24:02<33:36,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.099, train/loss_vlb_step=0.000325, train/loss_step=0.099, global_step=244.0] 
Epoch 0:  42%|████▏     | 2491/5971 [24:03<33:35,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000156, train/loss_step=0.0449, global_step=244.0]
Epoch 0:  42%|████▏     | 2492/5971 [24:05<33:37,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000156, train/loss_step=0.0449, global_step=244.0]
Epoch 0:  42%|████▏     | 2492/5971 [24:05<33:37,  1.72it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000115, train/loss_step=0.0298, global_step=244.0]
Epoch 0:  42%|████▏     | 2493/5971 [24:06<33:37,  1.72it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.13e-5, train/loss_step=0.0138, global_step=245.0] 
Epoch 0:  42%|████▏     | 2494/5971 [24:07<33:37,  1.72it/s, loss=0.121, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000819, train/loss_step=0.218, global_step=245.0] 
Epoch 0:  42%|████▏     | 2495/5971 [24:08<33:37,  1.72it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.00931, train/loss_vlb_step=4.43e-5, train/loss_step=0.00931, global_step=245.0]
Epoch 0:  42%|████▏     | 2496/5971 [24:10<33:39,  1.72it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.00931, train/loss_vlb_step=4.43e-5, train/loss_step=0.00931, global_step=245.0]
Epoch 0:  42%|████▏     | 2496/5971 [24:10<33:39,  1.72it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000292, train/loss_step=0.0887, global_step=245.0] 
Epoch 0:  42%|████▏     | 2497/5971 [24:11<33:39,  1.72it/s, loss=0.103, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=246.0]   
Epoch 0:  42%|████▏     | 2498/5971 [24:12<33:38,  1.72it/s, loss=0.109, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000446, train/loss_step=0.135, global_step=246.0]
Epoch 0:  42%|████▏     | 2499/5971 [24:13<33:38,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00309, train/loss_step=0.476, global_step=246.0] 
Epoch 0:  42%|████▏     | 2500/5971 [24:15<33:40,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00309, train/loss_step=0.476, global_step=246.0]
Epoch 0:  42%|████▏     | 2500/5971 [24:15<33:40,  1.72it/s, loss=0.132, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00163, train/loss_step=0.329, global_step=246.0]
Epoch 0:  42%|████▏     | 2501/5971 [24:16<33:40,  1.72it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.77e-5, train/loss_step=0.00319, global_step=247.0]
Epoch 0:  42%|████▏     | 2502/5971 [24:17<33:39,  1.72it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000128, train/loss_step=0.0324, global_step=247.0] 
Epoch 0:  42%|████▏     | 2503/5971 [24:18<33:39,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.24e-5, train/loss_step=0.00215, global_step=247.0]
Epoch 0:  42%|████▏     | 2504/5971 [24:20<33:41,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.24e-5, train/loss_step=0.00215, global_step=247.0]
Epoch 0:  42%|████▏     | 2504/5971 [24:20<33:41,  1.72it/s, loss=0.131, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000892, train/loss_step=0.233, global_step=247.0]  
Epoch 0:  42%|████▏     | 2505/5971 [24:21<33:41,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.00269, train/loss_step=0.492, global_step=248.0] 
Epoch 0:  42%|████▏     | 2506/5971 [24:22<33:40,  1.71it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.8e-5, train/loss_step=0.0137, global_step=248.0]
Epoch 0:  42%|████▏     | 2507/5971 [24:23<33:40,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.00532, train/loss_step=0.557, global_step=248.0]
Epoch 0:  42%|████▏     | 2508/5971 [24:25<33:42,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.00532, train/loss_step=0.557, global_step=248.0]
Epoch 0:  42%|████▏     | 2508/5971 [24:25<33:42,  1.71it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000141, train/loss_step=0.0378, global_step=248.0]
Epoch 0:  42%|████▏     | 2509/5971 [24:26<33:42,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.96e-5, train/loss_step=0.021, global_step=249.0]    
Epoch 0:  42%|████▏     | 2510/5971 [24:26<33:41,  1.71it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.61e-5, train/loss_step=0.00273, global_step=249.0]
Epoch 0:  42%|████▏     | 2511/5971 [24:27<33:41,  1.71it/s, loss=0.167, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.0052, train/loss_step=0.463, global_step=249.0]     
Epoch 0:  42%|████▏     | 2512/5971 [24:29<33:43,  1.71it/s, loss=0.167, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.0052, train/loss_step=0.463, global_step=249.0]
Epoch 0:  42%|████▏     | 2512/5971 [24:29<33:43,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:01<02:46,  1.00s/it][A

Validating:   1%|          | 2/167 [00:01<01:50,  1.49it/s][A
Epoch 0:  42%|████▏     | 2516/5971 [24:31<33:39,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:   3%|▎         | 5/167 [00:01<00:34,  4.64it/s][A
Epoch 0:  42%|████▏     | 2520/5971 [24:31<33:34,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:   5%|▍         | 8/167 [00:01<00:19,  7.98it/s][A

Validating:   7%|▋         | 11/167 [00:01<00:14, 11.14it/s][A
Epoch 0:  42%|████▏     | 2524/5971 [24:31<33:29,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:   8%|▊         | 14/167 [00:01<00:11, 13.86it/s][A
Epoch 0:  42%|████▏     | 2528/5971 [24:31<33:23,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  10%|█         | 17/167 [00:02<00:09, 16.23it/s][A
Epoch 0:  42%|████▏     | 2532/5971 [24:32<33:18,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  12%|█▏        | 20/167 [00:02<00:07, 18.39it/s][A

Validating:  14%|█▍        | 23/167 [00:02<00:07, 19.39it/s][A
Epoch 0:  42%|████▏     | 2536/5971 [24:32<33:13,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  16%|█▌        | 26/167 [00:02<00:06, 21.10it/s][A
Epoch 0:  43%|████▎     | 2540/5971 [24:32<33:08,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  17%|█▋        | 29/167 [00:02<00:06, 22.26it/s][A
Epoch 0:  43%|████▎     | 2544/5971 [24:32<33:02,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  19%|█▉        | 32/167 [00:02<00:05, 22.88it/s][A

Validating:  21%|██        | 35/167 [00:02<00:05, 23.73it/s][A
Epoch 0:  43%|████▎     | 2548/5971 [24:32<32:57,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 23.99it/s][A
Epoch 0:  43%|████▎     | 2552/5971 [24:32<32:52,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 25.04it/s][A
Epoch 0:  43%|████▎     | 2556/5971 [24:33<32:47,  1.74it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  26%|██▋       | 44/167 [00:03<00:04, 25.50it/s][A

Validating:  28%|██▊       | 47/167 [00:03<00:04, 26.47it/s][A
Epoch 0:  43%|████▎     | 2560/5971 [24:33<32:42,  1.74it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  30%|██▉       | 50/167 [00:03<00:04, 26.61it/s][A
Epoch 0:  43%|████▎     | 2564/5971 [24:33<32:36,  1.74it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  32%|███▏      | 53/167 [00:03<00:04, 26.54it/s][A
Epoch 0:  43%|████▎     | 2568/5971 [24:33<32:31,  1.74it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 26.74it/s][A

Validating:  35%|███▌      | 59/167 [00:03<00:04, 26.74it/s][A
Epoch 0:  43%|████▎     | 2572/5971 [24:33<32:26,  1.75it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  37%|███▋      | 62/167 [00:03<00:03, 26.91it/s][A
Epoch 0:  43%|████▎     | 2576/5971 [24:33<32:21,  1.75it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.40it/s][A
Epoch 0:  43%|████▎     | 2580/5971 [24:33<32:16,  1.75it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.34it/s][A

Validating:  43%|████▎     | 71/167 [00:04<00:03, 25.82it/s][A
Epoch 0:  43%|████▎     | 2584/5971 [24:34<32:11,  1.75it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  44%|████▍     | 74/167 [00:04<00:03, 25.44it/s][A
Epoch 0:  43%|████▎     | 2588/5971 [24:34<32:06,  1.76it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  46%|████▌     | 77/167 [00:04<00:03, 24.93it/s][A
Epoch 0:  43%|████▎     | 2592/5971 [24:34<32:01,  1.76it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  48%|████▊     | 80/167 [00:04<00:03, 25.10it/s][A

Validating:  50%|████▉     | 83/167 [00:04<00:03, 25.99it/s][A
Epoch 0:  43%|████▎     | 2596/5971 [24:34<31:56,  1.76it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 25.63it/s][A
Epoch 0:  44%|████▎     | 2600/5971 [24:34<31:51,  1.76it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 25.24it/s][A
Epoch 0:  44%|████▎     | 2604/5971 [24:34<31:46,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.78it/s][A

Validating:  57%|█████▋    | 95/167 [00:05<00:02, 25.09it/s][A
Epoch 0:  44%|████▎     | 2608/5971 [24:35<31:41,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  59%|█████▊    | 98/167 [00:05<00:02, 25.32it/s][A
Epoch 0:  44%|████▎     | 2612/5971 [24:35<31:36,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  60%|██████    | 101/167 [00:05<00:02, 24.52it/s][A
Epoch 0:  44%|████▍     | 2616/5971 [24:35<31:31,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  62%|██████▏   | 104/167 [00:05<00:02, 23.80it/s][A

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 23.70it/s][A
Epoch 0:  44%|████▍     | 2620/5971 [24:35<31:26,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 23.50it/s][A
Epoch 0:  44%|████▍     | 2624/5971 [24:35<31:21,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 23.93it/s][A
Epoch 0:  44%|████▍     | 2628/5971 [24:35<31:16,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 23.86it/s][A

Validating:  71%|███████▏  | 119/167 [00:06<00:01, 25.13it/s][A
Epoch 0:  44%|████▍     | 2632/5971 [24:36<31:11,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  73%|███████▎  | 122/167 [00:06<00:01, 23.85it/s][A
Epoch 0:  44%|████▍     | 2636/5971 [24:36<31:06,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  75%|███████▍  | 125/167 [00:06<00:01, 23.60it/s][A
Epoch 0:  44%|████▍     | 2640/5971 [24:36<31:02,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  77%|███████▋  | 128/167 [00:06<00:01, 24.92it/s][A

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 25.39it/s][A
Epoch 0:  44%|████▍     | 2644/5971 [24:36<30:57,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  80%|████████  | 134/167 [00:06<00:01, 26.09it/s][A
Epoch 0:  44%|████▍     | 2648/5971 [24:36<30:52,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.45it/s][A
Epoch 0:  44%|████▍     | 2652/5971 [24:36<30:47,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.31it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.92it/s][A
Epoch 0:  44%|████▍     | 2656/5971 [24:36<30:42,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  87%|████████▋ | 146/167 [00:07<00:00, 24.49it/s][A
Epoch 0:  45%|████▍     | 2660/5971 [24:37<30:37,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  89%|████████▉ | 149/167 [00:07<00:00, 24.09it/s][A
Epoch 0:  45%|████▍     | 2664/5971 [24:37<30:33,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  91%|█████████ | 152/167 [00:07<00:00, 24.22it/s][A

Validating:  93%|█████████▎| 155/167 [00:07<00:00, 25.31it/s][A
Epoch 0:  45%|████▍     | 2668/5971 [24:37<30:28,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 24.72it/s][A
Epoch 0:  45%|████▍     | 2672/5971 [24:37<30:23,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 26.29it/s][A
Epoch 0:  45%|████▍     | 2676/5971 [24:37<30:18,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 26.44it/s][A
Epoch 0:  45%|████▍     | 2680/5971 [24:37<30:14,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]
Epoch 0:  45%|████▍     | 2680/5971 [24:38<30:14,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=249.0]

                                                             [A
Epoch 0:  45%|████▍     | 2681/5971 [24:39<30:14,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.647, train/loss_vlb_step=0.00809, train/loss_step=0.647, global_step=250.0]  
Epoch 0:  45%|████▍     | 2682/5971 [24:40<30:14,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000498, train/loss_step=0.142, global_step=250.0]
Epoch 0:  45%|████▍     | 2683/5971 [24:40<30:14,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000184, train/loss_step=0.0525, global_step=250.0]
Epoch 0:  45%|████▍     | 2684/5971 [24:44<30:16,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000184, train/loss_step=0.0525, global_step=250.0]
Epoch 0:  45%|████▍     | 2684/5971 [24:44<30:16,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0445, train/loss_vlb_step=0.000163, train/loss_step=0.0445, global_step=250.0]
Epoch 0:  45%|████▍     | 2685/5971 [24:44<30:16,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.31e-5, train/loss_step=0.00232, global_step=251.0]
Epoch 0:  45%|████▍     | 2686/5971 [24:45<30:16,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00353, train/loss_step=0.513, global_step=251.0]    
Epoch 0:  45%|████▌     | 2687/5971 [24:46<30:16,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.58e-5, train/loss_step=0.0028, global_step=251.0]
Epoch 0:  45%|████▌     | 2688/5971 [24:48<30:17,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.58e-5, train/loss_step=0.0028, global_step=251.0]
Epoch 0:  45%|████▌     | 2688/5971 [24:48<30:17,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.63e-5, train/loss_step=0.00292, global_step=251.0]
Epoch 0:  45%|████▌     | 2689/5971 [24:49<30:17,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.83e-5, train/loss_step=0.0111, global_step=252.0]  
Epoch 0:  45%|████▌     | 2690/5971 [24:50<30:17,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000444, train/loss_step=0.135, global_step=252.0] 
Epoch 0:  45%|████▌     | 2691/5971 [24:51<30:17,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000114, train/loss_step=0.0292, global_step=252.0]
Epoch 0:  45%|████▌     | 2692/5971 [24:53<30:18,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000114, train/loss_step=0.0292, global_step=252.0]
Epoch 0:  45%|████▌     | 2692/5971 [24:53<30:18,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00338, train/loss_vlb_step=1.84e-5, train/loss_step=0.00338, global_step=252.0]
Epoch 0:  45%|████▌     | 2693/5971 [24:54<30:18,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000563, train/loss_step=0.168, global_step=253.0]   
Epoch 0:  45%|████▌     | 2694/5971 [24:55<30:18,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000453, train/loss_step=0.133, global_step=253.0]
Epoch 0:  45%|████▌     | 2695/5971 [24:56<30:18,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=3.14e-5, train/loss_step=0.00626, global_step=253.0]
Epoch 0:  45%|████▌     | 2696/5971 [24:58<30:19,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=3.14e-5, train/loss_step=0.00626, global_step=253.0]
Epoch 0:  45%|████▌     | 2696/5971 [24:58<30:19,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.86e-5, train/loss_step=0.00342, global_step=253.0]
Epoch 0:  45%|████▌     | 2697/5971 [24:59<30:19,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000104, train/loss_step=0.0264, global_step=254.0]  
Epoch 0:  45%|████▌     | 2698/5971 [25:00<30:19,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.32e-5, train/loss_step=0.00434, global_step=254.0]
Epoch 0:  45%|████▌     | 2699/5971 [25:01<30:19,  1.80it/s, loss=0.103, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=254.0]  
Epoch 0:  45%|████▌     | 2700/5971 [25:03<30:20,  1.80it/s, loss=0.103, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=254.0]
Epoch 0:  45%|████▌     | 2700/5971 [25:03<30:20,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000701, train/loss_step=0.191, global_step=254.0]
Epoch 0:  45%|████▌     | 2701/5971 [25:04<30:20,  1.80it/s, loss=0.0801, v_num=0, train/loss_simple_step=0.00233, train/loss_vlb_step=1.35e-5, train/loss_step=0.00233, global_step=255.0]
Epoch 0:  45%|████▌     | 2702/5971 [25:05<30:20,  1.80it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.0032, train/loss_step=0.468, global_step=255.0]     
Epoch 0:  45%|████▌     | 2703/5971 [25:06<30:20,  1.80it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.35e-6, train/loss_step=0.00155, global_step=255.0]
Epoch 0:  45%|████▌     | 2704/5971 [25:08<30:21,  1.79it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.35e-6, train/loss_step=0.00155, global_step=255.0]
Epoch 0:  45%|████▌     | 2704/5971 [25:08<30:21,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.720, train/loss_vlb_step=0.0212, train/loss_step=0.720, global_step=255.0]      
Epoch 0:  45%|████▌     | 2705/5971 [25:09<30:21,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00284, train/loss_step=0.478, global_step=256.0]
Epoch 0:  45%|████▌     | 2706/5971 [25:10<30:21,  1.79it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000192, train/loss_step=0.0542, global_step=256.0]
Epoch 0:  45%|████▌     | 2707/5971 [25:11<30:21,  1.79it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00534, train/loss_vlb_step=2.72e-5, train/loss_step=0.00534, global_step=256.0]
Epoch 0:  45%|████▌     | 2708/5971 [25:13<30:22,  1.79it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00534, train/loss_vlb_step=2.72e-5, train/loss_step=0.00534, global_step=256.0]
Epoch 0:  45%|████▌     | 2708/5971 [25:13<30:22,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.0055, train/loss_step=0.590, global_step=256.0]     
Epoch 0:  45%|████▌     | 2709/5971 [25:14<30:22,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0907, train/loss_vlb_step=0.000302, train/loss_step=0.0907, global_step=257.0]
Epoch 0:  45%|████▌     | 2710/5971 [25:14<30:22,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=7.68e-5, train/loss_step=0.0169, global_step=257.0] 
Epoch 0:  45%|████▌     | 2711/5971 [25:15<30:22,  1.79it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000265, train/loss_step=0.0803, global_step=257.0]
Epoch 0:  45%|████▌     | 2712/5971 [25:18<30:23,  1.79it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000265, train/loss_step=0.0803, global_step=257.0]
Epoch 0:  45%|████▌     | 2712/5971 [25:18<30:23,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00136, train/loss_step=0.303, global_step=257.0]   
Epoch 0:  45%|████▌     | 2713/5971 [25:19<30:23,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0612, train/loss_vlb_step=0.000214, train/loss_step=0.0612, global_step=258.0]
Epoch 0:  45%|████▌     | 2714/5971 [25:20<30:23,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00891, train/loss_vlb_step=4.24e-5, train/loss_step=0.00891, global_step=258.0]
Epoch 0:  45%|████▌     | 2715/5971 [25:20<30:23,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00148, train/loss_step=0.300, global_step=258.0]    
Epoch 0:  45%|████▌     | 2716/5971 [25:23<30:24,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00148, train/loss_step=0.300, global_step=258.0]
Epoch 0:  45%|████▌     | 2716/5971 [25:23<30:24,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00499, train/loss_vlb_step=2.62e-5, train/loss_step=0.00499, global_step=258.0]
Epoch 0:  46%|████▌     | 2717/5971 [25:24<30:24,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00685, train/loss_vlb_step=3.4e-5, train/loss_step=0.00685, global_step=259.0] 
Epoch 0:  46%|████▌     | 2718/5971 [25:25<30:24,  1.78it/s, loss=0.19, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00142, train/loss_step=0.295, global_step=259.0]    
Epoch 0:  46%|████▌     | 2719/5971 [25:25<30:24,  1.78it/s, loss=0.198, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00096, train/loss_step=0.279, global_step=259.0]
Epoch 0:  46%|████▌     | 2720/5971 [25:28<30:25,  1.78it/s, loss=0.198, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00096, train/loss_step=0.279, global_step=259.0]
Epoch 0:  46%|████▌     | 2720/5971 [25:28<30:25,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=259.0]
Epoch 0:  46%|████▌     | 2721/5971 [25:28<30:25,  1.78it/s, loss=0.219, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00296, train/loss_step=0.495, global_step=260.0] 
Epoch 0:  46%|████▌     | 2722/5971 [25:29<30:25,  1.78it/s, loss=0.203, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000544, train/loss_step=0.164, global_step=260.0]
Epoch 0:  46%|████▌     | 2723/5971 [25:30<30:25,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.19e-5, train/loss_step=0.024, global_step=260.0] 
Epoch 0:  46%|████▌     | 2724/5971 [25:33<30:26,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.19e-5, train/loss_step=0.024, global_step=260.0]
Epoch 0:  46%|████▌     | 2724/5971 [25:33<30:26,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00105, train/loss_step=0.282, global_step=260.0]
Epoch 0:  46%|████▌     | 2725/5971 [25:33<30:26,  1.78it/s, loss=0.175, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00156, train/loss_step=0.330, global_step=261.0]
Epoch 0:  46%|████▌     | 2726/5971 [25:34<30:26,  1.78it/s, loss=0.181, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000543, train/loss_step=0.165, global_step=261.0]
Epoch 0:  46%|████▌     | 2727/5971 [25:35<30:26,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000212, train/loss_step=0.0632, global_step=261.0]
Epoch 0:  46%|████▌     | 2728/5971 [25:37<30:27,  1.77it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000212, train/loss_step=0.0632, global_step=261.0]
Epoch 0:  46%|████▌     | 2728/5971 [25:37<30:27,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.00012, train/loss_step=0.0323, global_step=261.0] 
Epoch 0:  46%|████▌     | 2729/5971 [25:38<30:27,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000142, train/loss_step=0.0397, global_step=262.0]
Epoch 0:  46%|████▌     | 2730/5971 [25:39<30:27,  1.77it/s, loss=0.168, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00129, train/loss_step=0.304, global_step=262.0]   
Epoch 0:  46%|████▌     | 2731/5971 [25:40<30:26,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.000142, train/loss_step=0.0398, global_step=262.0]
Epoch 0:  46%|████▌     | 2732/5971 [25:43<30:28,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.000142, train/loss_step=0.0398, global_step=262.0]
Epoch 0:  46%|████▌     | 2732/5971 [25:43<30:28,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0703, train/loss_vlb_step=0.000246, train/loss_step=0.0703, global_step=262.0]
Epoch 0:  46%|████▌     | 2733/5971 [25:43<30:28,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000909, train/loss_step=0.231, global_step=263.0]  
Epoch 0:  46%|████▌     | 2734/5971 [25:44<30:28,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0599, train/loss_vlb_step=0.000208, train/loss_step=0.0599, global_step=263.0]
Epoch 0:  46%|████▌     | 2735/5971 [25:45<30:28,  1.77it/s, loss=0.168, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00197, train/loss_step=0.370, global_step=263.0]   
Epoch 0:  46%|████▌     | 2736/5971 [25:48<30:30,  1.77it/s, loss=0.168, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00197, train/loss_step=0.370, global_step=263.0]
Epoch 0:  46%|████▌     | 2736/5971 [25:48<30:30,  1.77it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00856, train/loss_vlb_step=4.19e-5, train/loss_step=0.00856, global_step=263.0]
Epoch 0:  46%|████▌     | 2737/5971 [25:49<30:29,  1.77it/s, loss=0.174, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=264.0]   
Epoch 0:  46%|████▌     | 2738/5971 [25:50<30:29,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.8e-5, train/loss_step=0.0136, global_step=264.0] 
Epoch 0:  46%|████▌     | 2739/5971 [25:51<30:29,  1.77it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000156, train/loss_step=0.0468, global_step=264.0]
Epoch 0:  46%|████▌     | 2740/5971 [25:53<30:31,  1.76it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000156, train/loss_step=0.0468, global_step=264.0]
Epoch 0:  46%|████▌     | 2740/5971 [25:53<30:31,  1.76it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.56e-5, train/loss_step=0.00274, global_step=264.0]
Epoch 0:  46%|████▌     | 2741/5971 [25:54<30:31,  1.76it/s, loss=0.128, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000621, train/loss_step=0.185, global_step=265.0]   
Epoch 0:  46%|████▌     | 2742/5971 [25:55<30:30,  1.76it/s, loss=0.142, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00263, train/loss_step=0.445, global_step=265.0] 
Epoch 0:  46%|████▌     | 2743/5971 [25:56<30:30,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.71e-5, train/loss_step=0.0121, global_step=265.0]
Epoch 0:  46%|████▌     | 2744/5971 [25:58<30:32,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.71e-5, train/loss_step=0.0121, global_step=265.0]
Epoch 0:  46%|████▌     | 2744/5971 [25:58<30:32,  1.76it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00815, train/loss_vlb_step=4.07e-5, train/loss_step=0.00815, global_step=265.0]
Epoch 0:  46%|████▌     | 2745/5971 [25:59<30:32,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.95e-5, train/loss_step=0.0135, global_step=266.0]  
Epoch 0:  46%|████▌     | 2746/5971 [26:00<30:31,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000183, train/loss_step=0.0528, global_step=266.0]
Epoch 0:  46%|████▌     | 2747/5971 [26:01<30:31,  1.76it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.81e-5, train/loss_step=0.00319, global_step=266.0]
Epoch 0:  46%|████▌     | 2748/5971 [26:03<30:32,  1.76it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.81e-5, train/loss_step=0.00319, global_step=266.0]
Epoch 0:  46%|████▌     | 2748/5971 [26:03<30:32,  1.76it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00675, train/loss_vlb_step=3.48e-5, train/loss_step=0.00675, global_step=266.0]
Epoch 0:  46%|████▌     | 2749/5971 [26:04<30:32,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.734, train/loss_vlb_step=0.0295, train/loss_step=0.734, global_step=267.0]     
Epoch 0:  46%|████▌     | 2750/5971 [26:05<30:32,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.0168, train/loss_step=0.599, global_step=267.0]
Epoch 0:  46%|████▌     | 2751/5971 [26:05<30:32,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00105, train/loss_step=0.268, global_step=267.0]
Epoch 0:  46%|████▌     | 2752/5971 [26:08<30:34,  1.75it/s, loss=0.163, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00105, train/loss_step=0.268, global_step=267.0]
Epoch 0:  46%|████▌     | 2752/5971 [26:08<30:34,  1.75it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.91e-5, train/loss_step=0.0137, global_step=267.0]
Epoch 0:  46%|████▌     | 2753/5971 [26:09<30:34,  1.75it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.36e-5, train/loss_step=0.00468, global_step=268.0]
Epoch 0:  46%|████▌     | 2754/5971 [26:10<30:33,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.00044, train/loss_step=0.132, global_step=268.0]    
Epoch 0:  46%|████▌     | 2755/5971 [26:11<30:33,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00801, train/loss_vlb_step=4.06e-5, train/loss_step=0.00801, global_step=268.0]
Epoch 0:  46%|████▌     | 2756/5971 [26:13<30:35,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00801, train/loss_vlb_step=4.06e-5, train/loss_step=0.00801, global_step=268.0]
Epoch 0:  46%|████▌     | 2756/5971 [26:13<30:35,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00526, train/loss_vlb_step=2.6e-5, train/loss_step=0.00526, global_step=268.0] 
Epoch 0:  46%|████▌     | 2757/5971 [26:14<30:35,  1.75it/s, loss=0.161, v_num=0, train/loss_simple_step=0.667, train/loss_vlb_step=0.0234, train/loss_step=0.667, global_step=269.0]    
Epoch 0:  46%|████▌     | 2758/5971 [26:15<30:34,  1.75it/s, loss=0.179, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00227, train/loss_step=0.375, global_step=269.0]
Epoch 0:  46%|████▌     | 2759/5971 [26:16<30:34,  1.75it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000144, train/loss_step=0.0396, global_step=269.0]
Epoch 0:  46%|████▌     | 2760/5971 [26:19<30:37,  1.75it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000144, train/loss_step=0.0396, global_step=269.0]
Epoch 0:  46%|████▌     | 2760/5971 [26:19<30:37,  1.75it/s, loss=0.191, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000862, train/loss_step=0.240, global_step=269.0]  
Epoch 0:  46%|████▌     | 2761/5971 [26:20<30:37,  1.75it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000152, train/loss_step=0.0421, global_step=270.0]
Epoch 0:  46%|████▋     | 2762/5971 [26:21<30:37,  1.75it/s, loss=0.169, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.00054, train/loss_step=0.159, global_step=270.0]   
Epoch 0:  46%|████▋     | 2763/5971 [26:22<30:36,  1.75it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.37e-5, train/loss_step=0.00243, global_step=270.0]
Epoch 0:  46%|████▋     | 2764/5971 [26:25<30:38,  1.74it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.37e-5, train/loss_step=0.00243, global_step=270.0]
Epoch 0:  46%|████▋     | 2764/5971 [26:25<30:38,  1.74it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.00012, train/loss_step=0.0341, global_step=270.0]   
Epoch 0:  46%|████▋     | 2765/5971 [26:26<30:38,  1.74it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00861, train/loss_vlb_step=4.28e-5, train/loss_step=0.00861, global_step=271.0]
Epoch 0:  46%|████▋     | 2766/5971 [26:27<30:38,  1.74it/s, loss=0.178, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000784, train/loss_step=0.215, global_step=271.0]  
Epoch 0:  46%|████▋     | 2767/5971 [26:28<30:38,  1.74it/s, loss=0.179, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000107, train/loss_step=0.028, global_step=271.0]
Epoch 0:  46%|████▋     | 2768/5971 [26:30<30:39,  1.74it/s, loss=0.179, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000107, train/loss_step=0.028, global_step=271.0]
Epoch 0:  46%|████▋     | 2768/5971 [26:30<30:39,  1.74it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00486, train/loss_vlb_step=2.54e-5, train/loss_step=0.00486, global_step=271.0]
Epoch 0:  46%|████▋     | 2769/5971 [26:31<30:39,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00117, train/loss_step=0.273, global_step=272.0]    
Epoch 0:  46%|████▋     | 2770/5971 [26:31<30:39,  1.74it/s, loss=0.139, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00103, train/loss_step=0.269, global_step=272.0]
Epoch 0:  46%|████▋     | 2771/5971 [26:32<30:38,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.66e-5, train/loss_step=0.00739, global_step=272.0]
Epoch 0:  46%|████▋     | 2772/5971 [26:34<30:40,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.66e-5, train/loss_step=0.00739, global_step=272.0]
Epoch 0:  46%|████▋     | 2772/5971 [26:34<30:40,  1.74it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0608, train/loss_vlb_step=0.000213, train/loss_step=0.0608, global_step=272.0] 
Epoch 0:  46%|████▋     | 2773/5971 [26:35<30:39,  1.74it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0441, train/loss_vlb_step=0.000167, train/loss_step=0.0441, global_step=273.0]
Epoch 0:  46%|████▋     | 2774/5971 [26:36<30:39,  1.74it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00354, train/loss_vlb_step=1.87e-5, train/loss_step=0.00354, global_step=273.0]
Epoch 0:  46%|████▋     | 2775/5971 [26:37<30:39,  1.74it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000108, train/loss_step=0.0273, global_step=273.0] 
Epoch 0:  46%|████▋     | 2776/5971 [26:39<30:40,  1.74it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000108, train/loss_step=0.0273, global_step=273.0]
Epoch 0:  46%|████▋     | 2776/5971 [26:39<30:40,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.00322, train/loss_step=0.485, global_step=273.0]   
Epoch 0:  47%|████▋     | 2777/5971 [26:40<30:40,  1.74it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.66e-5, train/loss_step=0.0241, global_step=274.0]
Epoch 0:  47%|████▋     | 2778/5971 [26:41<30:40,  1.74it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.00448, train/loss_vlb_step=2.39e-5, train/loss_step=0.00448, global_step=274.0]
Epoch 0:  47%|████▋     | 2779/5971 [26:42<30:39,  1.73it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000139, train/loss_step=0.0349, global_step=274.0] 
Epoch 0:  47%|████▋     | 2780/5971 [26:44<30:41,  1.73it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000139, train/loss_step=0.0349, global_step=274.0]
Epoch 0:  47%|████▋     | 2780/5971 [26:44<30:41,  1.73it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.38it/s][A

Validating:   1%|          | 2/167 [00:00<00:56,  2.90it/s][A
Epoch 0:  47%|████▋     | 2784/5971 [26:45<30:37,  1.73it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:   3%|▎         | 5/167 [00:00<00:20,  7.97it/s][A
Epoch 0:  47%|████▋     | 2788/5971 [26:45<30:32,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.56it/s][A
Epoch 0:  47%|████▋     | 2792/5971 [26:46<30:27,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:   7%|▋         | 12/167 [00:01<00:09, 16.79it/s][A

Validating:   9%|▉         | 15/167 [00:01<00:07, 19.03it/s][A
Epoch 0:  47%|████▋     | 2796/5971 [26:46<30:23,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  11%|█         | 18/167 [00:01<00:07, 21.00it/s][A
Epoch 0:  47%|████▋     | 2800/5971 [26:46<30:18,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.54it/s][A
Epoch 0:  47%|████▋     | 2804/5971 [26:46<30:13,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.00it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:06, 22.97it/s][A
Epoch 0:  47%|████▋     | 2808/5971 [26:46<30:09,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.69it/s][A
Epoch 0:  47%|████▋     | 2812/5971 [26:46<30:04,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 23.88it/s][A
Epoch 0:  47%|████▋     | 2816/5971 [26:46<29:59,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 24.71it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.35it/s][A
Epoch 0:  47%|████▋     | 2820/5971 [26:47<29:55,  1.76it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.25it/s][A
Epoch 0:  47%|████▋     | 2824/5971 [26:47<29:50,  1.76it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.12it/s][A
Epoch 0:  47%|████▋     | 2828/5971 [26:47<29:45,  1.76it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.44it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 27.64it/s][A
Epoch 0:  47%|████▋     | 2832/5971 [26:47<29:41,  1.76it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.24it/s][A
Epoch 0:  47%|████▋     | 2836/5971 [26:47<29:36,  1.76it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.11it/s][A
Epoch 0:  48%|████▊     | 2840/5971 [26:47<29:32,  1.77it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.23it/s][A

Validating:  38%|███▊      | 63/167 [00:03<00:04, 25.31it/s][A
Epoch 0:  48%|████▊     | 2844/5971 [26:48<29:27,  1.77it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 24.57it/s][A
Epoch 0:  48%|████▊     | 2848/5971 [26:48<29:22,  1.77it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.79it/s][A
Epoch 0:  48%|████▊     | 2852/5971 [26:48<29:18,  1.77it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.76it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.24it/s][A
Epoch 0:  48%|████▊     | 2856/5971 [26:48<29:13,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.88it/s][A
Epoch 0:  48%|████▊     | 2860/5971 [26:48<29:09,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.81it/s][A
Epoch 0:  48%|████▊     | 2864/5971 [26:48<29:04,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.56it/s][A
Epoch 0:  48%|████▊     | 2868/5971 [26:48<29:00,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 28.22it/s][A
Epoch 0:  48%|████▊     | 2872/5971 [26:49<28:55,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 28.49it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 28.80it/s][A
Epoch 0:  48%|████▊     | 2876/5971 [26:49<28:51,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 28.45it/s][A
Epoch 0:  48%|████▊     | 2880/5971 [26:49<28:46,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.08it/s][A
Epoch 0:  48%|████▊     | 2884/5971 [26:49<28:42,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.24it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.61it/s][A
Epoch 0:  48%|████▊     | 2888/5971 [26:49<28:37,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.02it/s][A
Epoch 0:  48%|████▊     | 2892/5971 [26:49<28:33,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 25.70it/s][A
Epoch 0:  49%|████▊     | 2896/5971 [26:49<28:28,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.21it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 24.89it/s][A
Epoch 0:  49%|████▊     | 2900/5971 [26:50<28:24,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.16it/s][A
Epoch 0:  49%|████▊     | 2904/5971 [26:50<28:20,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.56it/s][A
Epoch 0:  49%|████▊     | 2908/5971 [26:50<28:15,  1.81it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.54it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.95it/s][A
Epoch 0:  49%|████▉     | 2912/5971 [26:50<28:11,  1.81it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.90it/s][A
Epoch 0:  49%|████▉     | 2916/5971 [26:50<28:06,  1.81it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.96it/s][A
Epoch 0:  49%|████▉     | 2920/5971 [26:50<28:02,  1.81it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.01it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 24.66it/s][A
Epoch 0:  49%|████▉     | 2924/5971 [26:51<27:58,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 24.66it/s][A
Epoch 0:  49%|████▉     | 2928/5971 [26:51<27:53,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.10it/s][A
Epoch 0:  49%|████▉     | 2932/5971 [26:51<27:49,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.06it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 24.55it/s][A
Epoch 0:  49%|████▉     | 2936/5971 [26:51<27:45,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 24.63it/s][A
Epoch 0:  49%|████▉     | 2940/5971 [26:51<27:41,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 24.58it/s][A
Epoch 0:  49%|████▉     | 2944/5971 [26:51<27:36,  1.83it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.76it/s][A
Epoch 0:  49%|████▉     | 2948/5971 [26:52<27:32,  1.83it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]
Epoch 0:  49%|████▉     | 2948/5971 [26:52<27:32,  1.83it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00141, train/loss_step=0.305, global_step=274.0]

                                                             [A
Epoch 0:  49%|████▉     | 2949/5971 [26:53<27:32,  1.83it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=3.22e-5, train/loss_step=0.00636, global_step=275.0]
Epoch 0:  49%|████▉     | 2950/5971 [26:54<27:32,  1.83it/s, loss=0.101, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000664, train/loss_step=0.190, global_step=275.0]    
Epoch 0:  49%|████▉     | 2951/5971 [26:55<27:32,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.724, train/loss_vlb_step=0.0617, train/loss_step=0.724, global_step=275.0]  
Epoch 0:  49%|████▉     | 2952/5971 [26:58<27:34,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.724, train/loss_vlb_step=0.0617, train/loss_step=0.724, global_step=275.0]
Epoch 0:  49%|████▉     | 2952/5971 [26:58<27:34,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00365, train/loss_step=0.473, global_step=275.0]
Epoch 0:  49%|████▉     | 2953/5971 [26:59<27:34,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.672, train/loss_vlb_step=0.0146, train/loss_step=0.672, global_step=276.0] 
Epoch 0:  49%|████▉     | 2954/5971 [26:59<27:33,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00232, train/loss_step=0.371, global_step=276.0] 
Epoch 0:  49%|████▉     | 2955/5971 [27:00<27:33,  1.82it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000232, train/loss_step=0.0655, global_step=276.0]
Epoch 0:  50%|████▉     | 2956/5971 [27:03<27:34,  1.82it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000232, train/loss_step=0.0655, global_step=276.0]
Epoch 0:  50%|████▉     | 2956/5971 [27:03<27:34,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.01e-5, train/loss_step=0.0114, global_step=276.0] 
Epoch 0:  50%|████▉     | 2957/5971 [27:03<27:34,  1.82it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.02e-6, train/loss_step=0.00151, global_step=277.0]
Epoch 0:  50%|████▉     | 2958/5971 [27:04<27:34,  1.82it/s, loss=0.185, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000718, train/loss_step=0.189, global_step=277.0]   
Epoch 0:  50%|████▉     | 2959/5971 [27:05<27:34,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000407, train/loss_step=0.123, global_step=277.0]
Epoch 0:  50%|████▉     | 2960/5971 [27:07<27:35,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000407, train/loss_step=0.123, global_step=277.0]
Epoch 0:  50%|████▉     | 2960/5971 [27:07<27:35,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.58e-5, train/loss_step=0.00262, global_step=277.0]
Epoch 0:  50%|████▉     | 2961/5971 [27:08<27:35,  1.82it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.34e-5, train/loss_step=0.00228, global_step=278.0]
Epoch 0:  50%|████▉     | 2962/5971 [27:09<27:34,  1.82it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.000232, train/loss_step=0.0653, global_step=278.0] 
Epoch 0:  50%|████▉     | 2963/5971 [27:10<27:34,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000186, train/loss_step=0.053, global_step=278.0]   
Epoch 0:  50%|████▉     | 2964/5971 [27:12<27:35,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000186, train/loss_step=0.053, global_step=278.0]
Epoch 0:  50%|████▉     | 2964/5971 [27:12<27:35,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00697, train/loss_vlb_step=3.37e-5, train/loss_step=0.00697, global_step=278.0]
Epoch 0:  50%|████▉     | 2965/5971 [27:13<27:35,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00252, train/loss_step=0.384, global_step=279.0]    
Epoch 0:  50%|████▉     | 2966/5971 [27:14<27:35,  1.82it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000109, train/loss_step=0.0264, global_step=279.0]
Epoch 0:  50%|████▉     | 2967/5971 [27:15<27:35,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000765, train/loss_step=0.208, global_step=279.0]  
Epoch 0:  50%|████▉     | 2968/5971 [27:17<27:36,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000765, train/loss_step=0.208, global_step=279.0]
Epoch 0:  50%|████▉     | 2968/5971 [27:17<27:36,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=279.0]
Epoch 0:  50%|████▉     | 2969/5971 [27:18<27:35,  1.81it/s, loss=0.208, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00319, train/loss_step=0.473, global_step=280.0] 
Epoch 0:  50%|████▉     | 2970/5971 [27:19<27:35,  1.81it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0681, train/loss_vlb_step=0.000226, train/loss_step=0.0681, global_step=280.0]
Epoch 0:  50%|████▉     | 2971/5971 [27:20<27:35,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000387, train/loss_step=0.118, global_step=280.0]  
Epoch 0:  50%|████▉     | 2972/5971 [27:22<27:36,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000387, train/loss_step=0.118, global_step=280.0]
Epoch 0:  50%|████▉     | 2972/5971 [27:22<27:36,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.629, train/loss_vlb_step=0.00719, train/loss_step=0.629, global_step=280.0]  
Epoch 0:  50%|████▉     | 2973/5971 [27:23<27:36,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00202, train/loss_step=0.380, global_step=281.0]
Epoch 0:  50%|████▉     | 2974/5971 [27:23<27:36,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.763, train/loss_vlb_step=0.065, train/loss_step=0.763, global_step=281.0]  
Epoch 0:  50%|████▉     | 2975/5971 [27:24<27:35,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=7.13e-5, train/loss_step=0.0169, global_step=281.0]
Epoch 0:  50%|████▉     | 2976/5971 [27:26<27:36,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=7.13e-5, train/loss_step=0.0169, global_step=281.0]
Epoch 0:  50%|████▉     | 2976/5971 [27:26<27:36,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000572, train/loss_step=0.169, global_step=281.0]  
Epoch 0:  50%|████▉     | 2977/5971 [27:27<27:36,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.17e-5, train/loss_step=0.00202, global_step=282.0]
Epoch 0:  50%|████▉     | 2978/5971 [27:28<27:36,  1.81it/s, loss=0.187, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000392, train/loss_step=0.116, global_step=282.0]  
Epoch 0:  50%|████▉     | 2979/5971 [27:29<27:36,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000687, train/loss_step=0.202, global_step=282.0] 
Epoch 0:  50%|████▉     | 2980/5971 [27:31<27:37,  1.80it/s, loss=0.19, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000687, train/loss_step=0.202, global_step=282.0]
Epoch 0:  50%|████▉     | 2980/5971 [27:31<27:37,  1.80it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.24e-5, train/loss_step=0.00415, global_step=282.0]
Epoch 0:  50%|████▉     | 2981/5971 [27:32<27:37,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=283.0]   
Epoch 0:  50%|████▉     | 2982/5971 [27:33<27:37,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=4e-5, train/loss_step=0.00852, global_step=283.0]
Epoch 0:  50%|████▉     | 2983/5971 [27:34<27:36,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000425, train/loss_step=0.129, global_step=283.0]
Epoch 0:  50%|████▉     | 2984/5971 [27:37<27:38,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000425, train/loss_step=0.129, global_step=283.0]
Epoch 0:  50%|████▉     | 2984/5971 [27:37<27:38,  1.80it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0639, train/loss_vlb_step=0.000218, train/loss_step=0.0639, global_step=283.0]
Epoch 0:  50%|████▉     | 2985/5971 [27:38<27:38,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=8.08e-5, train/loss_step=0.0187, global_step=284.0]
Epoch 0:  50%|█████     | 2986/5971 [27:39<27:37,  1.80it/s, loss=0.187, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000428, train/loss_step=0.130, global_step=284.0] 
Epoch 0:  50%|█████     | 2987/5971 [27:39<27:37,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00268, train/loss_step=0.379, global_step=284.0] 
Epoch 0:  50%|█████     | 2988/5971 [27:42<27:38,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00268, train/loss_step=0.379, global_step=284.0]
Epoch 0:  50%|█████     | 2988/5971 [27:42<27:38,  1.80it/s, loss=0.192, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000168, train/loss_step=0.047, global_step=284.0]
Epoch 0:  50%|█████     | 2989/5971 [27:42<27:38,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000753, train/loss_step=0.212, global_step=285.0]
Epoch 0:  50%|█████     | 2990/5971 [27:43<27:38,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000759, train/loss_step=0.217, global_step=285.0]
Epoch 0:  50%|█████     | 2991/5971 [27:44<27:38,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000126, train/loss_step=0.0333, global_step=285.0]
Epoch 0:  50%|█████     | 2992/5971 [27:46<27:39,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000126, train/loss_step=0.0333, global_step=285.0]
Epoch 0:  50%|█████     | 2992/5971 [27:46<27:39,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0686, train/loss_vlb_step=0.000235, train/loss_step=0.0686, global_step=285.0]
Epoch 0:  50%|█████     | 2993/5971 [27:47<27:38,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.000215, train/loss_step=0.0649, global_step=286.0]
Epoch 0:  50%|█████     | 2994/5971 [27:48<27:38,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000251, train/loss_step=0.0743, global_step=286.0]
Epoch 0:  50%|█████     | 2995/5971 [27:49<27:38,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000556, train/loss_step=0.163, global_step=286.0]  
Epoch 0:  50%|█████     | 2996/5971 [27:51<27:39,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000556, train/loss_step=0.163, global_step=286.0]
Epoch 0:  50%|█████     | 2996/5971 [27:51<27:39,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000468, train/loss_step=0.136, global_step=286.0]
Epoch 0:  50%|█████     | 2997/5971 [27:52<27:39,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.58e-6, train/loss_step=0.00166, global_step=287.0]
Epoch 0:  50%|█████     | 2998/5971 [27:53<27:38,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.0012, train/loss_step=0.282, global_step=287.0]     
Epoch 0:  50%|█████     | 2999/5971 [27:54<27:38,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.39e-5, train/loss_step=0.00229, global_step=287.0]
Epoch 0:  50%|█████     | 3000/5971 [27:56<27:39,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.39e-5, train/loss_step=0.00229, global_step=287.0]
Epoch 0:  50%|█████     | 3000/5971 [27:56<27:39,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.85e-5, train/loss_step=0.011, global_step=287.0]    
Epoch 0:  50%|█████     | 3001/5971 [27:57<27:39,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000536, train/loss_step=0.157, global_step=288.0]
Epoch 0:  50%|█████     | 3002/5971 [27:58<27:39,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.75e-5, train/loss_step=0.00541, global_step=288.0]
Epoch 0:  50%|█████     | 3003/5971 [27:59<27:39,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.34e-5, train/loss_step=0.00244, global_step=288.0]
Epoch 0:  50%|█████     | 3004/5971 [28:01<27:40,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.34e-5, train/loss_step=0.00244, global_step=288.0]
Epoch 0:  50%|█████     | 3004/5971 [28:01<27:40,  1.79it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0993, train/loss_vlb_step=0.000329, train/loss_step=0.0993, global_step=288.0] 
Epoch 0:  50%|█████     | 3005/5971 [28:02<27:40,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00313, train/loss_vlb_step=1.68e-5, train/loss_step=0.00313, global_step=289.0]
Epoch 0:  50%|█████     | 3006/5971 [28:03<27:40,  1.79it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000108, train/loss_step=0.0283, global_step=289.0]
Epoch 0:  50%|█████     | 3007/5971 [28:04<27:39,  1.79it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.35e-5, train/loss_step=0.021, global_step=289.0]   
Epoch 0:  50%|█████     | 3008/5971 [28:07<27:41,  1.78it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.35e-5, train/loss_step=0.021, global_step=289.0]
Epoch 0:  50%|█████     | 3008/5971 [28:07<27:41,  1.78it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000753, train/loss_step=0.204, global_step=289.0]
Epoch 0:  50%|█████     | 3009/5971 [28:08<27:41,  1.78it/s, loss=0.0795, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.74e-5, train/loss_step=0.0152, global_step=290.0]
Epoch 0:  50%|█████     | 3010/5971 [28:09<27:41,  1.78it/s, loss=0.083, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00124, train/loss_step=0.287, global_step=290.0]   
Epoch 0:  50%|█████     | 3011/5971 [28:10<27:40,  1.78it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000634, train/loss_step=0.179, global_step=290.0]
Epoch 0:  50%|█████     | 3012/5971 [28:12<27:41,  1.78it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000634, train/loss_step=0.179, global_step=290.0]
Epoch 0:  50%|█████     | 3012/5971 [28:12<27:41,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00154, train/loss_step=0.320, global_step=290.0]  
Epoch 0:  50%|█████     | 3013/5971 [28:13<27:41,  1.78it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.04e-5, train/loss_step=0.0231, global_step=291.0]
Epoch 0:  50%|█████     | 3014/5971 [28:13<27:41,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=291.0] 
Epoch 0:  50%|█████     | 3015/5971 [28:14<27:41,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00153, train/loss_step=0.340, global_step=291.0] 
Epoch 0:  51%|█████     | 3016/5971 [28:16<27:42,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00153, train/loss_step=0.340, global_step=291.0]
Epoch 0:  51%|█████     | 3016/5971 [28:16<27:42,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.51e-5, train/loss_step=0.0178, global_step=291.0]
Epoch 0:  51%|█████     | 3017/5971 [28:17<27:41,  1.78it/s, loss=0.118, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000891, train/loss_step=0.238, global_step=292.0] 
Epoch 0:  51%|█████     | 3018/5971 [28:18<27:41,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000479, train/loss_step=0.142, global_step=292.0]
Epoch 0:  51%|█████     | 3019/5971 [28:19<27:41,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000174, train/loss_step=0.0469, global_step=292.0]
Epoch 0:  51%|█████     | 3020/5971 [28:23<27:43,  1.77it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000174, train/loss_step=0.0469, global_step=292.0]
Epoch 0:  51%|█████     | 3020/5971 [28:23<27:43,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000513, train/loss_step=0.154, global_step=292.0]   
Epoch 0:  51%|█████     | 3021/5971 [28:24<27:43,  1.77it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.89e-5, train/loss_step=0.00348, global_step=293.0]
Epoch 0:  51%|█████     | 3022/5971 [28:24<27:43,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000461, train/loss_step=0.140, global_step=293.0]   
Epoch 0:  51%|█████     | 3023/5971 [28:25<27:42,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.76e-5, train/loss_step=0.0183, global_step=293.0]
Epoch 0:  51%|█████     | 3024/5971 [28:28<27:44,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.76e-5, train/loss_step=0.0183, global_step=293.0]
Epoch 0:  51%|█████     | 3024/5971 [28:28<27:44,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000428, train/loss_step=0.130, global_step=293.0]
Epoch 0:  51%|█████     | 3025/5971 [28:29<27:44,  1.77it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.95e-5, train/loss_step=0.0239, global_step=294.0]
Epoch 0:  51%|█████     | 3026/5971 [28:30<27:43,  1.77it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000119, train/loss_step=0.0307, global_step=294.0]
Epoch 0:  51%|█████     | 3027/5971 [28:31<27:43,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000515, train/loss_step=0.152, global_step=294.0]  
Epoch 0:  51%|█████     | 3028/5971 [28:34<27:45,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000515, train/loss_step=0.152, global_step=294.0]
Epoch 0:  51%|█████     | 3028/5971 [28:34<27:45,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00251, train/loss_step=0.469, global_step=294.0] 
Epoch 0:  51%|█████     | 3029/5971 [28:34<27:45,  1.77it/s, loss=0.149, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000483, train/loss_step=0.144, global_step=295.0]
Epoch 0:  51%|█████     | 3030/5971 [28:35<27:44,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000189, train/loss_step=0.0562, global_step=295.0]
Epoch 0:  51%|█████     | 3031/5971 [28:36<27:44,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.93e-5, train/loss_step=0.0134, global_step=295.0] 
Epoch 0:  51%|█████     | 3032/5971 [28:39<27:46,  1.76it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.93e-5, train/loss_step=0.0134, global_step=295.0]
Epoch 0:  51%|█████     | 3032/5971 [28:39<27:46,  1.76it/s, loss=0.132, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00166, train/loss_step=0.386, global_step=295.0]  
Epoch 0:  51%|█████     | 3033/5971 [28:40<27:46,  1.76it/s, loss=0.146, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00155, train/loss_step=0.305, global_step=296.0]
Epoch 0:  51%|█████     | 3034/5971 [28:41<27:46,  1.76it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0598, train/loss_vlb_step=0.000207, train/loss_step=0.0598, global_step=296.0]
Epoch 0:  51%|█████     | 3035/5971 [28:42<27:45,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000588, train/loss_step=0.177, global_step=296.0]  
Epoch 0:  51%|█████     | 3036/5971 [28:45<27:47,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000588, train/loss_step=0.177, global_step=296.0]
Epoch 0:  51%|█████     | 3036/5971 [28:45<27:47,  1.76it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000179, train/loss_step=0.0497, global_step=296.0]
Epoch 0:  51%|█████     | 3037/5971 [28:46<27:47,  1.76it/s, loss=0.132, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.0005, train/loss_step=0.149, global_step=297.0]    
Epoch 0:  51%|█████     | 3038/5971 [28:47<27:47,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000858, train/loss_step=0.217, global_step=297.0]
Epoch 0:  51%|█████     | 3039/5971 [28:48<27:46,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.36e-5, train/loss_step=0.0239, global_step=297.0]
Epoch 0:  51%|█████     | 3040/5971 [28:50<27:48,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.36e-5, train/loss_step=0.0239, global_step=297.0]
Epoch 0:  51%|█████     | 3040/5971 [28:50<27:48,  1.76it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.15e-5, train/loss_step=0.00204, global_step=297.0]
Epoch 0:  51%|█████     | 3041/5971 [28:51<27:47,  1.76it/s, loss=0.143, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00143, train/loss_step=0.306, global_step=298.0]    
Epoch 0:  51%|█████     | 3042/5971 [28:52<27:47,  1.76it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0733, train/loss_vlb_step=0.000241, train/loss_step=0.0733, global_step=298.0]
Epoch 0:  51%|█████     | 3043/5971 [28:53<27:47,  1.76it/s, loss=0.143, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000313, train/loss_step=0.094, global_step=298.0]  
Epoch 0:  51%|█████     | 3044/5971 [28:56<27:48,  1.75it/s, loss=0.143, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000313, train/loss_step=0.094, global_step=298.0]
Epoch 0:  51%|█████     | 3044/5971 [28:56<27:48,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.6e-5, train/loss_step=0.0127, global_step=298.0]
Epoch 0:  51%|█████     | 3045/5971 [28:57<27:48,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.649, train/loss_vlb_step=0.0146, train/loss_step=0.649, global_step=299.0]  
Epoch 0:  51%|█████     | 3046/5971 [28:57<27:48,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.659, train/loss_vlb_step=0.0114, train/loss_step=0.659, global_step=299.0]  
Epoch 0:  51%|█████     | 3047/5971 [28:58<27:48,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000508, train/loss_step=0.152, global_step=299.0]
Epoch 0:  51%|█████     | 3048/5971 [29:01<27:49,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000508, train/loss_step=0.152, global_step=299.0]
Epoch 0:  51%|█████     | 3048/5971 [29:01<27:49,  1.75it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.22it/s][A

Validating:   1%|          | 2/167 [00:00<00:45,  3.67it/s][A
Epoch 0:  51%|█████     | 3052/5971 [29:01<27:45,  1.75it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:   2%|▏         | 4/167 [00:00<00:22,  7.18it/s][A

Validating:   4%|▎         | 6/167 [00:00<00:16,  9.89it/s][A
Epoch 0:  51%|█████     | 3056/5971 [29:01<27:41,  1.75it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:   5%|▍         | 8/167 [00:00<00:13, 11.93it/s][A

Validating:   7%|▋         | 11/167 [00:01<00:10, 15.42it/s][A
Epoch 0:  51%|█████     | 3060/5971 [29:02<27:36,  1.76it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.95it/s][A
Epoch 0:  51%|█████▏    | 3064/5971 [29:02<27:32,  1.76it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.71it/s][A
Epoch 0:  51%|█████▏    | 3068/5971 [29:02<27:28,  1.76it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.03it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.33it/s][A
Epoch 0:  51%|█████▏    | 3072/5971 [29:02<27:23,  1.76it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.95it/s][A
Epoch 0:  52%|█████▏    | 3076/5971 [29:02<27:19,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.86it/s][A
Epoch 0:  52%|█████▏    | 3080/5971 [29:02<27:15,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.91it/s][A

Validating:  21%|██        | 35/167 [00:02<00:05, 24.83it/s][A
Epoch 0:  52%|█████▏    | 3084/5971 [29:03<27:11,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.12it/s][A
Epoch 0:  52%|█████▏    | 3088/5971 [29:03<27:07,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.23it/s][A
Epoch 0:  52%|█████▏    | 3092/5971 [29:03<27:02,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.12it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.68it/s][A
Epoch 0:  52%|█████▏    | 3096/5971 [29:03<26:58,  1.78it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.43it/s][A
Epoch 0:  52%|█████▏    | 3100/5971 [29:03<26:54,  1.78it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.87it/s][A
Epoch 0:  52%|█████▏    | 3104/5971 [29:03<26:50,  1.78it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 27.86it/s][A
Epoch 0:  52%|█████▏    | 3108/5971 [29:04<26:46,  1.78it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.21it/s][A

Validating:  38%|███▊      | 63/167 [00:03<00:03, 26.78it/s][A
Epoch 0:  52%|█████▏    | 3112/5971 [29:04<26:41,  1.78it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.83it/s][A
Epoch 0:  52%|█████▏    | 3116/5971 [29:04<26:37,  1.79it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 27.00it/s][A
Epoch 0:  52%|█████▏    | 3120/5971 [29:04<26:33,  1.79it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.64it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.64it/s][A
Epoch 0:  52%|█████▏    | 3124/5971 [29:04<26:29,  1.79it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.51it/s][A
Epoch 0:  52%|█████▏    | 3128/5971 [29:04<26:25,  1.79it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.39it/s][A
Epoch 0:  52%|█████▏    | 3132/5971 [29:04<26:21,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 28.14it/s][A
Epoch 0:  53%|█████▎    | 3136/5971 [29:05<26:17,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  53%|█████▎    | 89/167 [00:04<00:02, 27.08it/s][A
Epoch 0:  53%|█████▎    | 3140/5971 [29:05<26:12,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.51it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 24.93it/s][A
Epoch 0:  53%|█████▎    | 3144/5971 [29:05<26:08,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 23.67it/s][A
Epoch 0:  53%|█████▎    | 3148/5971 [29:05<26:04,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.26it/s][A
Epoch 0:  53%|█████▎    | 3152/5971 [29:05<26:00,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 23.92it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.24it/s][A
Epoch 0:  53%|█████▎    | 3156/5971 [29:05<25:56,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 24.56it/s][A
Epoch 0:  53%|█████▎    | 3160/5971 [29:06<25:52,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 24.43it/s][A
Epoch 0:  53%|█████▎    | 3164/5971 [29:06<25:48,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 25.77it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.77it/s][A
Epoch 0:  53%|█████▎    | 3168/5971 [29:06<25:44,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.70it/s][A
Epoch 0:  53%|█████▎    | 3172/5971 [29:06<25:40,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.00it/s][A
Epoch 0:  53%|█████▎    | 3176/5971 [29:06<25:36,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.06it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.70it/s][A
Epoch 0:  53%|█████▎    | 3180/5971 [29:06<25:32,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  80%|████████  | 134/167 [00:05<00:01, 24.74it/s][A
Epoch 0:  53%|█████▎    | 3184/5971 [29:06<25:28,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.14it/s][A
Epoch 0:  53%|█████▎    | 3188/5971 [29:07<25:24,  1.83it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.04it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.37it/s][A
Epoch 0:  53%|█████▎    | 3192/5971 [29:07<25:20,  1.83it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.12it/s][A
Epoch 0:  54%|█████▎    | 3196/5971 [29:07<25:16,  1.83it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.34it/s][A
Epoch 0:  54%|█████▎    | 3200/5971 [29:07<25:12,  1.83it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.67it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.79it/s][A
Epoch 0:  54%|█████▎    | 3204/5971 [29:07<25:08,  1.83it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.44it/s][A
Epoch 0:  54%|█████▎    | 3208/5971 [29:07<25:04,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 27.85it/s][A
Epoch 0:  54%|█████▍    | 3212/5971 [29:08<25:01,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 26.96it/s][A
Epoch 0:  54%|█████▍    | 3216/5971 [29:08<24:57,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]
Epoch 0:  54%|█████▍    | 3216/5971 [29:08<24:57,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=299.0]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.46it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.29it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.93it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.41it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.02it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.20it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.50it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.53it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.49it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.45it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.50it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.24it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.26it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.29it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.36it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.44it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.54it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.41it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 0:  54%|█████▍    | 3217/5971 [29:20<25:06,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.53e-5, train/loss_step=0.0166, global_step=300.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.87it/s][A
Epoch 0:  54%|█████▍    | 3217/5971 [29:23<25:08,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.53e-5, train/loss_step=0.0166, global_step=300.0]

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.33it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.67it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.91it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.34it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.60it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.25it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:05,  4.37it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  4.66it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  4.87it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.05it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  4.90it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  4.28it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  4.59it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:03,  4.84it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.05it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  4.96it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.00it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.04it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.07it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.05it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.17it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.22it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.19it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.93it/s]

Epoch 0:  54%|█████▍    | 3218/5971 [29:32<25:16,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.53e-5, train/loss_step=0.0166, global_step=300.0]
Epoch 0:  54%|█████▍    | 3218/5971 [29:32<25:16,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000156, train/loss_step=0.0464, global_step=300.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.87it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.62it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.81it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.88it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.89it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.96it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.02it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.08it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.16it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.24it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.44it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.20it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.13it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.56it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.59it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.61it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.63it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.35it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.35it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.09it/s]

Epoch 0:  54%|█████▍    | 3219/5971 [29:45<25:25,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000156, train/loss_step=0.0464, global_step=300.0]
Epoch 0:  54%|█████▍    | 3219/5971 [29:45<25:25,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000188, train/loss_step=0.0533, global_step=300.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.17it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.75it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.13it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.44it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.69it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.94it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.12it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.43it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.49it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.52it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.40it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.50it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.38it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.25it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.37it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.36it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.34it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  4.81it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:01,  4.22it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  4.54it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  4.78it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  4.91it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.01it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.15it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.98it/s]

Epoch 0:  54%|█████▍    | 3220/5971 [29:59<25:36,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000188, train/loss_step=0.0533, global_step=300.0]
Epoch 0:  54%|█████▍    | 3220/5971 [29:59<25:36,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000619, train/loss_step=0.185, global_step=300.0]  
Epoch 0:  54%|█████▍    | 3221/5971 [29:59<25:36,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000619, train/loss_step=0.185, global_step=300.0]
Epoch 0:  54%|█████▍    | 3221/5971 [29:59<25:36,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.16e-5, train/loss_step=0.00195, global_step=301.0]
Epoch 0:  54%|█████▍    | 3222/5971 [30:00<25:36,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.16e-5, train/loss_step=0.00195, global_step=301.0]
Epoch 0:  54%|█████▍    | 3222/5971 [30:00<25:36,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000603, train/loss_step=0.168, global_step=301.0]   
Epoch 0:  54%|█████▍    | 3223/5971 [30:01<25:35,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000603, train/loss_step=0.168, global_step=301.0]
Epoch 0:  54%|█████▍    | 3223/5971 [30:01<25:35,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000149, train/loss_step=0.039, global_step=301.0]
Epoch 0:  54%|█████▍    | 3224/5971 [30:04<25:36,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000149, train/loss_step=0.039, global_step=301.0]
Epoch 0:  54%|█████▍    | 3224/5971 [30:04<25:36,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=0.000104, train/loss_step=0.027, global_step=301.0] 
Epoch 0:  54%|█████▍    | 3225/5971 [30:04<25:36,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=0.000104, train/loss_step=0.027, global_step=301.0]
Epoch 0:  54%|█████▍    | 3225/5971 [30:04<25:36,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0543, train/loss_vlb_step=0.000186, train/loss_step=0.0543, global_step=302.0]
Epoch 0:  54%|█████▍    | 3226/5971 [30:05<25:36,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0543, train/loss_vlb_step=0.000186, train/loss_step=0.0543, global_step=302.0]
Epoch 0:  54%|█████▍    | 3226/5971 [30:05<25:36,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.840, train/loss_vlb_step=0.0435, train/loss_step=0.840, global_step=302.0]    
Epoch 0:  54%|█████▍    | 3227/5971 [30:06<25:35,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.840, train/loss_vlb_step=0.0435, train/loss_step=0.840, global_step=302.0]
Epoch 0:  54%|█████▍    | 3227/5971 [30:06<25:35,  1.79it/s, loss=0.186, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000824, train/loss_step=0.228, global_step=302.0]
Epoch 0:  54%|█████▍    | 3228/5971 [30:08<25:36,  1.79it/s, loss=0.186, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000824, train/loss_step=0.228, global_step=302.0]
Epoch 0:  54%|█████▍    | 3228/5971 [30:08<25:36,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00902, train/loss_vlb_step=4.37e-5, train/loss_step=0.00902, global_step=302.0]
Epoch 0:  54%|█████▍    | 3229/5971 [30:09<25:36,  1.78it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00902, train/loss_vlb_step=4.37e-5, train/loss_step=0.00902, global_step=302.0]
Epoch 0:  54%|█████▍    | 3229/5971 [30:09<25:36,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000434, train/loss_step=0.132, global_step=303.0]   
Epoch 0:  54%|█████▍    | 3230/5971 [30:10<25:36,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000434, train/loss_step=0.132, global_step=303.0]
Epoch 0:  54%|█████▍    | 3230/5971 [30:10<25:36,  1.78it/s, loss=0.197, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00396, train/loss_step=0.462, global_step=303.0] 
Epoch 0:  54%|█████▍    | 3231/5971 [30:11<25:35,  1.78it/s, loss=0.197, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00396, train/loss_step=0.462, global_step=303.0]
Epoch 0:  54%|█████▍    | 3231/5971 [30:11<25:35,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=303.0]
Epoch 0:  54%|█████▍    | 3232/5971 [30:13<25:36,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=303.0]
Epoch 0:  54%|█████▍    | 3232/5971 [30:13<25:36,  1.78it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00187, train/loss_vlb_step=1.09e-5, train/loss_step=0.00187, global_step=303.0]
Epoch 0:  54%|█████▍    | 3233/5971 [30:14<25:36,  1.78it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00187, train/loss_vlb_step=1.09e-5, train/loss_step=0.00187, global_step=303.0]
Epoch 0:  54%|█████▍    | 3233/5971 [30:14<25:36,  1.78it/s, loss=0.181, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00129, train/loss_step=0.304, global_step=304.0]    
Epoch 0:  54%|█████▍    | 3234/5971 [30:15<25:36,  1.78it/s, loss=0.181, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00129, train/loss_step=0.304, global_step=304.0]
Epoch 0:  54%|█████▍    | 3234/5971 [30:15<25:36,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00483, train/loss_step=0.408, global_step=304.0]
Epoch 0:  54%|█████▍    | 3235/5971 [30:16<25:35,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00483, train/loss_step=0.408, global_step=304.0]
Epoch 0:  54%|█████▍    | 3235/5971 [30:16<25:35,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0832, train/loss_vlb_step=0.000274, train/loss_step=0.0832, global_step=304.0]
Epoch 0:  54%|█████▍    | 3236/5971 [30:18<25:36,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0832, train/loss_vlb_step=0.000274, train/loss_step=0.0832, global_step=304.0]
Epoch 0:  54%|█████▍    | 3236/5971 [30:18<25:36,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000637, train/loss_step=0.183, global_step=304.0]  
Epoch 0:  54%|█████▍    | 3237/5971 [30:19<25:36,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000637, train/loss_step=0.183, global_step=304.0]
Epoch 0:  54%|█████▍    | 3237/5971 [30:19<25:36,  1.78it/s, loss=0.174, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=305.0]
Epoch 0:  54%|█████▍    | 3238/5971 [30:20<25:36,  1.78it/s, loss=0.174, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=305.0]
Epoch 0:  54%|█████▍    | 3238/5971 [30:20<25:36,  1.78it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0602, train/loss_vlb_step=0.000208, train/loss_step=0.0602, global_step=305.0]
Epoch 0:  54%|█████▍    | 3239/5971 [30:21<25:35,  1.78it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0602, train/loss_vlb_step=0.000208, train/loss_step=0.0602, global_step=305.0]
Epoch 0:  54%|█████▍    | 3239/5971 [30:21<25:35,  1.78it/s, loss=0.175, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000217, train/loss_step=0.063, global_step=305.0]  
Epoch 0:  54%|█████▍    | 3240/5971 [30:23<25:36,  1.78it/s, loss=0.175, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000217, train/loss_step=0.063, global_step=305.0]
Epoch 0:  54%|█████▍    | 3240/5971 [30:23<25:36,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.00019, train/loss_step=0.0539, global_step=305.0]
Epoch 0:  54%|█████▍    | 3241/5971 [30:24<25:36,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.00019, train/loss_step=0.0539, global_step=305.0]
Epoch 0:  54%|█████▍    | 3241/5971 [30:24<25:36,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000123, train/loss_step=0.0337, global_step=306.0]
Epoch 0:  54%|█████▍    | 3242/5971 [30:25<25:36,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000123, train/loss_step=0.0337, global_step=306.0]
Epoch 0:  54%|█████▍    | 3242/5971 [30:25<25:36,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000536, train/loss_step=0.163, global_step=306.0]  
Epoch 0:  54%|█████▍    | 3243/5971 [30:26<25:35,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000536, train/loss_step=0.163, global_step=306.0]
Epoch 0:  54%|█████▍    | 3243/5971 [30:26<25:35,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000796, train/loss_step=0.215, global_step=306.0]
Epoch 0:  54%|█████▍    | 3244/5971 [30:28<25:36,  1.77it/s, loss=0.178, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000796, train/loss_step=0.215, global_step=306.0]
Epoch 0:  54%|█████▍    | 3244/5971 [30:28<25:36,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.93e-5, train/loss_step=0.00366, global_step=306.0]
Epoch 0:  54%|█████▍    | 3245/5971 [30:29<25:36,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.93e-5, train/loss_step=0.00366, global_step=306.0]
Epoch 0:  54%|█████▍    | 3245/5971 [30:29<25:36,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000518, train/loss_step=0.153, global_step=307.0]   
Epoch 0:  54%|█████▍    | 3246/5971 [30:30<25:35,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000518, train/loss_step=0.153, global_step=307.0]
Epoch 0:  54%|█████▍    | 3246/5971 [30:30<25:35,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00417, train/loss_vlb_step=2.21e-5, train/loss_step=0.00417, global_step=307.0]
Epoch 0:  54%|█████▍    | 3247/5971 [30:30<25:35,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00417, train/loss_vlb_step=2.21e-5, train/loss_step=0.00417, global_step=307.0]
Epoch 0:  54%|█████▍    | 3247/5971 [30:30<25:35,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=307.0]  
Epoch 0:  54%|█████▍    | 3248/5971 [30:33<25:36,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=307.0]
Epoch 0:  54%|█████▍    | 3248/5971 [30:33<25:36,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.56e-5, train/loss_step=0.021, global_step=307.0] 
Epoch 0:  54%|█████▍    | 3249/5971 [30:34<25:36,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.56e-5, train/loss_step=0.021, global_step=307.0]
Epoch 0:  54%|█████▍    | 3249/5971 [30:34<25:36,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.630, train/loss_vlb_step=0.00788, train/loss_step=0.630, global_step=308.0] 
Epoch 0:  54%|█████▍    | 3250/5971 [30:35<25:35,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.630, train/loss_vlb_step=0.00788, train/loss_step=0.630, global_step=308.0]
Epoch 0:  54%|█████▍    | 3250/5971 [30:35<25:35,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00459, train/loss_vlb_step=2.29e-5, train/loss_step=0.00459, global_step=308.0]
Epoch 0:  54%|█████▍    | 3251/5971 [30:36<25:35,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00459, train/loss_vlb_step=2.29e-5, train/loss_step=0.00459, global_step=308.0]
Epoch 0:  54%|█████▍    | 3251/5971 [30:36<25:35,  1.77it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=0.000104, train/loss_step=0.0258, global_step=308.0] 
Epoch 0:  54%|█████▍    | 3252/5971 [30:38<25:36,  1.77it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=0.000104, train/loss_step=0.0258, global_step=308.0]
Epoch 0:  54%|█████▍    | 3252/5971 [30:38<25:36,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000668, train/loss_step=0.190, global_step=308.0]  
Epoch 0:  54%|█████▍    | 3253/5971 [30:39<25:36,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000668, train/loss_step=0.190, global_step=308.0]
Epoch 0:  54%|█████▍    | 3253/5971 [30:39<25:36,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000138, train/loss_step=0.0393, global_step=309.0]
Epoch 0:  54%|█████▍    | 3254/5971 [30:40<25:36,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000138, train/loss_step=0.0393, global_step=309.0]
Epoch 0:  54%|█████▍    | 3254/5971 [30:40<25:36,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000757, train/loss_step=0.227, global_step=309.0]   
Epoch 0:  55%|█████▍    | 3255/5971 [30:41<25:35,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000757, train/loss_step=0.227, global_step=309.0]
Epoch 0:  55%|█████▍    | 3255/5971 [30:41<25:35,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000432, train/loss_step=0.132, global_step=309.0]
Epoch 0:  55%|█████▍    | 3256/5971 [30:43<25:36,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000432, train/loss_step=0.132, global_step=309.0]
Epoch 0:  55%|█████▍    | 3256/5971 [30:43<25:36,  1.77it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.82e-5, train/loss_step=0.00347, global_step=309.0]
Epoch 0:  55%|█████▍    | 3257/5971 [30:44<25:36,  1.77it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.82e-5, train/loss_step=0.00347, global_step=309.0]
Epoch 0:  55%|█████▍    | 3257/5971 [30:44<25:36,  1.77it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00805, train/loss_vlb_step=3.95e-5, train/loss_step=0.00805, global_step=310.0]
Epoch 0:  55%|█████▍    | 3258/5971 [30:45<25:35,  1.77it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00805, train/loss_vlb_step=3.95e-5, train/loss_step=0.00805, global_step=310.0]
Epoch 0:  55%|█████▍    | 3258/5971 [30:45<25:35,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.73e-5, train/loss_step=0.0032, global_step=310.0]  
Epoch 0:  55%|█████▍    | 3259/5971 [30:46<25:35,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.73e-5, train/loss_step=0.0032, global_step=310.0]
Epoch 0:  55%|█████▍    | 3259/5971 [30:46<25:35,  1.77it/s, loss=0.117, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00176, train/loss_step=0.315, global_step=310.0]  
Epoch 0:  55%|█████▍    | 3260/5971 [30:48<25:36,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00176, train/loss_step=0.315, global_step=310.0]
Epoch 0:  55%|█████▍    | 3260/5971 [30:48<25:36,  1.76it/s, loss=0.125, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000708, train/loss_step=0.207, global_step=310.0]
Epoch 0:  55%|█████▍    | 3261/5971 [30:49<25:36,  1.76it/s, loss=0.125, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000708, train/loss_step=0.207, global_step=310.0]
Epoch 0:  55%|█████▍    | 3261/5971 [30:49<25:36,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.00092, train/loss_step=0.237, global_step=311.0] 
Epoch 0:  55%|█████▍    | 3262/5971 [30:50<25:35,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.00092, train/loss_step=0.237, global_step=311.0]
Epoch 0:  55%|█████▍    | 3262/5971 [30:50<25:35,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000565, train/loss_step=0.164, global_step=311.0]
Epoch 0:  55%|█████▍    | 3263/5971 [30:50<25:35,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000565, train/loss_step=0.164, global_step=311.0]
Epoch 0:  55%|█████▍    | 3263/5971 [30:50<25:35,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000125, train/loss_step=0.0338, global_step=311.0]
Epoch 0:  55%|█████▍    | 3264/5971 [30:53<25:36,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000125, train/loss_step=0.0338, global_step=311.0]
Epoch 0:  55%|█████▍    | 3264/5971 [30:53<25:36,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000524, train/loss_step=0.158, global_step=311.0]  
Epoch 0:  55%|█████▍    | 3265/5971 [30:53<25:36,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000524, train/loss_step=0.158, global_step=311.0]
Epoch 0:  55%|█████▍    | 3265/5971 [30:53<25:36,  1.76it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000137, train/loss_step=0.0394, global_step=312.0]
Epoch 0:  55%|█████▍    | 3266/5971 [30:54<25:35,  1.76it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000137, train/loss_step=0.0394, global_step=312.0]
Epoch 0:  55%|█████▍    | 3266/5971 [30:54<25:35,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000414, train/loss_step=0.124, global_step=312.0]  
Epoch 0:  55%|█████▍    | 3267/5971 [30:55<25:35,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000414, train/loss_step=0.124, global_step=312.0]
Epoch 0:  55%|█████▍    | 3267/5971 [30:55<25:35,  1.76it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.33e-5, train/loss_step=0.0234, global_step=312.0]
Epoch 0:  55%|█████▍    | 3268/5971 [30:57<25:36,  1.76it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.33e-5, train/loss_step=0.0234, global_step=312.0]
Epoch 0:  55%|█████▍    | 3268/5971 [30:57<25:36,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000705, train/loss_step=0.198, global_step=312.0] 
Epoch 0:  55%|█████▍    | 3269/5971 [30:58<25:35,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000705, train/loss_step=0.198, global_step=312.0]
Epoch 0:  55%|█████▍    | 3269/5971 [30:58<25:35,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000789, train/loss_step=0.208, global_step=313.0]
Epoch 0:  55%|█████▍    | 3270/5971 [30:59<25:35,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000789, train/loss_step=0.208, global_step=313.0]
Epoch 0:  55%|█████▍    | 3270/5971 [30:59<25:35,  1.76it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000123, train/loss_step=0.0322, global_step=313.0]
Epoch 0:  55%|█████▍    | 3271/5971 [31:00<25:35,  1.76it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000123, train/loss_step=0.0322, global_step=313.0]
Epoch 0:  55%|█████▍    | 3271/5971 [31:00<25:35,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.19e-5, train/loss_step=0.00205, global_step=313.0]
Epoch 0:  55%|█████▍    | 3272/5971 [31:02<25:36,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.19e-5, train/loss_step=0.00205, global_step=313.0]
Epoch 0:  55%|█████▍    | 3272/5971 [31:02<25:36,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.00072, train/loss_step=0.190, global_step=313.0]    
Epoch 0:  55%|█████▍    | 3273/5971 [31:03<25:35,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.00072, train/loss_step=0.190, global_step=313.0]
Epoch 0:  55%|█████▍    | 3273/5971 [31:03<25:35,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.58e-5, train/loss_step=0.0217, global_step=314.0]
Epoch 0:  55%|█████▍    | 3274/5971 [31:04<25:35,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.58e-5, train/loss_step=0.0217, global_step=314.0]
Epoch 0:  55%|█████▍    | 3274/5971 [31:04<25:35,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.00085, train/loss_step=0.213, global_step=314.0]  
Epoch 0:  55%|█████▍    | 3275/5971 [31:05<25:35,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.00085, train/loss_step=0.213, global_step=314.0]
Epoch 0:  55%|█████▍    | 3275/5971 [31:05<25:35,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.00576, train/loss_step=0.578, global_step=314.0]
Epoch 0:  55%|█████▍    | 3276/5971 [31:07<25:35,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.00576, train/loss_step=0.578, global_step=314.0]
Epoch 0:  55%|█████▍    | 3276/5971 [31:07<25:35,  1.75it/s, loss=0.145, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000442, train/loss_step=0.135, global_step=314.0]
Epoch 0:  55%|█████▍    | 3277/5971 [31:08<25:35,  1.75it/s, loss=0.145, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000442, train/loss_step=0.135, global_step=314.0]
Epoch 0:  55%|█████▍    | 3277/5971 [31:08<25:35,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000178, train/loss_step=0.0499, global_step=315.0]
Epoch 0:  55%|█████▍    | 3278/5971 [31:09<25:35,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000178, train/loss_step=0.0499, global_step=315.0]
Epoch 0:  55%|█████▍    | 3278/5971 [31:09<25:35,  1.75it/s, loss=0.162, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00116, train/loss_step=0.315, global_step=315.0]   
Epoch 0:  55%|█████▍    | 3279/5971 [31:10<25:35,  1.75it/s, loss=0.162, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00116, train/loss_step=0.315, global_step=315.0]
Epoch 0:  55%|█████▍    | 3279/5971 [31:10<25:35,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.21e-5, train/loss_step=0.0171, global_step=315.0]
Epoch 0:  55%|█████▍    | 3280/5971 [31:12<25:35,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.21e-5, train/loss_step=0.0171, global_step=315.0]
Epoch 0:  55%|█████▍    | 3280/5971 [31:12<25:35,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=315.0] 
Epoch 0:  55%|█████▍    | 3281/5971 [31:13<25:35,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=315.0]
Epoch 0:  55%|█████▍    | 3281/5971 [31:13<25:35,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000246, train/loss_step=0.0734, global_step=316.0]
Epoch 0:  55%|█████▍    | 3282/5971 [31:14<25:35,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000246, train/loss_step=0.0734, global_step=316.0]
Epoch 0:  55%|█████▍    | 3282/5971 [31:14<25:35,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0224, train/loss_step=0.761, global_step=316.0]    
Epoch 0:  55%|█████▍    | 3283/5971 [31:15<25:34,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0224, train/loss_step=0.761, global_step=316.0]
Epoch 0:  55%|█████▍    | 3283/5971 [31:15<25:34,  1.75it/s, loss=0.192, v_num=0, train/loss_simple_step=0.596, train/loss_vlb_step=0.00824, train/loss_step=0.596, global_step=316.0]
Epoch 0:  55%|█████▍    | 3284/5971 [31:17<25:35,  1.75it/s, loss=0.192, v_num=0, train/loss_simple_step=0.596, train/loss_vlb_step=0.00824, train/loss_step=0.596, global_step=316.0]
Epoch 0:  55%|█████▍    | 3284/5971 [31:17<25:35,  1.75it/s, loss=0.196, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.00094, train/loss_step=0.241, global_step=316.0]
Epoch 0:  55%|█████▌    | 3285/5971 [31:18<25:35,  1.75it/s, loss=0.196, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.00094, train/loss_step=0.241, global_step=316.0]
Epoch 0:  55%|█████▌    | 3285/5971 [31:18<25:35,  1.75it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.3e-5, train/loss_step=0.0023, global_step=317.0]
Epoch 0:  55%|█████▌    | 3286/5971 [31:19<25:35,  1.75it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.3e-5, train/loss_step=0.0023, global_step=317.0]
Epoch 0:  55%|█████▌    | 3286/5971 [31:19<25:35,  1.75it/s, loss=0.204, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00156, train/loss_step=0.318, global_step=317.0] 
Epoch 0:  55%|█████▌    | 3287/5971 [31:20<25:34,  1.75it/s, loss=0.204, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00156, train/loss_step=0.318, global_step=317.0]
Epoch 0:  55%|█████▌    | 3287/5971 [31:20<25:34,  1.75it/s, loss=0.209, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000443, train/loss_step=0.128, global_step=317.0]
Epoch 0:  55%|█████▌    | 3288/5971 [31:22<25:35,  1.75it/s, loss=0.209, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000443, train/loss_step=0.128, global_step=317.0]
Epoch 0:  55%|█████▌    | 3288/5971 [31:22<25:35,  1.75it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0966, train/loss_vlb_step=0.000318, train/loss_step=0.0966, global_step=317.0]
Epoch 0:  55%|█████▌    | 3289/5971 [31:23<25:35,  1.75it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0966, train/loss_vlb_step=0.000318, train/loss_step=0.0966, global_step=317.0]
Epoch 0:  55%|█████▌    | 3289/5971 [31:23<25:35,  1.75it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00978, train/loss_vlb_step=4.57e-5, train/loss_step=0.00978, global_step=318.0]
Epoch 0:  55%|█████▌    | 3290/5971 [31:24<25:35,  1.75it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00978, train/loss_vlb_step=4.57e-5, train/loss_step=0.00978, global_step=318.0]
Epoch 0:  55%|█████▌    | 3290/5971 [31:24<25:35,  1.75it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.00033, train/loss_step=0.0991, global_step=318.0]  
Epoch 0:  55%|█████▌    | 3291/5971 [31:25<25:34,  1.75it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.00033, train/loss_step=0.0991, global_step=318.0]
Epoch 0:  55%|█████▌    | 3291/5971 [31:25<25:34,  1.75it/s, loss=0.219, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00251, train/loss_step=0.426, global_step=318.0]  
Epoch 0:  55%|█████▌    | 3292/5971 [31:27<25:35,  1.74it/s, loss=0.219, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00251, train/loss_step=0.426, global_step=318.0]
Epoch 0:  55%|█████▌    | 3292/5971 [31:27<25:35,  1.74it/s, loss=0.223, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00103, train/loss_step=0.279, global_step=318.0]
Epoch 0:  55%|█████▌    | 3293/5971 [31:28<25:35,  1.74it/s, loss=0.223, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00103, train/loss_step=0.279, global_step=318.0]
Epoch 0:  55%|█████▌    | 3293/5971 [31:28<25:35,  1.74it/s, loss=0.227, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=319.0]
Epoch 0:  55%|█████▌    | 3294/5971 [31:29<25:34,  1.74it/s, loss=0.227, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=319.0]
Epoch 0:  55%|█████▌    | 3294/5971 [31:29<25:34,  1.74it/s, loss=0.229, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00117, train/loss_step=0.250, global_step=319.0] 
Epoch 0:  55%|█████▌    | 3295/5971 [31:29<25:34,  1.74it/s, loss=0.229, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00117, train/loss_step=0.250, global_step=319.0]
Epoch 0:  55%|█████▌    | 3295/5971 [31:29<25:34,  1.74it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000321, train/loss_step=0.0967, global_step=319.0]
Epoch 0:  55%|█████▌    | 3296/5971 [31:32<25:35,  1.74it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000321, train/loss_step=0.0967, global_step=319.0]
Epoch 0:  55%|█████▌    | 3296/5971 [31:32<25:35,  1.74it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0521, train/loss_vlb_step=0.000181, train/loss_step=0.0521, global_step=319.0]
Epoch 0:  55%|█████▌    | 3297/5971 [31:32<25:34,  1.74it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0521, train/loss_vlb_step=0.000181, train/loss_step=0.0521, global_step=319.0]
Epoch 0:  55%|█████▌    | 3297/5971 [31:32<25:34,  1.74it/s, loss=0.208, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.00066, train/loss_step=0.186, global_step=320.0]   
Epoch 0:  55%|█████▌    | 3298/5971 [31:33<25:34,  1.74it/s, loss=0.208, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.00066, train/loss_step=0.186, global_step=320.0]
Epoch 0:  55%|█████▌    | 3298/5971 [31:33<25:34,  1.74it/s, loss=0.201, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000616, train/loss_step=0.184, global_step=320.0]
Epoch 0:  55%|█████▌    | 3299/5971 [31:34<25:34,  1.74it/s, loss=0.201, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000616, train/loss_step=0.184, global_step=320.0]
Epoch 0:  55%|█████▌    | 3299/5971 [31:34<25:34,  1.74it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.64e-5, train/loss_step=0.0125, global_step=320.0]
Epoch 0:  55%|█████▌    | 3300/5971 [31:37<25:35,  1.74it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.64e-5, train/loss_step=0.0125, global_step=320.0]
Epoch 0:  55%|█████▌    | 3300/5971 [31:37<25:35,  1.74it/s, loss=0.204, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000559, train/loss_step=0.158, global_step=320.0] 
Epoch 0:  55%|█████▌    | 3301/5971 [31:38<25:34,  1.74it/s, loss=0.204, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000559, train/loss_step=0.158, global_step=320.0]
Epoch 0:  55%|█████▌    | 3301/5971 [31:38<25:34,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00925, train/loss_vlb_step=4.09e-5, train/loss_step=0.00925, global_step=321.0]
Epoch 0:  55%|█████▌    | 3302/5971 [31:38<25:34,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00925, train/loss_vlb_step=4.09e-5, train/loss_step=0.00925, global_step=321.0]
Epoch 0:  55%|█████▌    | 3302/5971 [31:38<25:34,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000889, train/loss_step=0.257, global_step=321.0] 
Epoch 0:  55%|█████▌    | 3303/5971 [31:39<25:34,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000889, train/loss_step=0.257, global_step=321.0]
Epoch 0:  55%|█████▌    | 3303/5971 [31:39<25:34,  1.74it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.46e-5, train/loss_step=0.00468, global_step=321.0]
Epoch 0:  55%|█████▌    | 3304/5971 [31:41<25:34,  1.74it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.46e-5, train/loss_step=0.00468, global_step=321.0]
Epoch 0:  55%|█████▌    | 3304/5971 [31:41<25:34,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00359, train/loss_vlb_step=1.93e-5, train/loss_step=0.00359, global_step=321.0]
Epoch 0:  55%|█████▌    | 3305/5971 [31:42<25:34,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00359, train/loss_vlb_step=1.93e-5, train/loss_step=0.00359, global_step=321.0]
Epoch 0:  55%|█████▌    | 3305/5971 [31:42<25:34,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.00749, train/loss_step=0.599, global_step=322.0]    
Epoch 0:  55%|█████▌    | 3306/5971 [31:43<25:34,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.00749, train/loss_step=0.599, global_step=322.0]
Epoch 0:  55%|█████▌    | 3306/5971 [31:43<25:34,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00659, train/loss_vlb_step=3.09e-5, train/loss_step=0.00659, global_step=322.0]
Epoch 0:  55%|█████▌    | 3307/5971 [31:44<25:33,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00659, train/loss_vlb_step=3.09e-5, train/loss_step=0.00659, global_step=322.0]
Epoch 0:  55%|█████▌    | 3307/5971 [31:44<25:33,  1.74it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.32e-5, train/loss_step=0.00235, global_step=322.0]
Epoch 0:  55%|█████▌    | 3308/5971 [31:46<25:34,  1.74it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.32e-5, train/loss_step=0.00235, global_step=322.0]
Epoch 0:  55%|█████▌    | 3308/5971 [31:46<25:34,  1.74it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.54e-5, train/loss_step=0.00479, global_step=322.0]
Epoch 0:  55%|█████▌    | 3309/5971 [31:47<25:34,  1.74it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.54e-5, train/loss_step=0.00479, global_step=322.0]
Epoch 0:  55%|█████▌    | 3309/5971 [31:47<25:34,  1.74it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.63e-5, train/loss_step=0.0221, global_step=323.0]  
Epoch 0:  55%|█████▌    | 3310/5971 [31:48<25:33,  1.73it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.63e-5, train/loss_step=0.0221, global_step=323.0]
Epoch 0:  55%|█████▌    | 3310/5971 [31:48<25:33,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.0001, train/loss_step=0.0264, global_step=323.0] 
Epoch 0:  55%|█████▌    | 3311/5971 [31:49<25:33,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.0001, train/loss_step=0.0264, global_step=323.0]
Epoch 0:  55%|█████▌    | 3311/5971 [31:49<25:33,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000283, train/loss_step=0.0828, global_step=323.0]
Epoch 0:  55%|█████▌    | 3312/5971 [31:51<25:34,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000283, train/loss_step=0.0828, global_step=323.0]
Epoch 0:  55%|█████▌    | 3312/5971 [31:51<25:34,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00116, train/loss_step=0.284, global_step=323.0]   
Epoch 0:  55%|█████▌    | 3313/5971 [31:52<25:33,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00116, train/loss_step=0.284, global_step=323.0]
Epoch 0:  55%|█████▌    | 3313/5971 [31:52<25:33,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.5e-5, train/loss_step=0.016, global_step=324.0] 
Epoch 0:  56%|█████▌    | 3314/5971 [31:53<25:33,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.5e-5, train/loss_step=0.016, global_step=324.0]
Epoch 0:  56%|█████▌    | 3314/5971 [31:53<25:33,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00112, train/loss_step=0.247, global_step=324.0]
Epoch 0:  56%|█████▌    | 3315/5971 [31:54<25:33,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00112, train/loss_step=0.247, global_step=324.0]
Epoch 0:  56%|█████▌    | 3315/5971 [31:54<25:33,  1.73it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000143, train/loss_step=0.0389, global_step=324.0]
Epoch 0:  56%|█████▌    | 3316/5971 [31:56<25:33,  1.73it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000143, train/loss_step=0.0389, global_step=324.0]
Epoch 0:  56%|█████▌    | 3316/5971 [31:56<25:33,  1.73it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:17,  2.16it/s][A
Epoch 0:  56%|█████▌    | 3318/5971 [31:56<25:32,  1.73it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:   1%|          | 2/167 [00:00<00:44,  3.70it/s][A
Epoch 0:  56%|█████▌    | 3320/5971 [31:57<25:30,  1.73it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.81it/s][A
Epoch 0:  56%|█████▌    | 3323/5971 [31:57<25:27,  1.73it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.88it/s][A
Epoch 0:  56%|█████▌    | 3326/5971 [31:57<25:24,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.35it/s][A
Epoch 0:  56%|█████▌    | 3329/5971 [31:57<25:21,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.62it/s][A
Epoch 0:  56%|█████▌    | 3332/5971 [31:57<25:18,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.82it/s][A
Epoch 0:  56%|█████▌    | 3335/5971 [31:57<25:15,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.97it/s][A
Epoch 0:  56%|█████▌    | 3338/5971 [31:57<25:12,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.44it/s][A
Epoch 0:  56%|█████▌    | 3341/5971 [31:57<25:09,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.43it/s][A
Epoch 0:  56%|█████▌    | 3344/5971 [31:57<25:06,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.77it/s][A
Epoch 0:  56%|█████▌    | 3348/5971 [31:58<25:02,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.12it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 26.18it/s][A
Epoch 0:  56%|█████▌    | 3352/5971 [31:58<24:58,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.28it/s][A
Epoch 0:  56%|█████▌    | 3356/5971 [31:58<24:54,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.85it/s][A
Epoch 0:  56%|█████▋    | 3360/5971 [31:58<24:50,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.88it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.56it/s][A
Epoch 0:  56%|█████▋    | 3364/5971 [31:58<24:46,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.70it/s][A
Epoch 0:  56%|█████▋    | 3368/5971 [31:58<24:42,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.28it/s][A
Epoch 0:  56%|█████▋    | 3372/5971 [31:59<24:38,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.12it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.09it/s][A
Epoch 0:  57%|█████▋    | 3376/5971 [31:59<24:34,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.93it/s][A
Epoch 0:  57%|█████▋    | 3380/5971 [31:59<24:30,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.52it/s][A
Epoch 0:  57%|█████▋    | 3384/5971 [31:59<24:26,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.07it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.06it/s][A
Epoch 0:  57%|█████▋    | 3388/5971 [31:59<24:23,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.41it/s][A
Epoch 0:  57%|█████▋    | 3392/5971 [31:59<24:19,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.86it/s][A
Epoch 0:  57%|█████▋    | 3396/5971 [31:59<24:15,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.10it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.91it/s][A
Epoch 0:  57%|█████▋    | 3400/5971 [32:00<24:11,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.26it/s][A
Epoch 0:  57%|█████▋    | 3404/5971 [32:00<24:07,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 24.23it/s][A
Epoch 0:  57%|█████▋    | 3408/5971 [32:00<24:03,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.43it/s][A
Epoch 0:  57%|█████▋    | 3412/5971 [32:00<24:00,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.30it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.44it/s][A
Epoch 0:  57%|█████▋    | 3416/5971 [32:00<23:56,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.57it/s][A
Epoch 0:  57%|█████▋    | 3420/5971 [32:00<23:52,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.71it/s][A
Epoch 0:  57%|█████▋    | 3424/5971 [32:01<23:48,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.66it/s][A
Epoch 0:  57%|█████▋    | 3428/5971 [32:01<23:44,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 27.15it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 26.07it/s][A
Epoch 0:  57%|█████▋    | 3432/5971 [32:01<23:40,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.11it/s][A
Epoch 0:  58%|█████▊    | 3436/5971 [32:01<23:37,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.40it/s][A
Epoch 0:  58%|█████▊    | 3440/5971 [32:01<23:33,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.37it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.64it/s][A
Epoch 0:  58%|█████▊    | 3444/5971 [32:01<23:29,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.39it/s][A
Epoch 0:  58%|█████▊    | 3448/5971 [32:01<23:25,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.94it/s][A
Epoch 0:  58%|█████▊    | 3452/5971 [32:02<23:22,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 28.48it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 27.10it/s][A
Epoch 0:  58%|█████▊    | 3456/5971 [32:02<23:18,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 27.16it/s][A
Epoch 0:  58%|█████▊    | 3460/5971 [32:02<23:14,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 27.57it/s][A
Epoch 0:  58%|█████▊    | 3464/5971 [32:02<23:10,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.05it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.38it/s][A
Epoch 0:  58%|█████▊    | 3468/5971 [32:02<23:07,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 25.60it/s][A
Epoch 0:  58%|█████▊    | 3472/5971 [32:02<23:03,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.49it/s][A
Epoch 0:  58%|█████▊    | 3476/5971 [32:02<22:59,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.72it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.70it/s][A
Epoch 0:  58%|█████▊    | 3480/5971 [32:03<22:56,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 25.39it/s][A
Epoch 0:  58%|█████▊    | 3484/5971 [32:03<22:52,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]
Epoch 0:  58%|█████▊    | 3484/5971 [32:03<22:52,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.918, train/loss_vlb_step=0.117, train/loss_step=0.918, global_step=324.0]

                                                             [A
Epoch 0:  58%|█████▊    | 3485/5971 [32:04<22:52,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=325.0]
Epoch 0:  58%|█████▊    | 3486/5971 [32:05<22:52,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.006, train/loss_vlb_step=2.94e-5, train/loss_step=0.006, global_step=325.0]
Epoch 0:  58%|█████▊    | 3487/5971 [32:06<22:51,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000122, train/loss_step=0.0315, global_step=325.0]
Epoch 0:  58%|█████▊    | 3488/5971 [32:08<22:52,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000122, train/loss_step=0.0315, global_step=325.0]
Epoch 0:  58%|█████▊    | 3488/5971 [32:08<22:52,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.72e-5, train/loss_step=0.0128, global_step=325.0] 
Epoch 0:  58%|█████▊    | 3489/5971 [32:09<22:52,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000301, train/loss_step=0.0899, global_step=326.0]
Epoch 0:  58%|█████▊    | 3490/5971 [32:10<22:52,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000888, train/loss_step=0.221, global_step=326.0]  
Epoch 0:  58%|█████▊    | 3491/5971 [32:11<22:51,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00036, train/loss_step=0.110, global_step=326.0] 
Epoch 0:  58%|█████▊    | 3492/5971 [32:13<22:52,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00036, train/loss_step=0.110, global_step=326.0]
Epoch 0:  58%|█████▊    | 3492/5971 [32:13<22:52,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00553, train/loss_vlb_step=2.74e-5, train/loss_step=0.00553, global_step=326.0]
Epoch 0:  58%|█████▊    | 3493/5971 [32:14<22:51,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.07e-5, train/loss_step=0.00397, global_step=327.0]
Epoch 0:  59%|█████▊    | 3494/5971 [32:15<22:51,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.6e-5, train/loss_step=0.00503, global_step=327.0] 
Epoch 0:  59%|█████▊    | 3495/5971 [32:16<22:51,  1.81it/s, loss=0.121, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000627, train/loss_step=0.175, global_step=327.0]  
Epoch 0:  59%|█████▊    | 3496/5971 [32:19<22:52,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000627, train/loss_step=0.175, global_step=327.0]
Epoch 0:  59%|█████▊    | 3496/5971 [32:19<22:52,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00866, train/loss_vlb_step=4.25e-5, train/loss_step=0.00866, global_step=327.0]
Epoch 0:  59%|█████▊    | 3497/5971 [32:19<22:52,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00447, train/loss_step=0.545, global_step=328.0]    
Epoch 0:  59%|█████▊    | 3498/5971 [32:20<22:51,  1.80it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000283, train/loss_step=0.0854, global_step=328.0]
Epoch 0:  59%|█████▊    | 3499/5971 [32:21<22:51,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000352, train/loss_step=0.106, global_step=328.0] 
Epoch 0:  59%|█████▊    | 3500/5971 [32:23<22:51,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000352, train/loss_step=0.106, global_step=328.0]
Epoch 0:  59%|█████▊    | 3500/5971 [32:23<22:51,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000119, train/loss_step=0.0331, global_step=328.0]
Epoch 0:  59%|█████▊    | 3501/5971 [32:24<22:51,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0724, train/loss_vlb_step=0.000242, train/loss_step=0.0724, global_step=329.0]
Epoch 0:  59%|█████▊    | 3502/5971 [32:25<22:51,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00237, train/loss_step=0.426, global_step=329.0]   
Epoch 0:  59%|█████▊    | 3503/5971 [32:26<22:50,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000781, train/loss_step=0.227, global_step=329.0]
Epoch 0:  59%|█████▊    | 3504/5971 [32:28<22:51,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000781, train/loss_step=0.227, global_step=329.0]
Epoch 0:  59%|█████▊    | 3504/5971 [32:28<22:51,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=329.0]
Epoch 0:  59%|█████▊    | 3505/5971 [32:29<22:51,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.85e-5, train/loss_step=0.00336, global_step=330.0]
Epoch 0:  59%|█████▊    | 3506/5971 [32:30<22:50,  1.80it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0345, train/loss_vlb_step=0.000127, train/loss_step=0.0345, global_step=330.0] 
Epoch 0:  59%|█████▊    | 3507/5971 [32:31<22:50,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.59e-5, train/loss_step=0.0152, global_step=330.0] 
Epoch 0:  59%|█████▉    | 3508/5971 [32:33<22:51,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.59e-5, train/loss_step=0.0152, global_step=330.0]
Epoch 0:  59%|█████▉    | 3508/5971 [32:33<22:51,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0834, train/loss_vlb_step=0.00028, train/loss_step=0.0834, global_step=330.0]
Epoch 0:  59%|█████▉    | 3509/5971 [32:34<22:50,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.0102, train/loss_step=0.590, global_step=331.0]   
Epoch 0:  59%|█████▉    | 3510/5971 [32:35<22:50,  1.80it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000205, train/loss_step=0.0604, global_step=331.0]
Epoch 0:  59%|█████▉    | 3511/5971 [32:36<22:50,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000225, train/loss_step=0.0642, global_step=331.0]
Epoch 0:  59%|█████▉    | 3512/5971 [32:38<22:51,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000225, train/loss_step=0.0642, global_step=331.0]
Epoch 0:  59%|█████▉    | 3512/5971 [32:38<22:51,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00297, train/loss_step=0.392, global_step=331.0]   
Epoch 0:  59%|█████▉    | 3513/5971 [32:39<22:50,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00804, train/loss_vlb_step=3.94e-5, train/loss_step=0.00804, global_step=332.0]
Epoch 0:  59%|█████▉    | 3514/5971 [32:40<22:50,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0442, train/loss_vlb_step=0.000153, train/loss_step=0.0442, global_step=332.0] 
Epoch 0:  59%|█████▉    | 3515/5971 [32:41<22:50,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00224, train/loss_step=0.355, global_step=332.0]   
Epoch 0:  59%|█████▉    | 3516/5971 [32:43<22:50,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00224, train/loss_step=0.355, global_step=332.0]
Epoch 0:  59%|█████▉    | 3516/5971 [32:43<22:50,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00461, train/loss_step=0.507, global_step=332.0]
Epoch 0:  59%|█████▉    | 3517/5971 [32:44<22:50,  1.79it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.44e-5, train/loss_step=0.00246, global_step=333.0]
Epoch 0:  59%|█████▉    | 3518/5971 [32:45<22:50,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.79e-5, train/loss_step=0.0111, global_step=333.0]  
Epoch 0:  59%|█████▉    | 3519/5971 [32:46<22:49,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000877, train/loss_step=0.254, global_step=333.0] 
Epoch 0:  59%|█████▉    | 3520/5971 [32:48<22:50,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000877, train/loss_step=0.254, global_step=333.0]
Epoch 0:  59%|█████▉    | 3520/5971 [32:48<22:50,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.84e-5, train/loss_step=0.0104, global_step=333.0]
Epoch 0:  59%|█████▉    | 3521/5971 [32:49<22:50,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00552, train/loss_vlb_step=2.84e-5, train/loss_step=0.00552, global_step=334.0]
Epoch 0:  59%|█████▉    | 3522/5971 [32:50<22:49,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000363, train/loss_step=0.111, global_step=334.0]  
Epoch 0:  59%|█████▉    | 3523/5971 [32:51<22:49,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00903, train/loss_vlb_step=4.22e-5, train/loss_step=0.00903, global_step=334.0]
Epoch 0:  59%|█████▉    | 3524/5971 [32:53<22:49,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00903, train/loss_vlb_step=4.22e-5, train/loss_step=0.00903, global_step=334.0]
Epoch 0:  59%|█████▉    | 3524/5971 [32:53<22:49,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.517, train/loss_vlb_step=0.0047, train/loss_step=0.517, global_step=334.0]     
Epoch 0:  59%|█████▉    | 3525/5971 [32:54<22:49,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000398, train/loss_step=0.121, global_step=335.0]
Epoch 0:  59%|█████▉    | 3526/5971 [32:55<22:49,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.00095, train/loss_step=0.213, global_step=335.0]
Epoch 0:  59%|█████▉    | 3527/5971 [32:56<22:48,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2.11e-5, train/loss_step=0.00391, global_step=335.0]
Epoch 0:  59%|█████▉    | 3528/5971 [32:58<22:49,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2.11e-5, train/loss_step=0.00391, global_step=335.0]
Epoch 0:  59%|█████▉    | 3528/5971 [32:58<22:49,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00255, train/loss_step=0.441, global_step=335.0]    
Epoch 0:  59%|█████▉    | 3529/5971 [32:59<22:49,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00964, train/loss_vlb_step=4.58e-5, train/loss_step=0.00964, global_step=336.0]
Epoch 0:  59%|█████▉    | 3530/5971 [33:00<22:48,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.38e-5, train/loss_step=0.0239, global_step=336.0]  
Epoch 0:  59%|█████▉    | 3531/5971 [33:00<22:48,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000107, train/loss_step=0.0288, global_step=336.0]
Epoch 0:  59%|█████▉    | 3532/5971 [33:03<22:49,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000107, train/loss_step=0.0288, global_step=336.0]
Epoch 0:  59%|█████▉    | 3532/5971 [33:03<22:49,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000531, train/loss_step=0.161, global_step=336.0]  
Epoch 0:  59%|█████▉    | 3533/5971 [33:04<22:49,  1.78it/s, loss=0.163, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00221, train/loss_step=0.422, global_step=337.0] 
Epoch 0:  59%|█████▉    | 3534/5971 [33:05<22:48,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=8.07e-5, train/loss_step=0.0184, global_step=337.0]
Epoch 0:  59%|█████▉    | 3535/5971 [33:06<22:48,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.00052, train/loss_step=0.157, global_step=337.0]  
Epoch 0:  59%|█████▉    | 3536/5971 [33:08<22:49,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.00052, train/loss_step=0.157, global_step=337.0]
Epoch 0:  59%|█████▉    | 3536/5971 [33:08<22:49,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000524, train/loss_step=0.156, global_step=337.0]
Epoch 0:  59%|█████▉    | 3537/5971 [33:09<22:48,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000594, train/loss_step=0.169, global_step=338.0]
Epoch 0:  59%|█████▉    | 3538/5971 [33:10<22:48,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00575, train/loss_vlb_step=2.77e-5, train/loss_step=0.00575, global_step=338.0]
Epoch 0:  59%|█████▉    | 3539/5971 [33:11<22:48,  1.78it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2e-5, train/loss_step=0.00391, global_step=338.0]   
Epoch 0:  59%|█████▉    | 3540/5971 [33:13<22:48,  1.78it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2e-5, train/loss_step=0.00391, global_step=338.0]
Epoch 0:  59%|█████▉    | 3540/5971 [33:13<22:48,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00185, train/loss_step=0.375, global_step=338.0] 
Epoch 0:  59%|█████▉    | 3541/5971 [33:14<22:48,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000885, train/loss_step=0.222, global_step=339.0]
Epoch 0:  59%|█████▉    | 3542/5971 [33:15<22:47,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00145, train/loss_step=0.317, global_step=339.0] 
Epoch 0:  59%|█████▉    | 3543/5971 [33:16<22:47,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.76e-5, train/loss_step=0.00316, global_step=339.0]
Epoch 0:  59%|█████▉    | 3544/5971 [33:18<22:48,  1.77it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.76e-5, train/loss_step=0.00316, global_step=339.0]
Epoch 0:  59%|█████▉    | 3544/5971 [33:18<22:48,  1.77it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.56e-5, train/loss_step=0.00274, global_step=339.0]
Epoch 0:  59%|█████▉    | 3545/5971 [33:19<22:47,  1.77it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=0.000104, train/loss_step=0.0261, global_step=340.0] 
Epoch 0:  59%|█████▉    | 3546/5971 [33:20<22:47,  1.77it/s, loss=0.147, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00199, train/loss_step=0.398, global_step=340.0]   
Epoch 0:  59%|█████▉    | 3547/5971 [33:21<22:47,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00301, train/loss_step=0.370, global_step=340.0]
Epoch 0:  59%|█████▉    | 3548/5971 [33:23<22:47,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00301, train/loss_step=0.370, global_step=340.0]
Epoch 0:  59%|█████▉    | 3548/5971 [33:23<22:47,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.00051, train/loss_step=0.153, global_step=340.0]
Epoch 0:  59%|█████▉    | 3549/5971 [33:24<22:47,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.64e-5, train/loss_step=0.0157, global_step=341.0]
Epoch 0:  59%|█████▉    | 3550/5971 [33:25<22:47,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000926, train/loss_step=0.232, global_step=341.0] 
Epoch 0:  59%|█████▉    | 3551/5971 [33:26<22:46,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000188, train/loss_step=0.0548, global_step=341.0]
Epoch 0:  59%|█████▉    | 3552/5971 [33:28<22:47,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000188, train/loss_step=0.0548, global_step=341.0]
Epoch 0:  59%|█████▉    | 3552/5971 [33:28<22:47,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.02e-5, train/loss_step=0.0114, global_step=341.0] 
Epoch 0:  60%|█████▉    | 3553/5971 [33:29<22:47,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0463, train/loss_vlb_step=0.000167, train/loss_step=0.0463, global_step=342.0]
Epoch 0:  60%|█████▉    | 3554/5971 [33:30<22:46,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00347, train/loss_step=0.480, global_step=342.0]    
Epoch 0:  60%|█████▉    | 3555/5971 [33:31<22:46,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000161, train/loss_step=0.0448, global_step=342.0]
Epoch 0:  60%|█████▉    | 3556/5971 [33:33<22:47,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000161, train/loss_step=0.0448, global_step=342.0]
Epoch 0:  60%|█████▉    | 3556/5971 [33:33<22:47,  1.77it/s, loss=0.159, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000979, train/loss_step=0.248, global_step=342.0]  
Epoch 0:  60%|█████▉    | 3557/5971 [33:34<22:47,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=343.0]
Epoch 0:  60%|█████▉    | 3558/5971 [33:35<22:46,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.21e-5, train/loss_step=0.00202, global_step=343.0]
Epoch 0:  60%|█████▉    | 3559/5971 [33:36<22:46,  1.77it/s, loss=0.181, v_num=0, train/loss_simple_step=0.516, train/loss_vlb_step=0.00367, train/loss_step=0.516, global_step=343.0]    
Epoch 0:  60%|█████▉    | 3560/5971 [33:38<22:46,  1.76it/s, loss=0.181, v_num=0, train/loss_simple_step=0.516, train/loss_vlb_step=0.00367, train/loss_step=0.516, global_step=343.0]
Epoch 0:  60%|█████▉    | 3560/5971 [33:38<22:46,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000296, train/loss_step=0.0901, global_step=343.0]
Epoch 0:  60%|█████▉    | 3561/5971 [33:39<22:46,  1.76it/s, loss=0.165, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000656, train/loss_step=0.183, global_step=344.0]  
Epoch 0:  60%|█████▉    | 3562/5971 [33:40<22:46,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00164, train/loss_step=0.357, global_step=344.0] 
Epoch 0:  60%|█████▉    | 3563/5971 [33:41<22:45,  1.76it/s, loss=0.182, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.0012, train/loss_step=0.315, global_step=344.0] 
Epoch 0:  60%|█████▉    | 3564/5971 [33:43<22:46,  1.76it/s, loss=0.182, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.0012, train/loss_step=0.315, global_step=344.0]
Epoch 0:  60%|█████▉    | 3564/5971 [33:43<22:46,  1.76it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000157, train/loss_step=0.0437, global_step=344.0]
Epoch 0:  60%|█████▉    | 3565/5971 [33:44<22:46,  1.76it/s, loss=0.206, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00423, train/loss_step=0.459, global_step=345.0]   
Epoch 0:  60%|█████▉    | 3566/5971 [33:45<22:45,  1.76it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=1.97e-5, train/loss_step=0.00376, global_step=345.0]
Epoch 0:  60%|█████▉    | 3567/5971 [33:46<22:45,  1.76it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.92e-5, train/loss_step=0.0158, global_step=345.0]  
Epoch 0:  60%|█████▉    | 3568/5971 [33:48<22:45,  1.76it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.92e-5, train/loss_step=0.0158, global_step=345.0]
Epoch 0:  60%|█████▉    | 3568/5971 [33:48<22:45,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00993, train/loss_vlb_step=4.41e-5, train/loss_step=0.00993, global_step=345.0]
Epoch 0:  60%|█████▉    | 3569/5971 [33:49<22:45,  1.76it/s, loss=0.171, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000708, train/loss_step=0.209, global_step=346.0]   
Epoch 0:  60%|█████▉    | 3570/5971 [33:50<22:45,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.72e-5, train/loss_step=0.0198, global_step=346.0]
Epoch 0:  60%|█████▉    | 3571/5971 [33:51<22:44,  1.76it/s, loss=0.179, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00242, train/loss_step=0.422, global_step=346.0]  
Epoch 0:  60%|█████▉    | 3572/5971 [33:53<22:45,  1.76it/s, loss=0.179, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00242, train/loss_step=0.422, global_step=346.0]
Epoch 0:  60%|█████▉    | 3572/5971 [33:53<22:45,  1.76it/s, loss=0.212, v_num=0, train/loss_simple_step=0.664, train/loss_vlb_step=0.00846, train/loss_step=0.664, global_step=346.0]
Epoch 0:  60%|█████▉    | 3573/5971 [33:54<22:44,  1.76it/s, loss=0.21, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.62e-5, train/loss_step=0.023, global_step=347.0] 
Epoch 0:  60%|█████▉    | 3574/5971 [33:55<22:44,  1.76it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000135, train/loss_step=0.0355, global_step=347.0]
Epoch 0:  60%|█████▉    | 3575/5971 [33:56<22:44,  1.76it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.34e-5, train/loss_step=0.0044, global_step=347.0] 
Epoch 0:  60%|█████▉    | 3576/5971 [33:58<22:44,  1.75it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.34e-5, train/loss_step=0.0044, global_step=347.0]
Epoch 0:  60%|█████▉    | 3576/5971 [33:58<22:44,  1.75it/s, loss=0.185, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000807, train/loss_step=0.220, global_step=347.0] 
Epoch 0:  60%|█████▉    | 3577/5971 [33:59<22:44,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00286, train/loss_step=0.405, global_step=348.0]   
Epoch 0:  60%|█████▉    | 3578/5971 [34:00<22:44,  1.75it/s, loss=0.207, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000675, train/loss_step=0.151, global_step=348.0]
Epoch 0:  60%|█████▉    | 3579/5971 [34:01<22:43,  1.75it/s, loss=0.182, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.73e-5, train/loss_step=0.003, global_step=348.0] 
Epoch 0:  60%|█████▉    | 3580/5971 [34:03<22:44,  1.75it/s, loss=0.182, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.73e-5, train/loss_step=0.003, global_step=348.0]
Epoch 0:  60%|█████▉    | 3580/5971 [34:03<22:44,  1.75it/s, loss=0.208, v_num=0, train/loss_simple_step=0.618, train/loss_vlb_step=0.00788, train/loss_step=0.618, global_step=348.0]
Epoch 0:  60%|█████▉    | 3581/5971 [34:04<22:43,  1.75it/s, loss=0.207, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000597, train/loss_step=0.169, global_step=349.0]
Epoch 0:  60%|█████▉    | 3582/5971 [34:05<22:43,  1.75it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.35e-5, train/loss_step=0.00484, global_step=349.0]
Epoch 0:  60%|██████    | 3583/5971 [34:05<22:43,  1.75it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.15e-5, train/loss_step=0.0139, global_step=349.0] 
Epoch 0:  60%|██████    | 3584/5971 [34:08<22:43,  1.75it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.15e-5, train/loss_step=0.0139, global_step=349.0]
Epoch 0:  60%|██████    | 3584/5971 [34:08<22:43,  1.75it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.47it/s][A

Validating:   1%|          | 2/167 [00:00<00:45,  3.65it/s][A
Epoch 0:  60%|██████    | 3588/5971 [34:09<22:40,  1.75it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.57it/s][A
Epoch 0:  60%|██████    | 3592/5971 [34:09<22:36,  1.75it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.08it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.42it/s][A
Epoch 0:  60%|██████    | 3596/5971 [34:09<22:33,  1.76it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.25it/s][A
Epoch 0:  60%|██████    | 3600/5971 [34:09<22:29,  1.76it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.46it/s][A
Epoch 0:  60%|██████    | 3604/5971 [34:09<22:25,  1.76it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.51it/s][A
Epoch 0:  60%|██████    | 3608/5971 [34:09<22:22,  1.76it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.25it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.81it/s][A
Epoch 0:  60%|██████    | 3612/5971 [34:10<22:18,  1.76it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.59it/s][A
Epoch 0:  61%|██████    | 3616/5971 [34:10<22:14,  1.76it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.91it/s][A
Epoch 0:  61%|██████    | 3620/5971 [34:10<22:11,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.74it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:04, 27.17it/s][A
Epoch 0:  61%|██████    | 3624/5971 [34:10<22:07,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.13it/s][A
Epoch 0:  61%|██████    | 3628/5971 [34:10<22:03,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.05it/s][A
Epoch 0:  61%|██████    | 3632/5971 [34:10<22:00,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.69it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 26.00it/s][A
Epoch 0:  61%|██████    | 3636/5971 [34:10<21:56,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.03it/s][A
Epoch 0:  61%|██████    | 3640/5971 [34:11<21:53,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 27.60it/s][A
Epoch 0:  61%|██████    | 3644/5971 [34:11<21:49,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.32it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.80it/s][A
Epoch 0:  61%|██████    | 3648/5971 [34:11<21:45,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.56it/s][A
Epoch 0:  61%|██████    | 3652/5971 [34:11<21:42,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.37it/s][A
Epoch 0:  61%|██████    | 3656/5971 [34:11<21:38,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.60it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.77it/s][A
Epoch 0:  61%|██████▏   | 3660/5971 [34:11<21:35,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.43it/s][A
Epoch 0:  61%|██████▏   | 3664/5971 [34:11<21:31,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.79it/s][A
Epoch 0:  61%|██████▏   | 3668/5971 [34:12<21:28,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.25it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.01it/s][A
Epoch 0:  61%|██████▏   | 3672/5971 [34:12<21:24,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 26.81it/s][A
Epoch 0:  62%|██████▏   | 3676/5971 [34:12<21:21,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 27.42it/s][A
Epoch 0:  62%|██████▏   | 3680/5971 [34:12<21:17,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 28.17it/s][A
Epoch 0:  62%|██████▏   | 3684/5971 [34:12<21:13,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.71it/s][A
Epoch 0:  62%|██████▏   | 3688/5971 [34:12<21:10,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 28.29it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.87it/s][A
Epoch 0:  62%|██████▏   | 3692/5971 [34:12<21:06,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.54it/s][A
Epoch 0:  62%|██████▏   | 3696/5971 [34:13<21:03,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.48it/s][A
Epoch 0:  62%|██████▏   | 3700/5971 [34:13<20:59,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  69%|██████▉   | 116/167 [00:04<00:02, 25.38it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.60it/s][A
Epoch 0:  62%|██████▏   | 3704/5971 [34:13<20:56,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.19it/s][A
Epoch 0:  62%|██████▏   | 3708/5971 [34:13<20:52,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.61it/s][A
Epoch 0:  62%|██████▏   | 3712/5971 [34:13<20:49,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.47it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.76it/s][A
Epoch 0:  62%|██████▏   | 3716/5971 [34:13<20:46,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.96it/s][A
Epoch 0:  62%|██████▏   | 3720/5971 [34:14<20:42,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.43it/s][A
Epoch 0:  62%|██████▏   | 3724/5971 [34:14<20:39,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.44it/s][A

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 25.73it/s][A
Epoch 0:  62%|██████▏   | 3728/5971 [34:14<20:35,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.66it/s][A
Epoch 0:  63%|██████▎   | 3732/5971 [34:14<20:32,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.45it/s][A
Epoch 0:  63%|██████▎   | 3736/5971 [34:14<20:28,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.40it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.38it/s][A
Epoch 0:  63%|██████▎   | 3740/5971 [34:14<20:25,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.52it/s][A
Epoch 0:  63%|██████▎   | 3744/5971 [34:14<20:22,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.09it/s][A
Epoch 0:  63%|██████▎   | 3748/5971 [34:15<20:18,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 24.71it/s][A
Epoch 0:  63%|██████▎   | 3752/5971 [34:15<20:15,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]
Epoch 0:  63%|██████▎   | 3752/5971 [34:15<20:15,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00265, train/loss_step=0.448, global_step=349.0]

                                                             [A
Epoch 0:  63%|██████▎   | 3753/5971 [34:16<20:15,  1.83it/s, loss=0.175, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000181, train/loss_step=0.053, global_step=350.0]
Epoch 0:  63%|██████▎   | 3754/5971 [34:17<20:14,  1.83it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0046, train/loss_vlb_step=2.54e-5, train/loss_step=0.0046, global_step=350.0]
Epoch 0:  63%|██████▎   | 3755/5971 [34:18<20:14,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.000284, train/loss_step=0.0852, global_step=350.0]
Epoch 0:  63%|██████▎   | 3756/5971 [34:20<20:14,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.000284, train/loss_step=0.0852, global_step=350.0]
Epoch 0:  63%|██████▎   | 3756/5971 [34:20<20:14,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00169, train/loss_vlb_step=1.02e-5, train/loss_step=0.00169, global_step=350.0]
Epoch 0:  63%|██████▎   | 3757/5971 [34:21<20:14,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000782, train/loss_step=0.221, global_step=351.0]   
Epoch 0:  63%|██████▎   | 3758/5971 [34:22<20:14,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.97e-5, train/loss_step=0.0192, global_step=351.0]
Epoch 0:  63%|██████▎   | 3759/5971 [34:23<20:13,  1.82it/s, loss=0.192, v_num=0, train/loss_simple_step=0.706, train/loss_vlb_step=0.018, train/loss_step=0.706, global_step=351.0]    
Epoch 0:  63%|██████▎   | 3760/5971 [34:25<20:14,  1.82it/s, loss=0.192, v_num=0, train/loss_simple_step=0.706, train/loss_vlb_step=0.018, train/loss_step=0.706, global_step=351.0]
Epoch 0:  63%|██████▎   | 3760/5971 [34:25<20:14,  1.82it/s, loss=0.17, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.00087, train/loss_step=0.219, global_step=351.0]
Epoch 0:  63%|██████▎   | 3761/5971 [34:26<20:13,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00381, train/loss_vlb_step=2.03e-5, train/loss_step=0.00381, global_step=352.0]
Epoch 0:  63%|██████▎   | 3762/5971 [34:27<20:13,  1.82it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=2.04e-5, train/loss_step=0.00372, global_step=352.0]
Epoch 0:  63%|██████▎   | 3763/5971 [34:28<20:13,  1.82it/s, loss=0.179, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000892, train/loss_step=0.238, global_step=352.0]   
Epoch 0:  63%|██████▎   | 3764/5971 [34:30<20:13,  1.82it/s, loss=0.179, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000892, train/loss_step=0.238, global_step=352.0]
Epoch 0:  63%|██████▎   | 3764/5971 [34:30<20:13,  1.82it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0556, train/loss_vlb_step=0.000198, train/loss_step=0.0556, global_step=352.0]
Epoch 0:  63%|██████▎   | 3765/5971 [34:31<20:13,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00564, train/loss_vlb_step=2.75e-5, train/loss_step=0.00564, global_step=353.0]
Epoch 0:  63%|██████▎   | 3766/5971 [34:32<20:12,  1.82it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.24e-5, train/loss_step=0.00208, global_step=353.0]
Epoch 0:  63%|██████▎   | 3767/5971 [34:32<20:12,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000158, train/loss_step=0.0466, global_step=353.0] 
Epoch 0:  63%|██████▎   | 3768/5971 [34:35<20:12,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000158, train/loss_step=0.0466, global_step=353.0]
Epoch 0:  63%|██████▎   | 3768/5971 [34:35<20:12,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00849, train/loss_vlb_step=3.93e-5, train/loss_step=0.00849, global_step=353.0]
Epoch 0:  63%|██████▎   | 3769/5971 [34:36<20:12,  1.82it/s, loss=0.145, v_num=0, train/loss_simple_step=0.767, train/loss_vlb_step=0.0308, train/loss_step=0.767, global_step=354.0]     
Epoch 0:  63%|██████▎   | 3770/5971 [34:36<20:12,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00918, train/loss_vlb_step=4.31e-5, train/loss_step=0.00918, global_step=354.0]
Epoch 0:  63%|██████▎   | 3771/5971 [34:37<20:11,  1.82it/s, loss=0.189, v_num=0, train/loss_simple_step=0.882, train/loss_vlb_step=0.223, train/loss_step=0.882, global_step=354.0]      
Epoch 0:  63%|██████▎   | 3772/5971 [34:40<20:12,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.882, train/loss_vlb_step=0.223, train/loss_step=0.882, global_step=354.0]
Epoch 0:  63%|██████▎   | 3772/5971 [34:40<20:12,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000719, train/loss_step=0.209, global_step=354.0]
Epoch 0:  63%|██████▎   | 3773/5971 [34:40<20:11,  1.81it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0832, train/loss_vlb_step=0.000283, train/loss_step=0.0832, global_step=355.0]
Epoch 0:  63%|██████▎   | 3774/5971 [34:41<20:11,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.00026, train/loss_step=0.0778, global_step=355.0] 
Epoch 0:  63%|██████▎   | 3775/5971 [34:42<20:11,  1.81it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.32e-5, train/loss_step=0.0144, global_step=355.0]
Epoch 0:  63%|██████▎   | 3776/5971 [34:44<20:11,  1.81it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.32e-5, train/loss_step=0.0144, global_step=355.0]
Epoch 0:  63%|██████▎   | 3776/5971 [34:44<20:11,  1.81it/s, loss=0.196, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00293, train/loss_step=0.342, global_step=355.0]  
Epoch 0:  63%|██████▎   | 3777/5971 [34:45<20:11,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000669, train/loss_step=0.193, global_step=356.0]
Epoch 0:  63%|██████▎   | 3778/5971 [34:46<20:10,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.00078, train/loss_step=0.207, global_step=356.0] 
Epoch 0:  63%|██████▎   | 3779/5971 [34:47<20:10,  1.81it/s, loss=0.206, v_num=0, train/loss_simple_step=0.752, train/loss_vlb_step=0.0326, train/loss_step=0.752, global_step=356.0] 
Epoch 0:  63%|██████▎   | 3780/5971 [34:49<20:11,  1.81it/s, loss=0.206, v_num=0, train/loss_simple_step=0.752, train/loss_vlb_step=0.0326, train/loss_step=0.752, global_step=356.0]
Epoch 0:  63%|██████▎   | 3780/5971 [34:49<20:11,  1.81it/s, loss=0.226, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.00486, train/loss_step=0.611, global_step=356.0]
Epoch 0:  63%|██████▎   | 3781/5971 [34:50<20:10,  1.81it/s, loss=0.226, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.45e-5, train/loss_step=0.005, global_step=357.0]
Epoch 0:  63%|██████▎   | 3782/5971 [34:51<20:10,  1.81it/s, loss=0.252, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00465, train/loss_step=0.524, global_step=357.0]
Epoch 0:  63%|██████▎   | 3783/5971 [34:52<20:09,  1.81it/s, loss=0.243, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000216, train/loss_step=0.064, global_step=357.0]
Epoch 0:  63%|██████▎   | 3784/5971 [34:55<20:10,  1.81it/s, loss=0.243, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000216, train/loss_step=0.064, global_step=357.0]
Epoch 0:  63%|██████▎   | 3784/5971 [34:55<20:10,  1.81it/s, loss=0.253, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00097, train/loss_step=0.256, global_step=357.0] 
Epoch 0:  63%|██████▎   | 3785/5971 [34:55<20:10,  1.81it/s, loss=0.253, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.57e-5, train/loss_step=0.0127, global_step=358.0]
Epoch 0:  63%|██████▎   | 3786/5971 [34:56<20:09,  1.81it/s, loss=0.253, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.29e-5, train/loss_step=0.00218, global_step=358.0]
Epoch 0:  63%|██████▎   | 3787/5971 [34:57<20:09,  1.81it/s, loss=0.278, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.00484, train/loss_step=0.539, global_step=358.0]    
Epoch 0:  63%|██████▎   | 3788/5971 [34:59<20:09,  1.80it/s, loss=0.278, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.00484, train/loss_step=0.539, global_step=358.0]
Epoch 0:  63%|██████▎   | 3788/5971 [34:59<20:09,  1.80it/s, loss=0.279, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.66e-5, train/loss_step=0.024, global_step=358.0]
Epoch 0:  63%|██████▎   | 3789/5971 [35:00<20:09,  1.80it/s, loss=0.241, v_num=0, train/loss_simple_step=0.00707, train/loss_vlb_step=3.52e-5, train/loss_step=0.00707, global_step=359.0]
Epoch 0:  63%|██████▎   | 3790/5971 [35:01<20:09,  1.80it/s, loss=0.24, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.02e-5, train/loss_step=0.00378, global_step=359.0] 
Epoch 0:  63%|██████▎   | 3791/5971 [35:02<20:08,  1.80it/s, loss=0.226, v_num=0, train/loss_simple_step=0.589, train/loss_vlb_step=0.00652, train/loss_step=0.589, global_step=359.0]   
Epoch 0:  64%|██████▎   | 3792/5971 [35:04<20:09,  1.80it/s, loss=0.226, v_num=0, train/loss_simple_step=0.589, train/loss_vlb_step=0.00652, train/loss_step=0.589, global_step=359.0]
Epoch 0:  64%|██████▎   | 3792/5971 [35:04<20:09,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000176, train/loss_step=0.0512, global_step=359.0]
Epoch 0:  64%|██████▎   | 3793/5971 [35:05<20:08,  1.80it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.0002, train/loss_step=0.0581, global_step=360.0]  
Epoch 0:  64%|██████▎   | 3794/5971 [35:06<20:08,  1.80it/s, loss=0.229, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00126, train/loss_step=0.322, global_step=360.0] 
Epoch 0:  64%|██████▎   | 3795/5971 [35:07<20:08,  1.80it/s, loss=0.251, v_num=0, train/loss_simple_step=0.451, train/loss_vlb_step=0.00339, train/loss_step=0.451, global_step=360.0]
Epoch 0:  64%|██████▎   | 3796/5971 [35:09<20:08,  1.80it/s, loss=0.251, v_num=0, train/loss_simple_step=0.451, train/loss_vlb_step=0.00339, train/loss_step=0.451, global_step=360.0]
Epoch 0:  64%|██████▎   | 3796/5971 [35:09<20:08,  1.80it/s, loss=0.257, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00308, train/loss_step=0.464, global_step=360.0]
Epoch 0:  64%|██████▎   | 3797/5971 [35:10<20:08,  1.80it/s, loss=0.251, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000245, train/loss_step=0.073, global_step=361.0]
Epoch 0:  64%|██████▎   | 3798/5971 [35:11<20:07,  1.80it/s, loss=0.251, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.00078, train/loss_step=0.215, global_step=361.0] 
Epoch 0:  64%|██████▎   | 3799/5971 [35:12<20:07,  1.80it/s, loss=0.214, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.37e-5, train/loss_step=0.012, global_step=361.0]
Epoch 0:  64%|██████▎   | 3800/5971 [35:14<20:07,  1.80it/s, loss=0.214, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.37e-5, train/loss_step=0.012, global_step=361.0]
Epoch 0:  64%|██████▎   | 3800/5971 [35:14<20:07,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.00076, train/loss_step=0.194, global_step=361.0]
Epoch 0:  64%|██████▎   | 3801/5971 [35:15<20:07,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.088, train/loss_vlb_step=0.000291, train/loss_step=0.088, global_step=362.0]
Epoch 0:  64%|██████▎   | 3802/5971 [35:16<20:07,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.78e-5, train/loss_step=0.00544, global_step=362.0]
Epoch 0:  64%|██████▎   | 3803/5971 [35:17<20:06,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.25e-5, train/loss_step=0.00212, global_step=362.0]
Epoch 0:  64%|██████▎   | 3804/5971 [35:19<20:07,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.25e-5, train/loss_step=0.00212, global_step=362.0]
Epoch 0:  64%|██████▎   | 3804/5971 [35:19<20:07,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00107, train/loss_step=0.293, global_step=362.0]     
Epoch 0:  64%|██████▎   | 3805/5971 [35:20<20:06,  1.79it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.67e-5, train/loss_step=0.00292, global_step=363.0]
Epoch 0:  64%|██████▎   | 3806/5971 [35:21<20:06,  1.79it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0646, train/loss_vlb_step=0.000227, train/loss_step=0.0646, global_step=363.0]
Epoch 0:  64%|██████▍   | 3807/5971 [35:22<20:05,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.0001, train/loss_step=0.0265, global_step=363.0]  
Epoch 0:  64%|██████▍   | 3808/5971 [35:24<20:06,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.0001, train/loss_step=0.0265, global_step=363.0]
Epoch 0:  64%|██████▍   | 3808/5971 [35:24<20:06,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0534, train/loss_vlb_step=0.000184, train/loss_step=0.0534, global_step=363.0]
Epoch 0:  64%|██████▍   | 3809/5971 [35:25<20:05,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.21e-6, train/loss_step=0.00154, global_step=364.0]
Epoch 0:  64%|██████▍   | 3810/5971 [35:26<20:05,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.00015, train/loss_step=0.0385, global_step=364.0]   
Epoch 0:  64%|██████▍   | 3811/5971 [35:26<20:05,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00147, train/loss_step=0.317, global_step=364.0] 
Epoch 0:  64%|██████▍   | 3812/5971 [35:29<20:05,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00147, train/loss_step=0.317, global_step=364.0]
Epoch 0:  64%|██████▍   | 3812/5971 [35:29<20:05,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.56e-5, train/loss_step=0.00501, global_step=364.0]
Epoch 0:  64%|██████▍   | 3813/5971 [35:30<20:05,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.639, train/loss_vlb_step=0.0073, train/loss_step=0.639, global_step=365.0]     
Epoch 0:  64%|██████▍   | 3814/5971 [35:31<20:04,  1.79it/s, loss=0.159, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.0012, train/loss_step=0.242, global_step=365.0]
Epoch 0:  64%|██████▍   | 3815/5971 [35:31<20:04,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000513, train/loss_step=0.155, global_step=365.0]
Epoch 0:  64%|██████▍   | 3816/5971 [35:34<20:04,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000513, train/loss_step=0.155, global_step=365.0]
Epoch 0:  64%|██████▍   | 3816/5971 [35:34<20:04,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.032, train/loss_vlb_step=0.000129, train/loss_step=0.032, global_step=365.0]
Epoch 0:  64%|██████▍   | 3817/5971 [35:34<20:04,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00569, train/loss_vlb_step=2.87e-5, train/loss_step=0.00569, global_step=366.0]
Epoch 0:  64%|██████▍   | 3818/5971 [35:35<20:04,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.13e-6, train/loss_step=0.00158, global_step=366.0]
Epoch 0:  64%|██████▍   | 3819/5971 [35:36<20:03,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.42e-5, train/loss_step=0.0129, global_step=366.0]  
Epoch 0:  64%|██████▍   | 3820/5971 [35:39<20:04,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.42e-5, train/loss_step=0.0129, global_step=366.0]
Epoch 0:  64%|██████▍   | 3820/5971 [35:39<20:04,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000298, train/loss_step=0.0887, global_step=366.0]
Epoch 0:  64%|██████▍   | 3821/5971 [35:39<20:03,  1.79it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000117, train/loss_step=0.0292, global_step=367.0]
Epoch 0:  64%|██████▍   | 3822/5971 [35:40<20:03,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=9.44e-5, train/loss_step=0.0255, global_step=367.0] 
Epoch 0:  64%|██████▍   | 3823/5971 [35:41<20:03,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=367.0] 
Epoch 0:  64%|██████▍   | 3824/5971 [35:43<20:03,  1.78it/s, loss=0.107, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=367.0]
Epoch 0:  64%|██████▍   | 3824/5971 [35:43<20:03,  1.78it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.46e-5, train/loss_step=0.00252, global_step=367.0]
Epoch 0:  64%|██████▍   | 3825/5971 [35:44<20:02,  1.78it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000159, train/loss_step=0.0424, global_step=368.0] 
Epoch 0:  64%|██████▍   | 3826/5971 [35:45<20:02,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000828, train/loss_step=0.215, global_step=368.0]   
Epoch 0:  64%|██████▍   | 3827/5971 [35:46<20:02,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.44e-5, train/loss_step=0.0158, global_step=368.0]
Epoch 0:  64%|██████▍   | 3828/5971 [35:48<20:02,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.44e-5, train/loss_step=0.0158, global_step=368.0]
Epoch 0:  64%|██████▍   | 3828/5971 [35:48<20:02,  1.78it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.51e-5, train/loss_step=0.015, global_step=368.0] 
Epoch 0:  64%|██████▍   | 3829/5971 [35:49<20:02,  1.78it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.00032, train/loss_step=0.0974, global_step=369.0]
Epoch 0:  64%|██████▍   | 3830/5971 [35:50<20:01,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00538, train/loss_vlb_step=2.74e-5, train/loss_step=0.00538, global_step=369.0]
Epoch 0:  64%|██████▍   | 3831/5971 [35:51<20:01,  1.78it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.93e-5, train/loss_step=0.0107, global_step=369.0] 
Epoch 0:  64%|██████▍   | 3832/5971 [35:53<20:01,  1.78it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.93e-5, train/loss_step=0.0107, global_step=369.0]
Epoch 0:  64%|██████▍   | 3832/5971 [35:53<20:01,  1.78it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000126, train/loss_step=0.0337, global_step=369.0]
Epoch 0:  64%|██████▍   | 3833/5971 [35:54<20:01,  1.78it/s, loss=0.0611, v_num=0, train/loss_simple_step=0.0812, train/loss_vlb_step=0.000271, train/loss_step=0.0812, global_step=370.0]
Epoch 0:  64%|██████▍   | 3834/5971 [35:55<20:00,  1.78it/s, loss=0.0494, v_num=0, train/loss_simple_step=0.00914, train/loss_vlb_step=4.36e-5, train/loss_step=0.00914, global_step=370.0]
Epoch 0:  64%|██████▍   | 3835/5971 [35:56<20:00,  1.78it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0065, train/loss_step=0.643, global_step=370.0]     
Epoch 0:  64%|██████▍   | 3836/5971 [35:58<20:00,  1.78it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0065, train/loss_step=0.643, global_step=370.0]
Epoch 0:  64%|██████▍   | 3836/5971 [35:58<20:00,  1.78it/s, loss=0.0727, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=4.05e-5, train/loss_step=0.00852, global_step=370.0]
Epoch 0:  64%|██████▍   | 3837/5971 [35:59<20:00,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.762, train/loss_vlb_step=0.036, train/loss_step=0.762, global_step=371.0]        
Epoch 0:  64%|██████▍   | 3838/5971 [35:59<20:00,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=0.000102, train/loss_step=0.0247, global_step=371.0]
Epoch 0:  64%|██████▍   | 3839/5971 [36:00<19:59,  1.78it/s, loss=0.118, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000511, train/loss_step=0.147, global_step=371.0]  
Epoch 0:  64%|██████▍   | 3840/5971 [36:03<20:00,  1.78it/s, loss=0.118, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000511, train/loss_step=0.147, global_step=371.0]
Epoch 0:  64%|██████▍   | 3840/5971 [36:03<20:00,  1.78it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000129, train/loss_step=0.0336, global_step=371.0]
Epoch 0:  64%|██████▍   | 3841/5971 [36:04<19:59,  1.78it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.06e-5, train/loss_step=0.0115, global_step=372.0] 
Epoch 0:  64%|██████▍   | 3842/5971 [36:05<19:59,  1.77it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.79e-5, train/loss_step=0.0128, global_step=372.0]
Epoch 0:  64%|██████▍   | 3843/5971 [36:05<19:59,  1.77it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000305, train/loss_step=0.0899, global_step=372.0]
Epoch 0:  64%|██████▍   | 3844/5971 [36:08<19:59,  1.77it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000305, train/loss_step=0.0899, global_step=372.0]
Epoch 0:  64%|██████▍   | 3844/5971 [36:08<19:59,  1.77it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.68e-6, train/loss_step=0.00159, global_step=372.0]
Epoch 0:  64%|██████▍   | 3845/5971 [36:09<19:58,  1.77it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.59e-5, train/loss_step=0.0237, global_step=373.0]  
Epoch 0:  64%|██████▍   | 3846/5971 [36:09<19:58,  1.77it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00849, train/loss_vlb_step=4.17e-5, train/loss_step=0.00849, global_step=373.0]
Epoch 0:  64%|██████▍   | 3847/5971 [36:10<19:58,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00186, train/loss_step=0.380, global_step=373.0]     
Epoch 0:  64%|██████▍   | 3848/5971 [36:12<19:58,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00186, train/loss_step=0.380, global_step=373.0]
Epoch 0:  64%|██████▍   | 3848/5971 [36:12<19:58,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.32e-5, train/loss_step=0.0143, global_step=373.0]
Epoch 0:  64%|██████▍   | 3849/5971 [36:13<19:58,  1.77it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00307, train/loss_vlb_step=1.71e-5, train/loss_step=0.00307, global_step=374.0]
Epoch 0:  64%|██████▍   | 3850/5971 [36:14<19:57,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000471, train/loss_step=0.135, global_step=374.0]   
Epoch 0:  64%|██████▍   | 3851/5971 [36:15<19:57,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.31e-5, train/loss_step=0.00218, global_step=374.0]
Epoch 0:  65%|██████▍   | 3852/5971 [36:17<19:57,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.31e-5, train/loss_step=0.00218, global_step=374.0]
Epoch 0:  65%|██████▍   | 3852/5971 [36:17<19:57,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:01<03:22,  1.22s/it][A

Validating:   2%|▏         | 3/167 [00:01<00:59,  2.75it/s][A
Epoch 0:  65%|██████▍   | 3856/5971 [36:19<19:55,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:   4%|▍         | 7/167 [00:01<00:22,  7.19it/s][A
Epoch 0:  65%|██████▍   | 3860/5971 [36:19<19:51,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:   6%|▌         | 10/167 [00:01<00:15, 10.42it/s][A
Epoch 0:  65%|██████▍   | 3864/5971 [36:19<19:48,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:   8%|▊         | 14/167 [00:01<00:10, 14.64it/s][A
Epoch 0:  65%|██████▍   | 3868/5971 [36:19<19:44,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  10%|█         | 17/167 [00:01<00:08, 17.24it/s][A
Epoch 0:  65%|██████▍   | 3872/5971 [36:19<19:41,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 18.90it/s][A

Validating:  14%|█▍        | 23/167 [00:02<00:06, 20.99it/s][A
Epoch 0:  65%|██████▍   | 3876/5971 [36:19<19:37,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  16%|█▌        | 27/167 [00:02<00:05, 23.88it/s][A
Epoch 0:  65%|██████▍   | 3880/5971 [36:20<19:34,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  18%|█▊        | 30/167 [00:02<00:05, 24.51it/s][A
Epoch 0:  65%|██████▌   | 3884/5971 [36:20<19:31,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 24.57it/s][A
Epoch 0:  65%|██████▌   | 3888/5971 [36:20<19:27,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 24.12it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 23.99it/s][A
Epoch 0:  65%|██████▌   | 3892/5971 [36:20<19:24,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.79it/s][A
Epoch 0:  65%|██████▌   | 3896/5971 [36:20<19:21,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.74it/s][A
Epoch 0:  65%|██████▌   | 3900/5971 [36:20<19:17,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  29%|██▊       | 48/167 [00:03<00:04, 26.81it/s][A
Epoch 0:  65%|██████▌   | 3904/5971 [36:21<19:14,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  31%|███       | 52/167 [00:03<00:04, 28.18it/s][A

Validating:  33%|███▎      | 55/167 [00:03<00:04, 27.99it/s][A
Epoch 0:  65%|██████▌   | 3908/5971 [36:21<19:11,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  35%|███▍      | 58/167 [00:03<00:04, 27.25it/s][A
Epoch 0:  66%|██████▌   | 3912/5971 [36:21<19:07,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  37%|███▋      | 61/167 [00:03<00:03, 26.64it/s][A
Epoch 0:  66%|██████▌   | 3916/5971 [36:21<19:04,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 25.97it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 26.07it/s][A
Epoch 0:  66%|██████▌   | 3920/5971 [36:21<19:01,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.40it/s][A
Epoch 0:  66%|██████▌   | 3924/5971 [36:21<18:57,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.41it/s][A
Epoch 0:  66%|██████▌   | 3928/5971 [36:21<18:54,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  46%|████▌     | 77/167 [00:04<00:03, 26.12it/s][A
Epoch 0:  66%|██████▌   | 3932/5971 [36:22<18:51,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  48%|████▊     | 80/167 [00:04<00:03, 26.58it/s][A

Validating:  50%|████▉     | 83/167 [00:04<00:03, 25.12it/s][A
Epoch 0:  66%|██████▌   | 3936/5971 [36:22<18:47,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 26.37it/s][A
Epoch 0:  66%|██████▌   | 3940/5971 [36:22<18:44,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.89it/s][A
Epoch 0:  66%|██████▌   | 3944/5971 [36:22<18:41,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.40it/s][A
Epoch 0:  66%|██████▌   | 3948/5971 [36:22<18:38,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.07it/s][A
Epoch 0:  66%|██████▌   | 3952/5971 [36:22<18:34,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.36it/s][A
Epoch 0:  66%|██████▋   | 3956/5971 [36:22<18:31,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  62%|██████▏   | 104/167 [00:05<00:02, 27.02it/s][A

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 26.43it/s][A
Epoch 0:  66%|██████▋   | 3960/5971 [36:23<18:28,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 27.13it/s][A
Epoch 0:  66%|██████▋   | 3964/5971 [36:23<18:25,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 26.89it/s][A
Epoch 0:  66%|██████▋   | 3968/5971 [36:23<18:21,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 26.38it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.17it/s][A
Epoch 0:  67%|██████▋   | 3972/5971 [36:23<18:18,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.80it/s][A
Epoch 0:  67%|██████▋   | 3976/5971 [36:23<18:15,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.99it/s][A
Epoch 0:  67%|██████▋   | 3980/5971 [36:23<18:12,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  77%|███████▋  | 128/167 [00:06<00:01, 26.89it/s][A

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 26.12it/s][A
Epoch 0:  67%|██████▋   | 3984/5971 [36:24<18:09,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  80%|████████  | 134/167 [00:06<00:01, 26.60it/s][A
Epoch 0:  67%|██████▋   | 3988/5971 [36:24<18:05,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.52it/s][A
Epoch 0:  67%|██████▋   | 3992/5971 [36:24<18:02,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.29it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.57it/s][A
Epoch 0:  67%|██████▋   | 3996/5971 [36:24<17:59,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.67it/s][A
Epoch 0:  67%|██████▋   | 4000/5971 [36:24<17:56,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.27it/s][A
Epoch 0:  67%|██████▋   | 4004/5971 [36:24<17:53,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.25it/s][A
Epoch 0:  67%|██████▋   | 4008/5971 [36:24<17:49,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  93%|█████████▎| 156/167 [00:07<00:00, 27.08it/s][A

Validating:  95%|█████████▌| 159/167 [00:07<00:00, 27.37it/s][A
Epoch 0:  67%|██████▋   | 4012/5971 [36:25<17:46,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 26.92it/s][A
Epoch 0:  67%|██████▋   | 4016/5971 [36:25<17:43,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 26.30it/s][A
Epoch 0:  67%|██████▋   | 4020/5971 [36:25<17:40,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]
Epoch 0:  67%|██████▋   | 4020/5971 [36:25<17:40,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=374.0]

                                                             [A
Epoch 0:  67%|██████▋   | 4021/5971 [36:26<17:40,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.51e-5, train/loss_step=0.0124, global_step=375.0]
Epoch 0:  67%|██████▋   | 4022/5971 [36:27<17:39,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00145, train/loss_step=0.338, global_step=375.0]  
Epoch 0:  67%|██████▋   | 4023/5971 [36:28<17:39,  1.84it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.67e-5, train/loss_step=0.00521, global_step=375.0]
Epoch 0:  67%|██████▋   | 4024/5971 [36:30<17:39,  1.84it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.67e-5, train/loss_step=0.00521, global_step=375.0]
Epoch 0:  67%|██████▋   | 4024/5971 [36:30<17:39,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.00069, train/loss_step=0.197, global_step=375.0]    
Epoch 0:  67%|██████▋   | 4025/5971 [36:31<17:39,  1.84it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00216, train/loss_step=0.363, global_step=376.0]
Epoch 0:  67%|██████▋   | 4026/5971 [36:32<17:39,  1.84it/s, loss=0.107, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000919, train/loss_step=0.243, global_step=376.0]
Epoch 0:  67%|██████▋   | 4027/5971 [36:33<17:38,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.00041, train/loss_step=0.122, global_step=376.0] 
Epoch 0:  67%|██████▋   | 4028/5971 [36:35<17:38,  1.83it/s, loss=0.106, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.00041, train/loss_step=0.122, global_step=376.0]
Epoch 0:  67%|██████▋   | 4028/5971 [36:35<17:38,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.649, train/loss_vlb_step=0.0127, train/loss_step=0.649, global_step=376.0] 
Epoch 0:  67%|██████▋   | 4029/5971 [36:36<17:38,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00136, train/loss_step=0.314, global_step=377.0]
Epoch 0:  67%|██████▋   | 4030/5971 [36:37<17:38,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=377.0]
Epoch 0:  68%|██████▊   | 4031/5971 [36:38<17:37,  1.83it/s, loss=0.172, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00209, train/loss_step=0.389, global_step=377.0] 
Epoch 0:  68%|██████▊   | 4032/5971 [36:40<17:38,  1.83it/s, loss=0.172, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00209, train/loss_step=0.389, global_step=377.0]
Epoch 0:  68%|██████▊   | 4032/5971 [36:40<17:38,  1.83it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.52e-5, train/loss_step=0.0026, global_step=377.0]
Epoch 0:  68%|██████▊   | 4033/5971 [36:41<17:37,  1.83it/s, loss=0.212, v_num=0, train/loss_simple_step=0.830, train/loss_vlb_step=0.0429, train/loss_step=0.830, global_step=378.0]   
Epoch 0:  68%|██████▊   | 4034/5971 [36:42<17:37,  1.83it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.000235, train/loss_step=0.0707, global_step=378.0]
Epoch 0:  68%|██████▊   | 4035/5971 [36:43<17:36,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00205, train/loss_step=0.401, global_step=378.0]   
Epoch 0:  68%|██████▊   | 4036/5971 [36:45<17:37,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00205, train/loss_step=0.401, global_step=378.0]
Epoch 0:  68%|██████▊   | 4036/5971 [36:45<17:37,  1.83it/s, loss=0.222, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000383, train/loss_step=0.116, global_step=378.0]
Epoch 0:  68%|██████▊   | 4037/5971 [36:46<17:36,  1.83it/s, loss=0.236, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00123, train/loss_step=0.294, global_step=379.0] 
Epoch 0:  68%|██████▊   | 4038/5971 [36:47<17:36,  1.83it/s, loss=0.23, v_num=0, train/loss_simple_step=0.00936, train/loss_vlb_step=4.09e-5, train/loss_step=0.00936, global_step=379.0]
Epoch 0:  68%|██████▊   | 4039/5971 [36:48<17:36,  1.83it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000128, train/loss_step=0.0348, global_step=379.0]
Epoch 0:  68%|██████▊   | 4040/5971 [36:50<17:36,  1.83it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000128, train/loss_step=0.0348, global_step=379.0]
Epoch 0:  68%|██████▊   | 4040/5971 [36:50<17:36,  1.83it/s, loss=0.242, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00206, train/loss_step=0.339, global_step=379.0]   
Epoch 0:  68%|██████▊   | 4041/5971 [36:51<17:36,  1.83it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.18e-5, train/loss_step=0.00415, global_step=380.0]
Epoch 0:  68%|██████▊   | 4042/5971 [36:52<17:35,  1.83it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000189, train/loss_step=0.0551, global_step=380.0] 
Epoch 0:  68%|██████▊   | 4043/5971 [36:53<17:35,  1.83it/s, loss=0.242, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00109, train/loss_step=0.289, global_step=380.0]   
Epoch 0:  68%|██████▊   | 4044/5971 [36:55<17:35,  1.83it/s, loss=0.242, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00109, train/loss_step=0.289, global_step=380.0]
Epoch 0:  68%|██████▊   | 4044/5971 [36:55<17:35,  1.83it/s, loss=0.24, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.00053, train/loss_step=0.159, global_step=380.0] 
Epoch 0:  68%|██████▊   | 4045/5971 [36:56<17:35,  1.83it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0199, train/loss_vlb_step=7.88e-5, train/loss_step=0.0199, global_step=381.0]
Epoch 0:  68%|██████▊   | 4046/5971 [36:57<17:34,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=381.0] 
Epoch 0:  68%|██████▊   | 4047/5971 [36:58<17:34,  1.82it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.59e-5, train/loss_step=0.0154, global_step=381.0]
Epoch 0:  68%|██████▊   | 4048/5971 [37:00<17:34,  1.82it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.59e-5, train/loss_step=0.0154, global_step=381.0]
Epoch 0:  68%|██████▊   | 4048/5971 [37:00<17:34,  1.82it/s, loss=0.189, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000781, train/loss_step=0.220, global_step=381.0] 
Epoch 0:  68%|██████▊   | 4049/5971 [37:01<17:34,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000396, train/loss_step=0.120, global_step=382.0] 
Epoch 0:  68%|██████▊   | 4050/5971 [37:02<17:33,  1.82it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.06e-5, train/loss_step=0.00174, global_step=382.0]
Epoch 0:  68%|██████▊   | 4051/5971 [37:03<17:33,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000154, train/loss_step=0.0423, global_step=382.0] 
Epoch 0:  68%|██████▊   | 4052/5971 [37:05<17:33,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000154, train/loss_step=0.0423, global_step=382.0]
Epoch 0:  68%|██████▊   | 4052/5971 [37:05<17:33,  1.82it/s, loss=0.171, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.0039, train/loss_step=0.282, global_step=382.0]    
Epoch 0:  68%|██████▊   | 4053/5971 [37:06<17:33,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000513, train/loss_step=0.156, global_step=383.0]
Epoch 0:  68%|██████▊   | 4054/5971 [37:07<17:32,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.38e-5, train/loss_step=0.0024, global_step=383.0]
Epoch 0:  68%|██████▊   | 4055/5971 [37:07<17:32,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0984, train/loss_vlb_step=0.000326, train/loss_step=0.0984, global_step=383.0]
Epoch 0:  68%|██████▊   | 4056/5971 [37:10<17:32,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0984, train/loss_vlb_step=0.000326, train/loss_step=0.0984, global_step=383.0]
Epoch 0:  68%|██████▊   | 4056/5971 [37:10<17:32,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00142, train/loss_vlb_step=8.66e-6, train/loss_step=0.00142, global_step=383.0]
Epoch 0:  68%|██████▊   | 4057/5971 [37:11<17:32,  1.82it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000143, train/loss_step=0.0375, global_step=384.0]
Epoch 0:  68%|██████▊   | 4058/5971 [37:12<17:32,  1.82it/s, loss=0.111, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000809, train/loss_step=0.234, global_step=384.0]   
Epoch 0:  68%|██████▊   | 4059/5971 [37:13<17:31,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00307, train/loss_step=0.471, global_step=384.0] 
Epoch 0:  68%|██████▊   | 4060/5971 [37:15<17:31,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00307, train/loss_step=0.471, global_step=384.0]
Epoch 0:  68%|██████▊   | 4060/5971 [37:15<17:31,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000744, train/loss_step=0.204, global_step=384.0]
Epoch 0:  68%|██████▊   | 4061/5971 [37:16<17:31,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.68e-5, train/loss_step=0.0182, global_step=385.0]
Epoch 0:  68%|██████▊   | 4062/5971 [37:16<17:31,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.21e-5, train/loss_step=0.00215, global_step=385.0]
Epoch 0:  68%|██████▊   | 4063/5971 [37:17<17:30,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000341, train/loss_step=0.103, global_step=385.0]   
Epoch 0:  68%|██████▊   | 4064/5971 [37:19<17:30,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000341, train/loss_step=0.103, global_step=385.0]
Epoch 0:  68%|██████▊   | 4064/5971 [37:19<17:30,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00816, train/loss_vlb_step=3.71e-5, train/loss_step=0.00816, global_step=385.0]
Epoch 0:  68%|██████▊   | 4065/5971 [37:20<17:30,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000169, train/loss_step=0.0468, global_step=386.0] 
Epoch 0:  68%|██████▊   | 4066/5971 [37:21<17:30,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.23e-5, train/loss_step=0.0206, global_step=386.0] 
Epoch 0:  68%|██████▊   | 4067/5971 [37:22<17:29,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0526, train/loss_vlb_step=0.000185, train/loss_step=0.0526, global_step=386.0]
Epoch 0:  68%|██████▊   | 4068/5971 [37:24<17:29,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0526, train/loss_vlb_step=0.000185, train/loss_step=0.0526, global_step=386.0]
Epoch 0:  68%|██████▊   | 4068/5971 [37:24<17:29,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00119, train/loss_step=0.288, global_step=386.0]    
Epoch 0:  68%|██████▊   | 4069/5971 [37:25<17:29,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.27e-5, train/loss_step=0.0142, global_step=387.0]
Epoch 0:  68%|██████▊   | 4070/5971 [37:26<17:29,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.69e-5, train/loss_step=0.00295, global_step=387.0]
Epoch 0:  68%|██████▊   | 4071/5971 [37:27<17:28,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00145, train/loss_step=0.328, global_step=387.0]    
Epoch 0:  68%|██████▊   | 4072/5971 [37:29<17:28,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00145, train/loss_step=0.328, global_step=387.0]
Epoch 0:  68%|██████▊   | 4072/5971 [37:29<17:28,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000343, train/loss_step=0.102, global_step=387.0]
Epoch 0:  68%|██████▊   | 4073/5971 [37:30<17:28,  1.81it/s, loss=0.103, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.65e-5, train/loss_step=0.018, global_step=388.0]
Epoch 0:  68%|██████▊   | 4074/5971 [37:31<17:28,  1.81it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00699, train/loss_vlb_step=3.41e-5, train/loss_step=0.00699, global_step=388.0]
Epoch 0:  68%|██████▊   | 4075/5971 [37:32<17:27,  1.81it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000152, train/loss_step=0.0444, global_step=388.0]   
Epoch 0:  68%|██████▊   | 4076/5971 [37:34<17:28,  1.81it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000152, train/loss_step=0.0444, global_step=388.0]
Epoch 0:  68%|██████▊   | 4076/5971 [37:34<17:28,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0263, train/loss_step=0.750, global_step=388.0]  
Epoch 0:  68%|██████▊   | 4077/5971 [37:35<17:27,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000523, train/loss_step=0.150, global_step=389.0]
Epoch 0:  68%|██████▊   | 4078/5971 [37:36<17:27,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=6.95e-5, train/loss_step=0.017, global_step=389.0] 
Epoch 0:  68%|██████▊   | 4079/5971 [37:37<17:26,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00296, train/loss_step=0.459, global_step=389.0]
Epoch 0:  68%|██████▊   | 4080/5971 [37:39<17:27,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00296, train/loss_step=0.459, global_step=389.0]
Epoch 0:  68%|██████▊   | 4080/5971 [37:39<17:27,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0936, train/loss_vlb_step=0.000308, train/loss_step=0.0936, global_step=389.0]
Epoch 0:  68%|██████▊   | 4081/5971 [37:40<17:26,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000194, train/loss_step=0.0554, global_step=390.0]
Epoch 0:  68%|██████▊   | 4082/5971 [37:41<17:26,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000581, train/loss_step=0.168, global_step=390.0]  
Epoch 0:  68%|██████▊   | 4083/5971 [37:42<17:25,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.17e-5, train/loss_step=0.0231, global_step=390.0]
Epoch 0:  68%|██████▊   | 4084/5971 [37:44<17:26,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.17e-5, train/loss_step=0.0231, global_step=390.0]
Epoch 0:  68%|██████▊   | 4084/5971 [37:44<17:26,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.0025, train/loss_step=0.372, global_step=390.0]   
Epoch 0:  68%|██████▊   | 4085/5971 [37:45<17:25,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00542, train/loss_vlb_step=2.75e-5, train/loss_step=0.00542, global_step=391.0]
Epoch 0:  68%|██████▊   | 4086/5971 [37:46<17:25,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0886, train/loss_vlb_step=0.000292, train/loss_step=0.0886, global_step=391.0] 
Epoch 0:  68%|██████▊   | 4087/5971 [37:47<17:24,  1.80it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0058, train/loss_vlb_step=2.83e-5, train/loss_step=0.0058, global_step=391.0]  
Epoch 0:  68%|██████▊   | 4088/5971 [37:49<17:24,  1.80it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0058, train/loss_vlb_step=2.83e-5, train/loss_step=0.0058, global_step=391.0]
Epoch 0:  68%|██████▊   | 4088/5971 [37:49<17:24,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.529, train/loss_vlb_step=0.00534, train/loss_step=0.529, global_step=391.0] 
Epoch 0:  68%|██████▊   | 4089/5971 [37:50<17:24,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000111, train/loss_step=0.0306, global_step=392.0]
Epoch 0:  68%|██████▊   | 4090/5971 [37:50<17:24,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.00058, train/loss_step=0.169, global_step=392.0]   
Epoch 0:  69%|██████▊   | 4091/5971 [37:51<17:23,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.0043, train/loss_step=0.473, global_step=392.0] 
Epoch 0:  69%|██████▊   | 4092/5971 [37:53<17:23,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.0043, train/loss_step=0.473, global_step=392.0]
Epoch 0:  69%|██████▊   | 4092/5971 [37:53<17:23,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000744, train/loss_step=0.205, global_step=392.0]
Epoch 0:  69%|██████▊   | 4093/5971 [37:54<17:23,  1.80it/s, loss=0.2, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00185, train/loss_step=0.347, global_step=393.0]   
Epoch 0:  69%|██████▊   | 4094/5971 [37:55<17:23,  1.80it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0038, train/loss_vlb_step=2.04e-5, train/loss_step=0.0038, global_step=393.0]
Epoch 0:  69%|██████▊   | 4095/5971 [37:56<17:22,  1.80it/s, loss=0.207, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000616, train/loss_step=0.188, global_step=393.0] 
Epoch 0:  69%|██████▊   | 4096/5971 [37:58<17:22,  1.80it/s, loss=0.207, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000616, train/loss_step=0.188, global_step=393.0]
Epoch 0:  69%|██████▊   | 4096/5971 [37:58<17:22,  1.80it/s, loss=0.191, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00439, train/loss_step=0.434, global_step=393.0] 
Epoch 0:  69%|██████▊   | 4097/5971 [37:59<17:22,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00946, train/loss_vlb_step=4.32e-5, train/loss_step=0.00946, global_step=394.0]
Epoch 0:  69%|██████▊   | 4098/5971 [38:00<17:22,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00118, train/loss_step=0.280, global_step=394.0]    
Epoch 0:  69%|██████▊   | 4099/5971 [38:01<17:21,  1.80it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.25e-5, train/loss_step=0.00217, global_step=394.0]
Epoch 0:  69%|██████▊   | 4100/5971 [38:03<17:21,  1.80it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.25e-5, train/loss_step=0.00217, global_step=394.0]
Epoch 0:  69%|██████▊   | 4100/5971 [38:03<17:21,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000239, train/loss_step=0.0688, global_step=394.0] 
Epoch 0:  69%|██████▊   | 4101/5971 [38:04<17:21,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000146, train/loss_step=0.0382, global_step=395.0]
Epoch 0:  69%|██████▊   | 4102/5971 [38:05<17:21,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000117, train/loss_step=0.0336, global_step=395.0]
Epoch 0:  69%|██████▊   | 4103/5971 [38:06<17:20,  1.79it/s, loss=0.19, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00666, train/loss_step=0.513, global_step=395.0]    
Epoch 0:  69%|██████▊   | 4104/5971 [38:08<17:20,  1.79it/s, loss=0.19, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00666, train/loss_step=0.513, global_step=395.0]
Epoch 0:  69%|██████▊   | 4104/5971 [38:08<17:20,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.00037, train/loss_step=0.113, global_step=395.0]
Epoch 0:  69%|██████▊   | 4105/5971 [38:09<17:20,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.42e-5, train/loss_step=0.0119, global_step=396.0]
Epoch 0:  69%|██████▉   | 4106/5971 [38:10<17:20,  1.79it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00742, train/loss_vlb_step=3.68e-5, train/loss_step=0.00742, global_step=396.0]
Epoch 0:  69%|██████▉   | 4107/5971 [38:11<17:19,  1.79it/s, loss=0.179, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.00038, train/loss_step=0.116, global_step=396.0]    
Epoch 0:  69%|██████▉   | 4108/5971 [38:13<17:19,  1.79it/s, loss=0.179, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.00038, train/loss_step=0.116, global_step=396.0]
Epoch 0:  69%|██████▉   | 4108/5971 [38:13<17:19,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00466, train/loss_vlb_step=2.48e-5, train/loss_step=0.00466, global_step=396.0]
Epoch 0:  69%|██████▉   | 4109/5971 [38:14<17:19,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000837, train/loss_step=0.238, global_step=397.0]   
Epoch 0:  69%|██████▉   | 4110/5971 [38:15<17:19,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0504, train/loss_vlb_step=0.000183, train/loss_step=0.0504, global_step=397.0]
Epoch 0:  69%|██████▉   | 4111/5971 [38:16<17:18,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=3.16e-5, train/loss_step=0.00636, global_step=397.0]
Epoch 0:  69%|██████▉   | 4112/5971 [38:18<17:18,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=3.16e-5, train/loss_step=0.00636, global_step=397.0]
Epoch 0:  69%|██████▉   | 4112/5971 [38:18<17:18,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00115, train/loss_step=0.294, global_step=397.0]    
Epoch 0:  69%|██████▉   | 4113/5971 [38:19<17:18,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=398.0]
Epoch 0:  69%|██████▉   | 4114/5971 [38:20<17:18,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.000141, train/loss_step=0.0385, global_step=398.0]
Epoch 0:  69%|██████▉   | 4115/5971 [38:21<17:17,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00334, train/loss_step=0.445, global_step=398.0]   
Epoch 0:  69%|██████▉   | 4116/5971 [38:23<17:17,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00334, train/loss_step=0.445, global_step=398.0]
Epoch 0:  69%|██████▉   | 4116/5971 [38:23<17:17,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000446, train/loss_step=0.133, global_step=398.0]
Epoch 0:  69%|██████▉   | 4117/5971 [38:24<17:17,  1.79it/s, loss=0.166, v_num=0, train/loss_simple_step=0.814, train/loss_vlb_step=0.0327, train/loss_step=0.814, global_step=399.0]  
Epoch 0:  69%|██████▉   | 4118/5971 [38:25<17:17,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00357, train/loss_vlb_step=2.04e-5, train/loss_step=0.00357, global_step=399.0]
Epoch 0:  69%|██████▉   | 4119/5971 [38:26<17:16,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.33e-5, train/loss_step=0.00455, global_step=399.0]
Epoch 0:  69%|██████▉   | 4120/5971 [38:28<17:16,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.33e-5, train/loss_step=0.00455, global_step=399.0]
Epoch 0:  69%|██████▉   | 4120/5971 [38:28<17:16,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]      

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:25,  1.95it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:28,  5.72it/s][A
Epoch 0:  69%|██████▉   | 4124/5971 [38:28<17:13,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:   4%|▎         | 6/167 [00:00<00:14, 10.77it/s][A
Epoch 0:  69%|██████▉   | 4128/5971 [38:29<17:10,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.67it/s][A
Epoch 0:  69%|██████▉   | 4132/5971 [38:29<17:07,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:   7%|▋         | 12/167 [00:01<00:09, 16.93it/s][A

Validating:   9%|▉         | 15/167 [00:01<00:07, 19.17it/s][A
Epoch 0:  69%|██████▉   | 4136/5971 [38:29<17:04,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  11%|█         | 18/167 [00:01<00:07, 20.29it/s][A
Epoch 0:  69%|██████▉   | 4140/5971 [38:29<17:01,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.50it/s][A
Epoch 0:  69%|██████▉   | 4144/5971 [38:29<16:58,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.26it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.51it/s][A
Epoch 0:  69%|██████▉   | 4148/5971 [38:29<16:54,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.87it/s][A
Epoch 0:  70%|██████▉   | 4152/5971 [38:29<16:51,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.89it/s][A
Epoch 0:  70%|██████▉   | 4156/5971 [38:30<16:48,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.97it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:04, 25.76it/s][A
Epoch 0:  70%|██████▉   | 4160/5971 [38:30<16:45,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.36it/s][A
Epoch 0:  70%|██████▉   | 4164/5971 [38:30<16:42,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 24.55it/s][A
Epoch 0:  70%|██████▉   | 4168/5971 [38:30<16:39,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.46it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 25.42it/s][A
Epoch 0:  70%|██████▉   | 4172/5971 [38:30<16:36,  1.81it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.50it/s][A
Epoch 0:  70%|██████▉   | 4176/5971 [38:30<16:33,  1.81it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.22it/s][A
Epoch 0:  70%|███████   | 4180/5971 [38:31<16:29,  1.81it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.41it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:04, 24.40it/s][A
Epoch 0:  70%|███████   | 4184/5971 [38:31<16:26,  1.81it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 24.77it/s][A
Epoch 0:  70%|███████   | 4188/5971 [38:31<16:23,  1.81it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.30it/s][A
Epoch 0:  70%|███████   | 4192/5971 [38:31<16:20,  1.81it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.42it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.07it/s][A
Epoch 0:  70%|███████   | 4196/5971 [38:31<16:17,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.78it/s][A
Epoch 0:  70%|███████   | 4200/5971 [38:31<16:14,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.21it/s][A
Epoch 0:  70%|███████   | 4204/5971 [38:32<16:11,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.35it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 24.85it/s][A
Epoch 0:  70%|███████   | 4208/5971 [38:32<16:08,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 25.03it/s][A
Epoch 0:  71%|███████   | 4212/5971 [38:32<16:05,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  56%|█████▌    | 93/167 [00:04<00:03, 23.24it/s][A
Epoch 0:  71%|███████   | 4216/5971 [38:32<16:02,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 23.86it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 24.17it/s][A
Epoch 0:  71%|███████   | 4220/5971 [38:32<15:59,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  61%|██████    | 102/167 [00:04<00:02, 24.00it/s][A
Epoch 0:  71%|███████   | 4224/5971 [38:32<15:56,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 21.69it/s][A
Epoch 0:  71%|███████   | 4228/5971 [38:33<15:53,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 20.87it/s][A

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 20.97it/s][A
Epoch 0:  71%|███████   | 4232/5971 [38:33<15:50,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 22.17it/s][A
Epoch 0:  71%|███████   | 4236/5971 [38:33<15:47,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  70%|███████   | 117/167 [00:05<00:02, 23.24it/s][A
Epoch 0:  71%|███████   | 4240/5971 [38:33<15:44,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 24.66it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 24.58it/s][A
Epoch 0:  71%|███████   | 4244/5971 [38:33<15:41,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.11it/s][A
Epoch 0:  71%|███████   | 4248/5971 [38:33<15:38,  1.84it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.53it/s][A
Epoch 0:  71%|███████   | 4252/5971 [38:34<15:35,  1.84it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 23.96it/s][A

Validating:  81%|████████  | 135/167 [00:06<00:01, 22.78it/s][A
Epoch 0:  71%|███████▏  | 4256/5971 [38:34<15:32,  1.84it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 22.61it/s][A
Epoch 0:  71%|███████▏  | 4260/5971 [38:34<15:29,  1.84it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 23.50it/s][A
Epoch 0:  71%|███████▏  | 4264/5971 [38:34<15:26,  1.84it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 23.73it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 23.35it/s][A
Epoch 0:  71%|███████▏  | 4268/5971 [38:34<15:23,  1.84it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.43it/s][A
Epoch 0:  72%|███████▏  | 4272/5971 [38:34<15:20,  1.85it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 23.22it/s][A
Epoch 0:  72%|███████▏  | 4276/5971 [38:35<15:17,  1.85it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.11it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.58it/s][A
Epoch 0:  72%|███████▏  | 4280/5971 [38:35<15:14,  1.85it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 24.47it/s][A
Epoch 0:  72%|███████▏  | 4284/5971 [38:35<15:11,  1.85it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 26.66it/s][A
Epoch 0:  72%|███████▏  | 4288/5971 [38:35<15:08,  1.85it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]
Epoch 0:  72%|███████▏  | 4288/5971 [38:35<15:08,  1.85it/s, loss=0.196, v_num=0, train/loss_simple_step=0.935, train/loss_vlb_step=0.236, train/loss_step=0.935, global_step=399.0]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:38,  1.29it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:21,  2.26it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:16,  2.93it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:13,  3.48it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  3.96it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.33it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.52it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:09,  4.64it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.85it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.02it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.56it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.59it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.54it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.62it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.62it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.64it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.64it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.12it/s]

Epoch 0:  72%|███████▏  | 4289/5971 [38:48<15:12,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.2e-5, train/loss_step=0.0125, global_step=400.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.89it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.30it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.90it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.08it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.41it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.44it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.43it/s][A
Epoch 0:  72%|███████▏  | 4289/5971 [38:53<15:14,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.2e-5, train/loss_step=0.0125, global_step=400.0]

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.36it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.41it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.41it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.50it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.51it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.45it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.36it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.19it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 0:  72%|███████▏  | 4290/5971 [39:00<15:16,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.2e-5, train/loss_step=0.0125, global_step=400.0]
Epoch 0:  72%|███████▏  | 4290/5971 [39:00<15:16,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000163, train/loss_step=0.0426, global_step=400.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.96it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.46it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.82it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.07it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.25it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.54it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.59it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.62it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.66it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.67it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.68it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.64it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.64it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.65it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.32it/s]

Epoch 0:  72%|███████▏  | 4291/5971 [39:11<15:20,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000163, train/loss_step=0.0426, global_step=400.0]
Epoch 0:  72%|███████▏  | 4291/5971 [39:11<15:20,  1.82it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.98e-5, train/loss_step=0.0109, global_step=400.0]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.40it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.76it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.03it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.21it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.52it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.57it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.63it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.63it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.64it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.67it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.67it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.64it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.51it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.55it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.60it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.59it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.29it/s]

Epoch 0:  72%|███████▏  | 4292/5971 [39:24<15:24,  1.82it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.98e-5, train/loss_step=0.0109, global_step=400.0]
Epoch 0:  72%|███████▏  | 4292/5971 [39:24<15:24,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000144, train/loss_step=0.0371, global_step=400.0]
Epoch 0:  72%|███████▏  | 4293/5971 [39:25<15:24,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000144, train/loss_step=0.0371, global_step=400.0]
Epoch 0:  72%|███████▏  | 4293/5971 [39:25<15:24,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000897, train/loss_step=0.253, global_step=401.0]  
Epoch 0:  72%|███████▏  | 4294/5971 [39:26<15:24,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000897, train/loss_step=0.253, global_step=401.0]
Epoch 0:  72%|███████▏  | 4294/5971 [39:26<15:24,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.00073, train/loss_step=0.210, global_step=401.0] 
Epoch 0:  72%|███████▏  | 4295/5971 [39:27<15:23,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.00073, train/loss_step=0.210, global_step=401.0]
Epoch 0:  72%|███████▏  | 4295/5971 [39:27<15:23,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000461, train/loss_step=0.139, global_step=401.0]
Epoch 0:  72%|███████▏  | 4296/5971 [39:30<15:23,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000461, train/loss_step=0.139, global_step=401.0]
Epoch 0:  72%|███████▏  | 4296/5971 [39:30<15:23,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00954, train/loss_vlb_step=4.22e-5, train/loss_step=0.00954, global_step=401.0]
Epoch 0:  72%|███████▏  | 4297/5971 [39:30<15:23,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00954, train/loss_vlb_step=4.22e-5, train/loss_step=0.00954, global_step=401.0]
Epoch 0:  72%|███████▏  | 4297/5971 [39:30<15:23,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000571, train/loss_step=0.169, global_step=402.0]  
Epoch 0:  72%|███████▏  | 4298/5971 [39:31<15:23,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000571, train/loss_step=0.169, global_step=402.0]
Epoch 0:  72%|███████▏  | 4298/5971 [39:31<15:23,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000637, train/loss_step=0.186, global_step=402.0]
Epoch 0:  72%|███████▏  | 4299/5971 [39:32<15:22,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000637, train/loss_step=0.186, global_step=402.0]
Epoch 0:  72%|███████▏  | 4299/5971 [39:32<15:22,  1.81it/s, loss=0.205, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.001, train/loss_step=0.248, global_step=402.0]   
Epoch 0:  72%|███████▏  | 4300/5971 [39:34<15:22,  1.81it/s, loss=0.205, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.001, train/loss_step=0.248, global_step=402.0]
Epoch 0:  72%|███████▏  | 4300/5971 [39:34<15:22,  1.81it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00932, train/loss_vlb_step=4.46e-5, train/loss_step=0.00932, global_step=402.0]
Epoch 0:  72%|███████▏  | 4301/5971 [39:35<15:22,  1.81it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00932, train/loss_vlb_step=4.46e-5, train/loss_step=0.00932, global_step=402.0]
Epoch 0:  72%|███████▏  | 4301/5971 [39:35<15:22,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000356, train/loss_step=0.102, global_step=403.0]    
Epoch 0:  72%|███████▏  | 4302/5971 [39:36<15:21,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000356, train/loss_step=0.102, global_step=403.0]
Epoch 0:  72%|███████▏  | 4302/5971 [39:36<15:21,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=403.0]
Epoch 0:  72%|███████▏  | 4303/5971 [39:37<15:21,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=403.0]
Epoch 0:  72%|███████▏  | 4303/5971 [39:37<15:21,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00261, train/loss_step=0.402, global_step=403.0] 
Epoch 0:  72%|███████▏  | 4304/5971 [39:40<15:21,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00261, train/loss_step=0.402, global_step=403.0]
Epoch 0:  72%|███████▏  | 4304/5971 [39:40<15:21,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.11e-5, train/loss_step=0.0152, global_step=403.0]
Epoch 0:  72%|███████▏  | 4305/5971 [39:41<15:21,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.11e-5, train/loss_step=0.0152, global_step=403.0]
Epoch 0:  72%|███████▏  | 4305/5971 [39:41<15:21,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.24e-5, train/loss_step=0.00409, global_step=404.0]
Epoch 0:  72%|███████▏  | 4306/5971 [39:42<15:20,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.24e-5, train/loss_step=0.00409, global_step=404.0]
Epoch 0:  72%|███████▏  | 4306/5971 [39:42<15:20,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000829, train/loss_step=0.196, global_step=404.0]   
Epoch 0:  72%|███████▏  | 4307/5971 [39:42<15:20,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000829, train/loss_step=0.196, global_step=404.0]
Epoch 0:  72%|███████▏  | 4307/5971 [39:42<15:20,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000536, train/loss_step=0.154, global_step=404.0]
Epoch 0:  72%|███████▏  | 4308/5971 [39:45<15:20,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000536, train/loss_step=0.154, global_step=404.0]
Epoch 0:  72%|███████▏  | 4308/5971 [39:45<15:20,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000775, train/loss_step=0.191, global_step=404.0]
Epoch 0:  72%|███████▏  | 4309/5971 [39:46<15:20,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000775, train/loss_step=0.191, global_step=404.0]
Epoch 0:  72%|███████▏  | 4309/5971 [39:46<15:20,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.31e-5, train/loss_step=0.0173, global_step=405.0]
Epoch 0:  72%|███████▏  | 4310/5971 [39:47<15:19,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.31e-5, train/loss_step=0.0173, global_step=405.0]
Epoch 0:  72%|███████▏  | 4310/5971 [39:47<15:19,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00842, train/loss_vlb_step=4.03e-5, train/loss_step=0.00842, global_step=405.0]
Epoch 0:  72%|███████▏  | 4311/5971 [39:47<15:19,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00842, train/loss_vlb_step=4.03e-5, train/loss_step=0.00842, global_step=405.0]
Epoch 0:  72%|███████▏  | 4311/5971 [39:47<15:19,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000139, train/loss_step=0.0379, global_step=405.0] 
Epoch 0:  72%|███████▏  | 4312/5971 [39:50<15:19,  1.80it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000139, train/loss_step=0.0379, global_step=405.0]
Epoch 0:  72%|███████▏  | 4312/5971 [39:50<15:19,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000749, train/loss_step=0.219, global_step=405.0]  
Epoch 0:  72%|███████▏  | 4313/5971 [39:51<15:19,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000749, train/loss_step=0.219, global_step=405.0]
Epoch 0:  72%|███████▏  | 4313/5971 [39:51<15:19,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00189, train/loss_step=0.411, global_step=406.0] 
Epoch 0:  72%|███████▏  | 4314/5971 [39:52<15:18,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00189, train/loss_step=0.411, global_step=406.0]
Epoch 0:  72%|███████▏  | 4314/5971 [39:52<15:18,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.7e-5, train/loss_step=0.005, global_step=406.0] 
Epoch 0:  72%|███████▏  | 4315/5971 [39:53<15:18,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.7e-5, train/loss_step=0.005, global_step=406.0]
Epoch 0:  72%|███████▏  | 4315/5971 [39:53<15:18,  1.80it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.17e-5, train/loss_step=0.00195, global_step=406.0]
Epoch 0:  72%|███████▏  | 4316/5971 [39:56<15:18,  1.80it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.17e-5, train/loss_step=0.00195, global_step=406.0]
Epoch 0:  72%|███████▏  | 4316/5971 [39:56<15:18,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00269, train/loss_step=0.408, global_step=406.0]    
Epoch 0:  72%|███████▏  | 4317/5971 [39:57<15:18,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00269, train/loss_step=0.408, global_step=406.0]
Epoch 0:  72%|███████▏  | 4317/5971 [39:57<15:18,  1.80it/s, loss=0.137, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.41e-5, train/loss_step=0.012, global_step=407.0]
Epoch 0:  72%|███████▏  | 4318/5971 [39:58<15:17,  1.80it/s, loss=0.137, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.41e-5, train/loss_step=0.012, global_step=407.0]
Epoch 0:  72%|███████▏  | 4318/5971 [39:58<15:17,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.00075, train/loss_step=0.221, global_step=407.0]
Epoch 0:  72%|███████▏  | 4319/5971 [39:59<15:17,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.00075, train/loss_step=0.221, global_step=407.0]
Epoch 0:  72%|███████▏  | 4319/5971 [39:59<15:17,  1.80it/s, loss=0.135, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000577, train/loss_step=0.167, global_step=407.0]
Epoch 0:  72%|███████▏  | 4320/5971 [40:02<15:17,  1.80it/s, loss=0.135, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000577, train/loss_step=0.167, global_step=407.0]
Epoch 0:  72%|███████▏  | 4320/5971 [40:02<15:17,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000834, train/loss_step=0.228, global_step=407.0]
Epoch 0:  72%|███████▏  | 4321/5971 [40:03<15:17,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000834, train/loss_step=0.228, global_step=407.0]
Epoch 0:  72%|███████▏  | 4321/5971 [40:03<15:17,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.22e-5, train/loss_step=0.017, global_step=408.0] 
Epoch 0:  72%|███████▏  | 4322/5971 [40:03<15:16,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.22e-5, train/loss_step=0.017, global_step=408.0]
Epoch 0:  72%|███████▏  | 4322/5971 [40:03<15:16,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00978, train/loss_vlb_step=4.48e-5, train/loss_step=0.00978, global_step=408.0]
Epoch 0:  72%|███████▏  | 4323/5971 [40:04<15:16,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00978, train/loss_vlb_step=4.48e-5, train/loss_step=0.00978, global_step=408.0]
Epoch 0:  72%|███████▏  | 4323/5971 [40:04<15:16,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000763, train/loss_step=0.205, global_step=408.0]   
Epoch 0:  72%|███████▏  | 4324/5971 [40:07<15:16,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000763, train/loss_step=0.205, global_step=408.0]
Epoch 0:  72%|███████▏  | 4324/5971 [40:07<15:16,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.78e-5, train/loss_step=0.0182, global_step=408.0]
Epoch 0:  72%|███████▏  | 4325/5971 [40:07<15:16,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.78e-5, train/loss_step=0.0182, global_step=408.0]
Epoch 0:  72%|███████▏  | 4325/5971 [40:07<15:16,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.03e-5, train/loss_step=0.0144, global_step=409.0]
Epoch 0:  72%|███████▏  | 4326/5971 [40:08<15:15,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.03e-5, train/loss_step=0.0144, global_step=409.0]
Epoch 0:  72%|███████▏  | 4326/5971 [40:08<15:15,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000119, train/loss_step=0.0323, global_step=409.0]
Epoch 0:  72%|███████▏  | 4327/5971 [40:09<15:15,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000119, train/loss_step=0.0323, global_step=409.0]
Epoch 0:  72%|███████▏  | 4327/5971 [40:09<15:15,  1.80it/s, loss=0.117, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=409.0]  
Epoch 0:  72%|███████▏  | 4328/5971 [40:11<15:15,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=409.0]
Epoch 0:  72%|███████▏  | 4328/5971 [40:11<15:15,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000381, train/loss_step=0.116, global_step=409.0]
Epoch 0:  73%|███████▎  | 4329/5971 [40:12<15:14,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000381, train/loss_step=0.116, global_step=409.0]
Epoch 0:  73%|███████▎  | 4329/5971 [40:12<15:14,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000703, train/loss_step=0.206, global_step=410.0]
Epoch 0:  73%|███████▎  | 4330/5971 [40:13<15:14,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000703, train/loss_step=0.206, global_step=410.0]
Epoch 0:  73%|███████▎  | 4330/5971 [40:13<15:14,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00895, train/loss_vlb_step=4.23e-5, train/loss_step=0.00895, global_step=410.0]
Epoch 0:  73%|███████▎  | 4331/5971 [40:14<15:14,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00895, train/loss_vlb_step=4.23e-5, train/loss_step=0.00895, global_step=410.0]
Epoch 0:  73%|███████▎  | 4331/5971 [40:14<15:14,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.21e-5, train/loss_step=0.0168, global_step=410.0]  
Epoch 0:  73%|███████▎  | 4332/5971 [40:16<15:14,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.21e-5, train/loss_step=0.0168, global_step=410.0]
Epoch 0:  73%|███████▎  | 4332/5971 [40:16<15:14,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.15e-5, train/loss_step=0.00397, global_step=410.0]
Epoch 0:  73%|███████▎  | 4333/5971 [40:17<15:13,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.15e-5, train/loss_step=0.00397, global_step=410.0]
Epoch 0:  73%|███████▎  | 4333/5971 [40:17<15:13,  1.79it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.51e-5, train/loss_step=0.00757, global_step=411.0]
Epoch 0:  73%|███████▎  | 4334/5971 [40:18<15:13,  1.79it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.51e-5, train/loss_step=0.00757, global_step=411.0]
Epoch 0:  73%|███████▎  | 4334/5971 [40:18<15:13,  1.79it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=8.94e-6, train/loss_step=0.00151, global_step=411.0]
Epoch 0:  73%|███████▎  | 4335/5971 [40:19<15:12,  1.79it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=8.94e-6, train/loss_step=0.00151, global_step=411.0]
Epoch 0:  73%|███████▎  | 4335/5971 [40:19<15:12,  1.79it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.56e-5, train/loss_step=0.0238, global_step=411.0]  
Epoch 0:  73%|███████▎  | 4336/5971 [40:21<15:13,  1.79it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.56e-5, train/loss_step=0.0238, global_step=411.0]
Epoch 0:  73%|███████▎  | 4336/5971 [40:21<15:13,  1.79it/s, loss=0.0733, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.000137, train/loss_step=0.0385, global_step=411.0]
Epoch 0:  73%|███████▎  | 4337/5971 [40:22<15:12,  1.79it/s, loss=0.0733, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.000137, train/loss_step=0.0385, global_step=411.0]
Epoch 0:  73%|███████▎  | 4337/5971 [40:22<15:12,  1.79it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000466, train/loss_step=0.142, global_step=412.0]  
Epoch 0:  73%|███████▎  | 4338/5971 [40:23<15:12,  1.79it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000466, train/loss_step=0.142, global_step=412.0]
Epoch 0:  73%|███████▎  | 4338/5971 [40:23<15:12,  1.79it/s, loss=0.0697, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.9e-5, train/loss_step=0.0204, global_step=412.0]
Epoch 0:  73%|███████▎  | 4339/5971 [40:24<15:11,  1.79it/s, loss=0.0697, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.9e-5, train/loss_step=0.0204, global_step=412.0]
Epoch 0:  73%|███████▎  | 4339/5971 [40:24<15:11,  1.79it/s, loss=0.0616, v_num=0, train/loss_simple_step=0.00471, train/loss_vlb_step=2.49e-5, train/loss_step=0.00471, global_step=412.0]
Epoch 0:  73%|███████▎  | 4340/5971 [40:27<15:12,  1.79it/s, loss=0.0616, v_num=0, train/loss_simple_step=0.00471, train/loss_vlb_step=2.49e-5, train/loss_step=0.00471, global_step=412.0]
Epoch 0:  73%|███████▎  | 4340/5971 [40:27<15:12,  1.79it/s, loss=0.0528, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000189, train/loss_step=0.0528, global_step=412.0] 
Epoch 0:  73%|███████▎  | 4341/5971 [40:28<15:11,  1.79it/s, loss=0.0528, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000189, train/loss_step=0.0528, global_step=412.0]
Epoch 0:  73%|███████▎  | 4341/5971 [40:28<15:11,  1.79it/s, loss=0.0522, v_num=0, train/loss_simple_step=0.00522, train/loss_vlb_step=2.69e-5, train/loss_step=0.00522, global_step=413.0]
Epoch 0:  73%|███████▎  | 4342/5971 [40:29<15:11,  1.79it/s, loss=0.0522, v_num=0, train/loss_simple_step=0.00522, train/loss_vlb_step=2.69e-5, train/loss_step=0.00522, global_step=413.0]
Epoch 0:  73%|███████▎  | 4342/5971 [40:29<15:11,  1.79it/s, loss=0.0524, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.52e-5, train/loss_step=0.0131, global_step=413.0]  
Epoch 0:  73%|███████▎  | 4343/5971 [40:30<15:10,  1.79it/s, loss=0.0524, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.52e-5, train/loss_step=0.0131, global_step=413.0]
Epoch 0:  73%|███████▎  | 4343/5971 [40:30<15:10,  1.79it/s, loss=0.0451, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.000203, train/loss_step=0.0594, global_step=413.0]
Epoch 0:  73%|███████▎  | 4344/5971 [40:32<15:10,  1.79it/s, loss=0.0451, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.000203, train/loss_step=0.0594, global_step=413.0]
Epoch 0:  73%|███████▎  | 4344/5971 [40:32<15:10,  1.79it/s, loss=0.0444, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.35e-5, train/loss_step=0.0044, global_step=413.0] 
Epoch 0:  73%|███████▎  | 4345/5971 [40:33<15:10,  1.79it/s, loss=0.0444, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.35e-5, train/loss_step=0.0044, global_step=413.0]
Epoch 0:  73%|███████▎  | 4345/5971 [40:33<15:10,  1.79it/s, loss=0.0439, v_num=0, train/loss_simple_step=0.00418, train/loss_vlb_step=2.23e-5, train/loss_step=0.00418, global_step=414.0]
Epoch 0:  73%|███████▎  | 4346/5971 [40:34<15:09,  1.79it/s, loss=0.0439, v_num=0, train/loss_simple_step=0.00418, train/loss_vlb_step=2.23e-5, train/loss_step=0.00418, global_step=414.0]
Epoch 0:  73%|███████▎  | 4346/5971 [40:34<15:09,  1.79it/s, loss=0.0493, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00046, train/loss_step=0.140, global_step=414.0]    
Epoch 0:  73%|███████▎  | 4347/5971 [40:35<15:09,  1.79it/s, loss=0.0493, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00046, train/loss_step=0.140, global_step=414.0]
Epoch 0:  73%|███████▎  | 4347/5971 [40:35<15:09,  1.79it/s, loss=0.0442, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.59e-5, train/loss_step=0.0154, global_step=414.0]
Epoch 0:  73%|███████▎  | 4348/5971 [40:37<15:09,  1.78it/s, loss=0.0442, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.59e-5, train/loss_step=0.0154, global_step=414.0]
Epoch 0:  73%|███████▎  | 4348/5971 [40:37<15:09,  1.78it/s, loss=0.0406, v_num=0, train/loss_simple_step=0.0441, train/loss_vlb_step=0.000163, train/loss_step=0.0441, global_step=414.0]
Epoch 0:  73%|███████▎  | 4349/5971 [40:38<15:09,  1.78it/s, loss=0.0406, v_num=0, train/loss_simple_step=0.0441, train/loss_vlb_step=0.000163, train/loss_step=0.0441, global_step=414.0]
Epoch 0:  73%|███████▎  | 4349/5971 [40:38<15:09,  1.78it/s, loss=0.0315, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=8.93e-5, train/loss_step=0.0225, global_step=415.0] 
Epoch 0:  73%|███████▎  | 4350/5971 [40:39<15:08,  1.78it/s, loss=0.0315, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=8.93e-5, train/loss_step=0.0225, global_step=415.0]
Epoch 0:  73%|███████▎  | 4350/5971 [40:39<15:08,  1.78it/s, loss=0.0357, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000312, train/loss_step=0.0948, global_step=415.0]
Epoch 0:  73%|███████▎  | 4351/5971 [40:40<15:08,  1.78it/s, loss=0.0357, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000312, train/loss_step=0.0948, global_step=415.0]
Epoch 0:  73%|███████▎  | 4351/5971 [40:40<15:08,  1.78it/s, loss=0.0354, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.82e-5, train/loss_step=0.0106, global_step=415.0] 
Epoch 0:  73%|███████▎  | 4352/5971 [40:42<15:08,  1.78it/s, loss=0.0354, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.82e-5, train/loss_step=0.0106, global_step=415.0]
Epoch 0:  73%|███████▎  | 4352/5971 [40:42<15:08,  1.78it/s, loss=0.0355, v_num=0, train/loss_simple_step=0.00457, train/loss_vlb_step=2.42e-5, train/loss_step=0.00457, global_step=415.0]
Epoch 0:  73%|███████▎  | 4353/5971 [40:43<15:07,  1.78it/s, loss=0.0355, v_num=0, train/loss_simple_step=0.00457, train/loss_vlb_step=2.42e-5, train/loss_step=0.00457, global_step=415.0]
Epoch 0:  73%|███████▎  | 4353/5971 [40:43<15:07,  1.78it/s, loss=0.0352, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.19e-5, train/loss_step=0.00196, global_step=416.0]
Epoch 0:  73%|███████▎  | 4354/5971 [40:44<15:07,  1.78it/s, loss=0.0352, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.19e-5, train/loss_step=0.00196, global_step=416.0]
Epoch 0:  73%|███████▎  | 4354/5971 [40:44<15:07,  1.78it/s, loss=0.0648, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.00554, train/loss_step=0.595, global_step=416.0]    
Epoch 0:  73%|███████▎  | 4355/5971 [40:45<15:07,  1.78it/s, loss=0.0648, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.00554, train/loss_step=0.595, global_step=416.0]
Epoch 0:  73%|███████▎  | 4355/5971 [40:45<15:07,  1.78it/s, loss=0.0647, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.1e-5, train/loss_step=0.0206, global_step=416.0]
Epoch 0:  73%|███████▎  | 4356/5971 [40:47<15:07,  1.78it/s, loss=0.0647, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.1e-5, train/loss_step=0.0206, global_step=416.0]
Epoch 0:  73%|███████▎  | 4356/5971 [40:47<15:07,  1.78it/s, loss=0.0836, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00224, train/loss_step=0.417, global_step=416.0] 
Epoch 0:  73%|███████▎  | 4357/5971 [40:48<15:06,  1.78it/s, loss=0.0836, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00224, train/loss_step=0.417, global_step=416.0]
Epoch 0:  73%|███████▎  | 4357/5971 [40:48<15:06,  1.78it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00181, train/loss_step=0.379, global_step=417.0]
Epoch 0:  73%|███████▎  | 4358/5971 [40:49<15:06,  1.78it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00181, train/loss_step=0.379, global_step=417.0]
Epoch 0:  73%|███████▎  | 4358/5971 [40:49<15:06,  1.78it/s, loss=0.101, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000483, train/loss_step=0.136, global_step=417.0]
Epoch 0:  73%|███████▎  | 4359/5971 [40:50<15:05,  1.78it/s, loss=0.101, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000483, train/loss_step=0.136, global_step=417.0]
Epoch 0:  73%|███████▎  | 4359/5971 [40:50<15:05,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000151, train/loss_step=0.0443, global_step=417.0]
Epoch 0:  73%|███████▎  | 4360/5971 [40:52<15:05,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000151, train/loss_step=0.0443, global_step=417.0]
Epoch 0:  73%|███████▎  | 4360/5971 [40:52<15:05,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000639, train/loss_step=0.190, global_step=417.0]   
Epoch 0:  73%|███████▎  | 4361/5971 [40:53<15:05,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000639, train/loss_step=0.190, global_step=417.0]
Epoch 0:  73%|███████▎  | 4361/5971 [40:53<15:05,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00107, train/loss_step=0.274, global_step=418.0]
Epoch 0:  73%|███████▎  | 4362/5971 [40:54<15:05,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00107, train/loss_step=0.274, global_step=418.0]
Epoch 0:  73%|███████▎  | 4362/5971 [40:54<15:05,  1.78it/s, loss=0.136, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00103, train/loss_step=0.266, global_step=418.0]
Epoch 0:  73%|███████▎  | 4363/5971 [40:55<15:04,  1.78it/s, loss=0.136, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00103, train/loss_step=0.266, global_step=418.0]
Epoch 0:  73%|███████▎  | 4363/5971 [40:55<15:04,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.26e-5, train/loss_step=0.0139, global_step=418.0]
Epoch 0:  73%|███████▎  | 4364/5971 [40:57<15:04,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.26e-5, train/loss_step=0.0139, global_step=418.0]
Epoch 0:  73%|███████▎  | 4364/5971 [40:57<15:04,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=418.0] 
Epoch 0:  73%|███████▎  | 4365/5971 [40:58<15:04,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=418.0]
Epoch 0:  73%|███████▎  | 4365/5971 [40:58<15:04,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.0099, train/loss_step=0.621, global_step=419.0]   
Epoch 0:  73%|███████▎  | 4366/5971 [40:58<15:03,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.0099, train/loss_step=0.621, global_step=419.0]
Epoch 0:  73%|███████▎  | 4366/5971 [40:58<15:03,  1.78it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00474, train/loss_vlb_step=2.52e-5, train/loss_step=0.00474, global_step=419.0]
Epoch 0:  73%|███████▎  | 4367/5971 [40:59<15:03,  1.78it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00474, train/loss_vlb_step=2.52e-5, train/loss_step=0.00474, global_step=419.0]
Epoch 0:  73%|███████▎  | 4367/5971 [40:59<15:03,  1.78it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.79e-5, train/loss_step=0.00784, global_step=419.0]
Epoch 0:  73%|███████▎  | 4368/5971 [41:02<15:03,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.79e-5, train/loss_step=0.00784, global_step=419.0]
Epoch 0:  73%|███████▎  | 4368/5971 [41:02<15:03,  1.77it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.84e-5, train/loss_step=0.00347, global_step=419.0]
Epoch 0:  73%|███████▎  | 4369/5971 [41:03<15:02,  1.77it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.84e-5, train/loss_step=0.00347, global_step=419.0]
Epoch 0:  73%|███████▎  | 4369/5971 [41:03<15:02,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.00038, train/loss_step=0.115, global_step=420.0]    
Epoch 0:  73%|███████▎  | 4370/5971 [41:03<15:02,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.00038, train/loss_step=0.115, global_step=420.0]
Epoch 0:  73%|███████▎  | 4370/5971 [41:03<15:02,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000337, train/loss_step=0.102, global_step=420.0]
Epoch 0:  73%|███████▎  | 4371/5971 [41:04<15:02,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000337, train/loss_step=0.102, global_step=420.0]
Epoch 0:  73%|███████▎  | 4371/5971 [41:04<15:02,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00785, train/loss_vlb_step=3.95e-5, train/loss_step=0.00785, global_step=420.0]
Epoch 0:  73%|███████▎  | 4372/5971 [41:06<15:02,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00785, train/loss_vlb_step=3.95e-5, train/loss_step=0.00785, global_step=420.0]
Epoch 0:  73%|███████▎  | 4372/5971 [41:06<15:02,  1.77it/s, loss=0.174, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000591, train/loss_step=0.174, global_step=420.0]   
Epoch 0:  73%|███████▎  | 4373/5971 [41:07<15:01,  1.77it/s, loss=0.174, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000591, train/loss_step=0.174, global_step=420.0]
Epoch 0:  73%|███████▎  | 4373/5971 [41:07<15:01,  1.77it/s, loss=0.212, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0648, train/loss_step=0.761, global_step=421.0]  
Epoch 0:  73%|███████▎  | 4374/5971 [41:08<15:01,  1.77it/s, loss=0.212, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0648, train/loss_step=0.761, global_step=421.0]
Epoch 0:  73%|███████▎  | 4374/5971 [41:08<15:01,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.28e-5, train/loss_step=0.00227, global_step=421.0]
Epoch 0:  73%|███████▎  | 4375/5971 [41:09<15:00,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.28e-5, train/loss_step=0.00227, global_step=421.0]
Epoch 0:  73%|███████▎  | 4375/5971 [41:09<15:00,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.61e-5, train/loss_step=0.0101, global_step=421.0]  
Epoch 0:  73%|███████▎  | 4376/5971 [41:12<15:00,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.61e-5, train/loss_step=0.0101, global_step=421.0]
Epoch 0:  73%|███████▎  | 4376/5971 [41:12<15:00,  1.77it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00994, train/loss_vlb_step=4.36e-5, train/loss_step=0.00994, global_step=421.0]
Epoch 0:  73%|███████▎  | 4377/5971 [41:12<15:00,  1.77it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00994, train/loss_vlb_step=4.36e-5, train/loss_step=0.00994, global_step=421.0]
Epoch 0:  73%|███████▎  | 4377/5971 [41:12<15:00,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00324, train/loss_step=0.499, global_step=422.0]    
Epoch 0:  73%|███████▎  | 4378/5971 [41:13<14:59,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00324, train/loss_step=0.499, global_step=422.0]
Epoch 0:  73%|███████▎  | 4378/5971 [41:13<14:59,  1.77it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=5.19e-5, train/loss_step=0.0107, global_step=422.0]
Epoch 0:  73%|███████▎  | 4379/5971 [41:14<14:59,  1.77it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=5.19e-5, train/loss_step=0.0107, global_step=422.0]
Epoch 0:  73%|███████▎  | 4379/5971 [41:14<14:59,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00169, train/loss_step=0.350, global_step=422.0]  
Epoch 0:  73%|███████▎  | 4380/5971 [41:16<14:59,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00169, train/loss_step=0.350, global_step=422.0]
Epoch 0:  73%|███████▎  | 4380/5971 [41:16<14:59,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.68e-5, train/loss_step=0.00302, global_step=422.0]
Epoch 0:  73%|███████▎  | 4381/5971 [41:17<14:59,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.68e-5, train/loss_step=0.00302, global_step=422.0]
Epoch 0:  73%|███████▎  | 4381/5971 [41:17<14:59,  1.77it/s, loss=0.168, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00136, train/loss_step=0.299, global_step=423.0]    
Epoch 0:  73%|███████▎  | 4382/5971 [41:18<14:58,  1.77it/s, loss=0.168, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00136, train/loss_step=0.299, global_step=423.0]
Epoch 0:  73%|███████▎  | 4382/5971 [41:18<14:58,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000468, train/loss_step=0.137, global_step=423.0]
Epoch 0:  73%|███████▎  | 4383/5971 [41:19<14:58,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000468, train/loss_step=0.137, global_step=423.0]
Epoch 0:  73%|███████▎  | 4383/5971 [41:19<14:58,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00788, train/loss_vlb_step=3.86e-5, train/loss_step=0.00788, global_step=423.0]
Epoch 0:  73%|███████▎  | 4384/5971 [41:21<14:58,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00788, train/loss_vlb_step=3.86e-5, train/loss_step=0.00788, global_step=423.0]
Epoch 0:  73%|███████▎  | 4384/5971 [41:21<14:58,  1.77it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.53e-5, train/loss_step=0.0158, global_step=423.0]  
Epoch 0:  73%|███████▎  | 4385/5971 [41:22<14:57,  1.77it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.53e-5, train/loss_step=0.0158, global_step=423.0]
Epoch 0:  73%|███████▎  | 4385/5971 [41:22<14:57,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00416, train/loss_vlb_step=2.13e-5, train/loss_step=0.00416, global_step=424.0]
Epoch 0:  73%|███████▎  | 4386/5971 [41:23<14:57,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00416, train/loss_vlb_step=2.13e-5, train/loss_step=0.00416, global_step=424.0]
Epoch 0:  73%|███████▎  | 4386/5971 [41:23<14:57,  1.77it/s, loss=0.127, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000107, train/loss_step=0.030, global_step=424.0]   
Epoch 0:  73%|███████▎  | 4387/5971 [41:24<14:56,  1.77it/s, loss=0.127, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000107, train/loss_step=0.030, global_step=424.0]
Epoch 0:  73%|███████▎  | 4387/5971 [41:24<14:56,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00726, train/loss_step=0.525, global_step=424.0] 
Epoch 0:  73%|███████▎  | 4388/5971 [41:26<14:56,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00726, train/loss_step=0.525, global_step=424.0]
Epoch 0:  73%|███████▎  | 4388/5971 [41:26<14:56,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.31it/s][A
Epoch 0:  74%|███████▎  | 4390/5971 [41:26<14:55,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:   1%|          | 2/167 [00:00<00:44,  3.69it/s][A
Epoch 0:  74%|███████▎  | 4392/5971 [41:27<14:53,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.70it/s][A
Epoch 0:  74%|███████▎  | 4395/5971 [41:27<14:51,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.96it/s][A
Epoch 0:  74%|███████▎  | 4398/5971 [41:27<14:49,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.37it/s][A
Epoch 0:  74%|███████▎  | 4401/5971 [41:27<14:47,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.11it/s][A
Epoch 0:  74%|███████▍  | 4404/5971 [41:27<14:44,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  11%|█         | 18/167 [00:01<00:06, 23.13it/s][A
Epoch 0:  74%|███████▍  | 4407/5971 [41:27<14:42,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.69it/s][A
Epoch 0:  74%|███████▍  | 4410/5971 [41:27<14:40,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.34it/s][A
Epoch 0:  74%|███████▍  | 4413/5971 [41:27<14:38,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.25it/s][A
Epoch 0:  74%|███████▍  | 4416/5971 [41:28<14:35,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.52it/s][A
Epoch 0:  74%|███████▍  | 4419/5971 [41:28<14:33,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.43it/s][A
Epoch 0:  74%|███████▍  | 4422/5971 [41:28<14:31,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.42it/s][A
Epoch 0:  74%|███████▍  | 4425/5971 [41:28<14:29,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 25.62it/s][A
Epoch 0:  74%|███████▍  | 4428/5971 [41:28<14:26,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.28it/s][A
Epoch 0:  74%|███████▍  | 4431/5971 [41:28<14:24,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  27%|██▋       | 45/167 [00:02<00:05, 23.78it/s][A
Epoch 0:  74%|███████▍  | 4434/5971 [41:28<14:22,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.88it/s][A
Epoch 0:  74%|███████▍  | 4437/5971 [41:28<14:20,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.87it/s][A
Epoch 0:  74%|███████▍  | 4440/5971 [41:29<14:18,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.12it/s][A
Epoch 0:  74%|███████▍  | 4443/5971 [41:29<14:15,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.86it/s][A
Epoch 0:  74%|███████▍  | 4446/5971 [41:29<14:13,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.54it/s][A
Epoch 0:  75%|███████▍  | 4449/5971 [41:29<14:11,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.75it/s][A
Epoch 0:  75%|███████▍  | 4453/5971 [41:29<14:08,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.81it/s][A
Epoch 0:  75%|███████▍  | 4457/5971 [41:29<14:05,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.64it/s][A
Epoch 0:  75%|███████▍  | 4461/5971 [41:29<14:02,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.42it/s][A
Epoch 0:  75%|███████▍  | 4465/5971 [41:29<13:59,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 28.42it/s][A
Epoch 0:  75%|███████▍  | 4469/5971 [41:30<13:56,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.91it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:02, 28.41it/s][A
Epoch 0:  75%|███████▍  | 4473/5971 [41:30<13:53,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 28.68it/s][A
Epoch 0:  75%|███████▍  | 4477/5971 [41:30<13:50,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 27.92it/s][A
Epoch 0:  75%|███████▌  | 4481/5971 [41:30<13:47,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.58it/s][A

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.35it/s][A
Epoch 0:  75%|███████▌  | 4485/5971 [41:30<13:45,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.02it/s][A
Epoch 0:  75%|███████▌  | 4489/5971 [41:30<13:42,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.29it/s][A
Epoch 0:  75%|███████▌  | 4493/5971 [41:30<13:39,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.72it/s][A

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.94it/s][A
Epoch 0:  75%|███████▌  | 4497/5971 [41:31<13:36,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.76it/s][A
Epoch 0:  75%|███████▌  | 4501/5971 [41:31<13:33,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.26it/s][A
Epoch 0:  75%|███████▌  | 4505/5971 [41:31<13:30,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.16it/s][A

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 27.35it/s][A
Epoch 0:  76%|███████▌  | 4509/5971 [41:31<13:27,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.59it/s][A
Epoch 0:  76%|███████▌  | 4513/5971 [41:31<13:24,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.39it/s][A
Epoch 0:  76%|███████▌  | 4517/5971 [41:31<13:21,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.57it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 17.51it/s][A
Epoch 0:  76%|███████▌  | 4521/5971 [41:32<13:19,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  81%|████████  | 135/167 [00:05<00:01, 19.61it/s][A
Epoch 0:  76%|███████▌  | 4525/5971 [41:32<13:16,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 20.29it/s][A
Epoch 0:  76%|███████▌  | 4529/5971 [41:32<13:13,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 21.86it/s][A

Validating:  86%|████████▌ | 144/167 [00:06<00:01, 20.59it/s][A
Epoch 0:  76%|███████▌  | 4533/5971 [41:32<13:10,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  88%|████████▊ | 147/167 [00:06<00:01, 19.21it/s][A
Epoch 0:  76%|███████▌  | 4537/5971 [41:33<13:07,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 17.53it/s][A
Epoch 0:  76%|███████▌  | 4541/5971 [41:33<13:04,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 19.50it/s][A

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 20.40it/s][A
Epoch 0:  76%|███████▌  | 4545/5971 [41:33<13:02,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 21.88it/s][A
Epoch 0:  76%|███████▌  | 4549/5971 [41:33<12:59,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 22.48it/s][A
Epoch 0:  76%|███████▋  | 4553/5971 [41:33<12:56,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 22.83it/s][A
Epoch 0:  76%|███████▋  | 4556/5971 [41:34<12:54,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]

                                                             [A
Epoch 0:  76%|███████▋  | 4557/5971 [41:35<12:54,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.82e-5, train/loss_step=0.0272, global_step=424.0]
Epoch 0:  76%|███████▋  | 4557/5971 [41:35<12:54,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000635, train/loss_step=0.171, global_step=425.0] 
Epoch 0:  76%|███████▋  | 4558/5971 [41:36<12:53,  1.83it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000148, train/loss_step=0.0407, global_step=425.0]
Epoch 0:  76%|███████▋  | 4559/5971 [41:36<12:53,  1.83it/s, loss=0.173, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00205, train/loss_step=0.380, global_step=425.0]   
Epoch 0:  76%|███████▋  | 4560/5971 [41:39<12:53,  1.83it/s, loss=0.172, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000504, train/loss_step=0.149, global_step=425.0]
Epoch 0:  76%|███████▋  | 4561/5971 [41:39<12:52,  1.82it/s, loss=0.172, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000504, train/loss_step=0.149, global_step=425.0]
Epoch 0:  76%|███████▋  | 4561/5971 [41:39<12:52,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000606, train/loss_step=0.172, global_step=426.0]
Epoch 0:  76%|███████▋  | 4562/5971 [41:40<12:52,  1.82it/s, loss=0.179, v_num=0, train/loss_simple_step=0.740, train/loss_vlb_step=0.0259, train/loss_step=0.740, global_step=426.0]  
Epoch 0:  76%|███████▋  | 4563/5971 [41:41<12:51,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0957, train/loss_vlb_step=0.000314, train/loss_step=0.0957, global_step=426.0]
Epoch 0:  76%|███████▋  | 4564/5971 [41:43<12:51,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00722, train/loss_vlb_step=3.6e-5, train/loss_step=0.00722, global_step=426.0]
Epoch 0:  76%|███████▋  | 4565/5971 [41:44<12:51,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00722, train/loss_vlb_step=3.6e-5, train/loss_step=0.00722, global_step=426.0]
Epoch 0:  76%|███████▋  | 4565/5971 [41:44<12:51,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000525, train/loss_step=0.153, global_step=427.0]  
Epoch 0:  76%|███████▋  | 4566/5971 [41:45<12:50,  1.82it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0995, train/loss_vlb_step=0.000327, train/loss_step=0.0995, global_step=427.0]
Epoch 0:  76%|███████▋  | 4567/5971 [41:46<12:50,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.43e-6, train/loss_step=0.00156, global_step=427.0]
Epoch 0:  77%|███████▋  | 4568/5971 [41:48<12:50,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.84e-5, train/loss_step=0.025, global_step=427.0]    
Epoch 0:  77%|███████▋  | 4569/5971 [41:49<12:49,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.84e-5, train/loss_step=0.025, global_step=427.0]
Epoch 0:  77%|███████▋  | 4569/5971 [41:49<12:49,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00155, train/loss_step=0.350, global_step=428.0]
Epoch 0:  77%|███████▋  | 4570/5971 [41:50<12:49,  1.82it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.59e-6, train/loss_step=0.00158, global_step=428.0]
Epoch 0:  77%|███████▋  | 4571/5971 [41:51<12:48,  1.82it/s, loss=0.175, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00553, train/loss_step=0.508, global_step=428.0]   
Epoch 0:  77%|███████▋  | 4572/5971 [41:53<12:48,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000296, train/loss_step=0.0894, global_step=428.0]
Epoch 0:  77%|███████▋  | 4573/5971 [41:54<12:48,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000296, train/loss_step=0.0894, global_step=428.0]
Epoch 0:  77%|███████▋  | 4573/5971 [41:54<12:48,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000115, train/loss_step=0.0302, global_step=429.0] 
Epoch 0:  77%|███████▋  | 4574/5971 [41:55<12:48,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00139, train/loss_step=0.305, global_step=429.0]  
Epoch 0:  77%|███████▋  | 4575/5971 [41:56<12:47,  1.82it/s, loss=0.174, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000454, train/loss_step=0.137, global_step=429.0]
Epoch 0:  77%|███████▋  | 4576/5971 [41:58<12:47,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.54e-5, train/loss_step=0.00265, global_step=429.0]
Epoch 0:  77%|███████▋  | 4577/5971 [41:59<12:47,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.54e-5, train/loss_step=0.00265, global_step=429.0]
Epoch 0:  77%|███████▋  | 4577/5971 [41:59<12:47,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00547, train/loss_vlb_step=2.87e-5, train/loss_step=0.00547, global_step=430.0]
Epoch 0:  77%|███████▋  | 4578/5971 [42:00<12:46,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00116, train/loss_step=0.291, global_step=430.0]    
Epoch 0:  77%|███████▋  | 4579/5971 [42:00<12:46,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.383, train/loss_vlb_step=0.00262, train/loss_step=0.383, global_step=430.0]
Epoch 0:  77%|███████▋  | 4580/5971 [42:03<12:46,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000235, train/loss_step=0.0699, global_step=430.0]
Epoch 0:  77%|███████▋  | 4581/5971 [42:04<12:45,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000235, train/loss_step=0.0699, global_step=430.0]
Epoch 0:  77%|███████▋  | 4581/5971 [42:04<12:45,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=0.000102, train/loss_step=0.0261, global_step=431.0]
Epoch 0:  77%|███████▋  | 4582/5971 [42:05<12:45,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=3.27e-5, train/loss_step=0.00636, global_step=431.0]
Epoch 0:  77%|███████▋  | 4583/5971 [42:06<12:44,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00581, train/loss_vlb_step=2.94e-5, train/loss_step=0.00581, global_step=431.0]
Epoch 0:  77%|███████▋  | 4584/5971 [42:08<12:44,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000623, train/loss_step=0.185, global_step=431.0]   
Epoch 0:  77%|███████▋  | 4585/5971 [42:09<12:44,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000623, train/loss_step=0.185, global_step=431.0]
Epoch 0:  77%|███████▋  | 4585/5971 [42:09<12:44,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000199, train/loss_step=0.0554, global_step=432.0]
Epoch 0:  77%|███████▋  | 4586/5971 [42:09<12:43,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00106, train/loss_step=0.274, global_step=432.0]   
Epoch 0:  77%|███████▋  | 4587/5971 [42:10<12:43,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.552, train/loss_vlb_step=0.00418, train/loss_step=0.552, global_step=432.0]
Epoch 0:  77%|███████▋  | 4588/5971 [42:13<12:43,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.47e-6, train/loss_step=0.00159, global_step=432.0]
Epoch 0:  77%|███████▋  | 4589/5971 [42:14<12:43,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.47e-6, train/loss_step=0.00159, global_step=432.0]
Epoch 0:  77%|███████▋  | 4589/5971 [42:14<12:43,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000629, train/loss_step=0.185, global_step=433.0]   
Epoch 0:  77%|███████▋  | 4590/5971 [42:15<12:42,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.57e-6, train/loss_step=0.00141, global_step=433.0]
Epoch 0:  77%|███████▋  | 4591/5971 [42:16<12:42,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000192, train/loss_step=0.0553, global_step=433.0] 
Epoch 0:  77%|███████▋  | 4592/5971 [42:18<12:42,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000551, train/loss_step=0.167, global_step=433.0]  
Epoch 0:  77%|███████▋  | 4593/5971 [42:19<12:41,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000551, train/loss_step=0.167, global_step=433.0]
Epoch 0:  77%|███████▋  | 4593/5971 [42:19<12:41,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000136, train/loss_step=0.0361, global_step=434.0]
Epoch 0:  77%|███████▋  | 4594/5971 [42:20<12:41,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00761, train/loss_vlb_step=3.59e-5, train/loss_step=0.00761, global_step=434.0]
Epoch 0:  77%|███████▋  | 4595/5971 [42:21<12:40,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000816, train/loss_step=0.209, global_step=434.0]   
Epoch 0:  77%|███████▋  | 4596/5971 [42:23<12:40,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000565, train/loss_step=0.153, global_step=434.0]
Epoch 0:  77%|███████▋  | 4597/5971 [42:24<12:40,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000565, train/loss_step=0.153, global_step=434.0]
Epoch 0:  77%|███████▋  | 4597/5971 [42:24<12:40,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00515, train/loss_vlb_step=2.68e-5, train/loss_step=0.00515, global_step=435.0]
Epoch 0:  77%|███████▋  | 4598/5971 [42:24<12:39,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000112, train/loss_step=0.030, global_step=435.0]    
Epoch 0:  77%|███████▋  | 4599/5971 [42:25<12:39,  1.81it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000127, train/loss_step=0.0347, global_step=435.0]
Epoch 0:  77%|███████▋  | 4600/5971 [42:28<12:39,  1.81it/s, loss=0.114, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00131, train/loss_step=0.288, global_step=435.0]   
Epoch 0:  77%|███████▋  | 4601/5971 [42:29<12:38,  1.81it/s, loss=0.114, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00131, train/loss_step=0.288, global_step=435.0]
Epoch 0:  77%|███████▋  | 4601/5971 [42:29<12:38,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.77e-5, train/loss_step=0.00326, global_step=436.0]
Epoch 0:  77%|███████▋  | 4602/5971 [42:30<12:38,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00198, train/loss_step=0.384, global_step=436.0]    
Epoch 0:  77%|███████▋  | 4603/5971 [42:31<12:37,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00112, train/loss_step=0.234, global_step=436.0]
Epoch 0:  77%|███████▋  | 4604/5971 [42:33<12:37,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.0082, train/loss_step=0.567, global_step=436.0] 
Epoch 0:  77%|███████▋  | 4605/5971 [42:34<12:37,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.0082, train/loss_step=0.567, global_step=436.0]
Epoch 0:  77%|███████▋  | 4605/5971 [42:34<12:37,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.000245, train/loss_step=0.0728, global_step=437.0]
Epoch 0:  77%|███████▋  | 4606/5971 [42:34<12:36,  1.80it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.76e-5, train/loss_step=0.0176, global_step=437.0]  
Epoch 0:  77%|███████▋  | 4607/5971 [42:35<12:36,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000823, train/loss_step=0.228, global_step=437.0]
Epoch 0:  77%|███████▋  | 4608/5971 [42:37<12:36,  1.80it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.16e-5, train/loss_step=0.0138, global_step=437.0]
Epoch 0:  77%|███████▋  | 4609/5971 [42:38<12:35,  1.80it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.16e-5, train/loss_step=0.0138, global_step=437.0]
Epoch 0:  77%|███████▋  | 4609/5971 [42:38<12:35,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0724, train/loss_vlb_step=0.000246, train/loss_step=0.0724, global_step=438.0]
Epoch 0:  77%|███████▋  | 4610/5971 [42:39<12:35,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00731, train/loss_step=0.574, global_step=438.0]   
Epoch 0:  77%|███████▋  | 4611/5971 [42:40<12:35,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0473, train/loss_vlb_step=0.00016, train/loss_step=0.0473, global_step=438.0]
Epoch 0:  77%|███████▋  | 4612/5971 [42:43<12:35,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00185, train/loss_step=0.387, global_step=438.0]  
Epoch 0:  77%|███████▋  | 4613/5971 [42:43<12:34,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00185, train/loss_step=0.387, global_step=438.0]
Epoch 0:  77%|███████▋  | 4613/5971 [42:43<12:34,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000295, train/loss_step=0.0895, global_step=439.0]
Epoch 0:  77%|███████▋  | 4614/5971 [42:44<12:34,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00106, train/loss_step=0.258, global_step=439.0]   
Epoch 0:  77%|███████▋  | 4615/5971 [42:45<12:33,  1.80it/s, loss=0.185, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000919, train/loss_step=0.245, global_step=439.0]
Epoch 0:  77%|███████▋  | 4616/5971 [42:48<12:33,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000448, train/loss_step=0.136, global_step=439.0]
Epoch 0:  77%|███████▋  | 4617/5971 [42:49<12:33,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000448, train/loss_step=0.136, global_step=439.0]
Epoch 0:  77%|███████▋  | 4617/5971 [42:49<12:33,  1.80it/s, loss=0.19, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000387, train/loss_step=0.118, global_step=440.0] 
Epoch 0:  77%|███████▋  | 4618/5971 [42:50<12:32,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00583, train/loss_vlb_step=2.94e-5, train/loss_step=0.00583, global_step=440.0]
Epoch 0:  77%|███████▋  | 4619/5971 [42:50<12:32,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=9.22e-5, train/loss_step=0.021, global_step=440.0]    
Epoch 0:  77%|███████▋  | 4620/5971 [42:53<12:32,  1.80it/s, loss=0.187, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00129, train/loss_step=0.265, global_step=440.0]
Epoch 0:  77%|███████▋  | 4621/5971 [42:54<12:31,  1.80it/s, loss=0.187, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00129, train/loss_step=0.265, global_step=440.0]
Epoch 0:  77%|███████▋  | 4621/5971 [42:54<12:31,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000623, train/loss_step=0.178, global_step=441.0]
Epoch 0:  77%|███████▋  | 4622/5971 [42:55<12:31,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00984, train/loss_vlb_step=4.6e-5, train/loss_step=0.00984, global_step=441.0]
Epoch 0:  77%|███████▋  | 4623/5971 [42:55<12:30,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000181, train/loss_step=0.051, global_step=441.0]  
Epoch 0:  77%|███████▋  | 4624/5971 [42:58<12:30,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.00049, train/loss_step=0.149, global_step=441.0] 
Epoch 0:  77%|███████▋  | 4625/5971 [42:58<12:30,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.00049, train/loss_step=0.149, global_step=441.0]
Epoch 0:  77%|███████▋  | 4625/5971 [42:58<12:30,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0756, train/loss_vlb_step=0.000251, train/loss_step=0.0756, global_step=442.0]
Epoch 0:  77%|███████▋  | 4626/5971 [42:59<12:29,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000122, train/loss_step=0.0319, global_step=442.0]
Epoch 0:  77%|███████▋  | 4627/5971 [43:00<12:29,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.94e-5, train/loss_step=0.00352, global_step=442.0]
Epoch 0:  78%|███████▊  | 4628/5971 [43:02<12:29,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000463, train/loss_step=0.135, global_step=442.0]   
Epoch 0:  78%|███████▊  | 4629/5971 [43:03<12:28,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000463, train/loss_step=0.135, global_step=442.0]
Epoch 0:  78%|███████▊  | 4629/5971 [43:03<12:28,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000878, train/loss_step=0.248, global_step=443.0]
Epoch 0:  78%|███████▊  | 4630/5971 [43:04<12:28,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000669, train/loss_step=0.193, global_step=443.0]
Epoch 0:  78%|███████▊  | 4631/5971 [43:05<12:27,  1.79it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00185, train/loss_vlb_step=1.13e-5, train/loss_step=0.00185, global_step=443.0]
Epoch 0:  78%|███████▊  | 4632/5971 [43:07<12:27,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00155, train/loss_step=0.322, global_step=443.0]   
Epoch 0:  78%|███████▊  | 4633/5971 [43:08<12:27,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00155, train/loss_step=0.322, global_step=443.0]
Epoch 0:  78%|███████▊  | 4633/5971 [43:08<12:27,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000176, train/loss_step=0.0483, global_step=444.0]
Epoch 0:  78%|███████▊  | 4634/5971 [43:09<12:26,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.37e-5, train/loss_step=0.0148, global_step=444.0] 
Epoch 0:  78%|███████▊  | 4635/5971 [43:10<12:26,  1.79it/s, loss=0.105, v_num=0, train/loss_simple_step=0.088, train/loss_vlb_step=0.00029, train/loss_step=0.088, global_step=444.0]  
Epoch 0:  78%|███████▊  | 4636/5971 [43:12<12:26,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000529, train/loss_step=0.157, global_step=444.0]
Epoch 0:  78%|███████▊  | 4637/5971 [43:13<12:25,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000529, train/loss_step=0.157, global_step=444.0]
Epoch 0:  78%|███████▊  | 4637/5971 [43:13<12:25,  1.79it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.41e-5, train/loss_step=0.0156, global_step=445.0]
Epoch 0:  78%|███████▊  | 4638/5971 [43:14<12:25,  1.79it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.82e-5, train/loss_step=0.00565, global_step=445.0]
Epoch 0:  78%|███████▊  | 4639/5971 [43:15<12:25,  1.79it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=5.01e-5, train/loss_step=0.0105, global_step=445.0]    
Epoch 0:  78%|███████▊  | 4640/5971 [43:17<12:24,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00165, train/loss_step=0.305, global_step=445.0]
Epoch 0:  78%|███████▊  | 4641/5971 [43:18<12:24,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00165, train/loss_step=0.305, global_step=445.0]
Epoch 0:  78%|███████▊  | 4641/5971 [43:18<12:24,  1.79it/s, loss=0.0938, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.57e-5, train/loss_step=0.010, global_step=446.0]
Epoch 0:  78%|███████▊  | 4642/5971 [43:19<12:24,  1.79it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.00519, train/loss_vlb_step=2.66e-5, train/loss_step=0.00519, global_step=446.0]
Epoch 0:  78%|███████▊  | 4643/5971 [43:20<12:23,  1.79it/s, loss=0.105, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00108, train/loss_step=0.289, global_step=446.0]     
Epoch 0:  78%|███████▊  | 4644/5971 [43:22<12:23,  1.78it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.06e-5, train/loss_step=0.00388, global_step=446.0]
Epoch 0:  78%|███████▊  | 4645/5971 [43:23<12:23,  1.78it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.06e-5, train/loss_step=0.00388, global_step=446.0]
Epoch 0:  78%|███████▊  | 4645/5971 [43:23<12:23,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000556, train/loss_step=0.162, global_step=447.0]    
Epoch 0:  78%|███████▊  | 4646/5971 [43:24<12:22,  1.78it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00547, train/loss_vlb_step=2.77e-5, train/loss_step=0.00547, global_step=447.0]
Epoch 0:  78%|███████▊  | 4647/5971 [43:25<12:22,  1.78it/s, loss=0.116, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00123, train/loss_step=0.289, global_step=447.0]    
Epoch 0:  78%|███████▊  | 4648/5971 [43:27<12:22,  1.78it/s, loss=0.129, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00245, train/loss_step=0.404, global_step=447.0]
Epoch 0:  78%|███████▊  | 4649/5971 [43:28<12:21,  1.78it/s, loss=0.129, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00245, train/loss_step=0.404, global_step=447.0]
Epoch 0:  78%|███████▊  | 4649/5971 [43:28<12:21,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00351, train/loss_step=0.501, global_step=448.0]
Epoch 0:  78%|███████▊  | 4650/5971 [43:29<12:21,  1.78it/s, loss=0.136, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000243, train/loss_step=0.073, global_step=448.0]
Epoch 0:  78%|███████▊  | 4651/5971 [43:30<12:20,  1.78it/s, loss=0.149, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00116, train/loss_step=0.278, global_step=448.0] 
Epoch 0:  78%|███████▊  | 4652/5971 [43:32<12:20,  1.78it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.000157, train/loss_step=0.0434, global_step=448.0]
Epoch 0:  78%|███████▊  | 4653/5971 [43:33<12:20,  1.78it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.000157, train/loss_step=0.0434, global_step=448.0]
Epoch 0:  78%|███████▊  | 4653/5971 [43:33<12:20,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00555, train/loss_vlb_step=2.77e-5, train/loss_step=0.00555, global_step=449.0]
Epoch 0:  78%|███████▊  | 4654/5971 [43:34<12:19,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.08e-5, train/loss_step=0.0142, global_step=449.0]  
Epoch 0:  78%|███████▊  | 4655/5971 [43:35<12:19,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000733, train/loss_step=0.206, global_step=449.0] 
Epoch 0:  78%|███████▊  | 4656/5971 [43:37<12:19,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]
Epoch 0:  78%|███████▊  | 4657/5971 [43:37<12:18,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:26,  1.91it/s][A

Validating:   1%|          | 2/167 [00:00<00:46,  3.58it/s][A
Epoch 0:  78%|███████▊  | 4661/5971 [43:37<12:15,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.17it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.74it/s][A
Epoch 0:  78%|███████▊  | 4665/5971 [43:38<12:12,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.60it/s][A
Epoch 0:  78%|███████▊  | 4669/5971 [43:38<12:09,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.72it/s][A
Epoch 0:  78%|███████▊  | 4673/5971 [43:38<12:07,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.97it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.92it/s][A
Epoch 0:  78%|███████▊  | 4677/5971 [43:38<12:04,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.50it/s][A
Epoch 0:  78%|███████▊  | 4681/5971 [43:38<12:01,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.22it/s][A
Epoch 0:  78%|███████▊  | 4685/5971 [43:38<11:58,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.76it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.64it/s][A
Epoch 0:  79%|███████▊  | 4689/5971 [43:39<11:55,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.56it/s][A
Epoch 0:  79%|███████▊  | 4693/5971 [43:39<11:53,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.71it/s][A
Epoch 0:  79%|███████▊  | 4697/5971 [43:39<11:50,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.72it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 24.85it/s][A
Epoch 0:  79%|███████▊  | 4701/5971 [43:39<11:47,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.07it/s][A
Epoch 0:  79%|███████▉  | 4705/5971 [43:39<11:44,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 24.95it/s][A
Epoch 0:  79%|███████▉  | 4709/5971 [43:39<11:41,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.18it/s][A

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.51it/s][A
Epoch 0:  79%|███████▉  | 4713/5971 [43:39<11:39,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.03it/s][A
Epoch 0:  79%|███████▉  | 4717/5971 [43:40<11:36,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.30it/s][A
Epoch 0:  79%|███████▉  | 4721/5971 [43:40<11:33,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 28.43it/s][A
Epoch 0:  79%|███████▉  | 4725/5971 [43:40<11:30,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 28.15it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.28it/s][A
Epoch 0:  79%|███████▉  | 4729/5971 [43:40<11:28,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.36it/s][A
Epoch 0:  79%|███████▉  | 4733/5971 [43:40<11:25,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.68it/s][A
Epoch 0:  79%|███████▉  | 4737/5971 [43:40<11:22,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.55it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 27.24it/s][A
Epoch 0:  79%|███████▉  | 4741/5971 [43:41<11:19,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 27.66it/s][A
Epoch 0:  79%|███████▉  | 4745/5971 [43:41<11:17,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.09it/s][A
Epoch 0:  80%|███████▉  | 4749/5971 [43:41<11:14,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.18it/s][A

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.35it/s][A
Epoch 0:  80%|███████▉  | 4753/5971 [43:41<11:11,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.96it/s][A
Epoch 0:  80%|███████▉  | 4757/5971 [43:41<11:08,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.71it/s][A
Epoch 0:  80%|███████▉  | 4761/5971 [43:41<11:06,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.92it/s][A

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.45it/s][A
Epoch 0:  80%|███████▉  | 4765/5971 [43:41<11:03,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.29it/s][A
Epoch 0:  80%|███████▉  | 4769/5971 [43:42<11:00,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.48it/s][A
Epoch 0:  80%|███████▉  | 4773/5971 [43:42<10:58,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  70%|███████   | 117/167 [00:05<00:01, 27.70it/s][A
Epoch 0:  80%|████████  | 4777/5971 [43:42<10:55,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.99it/s][A

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.12it/s][A
Epoch 0:  80%|████████  | 4781/5971 [43:42<10:52,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.05it/s][A
Epoch 0:  80%|████████  | 4785/5971 [43:42<10:49,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.71it/s][A
Epoch 0:  80%|████████  | 4789/5971 [43:42<10:47,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.07it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 24.75it/s][A
Epoch 0:  80%|████████  | 4793/5971 [43:43<10:44,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.14it/s][A
Epoch 0:  80%|████████  | 4797/5971 [43:43<10:41,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  85%|████████▌ | 142/167 [00:05<00:01, 24.97it/s][A
Epoch 0:  80%|████████  | 4801/5971 [43:43<10:39,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.22it/s][A
Epoch 0:  80%|████████  | 4805/5971 [43:43<10:36,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.73it/s][A

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.22it/s][A
Epoch 0:  81%|████████  | 4809/5971 [43:43<10:33,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.78it/s][A
Epoch 0:  81%|████████  | 4813/5971 [43:43<10:31,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.01it/s][A
Epoch 0:  81%|████████  | 4817/5971 [43:43<10:28,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.87it/s][A

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.72it/s][A
Epoch 0:  81%|████████  | 4821/5971 [43:44<10:25,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.02it/s][A
Epoch 0:  81%|████████  | 4824/5971 [43:44<10:23,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]

                                                             [A
Epoch 0:  81%|████████  | 4825/5971 [43:45<10:23,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=449.0]
Epoch 0:  81%|████████  | 4825/5971 [43:45<10:23,  1.84it/s, loss=0.163, v_num=0, train/loss_simple_step=0.633, train/loss_vlb_step=0.0363, train/loss_step=0.633, global_step=450.0]   
Epoch 0:  81%|████████  | 4826/5971 [43:46<10:22,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.00016, train/loss_step=0.0428, global_step=450.0]
Epoch 0:  81%|████████  | 4827/5971 [43:47<10:22,  1.84it/s, loss=0.199, v_num=0, train/loss_simple_step=0.694, train/loss_vlb_step=0.015, train/loss_step=0.694, global_step=450.0]    
Epoch 0:  81%|████████  | 4828/5971 [43:49<10:22,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.61e-5, train/loss_step=0.0248, global_step=450.0]
Epoch 0:  81%|████████  | 4829/5971 [43:50<10:21,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.61e-5, train/loss_step=0.0248, global_step=450.0]
Epoch 0:  81%|████████  | 4829/5971 [43:50<10:21,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.18e-5, train/loss_step=0.004, global_step=451.0]  
Epoch 0:  81%|████████  | 4830/5971 [43:51<10:21,  1.84it/s, loss=0.202, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00142, train/loss_step=0.354, global_step=451.0]
Epoch 0:  81%|████████  | 4831/5971 [43:52<10:21,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.03e-5, train/loss_step=0.0114, global_step=451.0]
Epoch 0:  81%|████████  | 4832/5971 [43:54<10:20,  1.83it/s, loss=0.226, v_num=0, train/loss_simple_step=0.760, train/loss_vlb_step=0.0266, train/loss_step=0.760, global_step=451.0]   
Epoch 0:  81%|████████  | 4833/5971 [43:55<10:20,  1.83it/s, loss=0.226, v_num=0, train/loss_simple_step=0.760, train/loss_vlb_step=0.0266, train/loss_step=0.760, global_step=451.0]
Epoch 0:  81%|████████  | 4833/5971 [43:55<10:20,  1.83it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.17e-5, train/loss_step=0.00203, global_step=452.0]
Epoch 0:  81%|████████  | 4834/5971 [43:56<10:19,  1.83it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.3e-6, train/loss_step=0.00154, global_step=452.0] 
Epoch 0:  81%|████████  | 4835/5971 [43:57<10:19,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0099, train/loss_vlb_step=4.41e-5, train/loss_step=0.0099, global_step=452.0] 
Epoch 0:  81%|████████  | 4836/5971 [43:59<10:19,  1.83it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.06e-5, train/loss_step=0.0204, global_step=452.0]
Epoch 0:  81%|████████  | 4837/5971 [44:00<10:18,  1.83it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.06e-5, train/loss_step=0.0204, global_step=452.0]
Epoch 0:  81%|████████  | 4837/5971 [44:00<10:18,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.569, train/loss_vlb_step=0.00842, train/loss_step=0.569, global_step=453.0]  
Epoch 0:  81%|████████  | 4838/5971 [44:01<10:18,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0715, train/loss_vlb_step=0.000237, train/loss_step=0.0715, global_step=453.0]
Epoch 0:  81%|████████  | 4839/5971 [44:02<10:18,  1.83it/s, loss=0.18, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=453.0]   
Epoch 0:  81%|████████  | 4840/5971 [44:04<10:17,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00732, train/loss_vlb_step=3.7e-5, train/loss_step=0.00732, global_step=453.0]
Epoch 0:  81%|████████  | 4841/5971 [44:05<10:17,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00732, train/loss_vlb_step=3.7e-5, train/loss_step=0.00732, global_step=453.0]
Epoch 0:  81%|████████  | 4841/5971 [44:05<10:17,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.07e-5, train/loss_step=0.0141, global_step=454.0] 
Epoch 0:  81%|████████  | 4842/5971 [44:06<10:16,  1.83it/s, loss=0.201, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00443, train/loss_step=0.475, global_step=454.0]  
Epoch 0:  81%|████████  | 4843/5971 [44:07<10:16,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.0017, train/loss_step=0.337, global_step=454.0] 
Epoch 0:  81%|████████  | 4844/5971 [44:09<10:16,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.58e-5, train/loss_step=0.016, global_step=454.0]
Epoch 0:  81%|████████  | 4845/5971 [44:10<10:15,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.58e-5, train/loss_step=0.016, global_step=454.0]
Epoch 0:  81%|████████  | 4845/5971 [44:10<10:15,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00538, train/loss_vlb_step=2.71e-5, train/loss_step=0.00538, global_step=455.0]
Epoch 0:  81%|████████  | 4846/5971 [44:11<10:15,  1.83it/s, loss=0.199, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00498, train/loss_step=0.486, global_step=455.0]    
Epoch 0:  81%|████████  | 4847/5971 [44:12<10:14,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.45e-5, train/loss_step=0.0205, global_step=455.0]
Epoch 0:  81%|████████  | 4848/5971 [44:14<10:14,  1.83it/s, loss=0.175, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000743, train/loss_step=0.216, global_step=455.0] 
Epoch 0:  81%|████████  | 4849/5971 [44:15<10:14,  1.83it/s, loss=0.175, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000743, train/loss_step=0.216, global_step=455.0]
Epoch 0:  81%|████████  | 4849/5971 [44:15<10:14,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.00028, train/loss_step=0.0803, global_step=456.0]
Epoch 0:  81%|████████  | 4850/5971 [44:16<10:13,  1.83it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.58e-6, train/loss_step=0.00157, global_step=456.0]
Epoch 0:  81%|████████  | 4851/5971 [44:17<10:13,  1.83it/s, loss=0.167, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=456.0]   
Epoch 0:  81%|████████▏ | 4852/5971 [44:19<10:13,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00211, train/loss_step=0.345, global_step=456.0] 
Epoch 0:  81%|████████▏ | 4853/5971 [44:20<10:12,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00211, train/loss_step=0.345, global_step=456.0]
Epoch 0:  81%|████████▏ | 4853/5971 [44:20<10:12,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.721, train/loss_vlb_step=0.0313, train/loss_step=0.721, global_step=457.0] 
Epoch 0:  81%|████████▏ | 4854/5971 [44:21<10:12,  1.82it/s, loss=0.209, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00576, train/loss_step=0.538, global_step=457.0]
Epoch 0:  81%|████████▏ | 4855/5971 [44:22<10:11,  1.82it/s, loss=0.214, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000347, train/loss_step=0.104, global_step=457.0]
Epoch 0:  81%|████████▏ | 4856/5971 [44:24<10:11,  1.82it/s, loss=0.225, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000948, train/loss_step=0.252, global_step=457.0]
Epoch 0:  81%|████████▏ | 4857/5971 [44:25<10:11,  1.82it/s, loss=0.225, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000948, train/loss_step=0.252, global_step=457.0]
Epoch 0:  81%|████████▏ | 4857/5971 [44:25<10:11,  1.82it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000326, train/loss_step=0.0991, global_step=458.0]
Epoch 0:  81%|████████▏ | 4858/5971 [44:26<10:10,  1.82it/s, loss=0.223, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.0041, train/loss_step=0.495, global_step=458.0]    
Epoch 0:  81%|████████▏ | 4859/5971 [44:27<10:10,  1.82it/s, loss=0.25, v_num=0, train/loss_simple_step=0.653, train/loss_vlb_step=0.0141, train/loss_step=0.653, global_step=458.0] 
Epoch 0:  81%|████████▏ | 4860/5971 [44:29<10:10,  1.82it/s, loss=0.251, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000123, train/loss_step=0.0338, global_step=458.0]
Epoch 0:  81%|████████▏ | 4861/5971 [44:30<10:09,  1.82it/s, loss=0.251, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000123, train/loss_step=0.0338, global_step=458.0]
Epoch 0:  81%|████████▏ | 4861/5971 [44:30<10:09,  1.82it/s, loss=0.267, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00147, train/loss_step=0.339, global_step=459.0]   
Epoch 0:  81%|████████▏ | 4862/5971 [44:31<10:09,  1.82it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.34e-5, train/loss_step=0.0203, global_step=459.0]
Epoch 0:  81%|████████▏ | 4863/5971 [44:32<10:08,  1.82it/s, loss=0.242, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00142, train/loss_step=0.293, global_step=459.0]  
Epoch 0:  81%|████████▏ | 4864/5971 [44:35<10:08,  1.82it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.69e-5, train/loss_step=0.0136, global_step=459.0]
Epoch 0:  81%|████████▏ | 4865/5971 [44:36<10:08,  1.82it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.69e-5, train/loss_step=0.0136, global_step=459.0]
Epoch 0:  81%|████████▏ | 4865/5971 [44:36<10:08,  1.82it/s, loss=0.244, v_num=0, train/loss_simple_step=0.0445, train/loss_vlb_step=0.000162, train/loss_step=0.0445, global_step=460.0]
Epoch 0:  81%|████████▏ | 4866/5971 [44:36<10:07,  1.82it/s, loss=0.228, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000494, train/loss_step=0.150, global_step=460.0]  
Epoch 0:  82%|████████▏ | 4867/5971 [44:37<10:07,  1.82it/s, loss=0.227, v_num=0, train/loss_simple_step=0.00233, train/loss_vlb_step=1.35e-5, train/loss_step=0.00233, global_step=460.0]
Epoch 0:  82%|████████▏ | 4868/5971 [44:39<10:07,  1.82it/s, loss=0.258, v_num=0, train/loss_simple_step=0.843, train/loss_vlb_step=0.0436, train/loss_step=0.843, global_step=460.0]     
Epoch 0:  82%|████████▏ | 4869/5971 [44:40<10:06,  1.82it/s, loss=0.258, v_num=0, train/loss_simple_step=0.843, train/loss_vlb_step=0.0436, train/loss_step=0.843, global_step=460.0]
Epoch 0:  82%|████████▏ | 4869/5971 [44:40<10:06,  1.82it/s, loss=0.262, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000574, train/loss_step=0.170, global_step=461.0]
Epoch 0:  82%|████████▏ | 4870/5971 [44:41<10:06,  1.82it/s, loss=0.272, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000635, train/loss_step=0.184, global_step=461.0]
Epoch 0:  82%|████████▏ | 4871/5971 [44:42<10:05,  1.82it/s, loss=0.267, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000121, train/loss_step=0.0331, global_step=461.0]
Epoch 0:  82%|████████▏ | 4872/5971 [44:45<10:05,  1.81it/s, loss=0.264, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00127, train/loss_step=0.290, global_step=461.0]   
Epoch 0:  82%|████████▏ | 4873/5971 [44:45<10:05,  1.81it/s, loss=0.264, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00127, train/loss_step=0.290, global_step=461.0]
Epoch 0:  82%|████████▏ | 4873/5971 [44:45<10:05,  1.81it/s, loss=0.244, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00156, train/loss_step=0.315, global_step=462.0]
Epoch 0:  82%|████████▏ | 4874/5971 [44:46<10:04,  1.81it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.23e-5, train/loss_step=0.0197, global_step=462.0]
Epoch 0:  82%|████████▏ | 4875/5971 [44:47<10:04,  1.81it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.87e-5, train/loss_step=0.00322, global_step=462.0]
Epoch 0:  82%|████████▏ | 4876/5971 [44:50<10:03,  1.81it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000143, train/loss_step=0.0402, global_step=462.0] 
Epoch 0:  82%|████████▏ | 4877/5971 [44:50<10:03,  1.81it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000143, train/loss_step=0.0402, global_step=462.0]
Epoch 0:  82%|████████▏ | 4877/5971 [44:50<10:03,  1.81it/s, loss=0.225, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.00511, train/loss_step=0.563, global_step=463.0]   
Epoch 0:  82%|████████▏ | 4878/5971 [44:51<10:03,  1.81it/s, loss=0.212, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.0008, train/loss_step=0.225, global_step=463.0] 
Epoch 0:  82%|████████▏ | 4879/5971 [44:52<10:02,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000666, train/loss_step=0.188, global_step=463.0]
Epoch 0:  82%|████████▏ | 4880/5971 [44:55<10:02,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000134, train/loss_step=0.038, global_step=463.0]
Epoch 0:  82%|████████▏ | 4881/5971 [44:55<10:01,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000134, train/loss_step=0.038, global_step=463.0]
Epoch 0:  82%|████████▏ | 4881/5971 [44:55<10:01,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.37e-5, train/loss_step=0.0115, global_step=464.0]
Epoch 0:  82%|████████▏ | 4882/5971 [44:56<10:01,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.00011, train/loss_step=0.0285, global_step=464.0]
Epoch 0:  82%|████████▏ | 4883/5971 [44:57<10:00,  1.81it/s, loss=0.179, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00281, train/loss_step=0.425, global_step=464.0]  
Epoch 0:  82%|████████▏ | 4884/5971 [45:00<10:00,  1.81it/s, loss=0.217, v_num=0, train/loss_simple_step=0.771, train/loss_vlb_step=0.0334, train/loss_step=0.771, global_step=464.0] 
Epoch 0:  82%|████████▏ | 4885/5971 [45:00<10:00,  1.81it/s, loss=0.217, v_num=0, train/loss_simple_step=0.771, train/loss_vlb_step=0.0334, train/loss_step=0.771, global_step=464.0]
Epoch 0:  82%|████████▏ | 4885/5971 [45:00<10:00,  1.81it/s, loss=0.227, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000869, train/loss_step=0.235, global_step=465.0]
Epoch 0:  82%|████████▏ | 4886/5971 [45:01<09:59,  1.81it/s, loss=0.231, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000854, train/loss_step=0.237, global_step=465.0]
Epoch 0:  82%|████████▏ | 4887/5971 [45:02<09:59,  1.81it/s, loss=0.241, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000735, train/loss_step=0.204, global_step=465.0]
Epoch 0:  82%|████████▏ | 4888/5971 [45:05<09:59,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000334, train/loss_step=0.102, global_step=465.0]
Epoch 0:  82%|████████▏ | 4889/5971 [45:06<09:58,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000334, train/loss_step=0.102, global_step=465.0]
Epoch 0:  82%|████████▏ | 4889/5971 [45:06<09:58,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0833, train/loss_vlb_step=0.000275, train/loss_step=0.0833, global_step=466.0]
Epoch 0:  82%|████████▏ | 4890/5971 [45:07<09:58,  1.81it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00813, train/loss_vlb_step=3.94e-5, train/loss_step=0.00813, global_step=466.0]
Epoch 0:  82%|████████▏ | 4891/5971 [45:08<09:57,  1.81it/s, loss=0.215, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00636, train/loss_step=0.518, global_step=466.0]    
Epoch 0:  82%|████████▏ | 4892/5971 [45:10<09:57,  1.81it/s, loss=0.206, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=466.0]
Epoch 0:  82%|████████▏ | 4893/5971 [45:11<09:57,  1.80it/s, loss=0.206, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=466.0]
Epoch 0:  82%|████████▏ | 4893/5971 [45:11<09:57,  1.80it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.53e-5, train/loss_step=0.0238, global_step=467.0]
Epoch 0:  82%|████████▏ | 4894/5971 [45:12<09:56,  1.80it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00693, train/loss_vlb_step=3.41e-5, train/loss_step=0.00693, global_step=467.0]
Epoch 0:  82%|████████▏ | 4895/5971 [45:13<09:56,  1.80it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=9.96e-5, train/loss_step=0.0277, global_step=467.0]  
Epoch 0:  82%|████████▏ | 4896/5971 [45:15<09:56,  1.80it/s, loss=0.202, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000817, train/loss_step=0.238, global_step=467.0] 
Epoch 0:  82%|████████▏ | 4897/5971 [45:16<09:55,  1.80it/s, loss=0.202, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000817, train/loss_step=0.238, global_step=467.0]
Epoch 0:  82%|████████▏ | 4897/5971 [45:16<09:55,  1.80it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.27e-5, train/loss_step=0.00218, global_step=468.0]
Epoch 0:  82%|████████▏ | 4898/5971 [45:17<09:55,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0602, train/loss_vlb_step=0.000202, train/loss_step=0.0602, global_step=468.0] 
Epoch 0:  82%|████████▏ | 4899/5971 [45:18<09:54,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.87e-5, train/loss_step=0.0132, global_step=468.0] 
Epoch 0:  82%|████████▏ | 4900/5971 [45:20<09:54,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000133, train/loss_step=0.0355, global_step=468.0]
Epoch 0:  82%|████████▏ | 4901/5971 [45:21<09:54,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000133, train/loss_step=0.0355, global_step=468.0]
Epoch 0:  82%|████████▏ | 4901/5971 [45:21<09:54,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000104, train/loss_step=0.0274, global_step=469.0]
Epoch 0:  82%|████████▏ | 4902/5971 [45:22<09:53,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00128, train/loss_step=0.323, global_step=469.0]   
Epoch 0:  82%|████████▏ | 4903/5971 [45:23<09:53,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000912, train/loss_step=0.245, global_step=469.0]
Epoch 0:  82%|████████▏ | 4904/5971 [45:25<09:52,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.541, train/loss_vlb_step=0.00749, train/loss_step=0.541, global_step=469.0] 
Epoch 0:  82%|████████▏ | 4905/5971 [45:26<09:52,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.541, train/loss_vlb_step=0.00749, train/loss_step=0.541, global_step=469.0]
Epoch 0:  82%|████████▏ | 4905/5971 [45:26<09:52,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.000992, train/loss_step=0.273, global_step=470.0]
Epoch 0:  82%|████████▏ | 4906/5971 [45:27<09:51,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000319, train/loss_step=0.096, global_step=470.0]
Epoch 0:  82%|████████▏ | 4907/5971 [45:28<09:51,  1.80it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.16e-5, train/loss_step=0.00404, global_step=470.0]
Epoch 0:  82%|████████▏ | 4908/5971 [45:30<09:51,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000178, train/loss_step=0.0471, global_step=470.0] 
Epoch 0:  82%|████████▏ | 4909/5971 [45:31<09:50,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000178, train/loss_step=0.0471, global_step=470.0]
Epoch 0:  82%|████████▏ | 4909/5971 [45:31<09:50,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000646, train/loss_step=0.186, global_step=471.0]  
Epoch 0:  82%|████████▏ | 4910/5971 [45:32<09:50,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000149, train/loss_step=0.0393, global_step=471.0]
Epoch 0:  82%|████████▏ | 4911/5971 [45:33<09:49,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000903, train/loss_step=0.228, global_step=471.0]  
Epoch 0:  82%|████████▏ | 4912/5971 [45:35<09:49,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000143, train/loss_step=0.0369, global_step=471.0]
Epoch 0:  82%|████████▏ | 4913/5971 [45:36<09:49,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000143, train/loss_step=0.0369, global_step=471.0]
Epoch 0:  82%|████████▏ | 4913/5971 [45:36<09:49,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000464, train/loss_step=0.140, global_step=472.0]  
Epoch 0:  82%|████████▏ | 4914/5971 [45:37<09:48,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.552, train/loss_vlb_step=0.00346, train/loss_step=0.552, global_step=472.0] 
Epoch 0:  82%|████████▏ | 4915/5971 [45:38<09:48,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00696, train/loss_vlb_step=3.42e-5, train/loss_step=0.00696, global_step=472.0]
Epoch 0:  82%|████████▏ | 4916/5971 [45:40<09:47,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=472.0]   
Epoch 0:  82%|████████▏ | 4917/5971 [45:41<09:47,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=472.0]
Epoch 0:  82%|████████▏ | 4917/5971 [45:41<09:47,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000101, train/loss_step=0.0257, global_step=473.0]
Epoch 0:  82%|████████▏ | 4918/5971 [45:42<09:47,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000107, train/loss_step=0.0284, global_step=473.0]
Epoch 0:  82%|████████▏ | 4919/5971 [45:43<09:46,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00325, train/loss_step=0.468, global_step=473.0]   
Epoch 0:  82%|████████▏ | 4920/5971 [45:45<09:46,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.83e-5, train/loss_step=0.00341, global_step=473.0]
Epoch 0:  82%|████████▏ | 4921/5971 [45:46<09:45,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.83e-5, train/loss_step=0.00341, global_step=473.0]
Epoch 0:  82%|████████▏ | 4921/5971 [45:46<09:45,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.0019, train/loss_step=0.385, global_step=474.0]     
Epoch 0:  82%|████████▏ | 4922/5971 [45:47<09:45,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000492, train/loss_step=0.142, global_step=474.0]
Epoch 0:  82%|████████▏ | 4923/5971 [45:47<09:44,  1.79it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0811, train/loss_vlb_step=0.000269, train/loss_step=0.0811, global_step=474.0]
Epoch 0:  82%|████████▏ | 4924/5971 [45:50<09:44,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]   
Epoch 0:  82%|████████▏ | 4925/5971 [45:50<09:44,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.41it/s][A

Validating:   1%|          | 2/167 [00:00<00:45,  3.61it/s][A
Epoch 0:  83%|████████▎ | 4929/5971 [45:50<09:41,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.62it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.71it/s][A
Epoch 0:  83%|████████▎ | 4933/5971 [45:51<09:38,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.73it/s][A
Epoch 0:  83%|████████▎ | 4937/5971 [45:51<09:36,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.98it/s][A
Epoch 0:  83%|████████▎ | 4941/5971 [45:51<09:33,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.45it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.79it/s][A
Epoch 0:  83%|████████▎ | 4945/5971 [45:51<09:30,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.96it/s][A
Epoch 0:  83%|████████▎ | 4949/5971 [45:51<09:28,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.13it/s][A
Epoch 0:  83%|████████▎ | 4953/5971 [45:51<09:25,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.67it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.83it/s][A
Epoch 0:  83%|████████▎ | 4957/5971 [45:52<09:22,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.60it/s][A
Epoch 0:  83%|████████▎ | 4961/5971 [45:52<09:20,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.43it/s][A
Epoch 0:  83%|████████▎ | 4965/5971 [45:52<09:17,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.05it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.82it/s][A
Epoch 0:  83%|████████▎ | 4969/5971 [45:52<09:14,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.76it/s][A
Epoch 0:  83%|████████▎ | 4973/5971 [45:52<09:12,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 24.28it/s][A
Epoch 0:  83%|████████▎ | 4977/5971 [45:52<09:09,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 24.64it/s][A

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.75it/s][A
Epoch 0:  83%|████████▎ | 4981/5971 [45:52<09:07,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  35%|███▌      | 59/167 [00:02<00:05, 20.61it/s][A
Epoch 0:  83%|████████▎ | 4985/5971 [45:53<09:04,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  37%|███▋      | 62/167 [00:03<00:05, 20.52it/s][A
Epoch 0:  84%|████████▎ | 4989/5971 [45:53<09:01,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 20.78it/s][A

Validating:  41%|████      | 68/167 [00:03<00:05, 18.86it/s][A
Epoch 0:  84%|████████▎ | 4993/5971 [45:53<08:59,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  43%|████▎     | 71/167 [00:03<00:04, 19.73it/s][A
Epoch 0:  84%|████████▎ | 4997/5971 [45:53<08:56,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  44%|████▍     | 74/167 [00:03<00:04, 21.46it/s][A
Epoch 0:  84%|████████▍ | 5001/5971 [45:54<08:54,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  46%|████▌     | 77/167 [00:03<00:04, 22.44it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 22.55it/s][A
Epoch 0:  84%|████████▍ | 5005/5971 [45:54<08:51,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 24.00it/s][A
Epoch 0:  84%|████████▍ | 5009/5971 [45:54<08:48,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  51%|█████▏    | 86/167 [00:04<00:04, 17.54it/s][A
Epoch 0:  84%|████████▍ | 5013/5971 [45:54<08:46,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  53%|█████▎    | 89/167 [00:04<00:04, 17.95it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 18.88it/s][A
Epoch 0:  84%|████████▍ | 5017/5971 [45:54<08:43,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  57%|█████▋    | 95/167 [00:04<00:03, 19.71it/s][A
Epoch 0:  84%|████████▍ | 5021/5971 [45:55<08:41,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:03, 20.59it/s][A
Epoch 0:  84%|████████▍ | 5025/5971 [45:55<08:38,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  61%|██████    | 102/167 [00:04<00:02, 22.44it/s][A
Epoch 0:  84%|████████▍ | 5029/5971 [45:55<08:36,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  63%|██████▎   | 105/167 [00:05<00:02, 23.46it/s][A

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 23.55it/s][A
Epoch 0:  84%|████████▍ | 5033/5971 [45:55<08:33,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 24.16it/s][A
Epoch 0:  84%|████████▍ | 5037/5971 [45:55<08:30,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 24.10it/s][A
Epoch 0:  84%|████████▍ | 5041/5971 [45:55<08:28,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  70%|███████   | 117/167 [00:05<00:02, 24.89it/s][A

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.01it/s][A
Epoch 0:  84%|████████▍ | 5045/5971 [45:55<08:25,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.52it/s][A
Epoch 0:  85%|████████▍ | 5049/5971 [45:56<08:23,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.10it/s][A
Epoch 0:  85%|████████▍ | 5053/5971 [45:56<08:20,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  77%|███████▋  | 129/167 [00:06<00:01, 21.45it/s][A

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 22.01it/s][A
Epoch 0:  85%|████████▍ | 5057/5971 [45:56<08:18,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 23.95it/s][A
Epoch 0:  85%|████████▍ | 5061/5971 [45:56<08:15,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 24.79it/s][A
Epoch 0:  85%|████████▍ | 5065/5971 [45:56<08:13,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 25.68it/s][A
Epoch 0:  85%|████████▍ | 5069/5971 [45:56<08:10,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.82it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.01it/s][A
Epoch 0:  85%|████████▍ | 5073/5971 [45:57<08:07,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.48it/s][A
Epoch 0:  85%|████████▌ | 5077/5971 [45:57<08:05,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  92%|█████████▏| 154/167 [00:07<00:00, 26.77it/s][A
Epoch 0:  85%|████████▌ | 5081/5971 [45:57<08:02,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  94%|█████████▍| 157/167 [00:07<00:00, 26.18it/s][A

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 18.91it/s][A
Epoch 0:  85%|████████▌ | 5085/5971 [45:57<08:00,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 15.37it/s][A
Epoch 0:  85%|████████▌ | 5089/5971 [45:58<07:57,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 17.40it/s][A
Epoch 0:  85%|████████▌ | 5092/5971 [45:58<07:56,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]

                                                             [A
Epoch 0:  85%|████████▌ | 5093/5971 [45:59<07:55,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0478, train/loss_step=0.834, global_step=474.0]
Epoch 0:  85%|████████▌ | 5093/5971 [45:59<07:55,  1.85it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.34e-5, train/loss_step=0.0178, global_step=475.0]
Epoch 0:  85%|████████▌ | 5094/5971 [46:00<07:55,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00201, train/loss_step=0.355, global_step=475.0]  
Epoch 0:  85%|████████▌ | 5095/5971 [46:01<07:54,  1.85it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0321, train/loss_vlb_step=0.000123, train/loss_step=0.0321, global_step=475.0]
Epoch 0:  85%|████████▌ | 5096/5971 [46:03<07:54,  1.84it/s, loss=0.194, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000716, train/loss_step=0.210, global_step=475.0]  
Epoch 0:  85%|████████▌ | 5097/5971 [46:04<07:54,  1.84it/s, loss=0.194, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000716, train/loss_step=0.210, global_step=475.0]
Epoch 0:  85%|████████▌ | 5097/5971 [46:04<07:54,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.74e-5, train/loss_step=0.00322, global_step=476.0]
Epoch 0:  85%|████████▌ | 5098/5971 [46:05<07:53,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00199, train/loss_step=0.342, global_step=476.0]      
Epoch 0:  85%|████████▌ | 5099/5971 [46:06<07:53,  1.84it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.000204, train/loss_step=0.0609, global_step=476.0]
Epoch 0:  85%|████████▌ | 5100/5971 [46:08<07:52,  1.84it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000106, train/loss_step=0.0279, global_step=476.0]
Epoch 0:  85%|████████▌ | 5101/5971 [46:09<07:52,  1.84it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000106, train/loss_step=0.0279, global_step=476.0]
Epoch 0:  85%|████████▌ | 5101/5971 [46:09<07:52,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.28e-5, train/loss_step=0.010, global_step=477.0]   
Epoch 0:  85%|████████▌ | 5102/5971 [46:10<07:51,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0885, train/loss_vlb_step=0.000296, train/loss_step=0.0885, global_step=477.0]
Epoch 0:  85%|████████▌ | 5103/5971 [46:11<07:51,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.66e-5, train/loss_step=0.0103, global_step=477.0] 
Epoch 0:  85%|████████▌ | 5104/5971 [46:14<07:51,  1.84it/s, loss=0.19, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0119, train/loss_step=0.668, global_step=477.0]    
Epoch 0:  85%|████████▌ | 5105/5971 [46:15<07:50,  1.84it/s, loss=0.19, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0119, train/loss_step=0.668, global_step=477.0]
Epoch 0:  85%|████████▌ | 5105/5971 [46:15<07:50,  1.84it/s, loss=0.219, v_num=0, train/loss_simple_step=0.616, train/loss_vlb_step=0.00981, train/loss_step=0.616, global_step=478.0]
Epoch 0:  86%|████████▌ | 5106/5971 [46:15<07:50,  1.84it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=6.09e-5, train/loss_step=0.0132, global_step=478.0]
Epoch 0:  86%|████████▌ | 5107/5971 [46:16<07:49,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.77e-5, train/loss_step=0.00319, global_step=478.0]
Epoch 0:  86%|████████▌ | 5108/5971 [46:18<07:49,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000232, train/loss_step=0.0698, global_step=478.0] 
Epoch 0:  86%|████████▌ | 5109/5971 [46:19<07:48,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000232, train/loss_step=0.0698, global_step=478.0]
Epoch 0:  86%|████████▌ | 5109/5971 [46:19<07:48,  1.84it/s, loss=0.222, v_num=0, train/loss_simple_step=0.864, train/loss_vlb_step=0.0736, train/loss_step=0.864, global_step=479.0]    
Epoch 0:  86%|████████▌ | 5110/5971 [46:20<07:48,  1.84it/s, loss=0.222, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=479.0]
Epoch 0:  86%|████████▌ | 5111/5971 [46:21<07:47,  1.84it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=7.96e-5, train/loss_step=0.0203, global_step=479.0]
Epoch 0:  86%|████████▌ | 5112/5971 [46:23<07:47,  1.84it/s, loss=0.223, v_num=0, train/loss_simple_step=0.914, train/loss_vlb_step=0.231, train/loss_step=0.914, global_step=479.0]    
Epoch 0:  86%|████████▌ | 5113/5971 [46:24<07:47,  1.84it/s, loss=0.223, v_num=0, train/loss_simple_step=0.914, train/loss_vlb_step=0.231, train/loss_step=0.914, global_step=479.0]
Epoch 0:  86%|████████▌ | 5113/5971 [46:24<07:47,  1.84it/s, loss=0.222, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.78e-5, train/loss_step=0.00316, global_step=480.0]
Epoch 0:  86%|████████▌ | 5114/5971 [46:25<07:46,  1.84it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.28e-5, train/loss_step=0.0149, global_step=480.0]  
Epoch 0:  86%|████████▌ | 5115/5971 [46:26<07:46,  1.84it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0223, train/loss_vlb_step=9.12e-5, train/loss_step=0.0223, global_step=480.0]
Epoch 0:  86%|████████▌ | 5116/5971 [46:28<07:45,  1.83it/s, loss=0.207, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000924, train/loss_step=0.261, global_step=480.0] 
Epoch 0:  86%|████████▌ | 5117/5971 [46:29<07:45,  1.83it/s, loss=0.207, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000924, train/loss_step=0.261, global_step=480.0]
Epoch 0:  86%|████████▌ | 5117/5971 [46:29<07:45,  1.83it/s, loss=0.221, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00137, train/loss_step=0.273, global_step=481.0] 
Epoch 0:  86%|████████▌ | 5118/5971 [46:30<07:45,  1.83it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.63e-5, train/loss_step=0.0189, global_step=481.0]
Epoch 0:  86%|████████▌ | 5119/5971 [46:31<07:44,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=481.0] 
Epoch 0:  86%|████████▌ | 5120/5971 [46:33<07:44,  1.83it/s, loss=0.215, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000569, train/loss_step=0.167, global_step=481.0]
Epoch 0:  86%|████████▌ | 5121/5971 [46:34<07:43,  1.83it/s, loss=0.215, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000569, train/loss_step=0.167, global_step=481.0]
Epoch 0:  86%|████████▌ | 5121/5971 [46:34<07:43,  1.83it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00552, train/loss_vlb_step=2.86e-5, train/loss_step=0.00552, global_step=482.0]
Epoch 0:  86%|████████▌ | 5122/5971 [46:35<07:43,  1.83it/s, loss=0.224, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00119, train/loss_step=0.278, global_step=482.0]    
Epoch 0:  86%|████████▌ | 5123/5971 [46:36<07:42,  1.83it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.09e-5, train/loss_step=0.0139, global_step=482.0]
Epoch 0:  86%|████████▌ | 5124/5971 [46:38<07:42,  1.83it/s, loss=0.198, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000499, train/loss_step=0.143, global_step=482.0] 
Epoch 0:  86%|████████▌ | 5125/5971 [46:39<07:42,  1.83it/s, loss=0.198, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000499, train/loss_step=0.143, global_step=482.0]
Epoch 0:  86%|████████▌ | 5125/5971 [46:39<07:42,  1.83it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0384, train/loss_vlb_step=0.000141, train/loss_step=0.0384, global_step=483.0]
Epoch 0:  86%|████████▌ | 5126/5971 [46:40<07:41,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000576, train/loss_step=0.170, global_step=483.0]  
Epoch 0:  86%|████████▌ | 5127/5971 [46:41<07:41,  1.83it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0463, train/loss_vlb_step=0.000161, train/loss_step=0.0463, global_step=483.0]
Epoch 0:  86%|████████▌ | 5128/5971 [46:43<07:40,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.82e-5, train/loss_step=0.0034, global_step=483.0] 
Epoch 0:  86%|████████▌ | 5129/5971 [46:44<07:40,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.82e-5, train/loss_step=0.0034, global_step=483.0]
Epoch 0:  86%|████████▌ | 5129/5971 [46:44<07:40,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00263, train/loss_vlb_step=1.48e-5, train/loss_step=0.00263, global_step=484.0]
Epoch 0:  86%|████████▌ | 5130/5971 [46:45<07:39,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000166, train/loss_step=0.0465, global_step=484.0] 
Epoch 0:  86%|████████▌ | 5131/5971 [46:46<07:39,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.23e-5, train/loss_step=0.00203, global_step=484.0]
Epoch 0:  86%|████████▌ | 5132/5971 [46:48<07:39,  1.83it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.00249, train/loss_vlb_step=1.44e-5, train/loss_step=0.00249, global_step=484.0]
Epoch 0:  86%|████████▌ | 5133/5971 [46:49<07:38,  1.83it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.00249, train/loss_vlb_step=1.44e-5, train/loss_step=0.00249, global_step=484.0]
Epoch 0:  86%|████████▌ | 5133/5971 [46:49<07:38,  1.83it/s, loss=0.087, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000359, train/loss_step=0.107, global_step=485.0]    
Epoch 0:  86%|████████▌ | 5134/5971 [46:50<07:38,  1.83it/s, loss=0.118, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.0104, train/loss_step=0.638, global_step=485.0]  
Epoch 0:  86%|████████▌ | 5135/5971 [46:51<07:37,  1.83it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.2e-5, train/loss_step=0.0194, global_step=485.0]
Epoch 0:  86%|████████▌ | 5136/5971 [46:53<07:37,  1.83it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0829, train/loss_vlb_step=0.000273, train/loss_step=0.0829, global_step=485.0]
Epoch 0:  86%|████████▌ | 5137/5971 [46:54<07:36,  1.83it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0829, train/loss_vlb_step=0.000273, train/loss_step=0.0829, global_step=485.0]
Epoch 0:  86%|████████▌ | 5137/5971 [46:54<07:36,  1.83it/s, loss=0.119, v_num=0, train/loss_simple_step=0.470, train/loss_vlb_step=0.00378, train/loss_step=0.470, global_step=486.0]   
Epoch 0:  86%|████████▌ | 5138/5971 [46:55<07:36,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000241, train/loss_step=0.0706, global_step=486.0]
Epoch 0:  86%|████████▌ | 5139/5971 [46:55<07:35,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00112, train/loss_step=0.260, global_step=486.0]   
Epoch 0:  86%|████████▌ | 5140/5971 [46:58<07:35,  1.82it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.41e-5, train/loss_step=0.0149, global_step=486.0]
Epoch 0:  86%|████████▌ | 5141/5971 [46:58<07:35,  1.82it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.41e-5, train/loss_step=0.0149, global_step=486.0]
Epoch 0:  86%|████████▌ | 5141/5971 [46:58<07:35,  1.82it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=3.19e-5, train/loss_step=0.00626, global_step=487.0]
Epoch 0:  86%|████████▌ | 5142/5971 [46:59<07:34,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00177, train/loss_step=0.380, global_step=487.0]    
Epoch 0:  86%|████████▌ | 5143/5971 [47:00<07:34,  1.82it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.000267, train/loss_step=0.0778, global_step=487.0]
Epoch 0:  86%|████████▌ | 5144/5971 [47:02<07:33,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.084, train/loss_vlb_step=0.000283, train/loss_step=0.084, global_step=487.0]  
Epoch 0:  86%|████████▌ | 5145/5971 [47:03<07:33,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.084, train/loss_vlb_step=0.000283, train/loss_step=0.084, global_step=487.0]
Epoch 0:  86%|████████▌ | 5145/5971 [47:03<07:33,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00168, train/loss_step=0.347, global_step=488.0] 
Epoch 0:  86%|████████▌ | 5146/5971 [47:04<07:32,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000403, train/loss_step=0.121, global_step=488.0]
Epoch 0:  86%|████████▌ | 5147/5971 [47:05<07:32,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000608, train/loss_step=0.179, global_step=488.0]
Epoch 0:  86%|████████▌ | 5148/5971 [47:07<07:31,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000713, train/loss_step=0.206, global_step=488.0]
Epoch 0:  86%|████████▌ | 5149/5971 [47:08<07:31,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000713, train/loss_step=0.206, global_step=488.0]
Epoch 0:  86%|████████▌ | 5149/5971 [47:08<07:31,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000144, train/loss_step=0.0391, global_step=489.0]
Epoch 0:  86%|████████▋ | 5150/5971 [47:09<07:30,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000521, train/loss_step=0.151, global_step=489.0]  
Epoch 0:  86%|████████▋ | 5151/5971 [47:10<07:30,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000202, train/loss_step=0.0581, global_step=489.0]
Epoch 0:  86%|████████▋ | 5152/5971 [47:12<07:30,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00169, train/loss_step=0.331, global_step=489.0]   
Epoch 0:  86%|████████▋ | 5153/5971 [47:13<07:29,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00169, train/loss_step=0.331, global_step=489.0]
Epoch 0:  86%|████████▋ | 5153/5971 [47:13<07:29,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00499, train/loss_step=0.535, global_step=490.0]
Epoch 0:  86%|████████▋ | 5154/5971 [47:14<07:29,  1.82it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00908, train/loss_vlb_step=4.24e-5, train/loss_step=0.00908, global_step=490.0]
Epoch 0:  86%|████████▋ | 5155/5971 [47:15<07:28,  1.82it/s, loss=0.215, v_num=0, train/loss_simple_step=0.872, train/loss_vlb_step=0.0639, train/loss_step=0.872, global_step=490.0]     
Epoch 0:  86%|████████▋ | 5156/5971 [47:17<07:28,  1.82it/s, loss=0.223, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000867, train/loss_step=0.246, global_step=490.0]
Epoch 0:  86%|████████▋ | 5157/5971 [47:18<07:27,  1.82it/s, loss=0.223, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000867, train/loss_step=0.246, global_step=490.0]
Epoch 0:  86%|████████▋ | 5157/5971 [47:18<07:27,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.16e-5, train/loss_step=0.00394, global_step=491.0]
Epoch 0:  86%|████████▋ | 5158/5971 [47:19<07:27,  1.82it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000231, train/loss_step=0.0699, global_step=491.0]
Epoch 0:  86%|████████▋ | 5159/5971 [47:19<07:26,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.00013, train/loss_step=0.0336, global_step=491.0] 
Epoch 0:  86%|████████▋ | 5160/5971 [47:21<07:26,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.52e-5, train/loss_step=0.0124, global_step=491.0]
Epoch 0:  86%|████████▋ | 5161/5971 [47:22<07:26,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.52e-5, train/loss_step=0.0124, global_step=491.0]
Epoch 0:  86%|████████▋ | 5161/5971 [47:22<07:26,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000331, train/loss_step=0.100, global_step=492.0] 
Epoch 0:  86%|████████▋ | 5162/5971 [47:23<07:25,  1.82it/s, loss=0.179, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000382, train/loss_step=0.114, global_step=492.0]
Epoch 0:  86%|████████▋ | 5163/5971 [47:24<07:25,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.58e-5, train/loss_step=0.00277, global_step=492.0]
Epoch 0:  86%|████████▋ | 5164/5971 [47:26<07:24,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.12e-5, train/loss_step=0.0192, global_step=492.0]  
Epoch 0:  87%|████████▋ | 5165/5971 [47:27<07:24,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.12e-5, train/loss_step=0.0192, global_step=492.0]
Epoch 0:  87%|████████▋ | 5165/5971 [47:27<07:24,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000492, train/loss_step=0.146, global_step=493.0] 
Epoch 0:  87%|████████▋ | 5166/5971 [47:28<07:23,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=3.77e-5, train/loss_step=0.00845, global_step=493.0]
Epoch 0:  87%|████████▋ | 5167/5971 [47:29<07:23,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000499, train/loss_step=0.152, global_step=493.0]   
Epoch 0:  87%|████████▋ | 5168/5971 [47:31<07:22,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.91e-5, train/loss_step=0.00579, global_step=493.0]
Epoch 0:  87%|████████▋ | 5169/5971 [47:32<07:22,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.91e-5, train/loss_step=0.00579, global_step=493.0]
Epoch 0:  87%|████████▋ | 5169/5971 [47:32<07:22,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.000271, train/loss_step=0.0817, global_step=494.0] 
Epoch 0:  87%|████████▋ | 5170/5971 [47:33<07:21,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000634, train/loss_step=0.181, global_step=494.0]  
Epoch 0:  87%|████████▋ | 5171/5971 [47:34<07:21,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0666, train/loss_vlb_step=0.000223, train/loss_step=0.0666, global_step=494.0]
Epoch 0:  87%|████████▋ | 5172/5971 [47:36<07:21,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00166, train/loss_step=0.373, global_step=494.0]  
Epoch 0:  87%|████████▋ | 5173/5971 [47:37<07:20,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00166, train/loss_step=0.373, global_step=494.0]
Epoch 0:  87%|████████▋ | 5173/5971 [47:37<07:20,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000751, train/loss_step=0.196, global_step=495.0]
Epoch 0:  87%|████████▋ | 5174/5971 [47:38<07:20,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000746, train/loss_step=0.186, global_step=495.0]
Epoch 0:  87%|████████▋ | 5175/5971 [47:38<07:19,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000674, train/loss_step=0.198, global_step=495.0] 
Epoch 0:  87%|████████▋ | 5176/5971 [47:41<07:19,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000834, train/loss_step=0.241, global_step=495.0]
Epoch 0:  87%|████████▋ | 5177/5971 [47:41<07:18,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000834, train/loss_step=0.241, global_step=495.0]
Epoch 0:  87%|████████▋ | 5177/5971 [47:41<07:18,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000617, train/loss_step=0.184, global_step=496.0]
Epoch 0:  87%|████████▋ | 5178/5971 [47:42<07:18,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.16e-5, train/loss_step=0.00406, global_step=496.0]
Epoch 0:  87%|████████▋ | 5179/5971 [47:43<07:17,  1.81it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.89e-5, train/loss_step=0.00342, global_step=496.0]
Epoch 0:  87%|████████▋ | 5180/5971 [47:45<07:17,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000166, train/loss_step=0.049, global_step=496.0]   
Epoch 0:  87%|████████▋ | 5181/5971 [47:46<07:17,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000166, train/loss_step=0.049, global_step=496.0]
Epoch 0:  87%|████████▋ | 5181/5971 [47:46<07:17,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=497.0]
Epoch 0:  87%|████████▋ | 5182/5971 [47:47<07:16,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00086, train/loss_step=0.243, global_step=497.0] 
Epoch 0:  87%|████████▋ | 5183/5971 [47:48<07:16,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000763, train/loss_step=0.211, global_step=497.0]
Epoch 0:  87%|████████▋ | 5184/5971 [47:50<07:15,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.00022, train/loss_step=0.0658, global_step=497.0]
Epoch 0:  87%|████████▋ | 5185/5971 [47:51<07:15,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.00022, train/loss_step=0.0658, global_step=497.0]
Epoch 0:  87%|████████▋ | 5185/5971 [47:51<07:15,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.0011, train/loss_step=0.276, global_step=498.0]   
Epoch 0:  87%|████████▋ | 5186/5971 [47:52<07:14,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0663, train/loss_vlb_step=0.000226, train/loss_step=0.0663, global_step=498.0]
Epoch 0:  87%|████████▋ | 5187/5971 [47:53<07:14,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.81e-5, train/loss_step=0.00334, global_step=498.0]
Epoch 0:  87%|████████▋ | 5188/5971 [47:55<07:13,  1.80it/s, loss=0.161, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00479, train/loss_step=0.481, global_step=498.0]    
Epoch 0:  87%|████████▋ | 5189/5971 [47:56<07:13,  1.80it/s, loss=0.161, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00479, train/loss_step=0.481, global_step=498.0]
Epoch 0:  87%|████████▋ | 5189/5971 [47:56<07:13,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000785, train/loss_step=0.229, global_step=499.0]
Epoch 0:  87%|████████▋ | 5189/5971 [48:13<07:15,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000785, train/loss_step=0.229, global_step=499.0]
Epoch 0:  87%|████████▋ | 5190/5971 [48:32<07:18,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000785, train/loss_step=0.229, global_step=499.0]
Epoch 0:  87%|████████▋ | 5190/5971 [48:32<07:18,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.64e-5, train/loss_step=0.0191, global_step=499.0]
Epoch 0:  87%|████████▋ | 5191/5971 [48:33<07:17,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.64e-5, train/loss_step=0.0191, global_step=499.0]
Epoch 0:  87%|████████▋ | 5191/5971 [48:33<07:17,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.7e-5, train/loss_step=0.0256, global_step=499.0]
Epoch 0:  87%|████████▋ | 5192/5971 [48:35<07:17,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.7e-5, train/loss_step=0.0256, global_step=499.0]
Epoch 0:  87%|████████▋ | 5192/5971 [48:35<07:17,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:20,  2.06it/s][A
Epoch 0:  87%|████████▋ | 5194/5971 [48:36<07:16,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:   1%|          | 2/167 [00:00<00:49,  3.32it/s][A
Epoch 0:  87%|████████▋ | 5196/5971 [48:36<07:14,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.80it/s][A
Epoch 0:  87%|████████▋ | 5199/5971 [48:36<07:13,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.32it/s][A
Epoch 0:  87%|████████▋ | 5202/5971 [48:36<07:11,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:   7%|▋         | 11/167 [00:01<00:09, 15.69it/s][A
Epoch 0:  87%|████████▋ | 5205/5971 [48:36<07:09,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.47it/s][A
Epoch 0:  87%|████████▋ | 5208/5971 [48:36<07:07,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.19it/s][A
Epoch 0:  87%|████████▋ | 5211/5971 [48:37<07:05,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.66it/s][A
Epoch 0:  87%|████████▋ | 5214/5971 [48:37<07:03,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.25it/s][A
Epoch 0:  87%|████████▋ | 5218/5971 [48:37<07:00,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.26it/s][A
Epoch 0:  87%|████████▋ | 5222/5971 [48:37<06:58,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.29it/s][A

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.76it/s][A
Epoch 0:  88%|████████▊ | 5226/5971 [48:37<06:55,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  22%|██▏       | 37/167 [00:01<00:04, 27.60it/s][A
Epoch 0:  88%|████████▊ | 5230/5971 [48:37<06:53,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 26.73it/s][A
Epoch 0:  88%|████████▊ | 5234/5971 [48:37<06:50,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.54it/s][A
Epoch 0:  88%|████████▊ | 5238/5971 [48:38<06:48,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.77it/s][A

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.19it/s][A
Epoch 0:  88%|████████▊ | 5242/5971 [48:38<06:45,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  31%|███       | 52/167 [00:02<00:04, 26.91it/s][A
Epoch 0:  88%|████████▊ | 5246/5971 [48:38<06:43,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 25.75it/s][A
Epoch 0:  88%|████████▊ | 5250/5971 [48:38<06:40,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.99it/s][A
Epoch 0:  88%|████████▊ | 5254/5971 [48:38<06:38,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.87it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.25it/s][A
Epoch 0:  88%|████████▊ | 5258/5971 [48:38<06:35,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.97it/s][A
Epoch 0:  88%|████████▊ | 5262/5971 [48:38<06:33,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.93it/s][A
Epoch 0:  88%|████████▊ | 5266/5971 [48:39<06:30,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.03it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.59it/s][A
Epoch 0:  88%|████████▊ | 5270/5971 [48:39<06:28,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.24it/s][A
Epoch 0:  88%|████████▊ | 5274/5971 [48:39<06:25,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.95it/s][A
Epoch 0:  88%|████████▊ | 5278/5971 [48:39<06:23,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.46it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 25.55it/s][A
Epoch 0:  88%|████████▊ | 5282/5971 [48:39<06:20,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.07it/s][A
Epoch 0:  89%|████████▊ | 5286/5971 [48:39<06:18,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.33it/s][A
Epoch 0:  89%|████████▊ | 5290/5971 [48:40<06:15,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.91it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.13it/s][A
Epoch 0:  89%|████████▊ | 5294/5971 [48:40<06:13,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.11it/s][A
Epoch 0:  89%|████████▊ | 5298/5971 [48:40<06:10,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.01it/s][A
Epoch 0:  89%|████████▉ | 5302/5971 [48:40<06:08,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.59it/s][A
Epoch 0:  89%|████████▉ | 5306/5971 [48:40<06:05,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 26.83it/s][A
Epoch 0:  89%|████████▉ | 5310/5971 [48:40<06:03,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.76it/s][A

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.46it/s][A
Epoch 0:  89%|████████▉ | 5314/5971 [48:40<06:01,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.15it/s][A
Epoch 0:  89%|████████▉ | 5318/5971 [48:41<05:58,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.06it/s][A
Epoch 0:  89%|████████▉ | 5322/5971 [48:41<05:56,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 25.07it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.21it/s][A
Epoch 0:  89%|████████▉ | 5326/5971 [48:41<05:53,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.35it/s][A
Epoch 0:  89%|████████▉ | 5330/5971 [48:41<05:51,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.34it/s][A
Epoch 0:  89%|████████▉ | 5334/5971 [48:41<05:48,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.90it/s][A

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 27.54it/s][A
Epoch 0:  89%|████████▉ | 5338/5971 [48:41<05:46,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.25it/s][A
Epoch 0:  89%|████████▉ | 5342/5971 [48:42<05:43,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.98it/s][A
Epoch 0:  90%|████████▉ | 5346/5971 [48:42<05:41,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 28.34it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 25.19it/s][A
Epoch 0:  90%|████████▉ | 5350/5971 [48:42<05:39,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.19it/s][A
Epoch 0:  90%|████████▉ | 5354/5971 [48:42<05:36,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.69it/s][A
Epoch 0:  90%|████████▉ | 5358/5971 [48:42<05:34,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]
Epoch 0:  90%|████████▉ | 5360/5971 [48:42<05:33,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.35it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.14it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.72it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.12it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.45it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.98it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.39it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.53it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.60it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.54it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.53it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.60it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.61it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.63it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.61it/s][A
Epoch 0:  90%|████████▉ | 5360/5971 [48:53<05:34,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.52it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.17it/s]

Epoch 0:  90%|████████▉ | 5361/5971 [48:55<05:33,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00263, train/loss_step=0.375, global_step=499.0]
Epoch 0:  90%|████████▉ | 5361/5971 [48:55<05:33,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.27e-5, train/loss_step=0.00422, global_step=500.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.20it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.82it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.30it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.88it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.02it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.11it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.10it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.44it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.56it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.36it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.47it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.46it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.64it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.64it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.64it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.17it/s]

Epoch 0:  90%|████████▉ | 5362/5971 [49:06<05:34,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.27e-5, train/loss_step=0.00422, global_step=500.0]
Epoch 0:  90%|████████▉ | 5362/5971 [49:06<05:34,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00692, train/loss_vlb_step=3.42e-5, train/loss_step=0.00692, global_step=500.0]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.29it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.33it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.62it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.92it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.13it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.21it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.43it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.58it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.55it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.58it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.59it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.60it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.60it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.64it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.49it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.38it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.23it/s]

Epoch 0:  90%|████████▉ | 5363/5971 [49:18<05:35,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00692, train/loss_vlb_step=3.42e-5, train/loss_step=0.00692, global_step=500.0]
Epoch 0:  90%|████████▉ | 5363/5971 [49:18<05:35,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00226, train/loss_step=0.413, global_step=500.0]     timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.76it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.18it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.49it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.79it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.00it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.53it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.48it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.58it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.65it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.66it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.66it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.71it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.71it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.68it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.70it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.70it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s]

Epoch 0:  90%|████████▉ | 5364/5971 [49:32<05:36,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00226, train/loss_step=0.413, global_step=500.0]
Epoch 0:  90%|████████▉ | 5364/5971 [49:32<05:36,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000111, train/loss_step=0.0292, global_step=500.0]
Epoch 0:  90%|████████▉ | 5365/5971 [49:32<05:35,  1.80it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000111, train/loss_step=0.0292, global_step=500.0]
Epoch 0:  90%|████████▉ | 5365/5971 [49:32<05:35,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000619, train/loss_step=0.177, global_step=501.0] 
Epoch 0:  90%|████████▉ | 5366/5971 [49:33<05:35,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000619, train/loss_step=0.177, global_step=501.0]
Epoch 0:  90%|████████▉ | 5366/5971 [49:33<05:35,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00126, train/loss_step=0.278, global_step=501.0] 
Epoch 0:  90%|████████▉ | 5367/5971 [49:34<05:34,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00126, train/loss_step=0.278, global_step=501.0]
Epoch 0:  90%|████████▉ | 5367/5971 [49:34<05:34,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=501.0]
Epoch 0:  90%|████████▉ | 5368/5971 [49:37<05:34,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=501.0]
Epoch 0:  90%|████████▉ | 5368/5971 [49:37<05:34,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00343, train/loss_vlb_step=1.9e-5, train/loss_step=0.00343, global_step=501.0]
Epoch 0:  90%|████████▉ | 5369/5971 [49:38<05:33,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00343, train/loss_vlb_step=1.9e-5, train/loss_step=0.00343, global_step=501.0]
Epoch 0:  90%|████████▉ | 5369/5971 [49:38<05:33,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00122, train/loss_step=0.308, global_step=502.0]   
Epoch 0:  90%|████████▉ | 5370/5971 [49:38<05:33,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00122, train/loss_step=0.308, global_step=502.0]
Epoch 0:  90%|████████▉ | 5370/5971 [49:38<05:33,  1.80it/s, loss=0.175, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00273, train/loss_step=0.409, global_step=502.0]
Epoch 0:  90%|████████▉ | 5371/5971 [49:39<05:32,  1.80it/s, loss=0.175, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00273, train/loss_step=0.409, global_step=502.0]
Epoch 0:  90%|████████▉ | 5371/5971 [49:39<05:32,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=9.86e-5, train/loss_step=0.0255, global_step=502.0]
Epoch 0:  90%|████████▉ | 5372/5971 [49:42<05:32,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=9.86e-5, train/loss_step=0.0255, global_step=502.0]
Epoch 0:  90%|████████▉ | 5372/5971 [49:42<05:32,  1.80it/s, loss=0.169, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000464, train/loss_step=0.140, global_step=502.0] 
Epoch 0:  90%|████████▉ | 5373/5971 [49:42<05:31,  1.80it/s, loss=0.169, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000464, train/loss_step=0.140, global_step=502.0]
Epoch 0:  90%|████████▉ | 5373/5971 [49:42<05:31,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00135, train/loss_step=0.318, global_step=503.0] 
Epoch 0:  90%|█████████ | 5374/5971 [49:43<05:31,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00135, train/loss_step=0.318, global_step=503.0]
Epoch 0:  90%|█████████ | 5374/5971 [49:43<05:31,  1.80it/s, loss=0.185, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00141, train/loss_step=0.328, global_step=503.0]
Epoch 0:  90%|█████████ | 5375/5971 [49:44<05:30,  1.80it/s, loss=0.185, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00141, train/loss_step=0.328, global_step=503.0]
Epoch 0:  90%|█████████ | 5375/5971 [49:44<05:30,  1.80it/s, loss=0.208, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00253, train/loss_step=0.461, global_step=503.0]
Epoch 0:  90%|█████████ | 5376/5971 [49:46<05:30,  1.80it/s, loss=0.208, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00253, train/loss_step=0.461, global_step=503.0]
Epoch 0:  90%|█████████ | 5376/5971 [49:46<05:30,  1.80it/s, loss=0.209, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00499, train/loss_step=0.514, global_step=503.0]
Epoch 0:  90%|█████████ | 5377/5971 [49:47<05:29,  1.80it/s, loss=0.209, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00499, train/loss_step=0.514, global_step=503.0]
Epoch 0:  90%|█████████ | 5377/5971 [49:47<05:29,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000327, train/loss_step=0.0992, global_step=504.0]
Epoch 0:  90%|█████████ | 5378/5971 [49:48<05:29,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000327, train/loss_step=0.0992, global_step=504.0]
Epoch 0:  90%|█████████ | 5378/5971 [49:48<05:29,  1.80it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.14e-5, train/loss_step=0.0019, global_step=504.0] 
Epoch 0:  90%|█████████ | 5379/5971 [49:49<05:28,  1.80it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.14e-5, train/loss_step=0.0019, global_step=504.0]
Epoch 0:  90%|█████████ | 5379/5971 [49:49<05:28,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000169, train/loss_step=0.049, global_step=504.0] 
Epoch 0:  90%|█████████ | 5380/5971 [49:51<05:28,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000169, train/loss_step=0.049, global_step=504.0]
Epoch 0:  90%|█████████ | 5380/5971 [49:51<05:28,  1.80it/s, loss=0.198, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00107, train/loss_step=0.276, global_step=504.0] 
Epoch 0:  90%|█████████ | 5381/5971 [49:52<05:28,  1.80it/s, loss=0.198, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00107, train/loss_step=0.276, global_step=504.0]
Epoch 0:  90%|█████████ | 5381/5971 [49:52<05:28,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0984, train/loss_vlb_step=0.000323, train/loss_step=0.0984, global_step=505.0]
Epoch 0:  90%|█████████ | 5382/5971 [49:53<05:27,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0984, train/loss_vlb_step=0.000323, train/loss_step=0.0984, global_step=505.0]
Epoch 0:  90%|█████████ | 5382/5971 [49:53<05:27,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00139, train/loss_step=0.308, global_step=505.0]   
Epoch 0:  90%|█████████ | 5383/5971 [49:54<05:27,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00139, train/loss_step=0.308, global_step=505.0]
Epoch 0:  90%|█████████ | 5383/5971 [49:54<05:27,  1.80it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=0.0001, train/loss_step=0.0255, global_step=505.0]
Epoch 0:  90%|█████████ | 5384/5971 [49:56<05:26,  1.80it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=0.0001, train/loss_step=0.0255, global_step=505.0]
Epoch 0:  90%|█████████ | 5384/5971 [49:56<05:26,  1.80it/s, loss=0.225, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.0062, train/loss_step=0.561, global_step=505.0]  
Epoch 0:  90%|█████████ | 5385/5971 [49:57<05:26,  1.80it/s, loss=0.225, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.0062, train/loss_step=0.561, global_step=505.0]
Epoch 0:  90%|█████████ | 5385/5971 [49:57<05:26,  1.80it/s, loss=0.223, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000495, train/loss_step=0.145, global_step=506.0]
Epoch 0:  90%|█████████ | 5386/5971 [49:58<05:25,  1.80it/s, loss=0.223, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000495, train/loss_step=0.145, global_step=506.0]
Epoch 0:  90%|█████████ | 5386/5971 [49:58<05:25,  1.80it/s, loss=0.222, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00103, train/loss_step=0.250, global_step=506.0] 
Epoch 0:  90%|█████████ | 5387/5971 [49:59<05:25,  1.80it/s, loss=0.222, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00103, train/loss_step=0.250, global_step=506.0]
Epoch 0:  90%|█████████ | 5387/5971 [49:59<05:25,  1.80it/s, loss=0.231, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00138, train/loss_step=0.292, global_step=506.0]
Epoch 0:  90%|█████████ | 5388/5971 [50:01<05:24,  1.80it/s, loss=0.231, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00138, train/loss_step=0.292, global_step=506.0]
Epoch 0:  90%|█████████ | 5388/5971 [50:01<05:24,  1.80it/s, loss=0.237, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000446, train/loss_step=0.133, global_step=506.0]
Epoch 0:  90%|█████████ | 5389/5971 [50:02<05:24,  1.80it/s, loss=0.237, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000446, train/loss_step=0.133, global_step=506.0]
Epoch 0:  90%|█████████ | 5389/5971 [50:02<05:24,  1.80it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0627, train/loss_vlb_step=0.000206, train/loss_step=0.0627, global_step=507.0]
Epoch 0:  90%|█████████ | 5390/5971 [50:03<05:23,  1.80it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0627, train/loss_vlb_step=0.000206, train/loss_step=0.0627, global_step=507.0]
Epoch 0:  90%|█████████ | 5390/5971 [50:03<05:23,  1.80it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0819, train/loss_vlb_step=0.000273, train/loss_step=0.0819, global_step=507.0]
Epoch 0:  90%|█████████ | 5391/5971 [50:04<05:23,  1.79it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0819, train/loss_vlb_step=0.000273, train/loss_step=0.0819, global_step=507.0]
Epoch 0:  90%|█████████ | 5391/5971 [50:04<05:23,  1.79it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000131, train/loss_step=0.0339, global_step=507.0]
Epoch 0:  90%|█████████ | 5392/5971 [50:06<05:22,  1.79it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000131, train/loss_step=0.0339, global_step=507.0]
Epoch 0:  90%|█████████ | 5392/5971 [50:06<05:22,  1.79it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00267, train/loss_vlb_step=1.52e-5, train/loss_step=0.00267, global_step=507.0]
Epoch 0:  90%|█████████ | 5393/5971 [50:07<05:22,  1.79it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00267, train/loss_vlb_step=1.52e-5, train/loss_step=0.00267, global_step=507.0]
Epoch 0:  90%|█████████ | 5393/5971 [50:07<05:22,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.77e-5, train/loss_step=0.0157, global_step=508.0]  
Epoch 0:  90%|█████████ | 5394/5971 [50:08<05:21,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.77e-5, train/loss_step=0.0157, global_step=508.0]
Epoch 0:  90%|█████████ | 5394/5971 [50:08<05:21,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00137, train/loss_step=0.327, global_step=508.0]  
Epoch 0:  90%|█████████ | 5395/5971 [50:09<05:21,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00137, train/loss_step=0.327, global_step=508.0]
Epoch 0:  90%|█████████ | 5395/5971 [50:09<05:21,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.81e-5, train/loss_step=0.0214, global_step=508.0]
Epoch 0:  90%|█████████ | 5396/5971 [50:11<05:20,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.81e-5, train/loss_step=0.0214, global_step=508.0]
Epoch 0:  90%|█████████ | 5396/5971 [50:11<05:20,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=9.89e-5, train/loss_step=0.0262, global_step=508.0] 
Epoch 0:  90%|█████████ | 5397/5971 [50:12<05:20,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=9.89e-5, train/loss_step=0.0262, global_step=508.0]
Epoch 0:  90%|█████████ | 5397/5971 [50:12<05:20,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000742, train/loss_step=0.211, global_step=509.0]
Epoch 0:  90%|█████████ | 5398/5971 [50:13<05:19,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000742, train/loss_step=0.211, global_step=509.0]
Epoch 0:  90%|█████████ | 5398/5971 [50:13<05:19,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000929, train/loss_step=0.232, global_step=509.0]
Epoch 0:  90%|█████████ | 5399/5971 [50:14<05:19,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000929, train/loss_step=0.232, global_step=509.0]
Epoch 0:  90%|█████████ | 5399/5971 [50:14<05:19,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=509.0] 
Epoch 0:  90%|█████████ | 5400/5971 [50:16<05:18,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=509.0]
Epoch 0:  90%|█████████ | 5400/5971 [50:16<05:18,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0039, train/loss_vlb_step=2.12e-5, train/loss_step=0.0039, global_step=509.0]
Epoch 0:  90%|█████████ | 5401/5971 [50:17<05:18,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0039, train/loss_vlb_step=2.12e-5, train/loss_step=0.0039, global_step=509.0]
Epoch 0:  90%|█████████ | 5401/5971 [50:17<05:18,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.639, train/loss_vlb_step=0.00847, train/loss_step=0.639, global_step=510.0]  
Epoch 0:  90%|█████████ | 5402/5971 [50:18<05:17,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.639, train/loss_vlb_step=0.00847, train/loss_step=0.639, global_step=510.0]
Epoch 0:  90%|█████████ | 5402/5971 [50:18<05:17,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00159, train/loss_step=0.334, global_step=510.0]
Epoch 0:  90%|█████████ | 5403/5971 [50:19<05:17,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00159, train/loss_step=0.334, global_step=510.0]
Epoch 0:  90%|█████████ | 5403/5971 [50:19<05:17,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.33e-5, train/loss_step=0.0201, global_step=510.0]
Epoch 0:  91%|█████████ | 5404/5971 [50:21<05:16,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.33e-5, train/loss_step=0.0201, global_step=510.0]
Epoch 0:  91%|█████████ | 5404/5971 [50:21<05:16,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.26e-5, train/loss_step=0.0175, global_step=510.0]
Epoch 0:  91%|█████████ | 5405/5971 [50:22<05:16,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.26e-5, train/loss_step=0.0175, global_step=510.0]
Epoch 0:  91%|█████████ | 5405/5971 [50:22<05:16,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00103, train/loss_step=0.270, global_step=511.0]  
Epoch 0:  91%|█████████ | 5406/5971 [50:23<05:15,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00103, train/loss_step=0.270, global_step=511.0]
Epoch 0:  91%|█████████ | 5406/5971 [50:23<05:15,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000498, train/loss_step=0.145, global_step=511.0]
Epoch 0:  91%|█████████ | 5407/5971 [50:24<05:15,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000498, train/loss_step=0.145, global_step=511.0]
Epoch 0:  91%|█████████ | 5407/5971 [50:24<05:15,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.53e-5, train/loss_step=0.0121, global_step=511.0]
Epoch 0:  91%|█████████ | 5408/5971 [50:26<05:15,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.53e-5, train/loss_step=0.0121, global_step=511.0]
Epoch 0:  91%|█████████ | 5408/5971 [50:26<05:15,  1.79it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000184, train/loss_step=0.0514, global_step=511.0]
Epoch 0:  91%|█████████ | 5409/5971 [50:27<05:14,  1.79it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000184, train/loss_step=0.0514, global_step=511.0]
Epoch 0:  91%|█████████ | 5409/5971 [50:27<05:14,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00323, train/loss_step=0.454, global_step=512.0]   
Epoch 0:  91%|█████████ | 5410/5971 [50:28<05:13,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00323, train/loss_step=0.454, global_step=512.0]
Epoch 0:  91%|█████████ | 5410/5971 [50:28<05:13,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.83e-5, train/loss_step=0.0131, global_step=512.0]
Epoch 0:  91%|█████████ | 5411/5971 [50:29<05:13,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.83e-5, train/loss_step=0.0131, global_step=512.0]
Epoch 0:  91%|█████████ | 5411/5971 [50:29<05:13,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.59e-5, train/loss_step=0.0105, global_step=512.0]
Epoch 0:  91%|█████████ | 5412/5971 [50:31<05:13,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.59e-5, train/loss_step=0.0105, global_step=512.0]
Epoch 0:  91%|█████████ | 5412/5971 [50:31<05:13,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.00029, train/loss_step=0.0871, global_step=512.0] 
Epoch 0:  91%|█████████ | 5413/5971 [50:32<05:12,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.00029, train/loss_step=0.0871, global_step=512.0]
Epoch 0:  91%|█████████ | 5413/5971 [50:32<05:12,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.0029, train/loss_step=0.393, global_step=513.0]  
Epoch 0:  91%|█████████ | 5414/5971 [50:33<05:12,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.0029, train/loss_step=0.393, global_step=513.0]
Epoch 0:  91%|█████████ | 5414/5971 [50:33<05:12,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.0014, train/loss_step=0.321, global_step=513.0]
Epoch 0:  91%|█████████ | 5415/5971 [50:34<05:11,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.0014, train/loss_step=0.321, global_step=513.0]
Epoch 0:  91%|█████████ | 5415/5971 [50:34<05:11,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.00079, train/loss_step=0.229, global_step=513.0]
Epoch 0:  91%|█████████ | 5416/5971 [50:36<05:11,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.00079, train/loss_step=0.229, global_step=513.0]
Epoch 0:  91%|█████████ | 5416/5971 [50:36<05:11,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.65e-5, train/loss_step=0.00533, global_step=513.0]
Epoch 0:  91%|█████████ | 5417/5971 [50:37<05:10,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.65e-5, train/loss_step=0.00533, global_step=513.0]
Epoch 0:  91%|█████████ | 5417/5971 [50:37<05:10,  1.78it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0917, train/loss_vlb_step=0.000311, train/loss_step=0.0917, global_step=514.0] 
Epoch 0:  91%|█████████ | 5418/5971 [50:38<05:10,  1.78it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0917, train/loss_vlb_step=0.000311, train/loss_step=0.0917, global_step=514.0]
Epoch 0:  91%|█████████ | 5418/5971 [50:38<05:10,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0886, train/loss_vlb_step=0.000291, train/loss_step=0.0886, global_step=514.0]
Epoch 0:  91%|█████████ | 5419/5971 [50:38<05:09,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0886, train/loss_vlb_step=0.000291, train/loss_step=0.0886, global_step=514.0]
Epoch 0:  91%|█████████ | 5419/5971 [50:38<05:09,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00592, train/loss_step=0.628, global_step=514.0]   
Epoch 0:  91%|█████████ | 5420/5971 [50:41<05:09,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00592, train/loss_step=0.628, global_step=514.0]
Epoch 0:  91%|█████████ | 5420/5971 [50:41<05:09,  1.78it/s, loss=0.214, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.00401, train/loss_step=0.479, global_step=514.0]
Epoch 0:  91%|█████████ | 5421/5971 [50:42<05:08,  1.78it/s, loss=0.214, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.00401, train/loss_step=0.479, global_step=514.0]
Epoch 0:  91%|█████████ | 5421/5971 [50:42<05:08,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.14e-5, train/loss_step=0.0114, global_step=515.0]
Epoch 0:  91%|█████████ | 5422/5971 [50:43<05:08,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.14e-5, train/loss_step=0.0114, global_step=515.0]
Epoch 0:  91%|█████████ | 5422/5971 [50:43<05:08,  1.78it/s, loss=0.18, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00112, train/loss_step=0.280, global_step=515.0]   
Epoch 0:  91%|█████████ | 5423/5971 [50:44<05:07,  1.78it/s, loss=0.18, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00112, train/loss_step=0.280, global_step=515.0]
Epoch 0:  91%|█████████ | 5423/5971 [50:44<05:07,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000221, train/loss_step=0.066, global_step=515.0]
Epoch 0:  91%|█████████ | 5424/5971 [50:46<05:07,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000221, train/loss_step=0.066, global_step=515.0]
Epoch 0:  91%|█████████ | 5424/5971 [50:46<05:07,  1.78it/s, loss=0.206, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00266, train/loss_step=0.475, global_step=515.0] 
Epoch 0:  91%|█████████ | 5425/5971 [50:47<05:06,  1.78it/s, loss=0.206, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00266, train/loss_step=0.475, global_step=515.0]
Epoch 0:  91%|█████████ | 5425/5971 [50:47<05:06,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000433, train/loss_step=0.131, global_step=516.0]
Epoch 0:  91%|█████████ | 5426/5971 [50:48<05:06,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000433, train/loss_step=0.131, global_step=516.0]
Epoch 0:  91%|█████████ | 5426/5971 [50:48<05:06,  1.78it/s, loss=0.208, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00143, train/loss_step=0.329, global_step=516.0] 
Epoch 0:  91%|█████████ | 5427/5971 [50:49<05:05,  1.78it/s, loss=0.208, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00143, train/loss_step=0.329, global_step=516.0]
Epoch 0:  91%|█████████ | 5427/5971 [50:49<05:05,  1.78it/s, loss=0.217, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000701, train/loss_step=0.202, global_step=516.0]
Epoch 0:  91%|█████████ | 5428/5971 [50:51<05:05,  1.78it/s, loss=0.217, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000701, train/loss_step=0.202, global_step=516.0]
Epoch 0:  91%|█████████ | 5428/5971 [50:51<05:05,  1.78it/s, loss=0.225, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000674, train/loss_step=0.197, global_step=516.0]
Epoch 0:  91%|█████████ | 5429/5971 [50:52<05:04,  1.78it/s, loss=0.225, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000674, train/loss_step=0.197, global_step=516.0]
Epoch 0:  91%|█████████ | 5429/5971 [50:52<05:04,  1.78it/s, loss=0.207, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=517.0] 
Epoch 0:  91%|█████████ | 5430/5971 [50:53<05:04,  1.78it/s, loss=0.207, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=517.0]
Epoch 0:  91%|█████████ | 5430/5971 [50:53<05:04,  1.78it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.29e-5, train/loss_step=0.00228, global_step=517.0]
Epoch 0:  91%|█████████ | 5431/5971 [50:54<05:03,  1.78it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.29e-5, train/loss_step=0.00228, global_step=517.0]
Epoch 0:  91%|█████████ | 5431/5971 [50:54<05:03,  1.78it/s, loss=0.235, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00572, train/loss_step=0.574, global_step=517.0]    
Epoch 0:  91%|█████████ | 5432/5971 [50:56<05:03,  1.78it/s, loss=0.235, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00572, train/loss_step=0.574, global_step=517.0]
Epoch 0:  91%|█████████ | 5432/5971 [50:56<05:03,  1.78it/s, loss=0.235, v_num=0, train/loss_simple_step=0.0752, train/loss_vlb_step=0.000258, train/loss_step=0.0752, global_step=517.0]
Epoch 0:  91%|█████████ | 5433/5971 [50:57<05:02,  1.78it/s, loss=0.235, v_num=0, train/loss_simple_step=0.0752, train/loss_vlb_step=0.000258, train/loss_step=0.0752, global_step=517.0]
Epoch 0:  91%|█████████ | 5433/5971 [50:57<05:02,  1.78it/s, loss=0.223, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000558, train/loss_step=0.168, global_step=518.0]  
Epoch 0:  91%|█████████ | 5434/5971 [50:58<05:02,  1.78it/s, loss=0.223, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000558, train/loss_step=0.168, global_step=518.0]
Epoch 0:  91%|█████████ | 5434/5971 [50:58<05:02,  1.78it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00805, train/loss_vlb_step=3.85e-5, train/loss_step=0.00805, global_step=518.0]
Epoch 0:  91%|█████████ | 5435/5971 [50:59<05:01,  1.78it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00805, train/loss_vlb_step=3.85e-5, train/loss_step=0.00805, global_step=518.0]
Epoch 0:  91%|█████████ | 5435/5971 [50:59<05:01,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000564, train/loss_step=0.165, global_step=518.0]   
Epoch 0:  91%|█████████ | 5436/5971 [51:01<05:01,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000564, train/loss_step=0.165, global_step=518.0]
Epoch 0:  91%|█████████ | 5436/5971 [51:01<05:01,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.55e-5, train/loss_step=0.0128, global_step=518.0]
Epoch 0:  91%|█████████ | 5437/5971 [51:02<05:00,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.55e-5, train/loss_step=0.0128, global_step=518.0]
Epoch 0:  91%|█████████ | 5437/5971 [51:02<05:00,  1.78it/s, loss=0.224, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.0054, train/loss_step=0.480, global_step=519.0]   
Epoch 0:  91%|█████████ | 5438/5971 [51:03<05:00,  1.78it/s, loss=0.224, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.0054, train/loss_step=0.480, global_step=519.0]
Epoch 0:  91%|█████████ | 5438/5971 [51:03<05:00,  1.78it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0863, train/loss_vlb_step=0.000286, train/loss_step=0.0863, global_step=519.0]
Epoch 0:  91%|█████████ | 5439/5971 [51:04<04:59,  1.78it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0863, train/loss_vlb_step=0.000286, train/loss_step=0.0863, global_step=519.0]
Epoch 0:  91%|█████████ | 5439/5971 [51:04<04:59,  1.78it/s, loss=0.206, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00106, train/loss_step=0.271, global_step=519.0]   
Epoch 0:  91%|█████████ | 5440/5971 [51:06<04:59,  1.77it/s, loss=0.206, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00106, train/loss_step=0.271, global_step=519.0]
Epoch 0:  91%|█████████ | 5440/5971 [51:06<04:59,  1.77it/s, loss=0.183, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.24e-5, train/loss_step=0.017, global_step=519.0]
Epoch 0:  91%|█████████ | 5441/5971 [51:07<04:58,  1.77it/s, loss=0.183, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.24e-5, train/loss_step=0.017, global_step=519.0]
Epoch 0:  91%|█████████ | 5441/5971 [51:07<04:58,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000434, train/loss_step=0.132, global_step=520.0]
Epoch 0:  91%|█████████ | 5442/5971 [51:08<04:58,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000434, train/loss_step=0.132, global_step=520.0]
Epoch 0:  91%|█████████ | 5442/5971 [51:08<04:58,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000142, train/loss_step=0.0407, global_step=520.0]
Epoch 0:  91%|█████████ | 5443/5971 [51:09<04:57,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000142, train/loss_step=0.0407, global_step=520.0]
Epoch 0:  91%|█████████ | 5443/5971 [51:09<04:57,  1.77it/s, loss=0.18, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=520.0]   
Epoch 0:  91%|█████████ | 5444/5971 [51:11<04:57,  1.77it/s, loss=0.18, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=520.0]
Epoch 0:  91%|█████████ | 5444/5971 [51:11<04:57,  1.77it/s, loss=0.172, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00156, train/loss_step=0.322, global_step=520.0]
Epoch 0:  91%|█████████ | 5445/5971 [51:12<04:56,  1.77it/s, loss=0.172, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00156, train/loss_step=0.322, global_step=520.0]
Epoch 0:  91%|█████████ | 5445/5971 [51:12<04:56,  1.77it/s, loss=0.175, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000624, train/loss_step=0.176, global_step=521.0]
Epoch 0:  91%|█████████ | 5446/5971 [51:12<04:56,  1.77it/s, loss=0.175, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000624, train/loss_step=0.176, global_step=521.0]
Epoch 0:  91%|█████████ | 5446/5971 [51:12<04:56,  1.77it/s, loss=0.174, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00132, train/loss_step=0.318, global_step=521.0] 
Epoch 0:  91%|█████████ | 5447/5971 [51:13<04:55,  1.77it/s, loss=0.174, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00132, train/loss_step=0.318, global_step=521.0]
Epoch 0:  91%|█████████ | 5447/5971 [51:13<04:55,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000627, train/loss_step=0.180, global_step=521.0]
Epoch 0:  91%|█████████ | 5448/5971 [51:16<04:55,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000627, train/loss_step=0.180, global_step=521.0]
Epoch 0:  91%|█████████ | 5448/5971 [51:16<04:55,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.66e-5, train/loss_step=0.015, global_step=521.0] 
Epoch 0:  91%|█████████▏| 5449/5971 [51:17<04:54,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.66e-5, train/loss_step=0.015, global_step=521.0]
Epoch 0:  91%|█████████▏| 5449/5971 [51:17<04:54,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.00044, train/loss_step=0.133, global_step=522.0]
Epoch 0:  91%|█████████▏| 5450/5971 [51:18<04:54,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.00044, train/loss_step=0.133, global_step=522.0]
Epoch 0:  91%|█████████▏| 5450/5971 [51:18<04:54,  1.77it/s, loss=0.181, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00182, train/loss_step=0.318, global_step=522.0]
Epoch 0:  91%|█████████▏| 5451/5971 [51:19<04:53,  1.77it/s, loss=0.181, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00182, train/loss_step=0.318, global_step=522.0]
Epoch 0:  91%|█████████▏| 5451/5971 [51:19<04:53,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00542, train/loss_vlb_step=2.62e-5, train/loss_step=0.00542, global_step=522.0]
Epoch 0:  91%|█████████▏| 5452/5971 [51:21<04:53,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00542, train/loss_vlb_step=2.62e-5, train/loss_step=0.00542, global_step=522.0]
Epoch 0:  91%|█████████▏| 5452/5971 [51:21<04:53,  1.77it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.25e-5, train/loss_step=0.00209, global_step=522.0]
Epoch 0:  91%|█████████▏| 5453/5971 [51:22<04:52,  1.77it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.25e-5, train/loss_step=0.00209, global_step=522.0]
Epoch 0:  91%|█████████▏| 5453/5971 [51:22<04:52,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.65e-5, train/loss_step=0.00294, global_step=523.0] 
Epoch 0:  91%|█████████▏| 5454/5971 [51:23<04:52,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.65e-5, train/loss_step=0.00294, global_step=523.0]
Epoch 0:  91%|█████████▏| 5454/5971 [51:23<04:52,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000152, train/loss_step=0.0393, global_step=523.0]
Epoch 0:  91%|█████████▏| 5455/5971 [51:24<04:51,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000152, train/loss_step=0.0393, global_step=523.0]
Epoch 0:  91%|█████████▏| 5455/5971 [51:24<04:51,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000165, train/loss_step=0.0468, global_step=523.0]
Epoch 0:  91%|█████████▏| 5456/5971 [51:26<04:51,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000165, train/loss_step=0.0468, global_step=523.0]
Epoch 0:  91%|█████████▏| 5456/5971 [51:26<04:51,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.46e-5, train/loss_step=0.00487, global_step=523.0]
Epoch 0:  91%|█████████▏| 5457/5971 [51:27<04:50,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.46e-5, train/loss_step=0.00487, global_step=523.0]
Epoch 0:  91%|█████████▏| 5457/5971 [51:27<04:50,  1.77it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000131, train/loss_step=0.0328, global_step=524.0] 
Epoch 0:  91%|█████████▏| 5458/5971 [51:28<04:50,  1.77it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000131, train/loss_step=0.0328, global_step=524.0]
Epoch 0:  91%|█████████▏| 5458/5971 [51:28<04:50,  1.77it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00411, train/loss_vlb_step=2.23e-5, train/loss_step=0.00411, global_step=524.0]
Epoch 0:  91%|█████████▏| 5459/5971 [51:28<04:49,  1.77it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00411, train/loss_vlb_step=2.23e-5, train/loss_step=0.00411, global_step=524.0]
Epoch 0:  91%|█████████▏| 5459/5971 [51:28<04:49,  1.77it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.77e-5, train/loss_step=0.0125, global_step=524.0] 
Epoch 0:  91%|█████████▏| 5460/5971 [51:31<04:49,  1.77it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.77e-5, train/loss_step=0.0125, global_step=524.0]
Epoch 0:  91%|█████████▏| 5460/5971 [51:31<04:49,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:16,  2.18it/s][A
Epoch 0:  91%|█████████▏| 5462/5971 [51:31<04:48,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:   1%|          | 2/167 [00:00<00:43,  3.83it/s][A
Epoch 0:  92%|█████████▏| 5464/5971 [51:31<04:46,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.55it/s][A
Epoch 0:  92%|█████████▏| 5467/5971 [51:31<04:44,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:   4%|▍         | 7/167 [00:00<00:13, 12.12it/s][A
Epoch 0:  92%|█████████▏| 5470/5971 [51:32<04:43,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:   6%|▌         | 10/167 [00:00<00:09, 16.14it/s][A
Epoch 0:  92%|█████████▏| 5473/5971 [51:32<04:41,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:   8%|▊         | 13/167 [00:01<00:07, 19.30it/s][A
Epoch 0:  92%|█████████▏| 5476/5971 [51:32<04:39,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  10%|▉         | 16/167 [00:01<00:07, 20.00it/s][A
Epoch 0:  92%|█████████▏| 5479/5971 [51:32<04:37,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.25it/s][A
Epoch 0:  92%|█████████▏| 5482/5971 [51:32<04:35,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 22.72it/s][A
Epoch 0:  92%|█████████▏| 5485/5971 [51:32<04:33,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 22.76it/s][A
Epoch 0:  92%|█████████▏| 5488/5971 [51:32<04:32,  1.77it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 23.45it/s][A
Epoch 0:  92%|█████████▏| 5491/5971 [51:32<04:30,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 23.76it/s][A
Epoch 0:  92%|█████████▏| 5494/5971 [51:32<04:28,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  20%|██        | 34/167 [00:01<00:05, 25.06it/s][A
Epoch 0:  92%|█████████▏| 5497/5971 [51:33<04:26,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 23.90it/s][A
Epoch 0:  92%|█████████▏| 5500/5971 [51:33<04:24,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 23.99it/s][A
Epoch 0:  92%|█████████▏| 5503/5971 [51:33<04:23,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  26%|██▌       | 43/167 [00:02<00:06, 19.51it/s][A
Epoch 0:  92%|█████████▏| 5506/5971 [51:33<04:21,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  28%|██▊       | 46/167 [00:02<00:05, 21.39it/s][A
Epoch 0:  92%|█████████▏| 5509/5971 [51:33<04:19,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  29%|██▉       | 49/167 [00:02<00:05, 22.88it/s][A
Epoch 0:  92%|█████████▏| 5512/5971 [51:33<04:17,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  31%|███       | 52/167 [00:02<00:04, 24.21it/s][A
Epoch 0:  92%|█████████▏| 5515/5971 [51:33<04:15,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 24.38it/s][A
Epoch 0:  92%|█████████▏| 5518/5971 [51:34<04:13,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 24.89it/s][A

Validating:  37%|███▋      | 61/167 [00:03<00:04, 26.05it/s][A
Epoch 0:  92%|█████████▏| 5522/5971 [51:34<04:11,  1.78it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.79it/s][A
Epoch 0:  93%|█████████▎| 5526/5971 [51:34<04:09,  1.79it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.27it/s][A
Epoch 0:  93%|█████████▎| 5530/5971 [51:34<04:06,  1.79it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.27it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.75it/s][A
Epoch 0:  93%|█████████▎| 5534/5971 [51:34<04:04,  1.79it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.68it/s][A
Epoch 0:  93%|█████████▎| 5538/5971 [51:34<04:01,  1.79it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.51it/s][A
Epoch 0:  93%|█████████▎| 5542/5971 [51:34<03:59,  1.79it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.43it/s][A

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.69it/s][A
Epoch 0:  93%|█████████▎| 5546/5971 [51:35<03:57,  1.79it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 26.24it/s][A
Epoch 0:  93%|█████████▎| 5550/5971 [51:35<03:54,  1.79it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 26.59it/s][A
Epoch 0:  93%|█████████▎| 5554/5971 [51:35<03:52,  1.79it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.88it/s][A
Epoch 0:  93%|█████████▎| 5558/5971 [51:35<03:49,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.86it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.02it/s][A
Epoch 0:  93%|█████████▎| 5562/5971 [51:35<03:47,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.80it/s][A
Epoch 0:  93%|█████████▎| 5566/5971 [51:35<03:45,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.40it/s][A
Epoch 0:  93%|█████████▎| 5570/5971 [51:35<03:42,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.71it/s][A

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 26.21it/s][A
Epoch 0:  93%|█████████▎| 5574/5971 [51:36<03:40,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 25.81it/s][A
Epoch 0:  93%|█████████▎| 5578/5971 [51:36<03:38,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.32it/s][A
Epoch 0:  93%|█████████▎| 5582/5971 [51:36<03:35,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.66it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.19it/s][A
Epoch 0:  94%|█████████▎| 5586/5971 [51:36<03:33,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.00it/s][A
Epoch 0:  94%|█████████▎| 5590/5971 [51:36<03:31,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 24.49it/s][A
Epoch 0:  94%|█████████▎| 5594/5971 [51:36<03:28,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.84it/s][A

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.32it/s][A
Epoch 0:  94%|█████████▍| 5598/5971 [51:37<03:26,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.48it/s][A
Epoch 0:  94%|█████████▍| 5602/5971 [51:37<03:23,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.80it/s][A
Epoch 0:  94%|█████████▍| 5606/5971 [51:37<03:21,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.28it/s][A
Epoch 0:  94%|█████████▍| 5610/5971 [51:37<03:19,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.32it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 21.65it/s][A
Epoch 0:  94%|█████████▍| 5614/5971 [51:37<03:16,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 21.31it/s][A
Epoch 0:  94%|█████████▍| 5618/5971 [51:37<03:14,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 22.27it/s][A
Epoch 0:  94%|█████████▍| 5622/5971 [51:38<03:12,  1.81it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 23.65it/s][A

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 24.74it/s][A
Epoch 0:  94%|█████████▍| 5626/5971 [51:38<03:09,  1.82it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]
Epoch 0:  94%|█████████▍| 5628/5971 [51:38<03:08,  1.82it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.08e-5, train/loss_step=0.0192, global_step=524.0]

                                                             [A
Epoch 0:  94%|█████████▍| 5629/5971 [51:39<03:08,  1.82it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.00013, train/loss_step=0.036, global_step=525.0]  
Epoch 0:  94%|█████████▍| 5630/5971 [51:40<03:07,  1.82it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.00013, train/loss_step=0.036, global_step=525.0]
Epoch 0:  94%|█████████▍| 5630/5971 [51:40<03:07,  1.82it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.00054, train/loss_step=0.162, global_step=525.0]
Epoch 0:  94%|█████████▍| 5631/5971 [51:41<03:07,  1.82it/s, loss=0.092, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.91e-5, train/loss_step=0.0102, global_step=525.0]
Epoch 0:  94%|█████████▍| 5632/5971 [51:43<03:06,  1.82it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.00069, train/loss_step=0.192, global_step=525.0] 
Epoch 0:  94%|█████████▍| 5633/5971 [51:44<03:06,  1.81it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00259, train/loss_step=0.427, global_step=526.0]
Epoch 0:  94%|█████████▍| 5634/5971 [51:45<03:05,  1.81it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00259, train/loss_step=0.427, global_step=526.0]
Epoch 0:  94%|█████████▍| 5634/5971 [51:45<03:05,  1.81it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000304, train/loss_step=0.0922, global_step=526.0]
Epoch 0:  94%|█████████▍| 5635/5971 [51:46<03:05,  1.81it/s, loss=0.078, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=1.95e-5, train/loss_step=0.00355, global_step=526.0]
Epoch 0:  94%|█████████▍| 5636/5971 [51:48<03:04,  1.81it/s, loss=0.0774, v_num=0, train/loss_simple_step=0.00399, train/loss_vlb_step=2.19e-5, train/loss_step=0.00399, global_step=526.0]
Epoch 0:  94%|█████████▍| 5637/5971 [51:49<03:04,  1.81it/s, loss=0.0711, v_num=0, train/loss_simple_step=0.00716, train/loss_vlb_step=3.61e-5, train/loss_step=0.00716, global_step=527.0]
Epoch 0:  94%|█████████▍| 5638/5971 [51:49<03:03,  1.81it/s, loss=0.0711, v_num=0, train/loss_simple_step=0.00716, train/loss_vlb_step=3.61e-5, train/loss_step=0.00716, global_step=527.0]
Epoch 0:  94%|█████████▍| 5638/5971 [51:49<03:03,  1.81it/s, loss=0.057, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000128, train/loss_step=0.0348, global_step=527.0]  
Epoch 0:  94%|█████████▍| 5639/5971 [51:50<03:03,  1.81it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00368, train/loss_step=0.449, global_step=527.0]  
Epoch 0:  94%|█████████▍| 5640/5971 [51:53<03:02,  1.81it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000994, train/loss_step=0.255, global_step=527.0]
Epoch 0:  94%|█████████▍| 5641/5971 [51:53<03:02,  1.81it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000172, train/loss_step=0.0488, global_step=528.0]
Epoch 0:  94%|█████████▍| 5642/5971 [51:54<03:01,  1.81it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000172, train/loss_step=0.0488, global_step=528.0]
Epoch 0:  94%|█████████▍| 5642/5971 [51:54<03:01,  1.81it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000504, train/loss_step=0.153, global_step=528.0]  
Epoch 0:  95%|█████████▍| 5643/5971 [51:55<03:01,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00713, train/loss_step=0.624, global_step=528.0]  
Epoch 0:  95%|█████████▍| 5644/5971 [51:57<03:00,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0039, train/loss_vlb_step=2.05e-5, train/loss_step=0.0039, global_step=528.0]
Epoch 0:  95%|█████████▍| 5645/5971 [51:58<03:00,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00148, train/loss_step=0.322, global_step=529.0]  
Epoch 0:  95%|█████████▍| 5646/5971 [51:59<02:59,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00148, train/loss_step=0.322, global_step=529.0]
Epoch 0:  95%|█████████▍| 5646/5971 [51:59<02:59,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000461, train/loss_step=0.140, global_step=529.0]
Epoch 0:  95%|█████████▍| 5647/5971 [52:00<02:59,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=9.19e-6, train/loss_step=0.00153, global_step=529.0]
Epoch 0:  95%|█████████▍| 5648/5971 [52:02<02:58,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0353, train/loss_vlb_step=0.00013, train/loss_step=0.0353, global_step=529.0]   
Epoch 0:  95%|█████████▍| 5649/5971 [52:03<02:58,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00695, train/loss_vlb_step=3.58e-5, train/loss_step=0.00695, global_step=530.0]
Epoch 0:  95%|█████████▍| 5650/5971 [52:04<02:57,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00695, train/loss_vlb_step=3.58e-5, train/loss_step=0.00695, global_step=530.0]
Epoch 0:  95%|█████████▍| 5650/5971 [52:04<02:57,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000218, train/loss_step=0.0628, global_step=530.0] 
Epoch 0:  95%|█████████▍| 5651/5971 [52:05<02:56,  1.81it/s, loss=0.161, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00215, train/loss_step=0.348, global_step=530.0]   
Epoch 0:  95%|█████████▍| 5652/5971 [52:07<02:56,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000363, train/loss_step=0.109, global_step=530.0]
Epoch 0:  95%|█████████▍| 5653/5971 [52:08<02:55,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000657, train/loss_step=0.187, global_step=531.0]
Epoch 0:  95%|█████████▍| 5654/5971 [52:09<02:55,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000657, train/loss_step=0.187, global_step=531.0]
Epoch 0:  95%|█████████▍| 5654/5971 [52:09<02:55,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000112, train/loss_step=0.0289, global_step=531.0]
Epoch 0:  95%|█████████▍| 5655/5971 [52:10<02:54,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00869, train/loss_vlb_step=4.12e-5, train/loss_step=0.00869, global_step=531.0]
Epoch 0:  95%|█████████▍| 5656/5971 [52:12<02:54,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.515, train/loss_vlb_step=0.00376, train/loss_step=0.515, global_step=531.0]    
Epoch 0:  95%|█████████▍| 5657/5971 [52:13<02:53,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0659, train/loss_vlb_step=0.000224, train/loss_step=0.0659, global_step=532.0]
Epoch 0:  95%|█████████▍| 5658/5971 [52:13<02:53,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0659, train/loss_vlb_step=0.000224, train/loss_step=0.0659, global_step=532.0]
Epoch 0:  95%|█████████▍| 5658/5971 [52:13<02:53,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.58e-5, train/loss_step=0.012, global_step=532.0]  
Epoch 0:  95%|█████████▍| 5659/5971 [52:14<02:52,  1.81it/s, loss=0.179, v_num=0, train/loss_simple_step=0.653, train/loss_vlb_step=0.0123, train/loss_step=0.653, global_step=532.0] 
Epoch 0:  95%|█████████▍| 5660/5971 [52:17<02:52,  1.80it/s, loss=0.174, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000554, train/loss_step=0.158, global_step=532.0]
Epoch 0:  95%|█████████▍| 5661/5971 [52:17<02:51,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.54e-6, train/loss_step=0.00157, global_step=533.0]
Epoch 0:  95%|█████████▍| 5662/5971 [52:18<02:51,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.54e-6, train/loss_step=0.00157, global_step=533.0]
Epoch 0:  95%|█████████▍| 5662/5971 [52:18<02:51,  1.80it/s, loss=0.194, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.00876, train/loss_step=0.606, global_step=533.0]    
Epoch 0:  95%|█████████▍| 5663/5971 [52:19<02:50,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.51e-5, train/loss_step=0.00261, global_step=533.0]
Epoch 0:  95%|█████████▍| 5664/5971 [52:21<02:50,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.082, train/loss_vlb_step=0.000274, train/loss_step=0.082, global_step=533.0]   
Epoch 0:  95%|█████████▍| 5665/5971 [52:22<02:49,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.89e-5, train/loss_step=0.0249, global_step=534.0]
Epoch 0:  95%|█████████▍| 5666/5971 [52:23<02:49,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.89e-5, train/loss_step=0.0249, global_step=534.0]
Epoch 0:  95%|█████████▍| 5666/5971 [52:23<02:49,  1.80it/s, loss=0.169, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00375, train/loss_step=0.481, global_step=534.0]  
Epoch 0:  95%|█████████▍| 5667/5971 [52:24<02:48,  1.80it/s, loss=0.212, v_num=0, train/loss_simple_step=0.849, train/loss_vlb_step=0.108, train/loss_step=0.849, global_step=534.0]  
Epoch 0:  95%|█████████▍| 5668/5971 [52:26<02:48,  1.80it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.95e-5, train/loss_step=0.0226, global_step=534.0]
Epoch 0:  95%|█████████▍| 5669/5971 [52:27<02:47,  1.80it/s, loss=0.211, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.09e-5, train/loss_step=0.00179, global_step=535.0]
Epoch 0:  95%|█████████▍| 5670/5971 [52:28<02:47,  1.80it/s, loss=0.211, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.09e-5, train/loss_step=0.00179, global_step=535.0]
Epoch 0:  95%|█████████▍| 5670/5971 [52:28<02:47,  1.80it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.46e-5, train/loss_step=0.0026, global_step=535.0]  
Epoch 0:  95%|█████████▍| 5671/5971 [52:29<02:46,  1.80it/s, loss=0.196, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000338, train/loss_step=0.101, global_step=535.0] 
Epoch 0:  95%|█████████▍| 5672/5971 [52:31<02:46,  1.80it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000129, train/loss_step=0.0349, global_step=535.0]
Epoch 0:  95%|█████████▌| 5673/5971 [52:32<02:45,  1.80it/s, loss=0.199, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00131, train/loss_step=0.336, global_step=536.0]   
Epoch 0:  95%|█████████▌| 5674/5971 [52:33<02:45,  1.80it/s, loss=0.199, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00131, train/loss_step=0.336, global_step=536.0]
Epoch 0:  95%|█████████▌| 5674/5971 [52:33<02:45,  1.80it/s, loss=0.21, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000928, train/loss_step=0.243, global_step=536.0]
Epoch 0:  95%|█████████▌| 5675/5971 [52:34<02:44,  1.80it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0523, train/loss_vlb_step=0.000184, train/loss_step=0.0523, global_step=536.0]
Epoch 0:  95%|█████████▌| 5676/5971 [52:36<02:44,  1.80it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.000239, train/loss_step=0.0693, global_step=536.0] 
Epoch 0:  95%|█████████▌| 5677/5971 [52:37<02:43,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000118, train/loss_step=0.0292, global_step=537.0]
Epoch 0:  95%|█████████▌| 5678/5971 [52:38<02:42,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000118, train/loss_step=0.0292, global_step=537.0]
Epoch 0:  95%|█████████▌| 5678/5971 [52:38<02:42,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000108, train/loss_step=0.0264, global_step=537.0]
Epoch 0:  95%|█████████▌| 5679/5971 [52:39<02:42,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0065, train/loss_vlb_step=3.1e-5, train/loss_step=0.0065, global_step=537.0]  
Epoch 0:  95%|█████████▌| 5680/5971 [52:41<02:41,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000654, train/loss_step=0.198, global_step=537.0]
Epoch 0:  95%|█████████▌| 5681/5971 [52:42<02:41,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.076, train/loss_vlb_step=0.000258, train/loss_step=0.076, global_step=538.0]
Epoch 0:  95%|█████████▌| 5682/5971 [52:42<02:40,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.076, train/loss_vlb_step=0.000258, train/loss_step=0.076, global_step=538.0]
Epoch 0:  95%|█████████▌| 5682/5971 [52:42<02:40,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00276, train/loss_vlb_step=1.54e-5, train/loss_step=0.00276, global_step=538.0]
Epoch 0:  95%|█████████▌| 5683/5971 [52:43<02:40,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00107, train/loss_step=0.276, global_step=538.0]    
Epoch 0:  95%|█████████▌| 5684/5971 [52:46<02:39,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000244, train/loss_step=0.0694, global_step=538.0]
Epoch 0:  95%|█████████▌| 5685/5971 [52:47<02:39,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00118, train/loss_step=0.286, global_step=539.0]   
Epoch 0:  95%|█████████▌| 5686/5971 [52:48<02:38,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00118, train/loss_step=0.286, global_step=539.0]
Epoch 0:  95%|█████████▌| 5686/5971 [52:48<02:38,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00359, train/loss_step=0.433, global_step=539.0]
Epoch 0:  95%|█████████▌| 5687/5971 [52:49<02:38,  1.79it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.71e-5, train/loss_step=0.00322, global_step=539.0]
Epoch 0:  95%|█████████▌| 5688/5971 [52:51<02:37,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.54e-5, train/loss_step=0.0129, global_step=539.0]  
Epoch 0:  95%|█████████▌| 5689/5971 [52:52<02:37,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00339, train/loss_vlb_step=1.81e-5, train/loss_step=0.00339, global_step=540.0]
Epoch 0:  95%|█████████▌| 5690/5971 [52:53<02:36,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00339, train/loss_vlb_step=1.81e-5, train/loss_step=0.00339, global_step=540.0]
Epoch 0:  95%|█████████▌| 5690/5971 [52:53<02:36,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000171, train/loss_step=0.0483, global_step=540.0] 
Epoch 0:  95%|█████████▌| 5691/5971 [52:54<02:36,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000472, train/loss_step=0.143, global_step=540.0]  
Epoch 0:  95%|█████████▌| 5692/5971 [52:56<02:35,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000865, train/loss_step=0.228, global_step=540.0]
Epoch 0:  95%|█████████▌| 5693/5971 [52:57<02:35,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.00019, train/loss_step=0.0502, global_step=541.0]
Epoch 0:  95%|█████████▌| 5694/5971 [52:58<02:34,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.00019, train/loss_step=0.0502, global_step=541.0]
Epoch 0:  95%|█████████▌| 5694/5971 [52:58<02:34,  1.79it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.24e-5, train/loss_step=0.0118, global_step=541.0]
Epoch 0:  95%|█████████▌| 5695/5971 [52:59<02:34,  1.79it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.15e-5, train/loss_step=0.00195, global_step=541.0]
Epoch 0:  95%|█████████▌| 5696/5971 [53:01<02:33,  1.79it/s, loss=0.0963, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.72e-5, train/loss_step=0.0189, global_step=541.0]  
Epoch 0:  95%|█████████▌| 5697/5971 [53:02<02:33,  1.79it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.48e-5, train/loss_step=0.00266, global_step=542.0]
Epoch 0:  95%|█████████▌| 5698/5971 [53:03<02:32,  1.79it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.48e-5, train/loss_step=0.00266, global_step=542.0]
Epoch 0:  95%|█████████▌| 5698/5971 [53:03<02:32,  1.79it/s, loss=0.1, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000444, train/loss_step=0.134, global_step=542.0]      
Epoch 0:  95%|█████████▌| 5699/5971 [53:04<02:31,  1.79it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.88e-5, train/loss_step=0.0189, global_step=542.0]
Epoch 0:  95%|█████████▌| 5700/5971 [53:06<02:31,  1.79it/s, loss=0.1, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.00063, train/loss_step=0.185, global_step=542.0]    
Epoch 0:  95%|█████████▌| 5701/5971 [53:07<02:30,  1.79it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000106, train/loss_step=0.028, global_step=543.0]
Epoch 0:  95%|█████████▌| 5702/5971 [53:08<02:30,  1.79it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000106, train/loss_step=0.028, global_step=543.0]
Epoch 0:  95%|█████████▌| 5702/5971 [53:08<02:30,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0849, train/loss_vlb_step=0.000283, train/loss_step=0.0849, global_step=543.0]
Epoch 0:  96%|█████████▌| 5703/5971 [53:09<02:29,  1.79it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.00148, train/loss_vlb_step=8.96e-6, train/loss_step=0.00148, global_step=543.0]
Epoch 0:  96%|█████████▌| 5704/5971 [53:11<02:29,  1.79it/s, loss=0.0857, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.27e-5, train/loss_step=0.0177, global_step=543.0]  
Epoch 0:  96%|█████████▌| 5705/5971 [53:12<02:28,  1.79it/s, loss=0.0717, v_num=0, train/loss_simple_step=0.0063, train/loss_vlb_step=3.07e-5, train/loss_step=0.0063, global_step=544.0]
Epoch 0:  96%|█████████▌| 5706/5971 [53:12<02:28,  1.79it/s, loss=0.0717, v_num=0, train/loss_simple_step=0.0063, train/loss_vlb_step=3.07e-5, train/loss_step=0.0063, global_step=544.0]
Epoch 0:  96%|█████████▌| 5706/5971 [53:12<02:28,  1.79it/s, loss=0.0503, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.05e-5, train/loss_step=0.00397, global_step=544.0]
Epoch 0:  96%|█████████▌| 5707/5971 [53:13<02:27,  1.79it/s, loss=0.0679, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00201, train/loss_step=0.356, global_step=544.0]    
Epoch 0:  96%|█████████▌| 5708/5971 [53:16<02:27,  1.79it/s, loss=0.0683, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=9.1e-5, train/loss_step=0.0215, global_step=544.0]
Epoch 0:  96%|█████████▌| 5709/5971 [53:16<02:26,  1.79it/s, loss=0.0702, v_num=0, train/loss_simple_step=0.0415, train/loss_vlb_step=0.000156, train/loss_step=0.0415, global_step=545.0]
Epoch 0:  96%|█████████▌| 5710/5971 [53:17<02:26,  1.79it/s, loss=0.0702, v_num=0, train/loss_simple_step=0.0415, train/loss_vlb_step=0.000156, train/loss_step=0.0415, global_step=545.0]
Epoch 0:  96%|█████████▌| 5710/5971 [53:17<02:26,  1.79it/s, loss=0.086, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00167, train/loss_step=0.364, global_step=545.0]    
Epoch 0:  96%|█████████▌| 5711/5971 [53:18<02:25,  1.79it/s, loss=0.0796, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.52e-5, train/loss_step=0.0151, global_step=545.0]
Epoch 0:  96%|█████████▌| 5712/5971 [53:20<02:25,  1.78it/s, loss=0.071, v_num=0, train/loss_simple_step=0.0558, train/loss_vlb_step=0.000197, train/loss_step=0.0558, global_step=545.0]
Epoch 0:  96%|█████████▌| 5713/5971 [53:21<02:24,  1.78it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00199, train/loss_step=0.322, global_step=546.0]  
Epoch 0:  96%|█████████▌| 5714/5971 [53:22<02:24,  1.78it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00199, train/loss_step=0.322, global_step=546.0]
Epoch 0:  96%|█████████▌| 5714/5971 [53:22<02:24,  1.78it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000293, train/loss_step=0.0861, global_step=546.0]
Epoch 0:  96%|█████████▌| 5715/5971 [53:23<02:23,  1.78it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.36e-5, train/loss_step=0.014, global_step=546.0]   
Epoch 0:  96%|█████████▌| 5716/5971 [53:25<02:22,  1.78it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.00535, train/loss_vlb_step=2.74e-5, train/loss_step=0.00535, global_step=546.0]
Epoch 0:  96%|█████████▌| 5717/5971 [53:26<02:22,  1.78it/s, loss=0.089, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.72e-5, train/loss_step=0.0174, global_step=547.0]   
Epoch 0:  96%|█████████▌| 5718/5971 [53:27<02:21,  1.78it/s, loss=0.089, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.72e-5, train/loss_step=0.0174, global_step=547.0]
Epoch 0:  96%|█████████▌| 5718/5971 [53:27<02:21,  1.78it/s, loss=0.0859, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000247, train/loss_step=0.0725, global_step=547.0]
Epoch 0:  96%|█████████▌| 5719/5971 [53:28<02:21,  1.78it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.0473, train/loss_vlb_step=0.000171, train/loss_step=0.0473, global_step=547.0]
Epoch 0:  96%|█████████▌| 5720/5971 [53:30<02:20,  1.78it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.000181, train/loss_step=0.0546, global_step=547.0]
Epoch 0:  96%|█████████▌| 5721/5971 [53:31<02:20,  1.78it/s, loss=0.0801, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.1e-5, train/loss_step=0.0144, global_step=548.0]  
Epoch 0:  96%|█████████▌| 5722/5971 [53:32<02:19,  1.78it/s, loss=0.0801, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.1e-5, train/loss_step=0.0144, global_step=548.0]
Epoch 0:  96%|█████████▌| 5722/5971 [53:32<02:19,  1.78it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00178, train/loss_step=0.368, global_step=548.0] 
Epoch 0:  96%|█████████▌| 5723/5971 [53:33<02:19,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00135, train/loss_step=0.311, global_step=548.0]  
Epoch 0:  96%|█████████▌| 5724/5971 [53:35<02:18,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00147, train/loss_step=0.341, global_step=548.0]
Epoch 0:  96%|█████████▌| 5725/5971 [53:36<02:18,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.00916, train/loss_step=0.590, global_step=549.0]
Epoch 0:  96%|█████████▌| 5726/5971 [53:37<02:17,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.00916, train/loss_step=0.590, global_step=549.0]
Epoch 0:  96%|█████████▌| 5726/5971 [53:37<02:17,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00771, train/loss_vlb_step=3.57e-5, train/loss_step=0.00771, global_step=549.0]
Epoch 0:  96%|█████████▌| 5727/5971 [53:38<02:17,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.59e-5, train/loss_step=0.0133, global_step=549.0]  
Epoch 0:  96%|█████████▌| 5728/5971 [53:40<02:16,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.37it/s][A
Epoch 0:  96%|█████████▌| 5730/5971 [53:41<02:15,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:   1%|          | 2/167 [00:00<01:01,  2.70it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:21,  7.46it/s][A
Epoch 0:  96%|█████████▌| 5734/5971 [53:41<02:13,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:   5%|▍         | 8/167 [00:01<00:14, 11.27it/s][A
Epoch 0:  96%|█████████▌| 5738/5971 [53:41<02:10,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:   7%|▋         | 11/167 [00:01<00:10, 14.47it/s][A
Epoch 0:  96%|█████████▌| 5742/5971 [53:42<02:08,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:   9%|▉         | 15/167 [00:01<00:08, 18.69it/s][A
Epoch 0:  96%|█████████▌| 5746/5971 [53:42<02:06,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  11%|█         | 18/167 [00:01<00:07, 21.21it/s][A

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.06it/s][A
Epoch 0:  96%|█████████▋| 5750/5971 [53:42<02:03,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 21.95it/s][A
Epoch 0:  96%|█████████▋| 5754/5971 [53:42<02:01,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.45it/s][A
Epoch 0:  96%|█████████▋| 5758/5971 [53:42<01:59,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.09it/s][A

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.91it/s][A
Epoch 0:  96%|█████████▋| 5762/5971 [53:42<01:56,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 26.03it/s][A
Epoch 0:  97%|█████████▋| 5766/5971 [53:42<01:54,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 25.77it/s][A
Epoch 0:  97%|█████████▋| 5770/5971 [53:43<01:52,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.84it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.16it/s][A
Epoch 0:  97%|█████████▋| 5774/5971 [53:43<01:49,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 28.86it/s][A
Epoch 0:  97%|█████████▋| 5778/5971 [53:43<01:47,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  31%|███       | 52/167 [00:02<00:04, 28.37it/s][A
Epoch 0:  97%|█████████▋| 5782/5971 [53:43<01:45,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 27.54it/s][A
Epoch 0:  97%|█████████▋| 5786/5971 [53:43<01:43,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 25.73it/s][A
Epoch 0:  97%|█████████▋| 5790/5971 [53:43<01:40,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  37%|███▋      | 62/167 [00:03<00:03, 26.30it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:03, 25.81it/s][A
Epoch 0:  97%|█████████▋| 5794/5971 [53:43<01:38,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  41%|████      | 68/167 [00:03<00:04, 24.35it/s][A
Epoch 0:  97%|█████████▋| 5798/5971 [53:44<01:36,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 24.45it/s][A
Epoch 0:  97%|█████████▋| 5802/5971 [53:44<01:33,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.50it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.50it/s][A
Epoch 0:  97%|█████████▋| 5806/5971 [53:44<01:31,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.69it/s][A
Epoch 0:  97%|█████████▋| 5810/5971 [53:44<01:29,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 24.91it/s][A
Epoch 0:  97%|█████████▋| 5814/5971 [53:44<01:27,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.65it/s][A

Validating:  53%|█████▎    | 89/167 [00:04<00:02, 26.09it/s][A
Epoch 0:  97%|█████████▋| 5818/5971 [53:44<01:24,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.58it/s][A
Epoch 0:  98%|█████████▊| 5822/5971 [53:45<01:22,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.40it/s][A
Epoch 0:  98%|█████████▊| 5826/5971 [53:45<01:20,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.63it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.37it/s][A
Epoch 0:  98%|█████████▊| 5830/5971 [53:45<01:17,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.76it/s][A
Epoch 0:  98%|█████████▊| 5834/5971 [53:45<01:15,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 23.87it/s][A
Epoch 0:  98%|█████████▊| 5838/5971 [53:45<01:13,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 24.86it/s][A

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 24.89it/s][A
Epoch 0:  98%|█████████▊| 5842/5971 [53:45<01:11,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.47it/s][A
Epoch 0:  98%|█████████▊| 5846/5971 [53:46<01:08,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.63it/s][A
Epoch 0:  98%|█████████▊| 5850/5971 [53:46<01:06,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.37it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.83it/s][A
Epoch 0:  98%|█████████▊| 5854/5971 [53:46<01:04,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.15it/s][A
Epoch 0:  98%|█████████▊| 5858/5971 [53:46<01:02,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.11it/s][A
Epoch 0:  98%|█████████▊| 5862/5971 [53:46<00:59,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  80%|████████  | 134/167 [00:05<00:01, 24.77it/s][A

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 25.08it/s][A
Epoch 0:  98%|█████████▊| 5866/5971 [53:46<00:57,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.21it/s][A
Epoch 0:  98%|█████████▊| 5870/5971 [53:46<00:55,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.99it/s][A
Epoch 0:  98%|█████████▊| 5874/5971 [53:47<00:53,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 28.00it/s][A
Epoch 0:  98%|█████████▊| 5878/5971 [53:47<00:51,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 28.10it/s][A
Epoch 0:  99%|█████████▊| 5882/5971 [53:47<00:48,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.67it/s][A
Epoch 0:  99%|█████████▊| 5886/5971 [53:47<00:46,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.17it/s][A
Epoch 0:  99%|█████████▊| 5890/5971 [53:47<00:44,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 28.82it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 28.41it/s][A
Epoch 0:  99%|█████████▊| 5894/5971 [53:47<00:42,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]
Epoch 0:  99%|█████████▊| 5896/5971 [53:48<00:41,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000165, train/loss_step=0.0453, global_step=549.0]

                                                             [A
Epoch 0:  99%|█████████▉| 5897/5971 [53:49<00:40,  1.83it/s, loss=0.144, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=550.0]  
Epoch 0:  99%|█████████▉| 5898/5971 [53:50<00:39,  1.83it/s, loss=0.144, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=550.0]
Epoch 0:  99%|█████████▉| 5898/5971 [53:50<00:39,  1.83it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000222, train/loss_step=0.0641, global_step=550.0]
Epoch 0:  99%|█████████▉| 5899/5971 [53:50<00:39,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.56e-5, train/loss_step=0.0027, global_step=550.0] 
Epoch 0:  99%|█████████▉| 5900/5971 [53:53<00:38,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.76e-5, train/loss_step=0.0032, global_step=550.0]
Epoch 0:  99%|█████████▉| 5901/5971 [53:54<00:38,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.4e-5, train/loss_step=0.0155, global_step=551.0]  
Epoch 0:  99%|█████████▉| 5902/5971 [53:54<00:37,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.4e-5, train/loss_step=0.0155, global_step=551.0]
Epoch 0:  99%|█████████▉| 5902/5971 [53:54<00:37,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00539, train/loss_vlb_step=2.77e-5, train/loss_step=0.00539, global_step=551.0]
Epoch 0:  99%|█████████▉| 5903/5971 [53:55<00:37,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00193, train/loss_step=0.348, global_step=551.0]    
Epoch 0:  99%|█████████▉| 5904/5971 [53:58<00:36,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000117, train/loss_step=0.0336, global_step=551.0]
Epoch 0:  99%|█████████▉| 5905/5971 [53:58<00:36,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00141, train/loss_step=0.278, global_step=552.0]   
Epoch 0:  99%|█████████▉| 5906/5971 [53:59<00:35,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00141, train/loss_step=0.278, global_step=552.0]
Epoch 0:  99%|█████████▉| 5906/5971 [53:59<00:35,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.521, train/loss_vlb_step=0.00753, train/loss_step=0.521, global_step=552.0] 
Epoch 0:  99%|█████████▉| 5907/5971 [54:00<00:35,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000895, train/loss_step=0.239, global_step=552.0]
Epoch 0:  99%|█████████▉| 5908/5971 [54:02<00:34,  1.82it/s, loss=0.187, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00205, train/loss_step=0.409, global_step=552.0] 
Epoch 0:  99%|█████████▉| 5909/5971 [54:03<00:34,  1.82it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.26e-5, train/loss_step=0.0138, global_step=553.0]
Epoch 0:  99%|█████████▉| 5910/5971 [54:04<00:33,  1.82it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.26e-5, train/loss_step=0.0138, global_step=553.0]
Epoch 0:  99%|█████████▉| 5910/5971 [54:04<00:33,  1.82it/s, loss=0.187, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00194, train/loss_step=0.360, global_step=553.0]  
Epoch 0:  99%|█████████▉| 5911/5971 [54:05<00:32,  1.82it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00683, train/loss_vlb_step=3.25e-5, train/loss_step=0.00683, global_step=553.0]
Epoch 0:  99%|█████████▉| 5912/5971 [54:07<00:32,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000118, train/loss_step=0.0313, global_step=553.0] 
Epoch 0:  99%|█████████▉| 5913/5971 [54:08<00:31,  1.82it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0597, train/loss_vlb_step=0.000202, train/loss_step=0.0597, global_step=554.0]
Epoch 0:  99%|█████████▉| 5914/5971 [54:09<00:31,  1.82it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0597, train/loss_vlb_step=0.000202, train/loss_step=0.0597, global_step=554.0]
Epoch 0:  99%|█████████▉| 5914/5971 [54:09<00:31,  1.82it/s, loss=0.14, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000811, train/loss_step=0.229, global_step=554.0]   
Epoch 0:  99%|█████████▉| 5915/5971 [54:10<00:30,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000778, train/loss_step=0.215, global_step=554.0]
Epoch 0:  99%|█████████▉| 5916/5971 [54:12<00:30,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.34e-5, train/loss_step=0.00223, global_step=554.0]
Epoch 0:  99%|█████████▉| 5917/5971 [54:13<00:29,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0765, train/loss_vlb_step=0.000257, train/loss_step=0.0765, global_step=555.0] 
Epoch 0:  99%|█████████▉| 5918/5971 [54:14<00:29,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0765, train/loss_vlb_step=0.000257, train/loss_step=0.0765, global_step=555.0]
Epoch 0:  99%|█████████▉| 5918/5971 [54:14<00:29,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0744, train/loss_vlb_step=0.000247, train/loss_step=0.0744, global_step=555.0]
Epoch 0:  99%|█████████▉| 5919/5971 [54:15<00:28,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00493, train/loss_vlb_step=2.64e-5, train/loss_step=0.00493, global_step=555.0]
Epoch 0:  99%|█████████▉| 5920/5971 [54:17<00:28,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000697, train/loss_step=0.198, global_step=555.0]   
Epoch 0:  99%|█████████▉| 5921/5971 [54:18<00:27,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000258, train/loss_step=0.0763, global_step=556.0]
Epoch 0:  99%|█████████▉| 5922/5971 [54:19<00:26,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000258, train/loss_step=0.0763, global_step=556.0]
Epoch 0:  99%|█████████▉| 5922/5971 [54:19<00:26,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.38e-5, train/loss_step=0.00247, global_step=556.0]
Epoch 0:  99%|█████████▉| 5923/5971 [54:20<00:26,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000691, train/loss_step=0.193, global_step=556.0]   
Epoch 0:  99%|█████████▉| 5924/5971 [54:22<00:25,  1.82it/s, loss=0.172, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00288, train/loss_step=0.454, global_step=556.0] 
Epoch 0:  99%|█████████▉| 5925/5971 [54:23<00:25,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.52e-5, train/loss_step=0.00261, global_step=557.0]
Epoch 0:  99%|█████████▉| 5926/5971 [54:23<00:24,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.52e-5, train/loss_step=0.00261, global_step=557.0]
Epoch 0:  99%|█████████▉| 5926/5971 [54:23<00:24,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.19e-5, train/loss_step=0.00203, global_step=557.0]
Epoch 0:  99%|█████████▉| 5927/5971 [54:24<00:24,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000455, train/loss_step=0.138, global_step=557.0]   
Epoch 0:  99%|█████████▉| 5928/5971 [54:27<00:23,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.43e-5, train/loss_step=0.00247, global_step=557.0]
Epoch 0:  99%|█████████▉| 5929/5971 [54:28<00:23,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00139, train/loss_step=0.346, global_step=558.0]    
Epoch 0:  99%|█████████▉| 5930/5971 [54:28<00:22,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00139, train/loss_step=0.346, global_step=558.0]
Epoch 0:  99%|█████████▉| 5930/5971 [54:28<00:22,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=558.0]
Epoch 0:  99%|█████████▉| 5931/5971 [54:29<00:22,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.714, train/loss_vlb_step=0.0459, train/loss_step=0.714, global_step=558.0]  
Epoch 0:  99%|█████████▉| 5932/5971 [54:32<00:21,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000642, train/loss_step=0.178, global_step=558.0]
Epoch 0:  99%|█████████▉| 5933/5971 [54:33<00:20,  1.81it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.6e-5, train/loss_step=0.00286, global_step=559.0]
Epoch 0:  99%|█████████▉| 5934/5971 [54:34<00:20,  1.81it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.6e-5, train/loss_step=0.00286, global_step=559.0]
Epoch 0:  99%|█████████▉| 5934/5971 [54:34<00:20,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.00078, train/loss_step=0.203, global_step=559.0]   
Epoch 0:  99%|█████████▉| 5935/5971 [54:35<00:19,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.608, train/loss_vlb_step=0.0132, train/loss_step=0.608, global_step=559.0] 
Epoch 0:  99%|█████████▉| 5936/5971 [54:37<00:19,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000587, train/loss_step=0.176, global_step=559.0]
Epoch 0:  99%|█████████▉| 5937/5971 [54:38<00:18,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00143, train/loss_step=0.290, global_step=560.0] 
Epoch 0:  99%|█████████▉| 5938/5971 [54:39<00:18,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00143, train/loss_step=0.290, global_step=560.0]
Epoch 0:  99%|█████████▉| 5938/5971 [54:39<00:18,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.52e-5, train/loss_step=0.00265, global_step=560.0]
Epoch 0:  99%|█████████▉| 5939/5971 [54:40<00:17,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000138, train/loss_step=0.0351, global_step=560.0] 
Epoch 0:  99%|█████████▉| 5940/5971 [54:42<00:17,  1.81it/s, loss=0.198, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00346, train/loss_step=0.440, global_step=560.0]   
Epoch 0:  99%|█████████▉| 5941/5971 [54:43<00:16,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.51e-5, train/loss_step=0.0146, global_step=561.0]
Epoch 0: 100%|█████████▉| 5942/5971 [54:44<00:16,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.51e-5, train/loss_step=0.0146, global_step=561.0]
Epoch 0: 100%|█████████▉| 5942/5971 [54:44<00:16,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0662, train/loss_vlb_step=0.000234, train/loss_step=0.0662, global_step=561.0]
Epoch 0: 100%|█████████▉| 5943/5971 [54:45<00:15,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0973, train/loss_vlb_step=0.000321, train/loss_step=0.0973, global_step=561.0]
Epoch 0: 100%|█████████▉| 5944/5971 [54:47<00:14,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000381, train/loss_step=0.116, global_step=561.0]  
Epoch 0: 100%|█████████▉| 5945/5971 [54:48<00:14,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00463, train/loss_vlb_step=2.48e-5, train/loss_step=0.00463, global_step=562.0]
Epoch 0: 100%|█████████▉| 5946/5971 [54:49<00:13,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00463, train/loss_vlb_step=2.48e-5, train/loss_step=0.00463, global_step=562.0]
Epoch 0: 100%|█████████▉| 5946/5971 [54:49<00:13,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00518, train/loss_vlb_step=2.83e-5, train/loss_step=0.00518, global_step=562.0]
Epoch 0: 100%|█████████▉| 5947/5971 [54:50<00:13,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.13e-5, train/loss_step=0.00188, global_step=562.0] 
Epoch 0: 100%|█████████▉| 5948/5971 [54:52<00:12,  1.81it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00992, train/loss_vlb_step=4.64e-5, train/loss_step=0.00992, global_step=562.0]
Epoch 0: 100%|█████████▉| 5949/5971 [54:53<00:12,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000316, train/loss_step=0.0955, global_step=563.0] 
Epoch 0: 100%|█████████▉| 5950/5971 [54:54<00:11,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000316, train/loss_step=0.0955, global_step=563.0]
Epoch 0: 100%|█████████▉| 5950/5971 [54:54<00:11,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.00334, train/loss_step=0.492, global_step=563.0]   
Epoch 0: 100%|█████████▉| 5951/5971 [54:54<00:11,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000223, train/loss_step=0.0631, global_step=563.0]
Epoch 0: 100%|█████████▉| 5952/5971 [54:57<00:10,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=563.0] 
Epoch 0: 100%|█████████▉| 5953/5971 [54:57<00:09,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.27e-5, train/loss_step=0.00212, global_step=564.0]
Epoch 0: 100%|█████████▉| 5954/5971 [54:58<00:09,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.27e-5, train/loss_step=0.00212, global_step=564.0]
Epoch 0: 100%|█████████▉| 5954/5971 [54:58<00:09,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000649, train/loss_step=0.184, global_step=564.0]   
Epoch 0: 100%|█████████▉| 5955/5971 [54:59<00:08,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000115, train/loss_step=0.0318, global_step=564.0]
Epoch 0: 100%|█████████▉| 5956/5971 [55:01<00:08,  1.80it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.1e-5, train/loss_step=0.0166, global_step=564.0]  
Epoch 0: 100%|█████████▉| 5957/5971 [55:02<00:07,  1.80it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000118, train/loss_step=0.0315, global_step=565.0]
Epoch 0: 100%|█████████▉| 5958/5971 [55:03<00:07,  1.80it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000118, train/loss_step=0.0315, global_step=565.0]
Epoch 0: 100%|█████████▉| 5958/5971 [55:03<00:07,  1.80it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.67e-5, train/loss_step=0.0127, global_step=565.0] 
Epoch 0: 100%|█████████▉| 5959/5971 [55:04<00:06,  1.80it/s, loss=0.107, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00204, train/loss_step=0.364, global_step=565.0]   
Epoch 0: 100%|█████████▉| 5960/5971 [55:07<00:06,  1.80it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000245, train/loss_step=0.0734, global_step=565.0]
Epoch 0: 100%|█████████▉| 5961/5971 [55:08<00:05,  1.80it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.25e-5, train/loss_step=0.00226, global_step=566.0]
Epoch 0: 100%|█████████▉| 5962/5971 [55:09<00:04,  1.80it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.25e-5, train/loss_step=0.00226, global_step=566.0]
Epoch 0: 100%|█████████▉| 5962/5971 [55:09<00:04,  1.80it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000163, train/loss_step=0.0443, global_step=566.0] 
Epoch 0: 100%|█████████▉| 5963/5971 [55:09<00:04,  1.80it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.35e-5, train/loss_step=0.0198, global_step=566.0] 
Epoch 0: 100%|█████████▉| 5964/5971 [55:11<00:03,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.680, train/loss_vlb_step=0.0211, train/loss_step=0.680, global_step=566.0]    
Epoch 0: 100%|█████████▉| 5965/5971 [55:13<00:03,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.06e-5, train/loss_step=0.0148, global_step=567.0]
Epoch 0: 100%|█████████▉| 5966/5971 [55:14<00:02,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.06e-5, train/loss_step=0.0148, global_step=567.0]
Epoch 0: 100%|█████████▉| 5966/5971 [55:14<00:02,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.16e-5, train/loss_step=0.0139, global_step=567.0]
Epoch 0: 100%|█████████▉| 5967/5971 [55:14<00:02,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.71e-5, train/loss_step=0.0133, global_step=567.0]
Epoch 0: 100%|█████████▉| 5968/5971 [55:17<00:01,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=0.000102, train/loss_step=0.0249, global_step=567.0]
Epoch 0: 100%|█████████▉| 5969/5971 [55:17<00:01,  1.80it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=3.11e-5, train/loss_step=0.00618, global_step=568.0]
Epoch 0: 100%|█████████▉| 5970/5971 [55:18<00:00,  1.80it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=3.11e-5, train/loss_step=0.00618, global_step=568.0]
Epoch 0: 100%|█████████▉| 5970/5971 [55:18<00:00,  1.80it/s, loss=0.085, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.36e-5, train/loss_step=0.0142, global_step=568.0]  
Epoch 0: 100%|██████████| 5971/5971 [55:19<00:00,  1.80it/s, loss=0.101, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00206, train/loss_step=0.382, global_step=568.0]  
Epoch 0: 100%|██████████| 5971/5971 [55:21<00:00,  1.80it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.00979, train/loss_vlb_step=4.71e-5, train/loss_step=0.00979, global_step=568.0]
Epoch 0: 100%|██████████| 5971/5971 [55:22<00:00,  1.80it/s, loss=0.0983, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=9.43e-5, train/loss_step=0.0261, global_step=569.0]  
Epoch 0: 100%|██████████| 5971/5971 [55:23<00:00,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00348, train/loss_step=0.429, global_step=569.0]   
Epoch 0: 100%|██████████| 5971/5971 [55:24<00:00,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00159, train/loss_step=0.287, global_step=569.0]
Epoch 0: 100%|██████████| 5971/5971 [55:26<00:00,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00516, train/loss_vlb_step=2.62e-5, train/loss_step=0.00516, global_step=569.0]
Epoch 0: 100%|██████████| 5971/5971 [55:27<00:00,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0082, train/loss_vlb_step=3.89e-5, train/loss_step=0.0082, global_step=570.0]  
Epoch 0: 100%|██████████| 5971/5971 [55:28<00:00,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000881, train/loss_step=0.242, global_step=570.0] 
Epoch 0: 100%|██████████| 5971/5971 [55:29<00:00,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.17e-5, train/loss_step=0.0157, global_step=570.0]
Epoch 0: 100%|██████████| 5971/5971 [55:31<00:00,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0811, train/loss_vlb_step=0.000268, train/loss_step=0.0811, global_step=570.0]
Epoch 0: 100%|██████████| 5971/5971 [55:32<00:00,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00917, train/loss_vlb_step=4.34e-5, train/loss_step=0.00917, global_step=571.0]
Epoch 0: 100%|██████████| 5971/5971 [55:33<00:00,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00917, train/loss_vlb_step=4.34e-5, train/loss_step=0.00917, global_step=571.0]
Epoch 0: 100%|██████████| 5971/5971 [55:33<00:00,  1.79it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0029, train/loss_vlb_step=1.59e-5, train/loss_step=0.0029, global_step=571.0]  
Epoch 0: 100%|██████████| 5971/5971 [55:34<00:00,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000277, train/loss_step=0.0841, global_step=571.0]
Epoch 0: 100%|██████████| 5971/5971 [55:37<00:00,  1.79it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000135, train/loss_step=0.0366, global_step=571.0]
Epoch 0: 100%|██████████| 5971/5971 [55:38<00:00,  1.79it/s, loss=0.0847, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.84e-5, train/loss_step=0.00331, global_step=572.0]
Epoch 0: 100%|██████████| 5971/5971 [55:38<00:00,  1.79it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000228, train/loss_step=0.0642, global_step=572.0] 
Epoch 0: 100%|██████████| 5971/5971 [55:39<00:00,  1.79it/s, loss=0.087, v_num=0, train/loss_simple_step=0.0093, train/loss_vlb_step=4.5e-5, train/loss_step=0.0093, global_step=572.0]   
Epoch 0: 100%|██████████| 5971/5971 [55:42<00:00,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00223, train/loss_step=0.334, global_step=572.0] 
Epoch 0: 100%|██████████| 5971/5971 [55:42<00:00,  1.79it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00758, train/loss_vlb_step=3.44e-5, train/loss_step=0.00758, global_step=573.0]
Epoch 0: 100%|██████████| 5971/5971 [55:43<00:00,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000125, train/loss_step=0.0334, global_step=573.0] 
Epoch 0: 100%|██████████| 5971/5971 [55:44<00:00,  1.79it/s, loss=0.112, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.0077, train/loss_step=0.557, global_step=573.0]    
Epoch 0: 100%|██████████| 5971/5971 [55:47<00:00,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000982, train/loss_step=0.249, global_step=573.0]
Epoch 0: 100%|██████████| 5971/5971 [55:50<00:00,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.73e-5, train/loss_step=0.0157, global_step=574.0]
Epoch 0:   0%|          | 0/5971 [00:00<00:00, 11037.64it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.73e-5, train/loss_step=0.0157, global_step=574.0]
Epoch 1:   0%|          | 0/5971 [00:00<00:02, 2250.16it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.73e-5, train/loss_step=0.0157, global_step=574.0] 
Epoch 1:   0%|          | 1/5971 [00:02<2:03:12,  1.24s/it, loss=0.124, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.73e-5, train/loss_step=0.0157, global_step=574.0]
Epoch 1:   0%|          | 1/5971 [00:02<2:03:17,  1.24s/it, loss=0.103, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=0.000101, train/loss_step=0.025, global_step=575.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 2/5971 [00:03<1:51:44,  1.12s/it, loss=0.103, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=0.000101, train/loss_step=0.025, global_step=575.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 2/5971 [00:03<1:51:46,  1.12s/it, loss=0.0961, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000457, train/loss_step=0.138, global_step=575.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 3/5971 [00:04<1:46:01,  1.07s/it, loss=0.0961, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000457, train/loss_step=0.138, global_step=575.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 3/5971 [00:04<1:46:03,  1.07s/it, loss=0.102, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=575.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   0%|          | 4/5971 [00:06<2:14:26,  1.35s/it, loss=0.102, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=575.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 4/5971 [00:06<2:14:28,  1.35s/it, loss=0.124, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00441, train/loss_step=0.455, global_step=575.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   0%|          | 5/5971 [00:07<2:07:06,  1.28s/it, loss=0.124, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00441, train/loss_step=0.455, global_step=575.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 5/5971 [00:07<2:07:07,  1.28s/it, loss=0.12, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000541, train/loss_step=0.165, global_step=576.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 6/5971 [00:08<2:01:21,  1.22s/it, loss=0.12, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000541, train/loss_step=0.165, global_step=576.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 6/5971 [00:08<2:01:22,  1.22s/it, loss=0.124, v_num=0, train/loss_simple_step=0.0865, train/loss_vlb_step=0.000287, train/loss_step=0.0865, global_step=576.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 7/5971 [00:09<1:57:18,  1.18s/it, loss=0.124, v_num=0, train/loss_simple_step=0.0865, train/loss_vlb_step=0.000287, train/loss_step=0.0865, global_step=576.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 7/5971 [00:09<1:57:19,  1.18s/it, loss=0.125, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000348, train/loss_step=0.105, global_step=576.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   0%|          | 8/5971 [00:12<2:17:38,  1.39s/it, loss=0.125, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000348, train/loss_step=0.105, global_step=576.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 8/5971 [00:12<2:17:39,  1.39s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000173, train/loss_step=0.0479, global_step=576.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 9/5971 [00:13<2:12:59,  1.34s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000173, train/loss_step=0.0479, global_step=576.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 9/5971 [00:13<2:12:59,  1.34s/it, loss=0.172, v_num=0, train/loss_simple_step=0.911, train/loss_vlb_step=0.458, train/loss_step=0.911, global_step=577.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:   0%|          | 10/5971 [00:14<2:08:51,  1.30s/it, loss=0.172, v_num=0, train/loss_simple_step=0.911, train/loss_vlb_step=0.458, train/loss_step=0.911, global_step=577.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 10/5971 [00:14<2:08:51,  1.30s/it, loss=0.203, v_num=0, train/loss_simple_step=0.710, train/loss_vlb_step=0.0181, train/loss_step=0.710, global_step=577.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 11/5971 [00:15<2:05:25,  1.26s/it, loss=0.203, v_num=0, train/loss_simple_step=0.710, train/loss_vlb_step=0.0181, train/loss_step=0.710, global_step=577.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 11/5971 [00:15<2:05:26,  1.26s/it, loss=0.202, v_num=0, train/loss_simple_step=0.00808, train/loss_vlb_step=3.73e-5, train/loss_step=0.00808, global_step=577.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 12/5971 [00:17<2:15:32,  1.36s/it, loss=0.202, v_num=0, train/loss_simple_step=0.00808, train/loss_vlb_step=3.73e-5, train/loss_step=0.00808, global_step=577.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 12/5971 [00:17<2:15:33,  1.36s/it, loss=0.204, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.00014, train/loss_step=0.0381, global_step=577.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   0%|          | 13/5971 [00:18<2:12:25,  1.33s/it, loss=0.204, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.00014, train/loss_step=0.0381, global_step=577.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 13/5971 [00:18<2:12:25,  1.33s/it, loss=0.201, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.82e-5, train/loss_step=0.0194, global_step=578.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 14/5971 [00:19<2:09:35,  1.31s/it, loss=0.201, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.82e-5, train/loss_step=0.0194, global_step=578.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 14/5971 [00:19<2:09:35,  1.31s/it, loss=0.201, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.7e-5, train/loss_step=0.0101, global_step=578.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   0%|          | 15/5971 [00:20<2:06:55,  1.28s/it, loss=0.201, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.7e-5, train/loss_step=0.0101, global_step=578.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 15/5971 [00:20<2:06:55,  1.28s/it, loss=0.228, v_num=0, train/loss_simple_step=0.868, train/loss_vlb_step=0.0884, train/loss_step=0.868, global_step=578.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   0%|          | 16/5971 [00:23<2:16:59,  1.38s/it, loss=0.228, v_num=0, train/loss_simple_step=0.868, train/loss_vlb_step=0.0884, train/loss_step=0.868, global_step=578.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 16/5971 [00:23<2:17:00,  1.38s/it, loss=0.232, v_num=0, train/loss_simple_step=0.0751, train/loss_vlb_step=0.000248, train/loss_step=0.0751, global_step=578.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 17/5971 [00:24<2:14:18,  1.35s/it, loss=0.232, v_num=0, train/loss_simple_step=0.0751, train/loss_vlb_step=0.000248, train/loss_step=0.0751, global_step=578.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 17/5971 [00:24<2:14:18,  1.35s/it, loss=0.236, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=579.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   0%|          | 18/5971 [00:25<2:11:52,  1.33s/it, loss=0.236, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=579.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 18/5971 [00:25<2:11:52,  1.33s/it, loss=0.208, v_num=0, train/loss_simple_step=0.0015, train/loss_vlb_step=9.02e-6, train/loss_step=0.0015, global_step=579.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 19/5971 [00:26<2:09:40,  1.31s/it, loss=0.208, v_num=0, train/loss_simple_step=0.0015, train/loss_vlb_step=9.02e-6, train/loss_step=0.0015, global_step=579.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 19/5971 [00:26<2:09:40,  1.31s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.87e-5, train/loss_step=0.0163, global_step=579.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 20/5971 [00:28<2:15:32,  1.37s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.87e-5, train/loss_step=0.0163, global_step=579.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 20/5971 [00:28<2:15:32,  1.37s/it, loss=0.196, v_num=0, train/loss_simple_step=0.00612, train/loss_vlb_step=3.07e-5, train/loss_step=0.00612, global_step=579.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 21/5971 [00:29<2:13:24,  1.35s/it, loss=0.196, v_num=0, train/loss_simple_step=0.00612, train/loss_vlb_step=3.07e-5, train/loss_step=0.00612, global_step=579.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 21/5971 [00:29<2:13:25,  1.35s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000128, train/loss_step=0.0355, global_step=580.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   0%|          | 22/5971 [00:30<2:11:24,  1.33s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000128, train/loss_step=0.0355, global_step=580.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 22/5971 [00:30<2:11:24,  1.33s/it, loss=0.196, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=580.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   0%|          | 23/5971 [00:31<2:10:04,  1.31s/it, loss=0.196, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=580.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 23/5971 [00:31<2:10:04,  1.31s/it, loss=0.193, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.00017, train/loss_step=0.0471, global_step=580.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 24/5971 [00:33<2:13:48,  1.35s/it, loss=0.193, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.00017, train/loss_step=0.0471, global_step=580.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 24/5971 [00:33<2:13:48,  1.35s/it, loss=0.187, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00203, train/loss_step=0.332, global_step=580.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   0%|          | 25/5971 [00:34<2:12:03,  1.33s/it, loss=0.187, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00203, train/loss_step=0.332, global_step=580.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 25/5971 [00:34<2:12:03,  1.33s/it, loss=0.186, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000537, train/loss_step=0.162, global_step=581.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 26/5971 [00:35<2:10:26,  1.32s/it, loss=0.186, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000537, train/loss_step=0.162, global_step=581.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 26/5971 [00:35<2:10:26,  1.32s/it, loss=0.19, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.0005, train/loss_step=0.150, global_step=581.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   0%|          | 27/5971 [00:36<2:08:53,  1.30s/it, loss=0.19, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.0005, train/loss_step=0.150, global_step=581.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 27/5971 [00:36<2:08:53,  1.30s/it, loss=0.188, v_num=0, train/loss_simple_step=0.0746, train/loss_vlb_step=0.000246, train/loss_step=0.0746, global_step=581.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 28/5971 [00:38<2:12:16,  1.34s/it, loss=0.188, v_num=0, train/loss_simple_step=0.0746, train/loss_vlb_step=0.000246, train/loss_step=0.0746, global_step=581.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 28/5971 [00:38<2:12:17,  1.34s/it, loss=0.187, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.43e-5, train/loss_step=0.0167, global_step=581.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   0%|          | 29/5971 [00:39<2:10:51,  1.32s/it, loss=0.187, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.43e-5, train/loss_step=0.0167, global_step=581.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   0%|          | 29/5971 [00:39<2:10:52,  1.32s/it, loss=0.149, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000598, train/loss_step=0.158, global_step=582.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|          | 30/5971 [00:40<2:09:24,  1.31s/it, loss=0.149, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000598, train/loss_step=0.158, global_step=582.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 30/5971 [00:40<2:09:24,  1.31s/it, loss=0.114, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.05e-5, train/loss_step=0.012, global_step=582.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|          | 31/5971 [00:41<2:08:08,  1.29s/it, loss=0.114, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.05e-5, train/loss_step=0.012, global_step=582.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 31/5971 [00:41<2:08:08,  1.29s/it, loss=0.122, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000605, train/loss_step=0.174, global_step=582.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 32/5971 [00:43<2:10:29,  1.32s/it, loss=0.122, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000605, train/loss_step=0.174, global_step=582.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 32/5971 [00:43<2:10:29,  1.32s/it, loss=0.123, v_num=0, train/loss_simple_step=0.0593, train/loss_vlb_step=0.000202, train/loss_step=0.0593, global_step=582.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 33/5971 [00:44<2:09:14,  1.31s/it, loss=0.123, v_num=0, train/loss_simple_step=0.0593, train/loss_vlb_step=0.000202, train/loss_step=0.0593, global_step=582.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 33/5971 [00:44<2:09:14,  1.31s/it, loss=0.169, v_num=0, train/loss_simple_step=0.934, train/loss_vlb_step=0.470, train/loss_step=0.934, global_step=583.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:   1%|          | 34/5971 [00:45<2:08:02,  1.29s/it, loss=0.169, v_num=0, train/loss_simple_step=0.934, train/loss_vlb_step=0.470, train/loss_step=0.934, global_step=583.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 34/5971 [00:45<2:08:02,  1.29s/it, loss=0.175, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000403, train/loss_step=0.119, global_step=583.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 35/5971 [00:46<2:06:55,  1.28s/it, loss=0.175, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000403, train/loss_step=0.119, global_step=583.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 35/5971 [00:46<2:06:55,  1.28s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000133, train/loss_step=0.0339, global_step=583.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 36/5971 [00:48<2:09:53,  1.31s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000133, train/loss_step=0.0339, global_step=583.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 36/5971 [00:48<2:09:53,  1.31s/it, loss=0.165, v_num=0, train/loss_simple_step=0.717, train/loss_vlb_step=0.0145, train/loss_step=0.717, global_step=583.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   1%|          | 37/5971 [00:49<2:08:48,  1.30s/it, loss=0.165, v_num=0, train/loss_simple_step=0.717, train/loss_vlb_step=0.0145, train/loss_step=0.717, global_step=583.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 37/5971 [00:49<2:08:48,  1.30s/it, loss=0.159, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.09e-5, train/loss_step=0.00397, global_step=584.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 38/5971 [00:50<2:07:43,  1.29s/it, loss=0.159, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.09e-5, train/loss_step=0.00397, global_step=584.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 38/5971 [00:50<2:07:43,  1.29s/it, loss=0.163, v_num=0, train/loss_simple_step=0.081, train/loss_vlb_step=0.000273, train/loss_step=0.081, global_step=584.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   1%|          | 39/5971 [00:51<2:06:40,  1.28s/it, loss=0.163, v_num=0, train/loss_simple_step=0.081, train/loss_vlb_step=0.000273, train/loss_step=0.081, global_step=584.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 39/5971 [00:51<2:06:40,  1.28s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.18e-5, train/loss_step=0.0202, global_step=584.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 40/5971 [00:53<2:08:36,  1.30s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.18e-5, train/loss_step=0.0202, global_step=584.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 40/5971 [00:53<2:08:36,  1.30s/it, loss=0.187, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00451, train/loss_step=0.478, global_step=584.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   1%|          | 41/5971 [00:54<2:07:38,  1.29s/it, loss=0.187, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00451, train/loss_step=0.478, global_step=584.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 41/5971 [00:54<2:07:38,  1.29s/it, loss=0.203, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00174, train/loss_step=0.351, global_step=585.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 42/5971 [00:55<2:06:40,  1.28s/it, loss=0.203, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00174, train/loss_step=0.351, global_step=585.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 42/5971 [00:55<2:06:40,  1.28s/it, loss=0.203, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000466, train/loss_step=0.142, global_step=585.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 43/5971 [00:56<2:05:45,  1.27s/it, loss=0.203, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000466, train/loss_step=0.142, global_step=585.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 43/5971 [00:56<2:05:45,  1.27s/it, loss=0.209, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000568, train/loss_step=0.165, global_step=585.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 44/5971 [00:58<2:07:34,  1.29s/it, loss=0.209, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000568, train/loss_step=0.165, global_step=585.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 44/5971 [00:58<2:07:34,  1.29s/it, loss=0.217, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.0051, train/loss_step=0.490, global_step=585.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   1%|          | 45/5971 [00:59<2:06:43,  1.28s/it, loss=0.217, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.0051, train/loss_step=0.490, global_step=585.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 45/5971 [00:59<2:06:43,  1.28s/it, loss=0.209, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.47e-5, train/loss_step=0.00961, global_step=586.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 46/5971 [00:59<2:05:50,  1.27s/it, loss=0.209, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.47e-5, train/loss_step=0.00961, global_step=586.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 46/5971 [00:59<2:05:50,  1.27s/it, loss=0.204, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000147, train/loss_step=0.0411, global_step=586.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|          | 47/5971 [01:00<2:04:59,  1.27s/it, loss=0.204, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000147, train/loss_step=0.0411, global_step=586.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 47/5971 [01:00<2:04:59,  1.27s/it, loss=0.206, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000387, train/loss_step=0.118, global_step=586.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   1%|          | 48/5971 [01:02<2:06:40,  1.28s/it, loss=0.206, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000387, train/loss_step=0.118, global_step=586.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 48/5971 [01:02<2:06:40,  1.28s/it, loss=0.223, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00171, train/loss_step=0.354, global_step=586.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|          | 49/5971 [01:03<2:05:56,  1.28s/it, loss=0.223, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00171, train/loss_step=0.354, global_step=586.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 49/5971 [01:03<2:05:56,  1.28s/it, loss=0.215, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.23e-5, train/loss_step=0.00204, global_step=587.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 50/5971 [01:04<2:05:08,  1.27s/it, loss=0.215, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.23e-5, train/loss_step=0.00204, global_step=587.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 50/5971 [01:04<2:05:08,  1.27s/it, loss=0.225, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000742, train/loss_step=0.215, global_step=587.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   1%|          | 51/5971 [01:05<2:04:22,  1.26s/it, loss=0.225, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000742, train/loss_step=0.215, global_step=587.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 51/5971 [01:05<2:04:22,  1.26s/it, loss=0.224, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000509, train/loss_step=0.154, global_step=587.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 52/5971 [01:07<2:05:56,  1.28s/it, loss=0.224, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000509, train/loss_step=0.154, global_step=587.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 52/5971 [01:07<2:05:56,  1.28s/it, loss=0.222, v_num=0, train/loss_simple_step=0.00927, train/loss_vlb_step=4.26e-5, train/loss_step=0.00927, global_step=587.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 53/5971 [01:08<2:05:12,  1.27s/it, loss=0.222, v_num=0, train/loss_simple_step=0.00927, train/loss_vlb_step=4.26e-5, train/loss_step=0.00927, global_step=587.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 53/5971 [01:08<2:05:12,  1.27s/it, loss=0.175, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.37e-5, train/loss_step=0.0023, global_step=588.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   1%|          | 54/5971 [01:09<2:04:30,  1.26s/it, loss=0.175, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.37e-5, train/loss_step=0.0023, global_step=588.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 54/5971 [01:09<2:04:31,  1.26s/it, loss=0.173, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.000227, train/loss_step=0.0647, global_step=588.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 55/5971 [01:10<2:03:49,  1.26s/it, loss=0.173, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.000227, train/loss_step=0.0647, global_step=588.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 55/5971 [01:10<2:03:49,  1.26s/it, loss=0.184, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00136, train/loss_step=0.267, global_step=588.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   1%|          | 56/5971 [01:12<2:05:41,  1.27s/it, loss=0.184, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00136, train/loss_step=0.267, global_step=588.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 56/5971 [01:12<2:05:41,  1.27s/it, loss=0.172, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00397, train/loss_step=0.469, global_step=588.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 57/5971 [01:13<2:05:02,  1.27s/it, loss=0.172, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00397, train/loss_step=0.469, global_step=588.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 57/5971 [01:13<2:05:02,  1.27s/it, loss=0.186, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00119, train/loss_step=0.278, global_step=589.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 58/5971 [01:14<2:04:22,  1.26s/it, loss=0.186, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00119, train/loss_step=0.278, global_step=589.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 58/5971 [01:14<2:04:22,  1.26s/it, loss=0.187, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000365, train/loss_step=0.110, global_step=589.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 59/5971 [01:15<2:03:44,  1.26s/it, loss=0.187, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000365, train/loss_step=0.110, global_step=589.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 59/5971 [01:15<2:03:44,  1.26s/it, loss=0.188, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000129, train/loss_step=0.0341, global_step=589.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 60/5971 [01:17<2:05:06,  1.27s/it, loss=0.188, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000129, train/loss_step=0.0341, global_step=589.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 60/5971 [01:17<2:05:06,  1.27s/it, loss=0.184, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00218, train/loss_step=0.413, global_step=589.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   1%|          | 61/5971 [01:18<2:04:29,  1.26s/it, loss=0.184, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00218, train/loss_step=0.413, global_step=589.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 61/5971 [01:18<2:04:29,  1.26s/it, loss=0.188, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00213, train/loss_step=0.429, global_step=590.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 62/5971 [01:19<2:03:51,  1.26s/it, loss=0.188, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00213, train/loss_step=0.429, global_step=590.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 62/5971 [01:19<2:03:51,  1.26s/it, loss=0.182, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.17e-5, train/loss_step=0.004, global_step=590.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 63/5971 [01:20<2:03:16,  1.25s/it, loss=0.182, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.17e-5, train/loss_step=0.004, global_step=590.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 63/5971 [01:20<2:03:16,  1.25s/it, loss=0.175, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000141, train/loss_step=0.0389, global_step=590.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 64/5971 [01:22<2:04:56,  1.27s/it, loss=0.175, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000141, train/loss_step=0.0389, global_step=590.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 64/5971 [01:22<2:04:56,  1.27s/it, loss=0.159, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000609, train/loss_step=0.167, global_step=590.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   1%|          | 65/5971 [01:23<2:04:20,  1.26s/it, loss=0.159, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000609, train/loss_step=0.167, global_step=590.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 65/5971 [01:23<2:04:20,  1.26s/it, loss=0.164, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=591.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 66/5971 [01:24<2:03:45,  1.26s/it, loss=0.164, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=591.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 66/5971 [01:24<2:03:45,  1.26s/it, loss=0.181, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00234, train/loss_step=0.381, global_step=591.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|          | 67/5971 [01:25<2:03:12,  1.25s/it, loss=0.181, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00234, train/loss_step=0.381, global_step=591.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 67/5971 [01:25<2:03:12,  1.25s/it, loss=0.175, v_num=0, train/loss_simple_step=0.00459, train/loss_vlb_step=2.46e-5, train/loss_step=0.00459, global_step=591.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 68/5971 [01:27<2:04:23,  1.26s/it, loss=0.175, v_num=0, train/loss_simple_step=0.00459, train/loss_vlb_step=2.46e-5, train/loss_step=0.00459, global_step=591.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 68/5971 [01:27<2:04:23,  1.26s/it, loss=0.175, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00163, train/loss_step=0.354, global_step=591.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   1%|          | 69/5971 [01:28<2:03:52,  1.26s/it, loss=0.175, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00163, train/loss_step=0.354, global_step=591.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 69/5971 [01:28<2:03:52,  1.26s/it, loss=0.19, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00128, train/loss_step=0.304, global_step=592.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|          | 70/5971 [01:29<2:03:18,  1.25s/it, loss=0.19, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00128, train/loss_step=0.304, global_step=592.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 70/5971 [01:29<2:03:18,  1.25s/it, loss=0.186, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000416, train/loss_step=0.124, global_step=592.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 71/5971 [01:29<2:02:47,  1.25s/it, loss=0.186, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000416, train/loss_step=0.124, global_step=592.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 71/5971 [01:29<2:02:47,  1.25s/it, loss=0.181, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.000241, train/loss_step=0.0718, global_step=592.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 72/5971 [01:32<2:03:55,  1.26s/it, loss=0.181, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.000241, train/loss_step=0.0718, global_step=592.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 72/5971 [01:32<2:03:55,  1.26s/it, loss=0.203, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00442, train/loss_step=0.431, global_step=592.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   1%|          | 73/5971 [01:32<2:03:26,  1.26s/it, loss=0.203, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00442, train/loss_step=0.431, global_step=592.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 73/5971 [01:32<2:03:26,  1.26s/it, loss=0.203, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.09e-5, train/loss_step=0.00184, global_step=593.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 74/5971 [01:33<2:02:55,  1.25s/it, loss=0.203, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.09e-5, train/loss_step=0.00184, global_step=593.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|          | 74/5971 [01:33<2:02:55,  1.25s/it, loss=0.215, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00131, train/loss_step=0.310, global_step=593.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   1%|▏         | 75/5971 [01:34<2:02:24,  1.25s/it, loss=0.215, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00131, train/loss_step=0.310, global_step=593.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 75/5971 [01:34<2:02:24,  1.25s/it, loss=0.202, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=3.22e-5, train/loss_step=0.00618, global_step=593.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 76/5971 [01:36<2:03:28,  1.26s/it, loss=0.202, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=3.22e-5, train/loss_step=0.00618, global_step=593.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 76/5971 [01:36<2:03:28,  1.26s/it, loss=0.179, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.78e-5, train/loss_step=0.0126, global_step=593.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   1%|▏         | 77/5971 [01:37<2:03:01,  1.25s/it, loss=0.179, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.78e-5, train/loss_step=0.0126, global_step=593.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 77/5971 [01:37<2:03:01,  1.25s/it, loss=0.171, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000428, train/loss_step=0.127, global_step=594.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|▏         | 78/5971 [01:38<2:02:31,  1.25s/it, loss=0.171, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000428, train/loss_step=0.127, global_step=594.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 78/5971 [01:38<2:02:31,  1.25s/it, loss=0.171, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=594.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 79/5971 [01:39<2:02:03,  1.24s/it, loss=0.171, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=594.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 79/5971 [01:39<2:02:03,  1.24s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.08e-5, train/loss_step=0.0113, global_step=594.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 80/5971 [01:41<2:03:04,  1.25s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.08e-5, train/loss_step=0.0113, global_step=594.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 80/5971 [01:41<2:03:04,  1.25s/it, loss=0.153, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.000248, train/loss_step=0.0748, global_step=594.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 81/5971 [01:42<2:02:38,  1.25s/it, loss=0.153, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.000248, train/loss_step=0.0748, global_step=594.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 81/5971 [01:42<2:02:38,  1.25s/it, loss=0.153, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00247, train/loss_step=0.436, global_step=595.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   1%|▏         | 82/5971 [01:43<2:02:10,  1.24s/it, loss=0.153, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00247, train/loss_step=0.436, global_step=595.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 82/5971 [01:43<2:02:10,  1.24s/it, loss=0.154, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8.11e-5, train/loss_step=0.0191, global_step=595.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 83/5971 [01:44<2:01:42,  1.24s/it, loss=0.154, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8.11e-5, train/loss_step=0.0191, global_step=595.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 83/5971 [01:44<2:01:42,  1.24s/it, loss=0.163, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000774, train/loss_step=0.209, global_step=595.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|▏         | 84/5971 [01:46<2:02:57,  1.25s/it, loss=0.163, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000774, train/loss_step=0.209, global_step=595.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 84/5971 [01:46<2:02:57,  1.25s/it, loss=0.165, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.00094, train/loss_step=0.217, global_step=595.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   1%|▏         | 85/5971 [01:47<2:02:32,  1.25s/it, loss=0.165, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.00094, train/loss_step=0.217, global_step=595.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 85/5971 [01:47<2:02:32,  1.25s/it, loss=0.168, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000563, train/loss_step=0.165, global_step=596.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 86/5971 [01:48<2:02:06,  1.24s/it, loss=0.168, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000563, train/loss_step=0.165, global_step=596.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 86/5971 [01:48<2:02:06,  1.24s/it, loss=0.152, v_num=0, train/loss_simple_step=0.061, train/loss_vlb_step=0.000203, train/loss_step=0.061, global_step=596.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 87/5971 [01:49<2:01:40,  1.24s/it, loss=0.152, v_num=0, train/loss_simple_step=0.061, train/loss_vlb_step=0.000203, train/loss_step=0.061, global_step=596.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 87/5971 [01:49<2:01:40,  1.24s/it, loss=0.155, v_num=0, train/loss_simple_step=0.0591, train/loss_vlb_step=0.000202, train/loss_step=0.0591, global_step=596.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 88/5971 [01:51<2:02:35,  1.25s/it, loss=0.155, v_num=0, train/loss_simple_step=0.0591, train/loss_vlb_step=0.000202, train/loss_step=0.0591, global_step=596.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 88/5971 [01:51<2:02:35,  1.25s/it, loss=0.138, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.48e-5, train/loss_step=0.00754, global_step=596.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 89/5971 [01:52<2:02:11,  1.25s/it, loss=0.138, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.48e-5, train/loss_step=0.00754, global_step=596.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   1%|▏         | 89/5971 [01:52<2:02:11,  1.25s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000283, train/loss_step=0.0861, global_step=597.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   2%|▏         | 90/5971 [01:53<2:01:47,  1.24s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000283, train/loss_step=0.0861, global_step=597.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 90/5971 [01:53<2:01:47,  1.24s/it, loss=0.121, v_num=0, train/loss_simple_step=0.00819, train/loss_vlb_step=3.95e-5, train/loss_step=0.00819, global_step=597.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 91/5971 [01:53<2:01:23,  1.24s/it, loss=0.121, v_num=0, train/loss_simple_step=0.00819, train/loss_vlb_step=3.95e-5, train/loss_step=0.00819, global_step=597.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 91/5971 [01:53<2:01:23,  1.24s/it, loss=0.125, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000521, train/loss_step=0.156, global_step=597.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   2%|▏         | 92/5971 [01:56<2:02:25,  1.25s/it, loss=0.125, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000521, train/loss_step=0.156, global_step=597.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 92/5971 [01:56<2:02:25,  1.25s/it, loss=0.11, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000435, train/loss_step=0.131, global_step=597.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   2%|▏         | 93/5971 [01:57<2:02:01,  1.25s/it, loss=0.11, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000435, train/loss_step=0.131, global_step=597.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 93/5971 [01:57<2:02:01,  1.25s/it, loss=0.112, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000119, train/loss_step=0.0322, global_step=598.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 94/5971 [01:57<2:01:39,  1.24s/it, loss=0.112, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000119, train/loss_step=0.0322, global_step=598.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 94/5971 [01:57<2:01:39,  1.24s/it, loss=0.1, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000273, train/loss_step=0.0807, global_step=598.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   2%|▏         | 95/5971 [01:58<2:01:15,  1.24s/it, loss=0.1, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000273, train/loss_step=0.0807, global_step=598.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 95/5971 [01:58<2:01:15,  1.24s/it, loss=0.1, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.49e-5, train/loss_step=0.00485, global_step=598.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 96/5971 [02:00<2:02:08,  1.25s/it, loss=0.1, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.49e-5, train/loss_step=0.00485, global_step=598.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 96/5971 [02:00<2:02:08,  1.25s/it, loss=0.11, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000769, train/loss_step=0.218, global_step=598.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   2%|▏         | 97/5971 [02:01<2:01:45,  1.24s/it, loss=0.11, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000769, train/loss_step=0.218, global_step=598.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 97/5971 [02:01<2:01:45,  1.24s/it, loss=0.111, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000499, train/loss_step=0.150, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 98/5971 [02:02<2:01:22,  1.24s/it, loss=0.111, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000499, train/loss_step=0.150, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 98/5971 [02:02<2:01:22,  1.24s/it, loss=0.135, v_num=0, train/loss_simple_step=0.566, train/loss_vlb_step=0.00879, train/loss_step=0.566, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   2%|▏         | 99/5971 [02:03<2:01:00,  1.24s/it, loss=0.135, v_num=0, train/loss_simple_step=0.566, train/loss_vlb_step=0.00879, train/loss_step=0.566, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 99/5971 [02:03<2:01:00,  1.24s/it, loss=0.141, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000485, train/loss_step=0.146, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 100/5971 [02:05<2:01:54,  1.25s/it, loss=0.141, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000485, train/loss_step=0.146, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   2%|▏         | 100/5971 [02:05<2:01:54,  1.25s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.45it/s][A
Epoch 1:   2%|▏         | 102/5971 [02:06<1:59:56,  1.23s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   1%|          | 2/167 [00:00<00:46,  3.53it/s][A
Epoch 1:   2%|▏         | 104/5971 [02:06<1:57:50,  1.21s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.12it/s][A
Epoch 1:   2%|▏         | 107/5971 [02:06<1:54:36,  1.17s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.98it/s][A
Epoch 1:   2%|▏         | 110/5971 [02:06<1:51:32,  1.14s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.82it/s][A
Epoch 1:   2%|▏         | 113/5971 [02:06<1:48:39,  1.11s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.24it/s][A
Epoch 1:   2%|▏         | 116/5971 [02:07<1:45:55,  1.09s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.47it/s][A
Epoch 1:   2%|▏         | 119/5971 [02:07<1:43:19,  1.06s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.98it/s][A
Epoch 1:   2%|▏         | 122/5971 [02:07<1:40:50,  1.03s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.50it/s][A
Epoch 1:   2%|▏         | 125/5971 [02:07<1:38:29,  1.01s/it, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.42it/s][A
Epoch 1:   2%|▏         | 128/5971 [02:07<1:36:14,  1.01it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.84it/s][A
Epoch 1:   2%|▏         | 131/5971 [02:07<1:34:06,  1.03it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.29it/s][A
Epoch 1:   2%|▏         | 134/5971 [02:07<1:32:03,  1.06it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.17it/s][A
Epoch 1:   2%|▏         | 137/5971 [02:07<1:30:05,  1.08it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.11it/s][A
Epoch 1:   2%|▏         | 140/5971 [02:07<1:28:12,  1.10it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.59it/s][A
Epoch 1:   2%|▏         | 143/5971 [02:08<1:26:24,  1.12it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.59it/s][A
Epoch 1:   2%|▏         | 146/5971 [02:08<1:24:40,  1.15it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 24.48it/s][A
Epoch 1:   2%|▏         | 149/5971 [02:08<1:23:01,  1.17it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.70it/s][A
Epoch 1:   3%|▎         | 152/5971 [02:08<1:21:25,  1.19it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.38it/s][A
Epoch 1:   3%|▎         | 156/5971 [02:08<1:19:22,  1.22it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.00it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.82it/s][A
Epoch 1:   3%|▎         | 160/5971 [02:08<1:17:26,  1.25it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 26.18it/s][A
Epoch 1:   3%|▎         | 164/5971 [02:08<1:15:35,  1.28it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.87it/s][A
Epoch 1:   3%|▎         | 168/5971 [02:09<1:13:50,  1.31it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.92it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:04, 21.14it/s][A
Epoch 1:   3%|▎         | 172/5971 [02:09<1:12:13,  1.34it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  44%|████▍     | 74/167 [00:03<00:04, 23.14it/s][A
Epoch 1:   3%|▎         | 176/5971 [02:09<1:10:37,  1.37it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 23.73it/s][A
Epoch 1:   3%|▎         | 180/5971 [02:09<1:09:05,  1.40it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 24.86it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.07it/s][A
Epoch 1:   3%|▎         | 184/5971 [02:09<1:07:37,  1.43it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.61it/s][A
Epoch 1:   3%|▎         | 188/5971 [02:09<1:06:13,  1.46it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 89/167 [00:04<00:02, 26.44it/s][A
Epoch 1:   3%|▎         | 192/5971 [02:10<1:04:53,  1.48it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.79it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.10it/s][A
Epoch 1:   3%|▎         | 196/5971 [02:10<1:03:36,  1.51it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.69it/s][A
Epoch 1:   3%|▎         | 200/5971 [02:10<1:02:22,  1.54it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.82it/s][A
Epoch 1:   3%|▎         | 204/5971 [02:10<1:01:10,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.78it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.71it/s][A
Epoch 1:   3%|▎         | 208/5971 [02:10<1:00:02,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.37it/s][A
Epoch 1:   4%|▎         | 212/5971 [02:10<58:56,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.00it/s][A
Epoch 1:   4%|▎         | 216/5971 [02:10<57:52,  1.66it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 26.24it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.07it/s][A
Epoch 1:   4%|▎         | 220/5971 [02:11<56:51,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.70it/s][A
Epoch 1:   4%|▍         | 224/5971 [02:11<55:52,  1.71it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.20it/s][A
Epoch 1:   4%|▍         | 228/5971 [02:11<54:55,  1.74it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.46it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.87it/s][A
Epoch 1:   4%|▍         | 232/5971 [02:11<54:00,  1.77it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.42it/s][A
Epoch 1:   4%|▍         | 236/5971 [02:11<53:07,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.85it/s][A
Epoch 1:   4%|▍         | 240/5971 [02:11<52:15,  1.83it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 25.91it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.55it/s][A
Epoch 1:   4%|▍         | 244/5971 [02:12<51:25,  1.86it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.21it/s][A
Epoch 1:   4%|▍         | 248/5971 [02:12<50:37,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.28it/s][A
Epoch 1:   4%|▍         | 252/5971 [02:12<49:51,  1.91it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 24.70it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 24.51it/s][A
Epoch 1:   4%|▍         | 256/5971 [02:12<49:06,  1.94it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.33it/s][A
Epoch 1:   4%|▍         | 260/5971 [02:12<48:22,  1.97it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.77it/s][A
Epoch 1:   4%|▍         | 264/5971 [02:12<47:39,  2.00it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.44it/s][A

Validating: 100%|██████████| 167/167 [00:07<00:00, 25.79it/s][A
Epoch 1:   4%|▍         | 268/5971 [02:12<46:58,  2.02it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   4%|▍         | 268/5971 [02:13<47:06,  2.02it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.77it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.22it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.50it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.67it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:09,  4.64it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.60it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.72it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:08,  4.85it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  4.99it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.07it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.10it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.19it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.23it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.14it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.09it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.11it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.13it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:05<00:05,  5.12it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.26it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.18it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.24it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.28it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.25it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.21it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.11it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.10it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.08it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.03it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:02,  4.95it/s][A
Epoch 1:   4%|▍         | 268/5971 [02:23<50:31,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  4.91it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  4.95it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.02it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.08it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.20it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.16it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.87it/s]

Epoch 1:   5%|▍         | 269/5971 [02:26<51:25,  1.85it/s, loss=0.143, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=599.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 269/5971 [02:26<51:26,  1.85it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.9e-5, train/loss_step=0.0161, global_step=600.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.29it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.30it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.12it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.71it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.20it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.57it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.82it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.08it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.09it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.14it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.12it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.07it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.05it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.14it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.23it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.26it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.22it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.25it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.17it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.26it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.29it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.23it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.01it/s]

Epoch 1:   5%|▍         | 270/5971 [02:38<55:36,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.9e-5, train/loss_step=0.0161, global_step=600.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 270/5971 [02:38<55:36,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=600.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.40it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.66it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.83it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.05it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.39it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.32it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.37it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.38it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.59it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.40it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.09it/s]

Epoch 1:   5%|▍         | 271/5971 [02:51<59:46,  1.59it/s, loss=0.127, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=600.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 271/5971 [02:51<59:46,  1.59it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.18e-5, train/loss_step=0.0206, global_step=600.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.29it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.21it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.86it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.33it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.68it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.96it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.48it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.43it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.38it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.36it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.28it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.34it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.34it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.16it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.18it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.34it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.35it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.37it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.42it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.08it/s]

Epoch 1:   5%|▍         | 272/5971 [03:04<1:04:18,  1.48it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.18e-5, train/loss_step=0.0206, global_step=600.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 272/5971 [03:04<1:04:18,  1.48it/s, loss=0.116, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000632, train/loss_step=0.185, global_step=600.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▍         | 273/5971 [03:05<1:04:22,  1.48it/s, loss=0.116, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000632, train/loss_step=0.185, global_step=600.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 273/5971 [03:05<1:04:22,  1.48it/s, loss=0.117, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000663, train/loss_step=0.196, global_step=601.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 274/5971 [03:06<1:04:25,  1.47it/s, loss=0.117, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000663, train/loss_step=0.196, global_step=601.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 274/5971 [03:06<1:04:25,  1.47it/s, loss=0.126, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00097, train/loss_step=0.243, global_step=601.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▍         | 275/5971 [03:07<1:04:29,  1.47it/s, loss=0.126, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00097, train/loss_step=0.243, global_step=601.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 275/5971 [03:07<1:04:29,  1.47it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000289, train/loss_step=0.0851, global_step=601.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 276/5971 [03:09<1:05:02,  1.46it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000289, train/loss_step=0.0851, global_step=601.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 276/5971 [03:09<1:05:02,  1.46it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.000103, train/loss_step=0.0267, global_step=601.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 277/5971 [03:10<1:05:06,  1.46it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.000103, train/loss_step=0.0267, global_step=601.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 277/5971 [03:10<1:05:06,  1.46it/s, loss=0.158, v_num=0, train/loss_simple_step=0.680, train/loss_vlb_step=0.0224, train/loss_step=0.680, global_step=602.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   5%|▍         | 278/5971 [03:11<1:05:12,  1.46it/s, loss=0.158, v_num=0, train/loss_simple_step=0.680, train/loss_vlb_step=0.0224, train/loss_step=0.680, global_step=602.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 278/5971 [03:11<1:05:12,  1.46it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0582, train/loss_vlb_step=0.000208, train/loss_step=0.0582, global_step=602.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 279/5971 [03:12<1:05:20,  1.45it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0582, train/loss_vlb_step=0.000208, train/loss_step=0.0582, global_step=602.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 279/5971 [03:12<1:05:20,  1.45it/s, loss=0.17, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00187, train/loss_step=0.341, global_step=602.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   5%|▍         | 280/5971 [03:14<1:05:48,  1.44it/s, loss=0.17, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00187, train/loss_step=0.341, global_step=602.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 280/5971 [03:14<1:05:48,  1.44it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.86e-5, train/loss_step=0.00588, global_step=602.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 281/5971 [03:15<1:05:52,  1.44it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.86e-5, train/loss_step=0.00588, global_step=602.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 281/5971 [03:15<1:05:52,  1.44it/s, loss=0.181, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00213, train/loss_step=0.380, global_step=603.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   5%|▍         | 282/5971 [03:16<1:05:55,  1.44it/s, loss=0.181, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00213, train/loss_step=0.380, global_step=603.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 282/5971 [03:16<1:05:55,  1.44it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000126, train/loss_step=0.0342, global_step=603.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 283/5971 [03:17<1:05:58,  1.44it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000126, train/loss_step=0.0342, global_step=603.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 283/5971 [03:17<1:05:58,  1.44it/s, loss=0.188, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000645, train/loss_step=0.190, global_step=603.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   5%|▍         | 284/5971 [03:19<1:06:27,  1.43it/s, loss=0.188, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000645, train/loss_step=0.190, global_step=603.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 284/5971 [03:19<1:06:27,  1.43it/s, loss=0.194, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00134, train/loss_step=0.336, global_step=603.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▍         | 285/5971 [03:20<1:06:30,  1.42it/s, loss=0.194, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00134, train/loss_step=0.336, global_step=603.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 285/5971 [03:20<1:06:30,  1.42it/s, loss=0.202, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00179, train/loss_step=0.299, global_step=604.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 286/5971 [03:21<1:06:33,  1.42it/s, loss=0.202, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00179, train/loss_step=0.299, global_step=604.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 286/5971 [03:21<1:06:33,  1.42it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=8.99e-5, train/loss_step=0.0232, global_step=604.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 287/5971 [03:22<1:06:36,  1.42it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=8.99e-5, train/loss_step=0.0232, global_step=604.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 287/5971 [03:22<1:06:36,  1.42it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000115, train/loss_step=0.0308, global_step=604.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 288/5971 [03:24<1:07:03,  1.41it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000115, train/loss_step=0.0308, global_step=604.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 288/5971 [03:24<1:07:03,  1.41it/s, loss=0.177, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00102, train/loss_step=0.270, global_step=604.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   5%|▍         | 289/5971 [03:25<1:07:06,  1.41it/s, loss=0.177, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00102, train/loss_step=0.270, global_step=604.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 289/5971 [03:25<1:07:06,  1.41it/s, loss=0.18, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000218, train/loss_step=0.066, global_step=605.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 290/5971 [03:26<1:07:09,  1.41it/s, loss=0.18, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000218, train/loss_step=0.066, global_step=605.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 290/5971 [03:26<1:07:09,  1.41it/s, loss=0.2, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00439, train/loss_step=0.535, global_step=605.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   5%|▍         | 291/5971 [03:27<1:07:11,  1.41it/s, loss=0.2, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00439, train/loss_step=0.535, global_step=605.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 291/5971 [03:27<1:07:11,  1.41it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00893, train/loss_vlb_step=4.28e-5, train/loss_step=0.00893, global_step=605.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 292/5971 [03:29<1:07:38,  1.40it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00893, train/loss_vlb_step=4.28e-5, train/loss_step=0.00893, global_step=605.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 292/5971 [03:29<1:07:38,  1.40it/s, loss=0.201, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000727, train/loss_step=0.204, global_step=605.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▍         | 293/5971 [03:30<1:07:40,  1.40it/s, loss=0.201, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000727, train/loss_step=0.204, global_step=605.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 293/5971 [03:30<1:07:40,  1.40it/s, loss=0.206, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00146, train/loss_step=0.306, global_step=606.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▍         | 294/5971 [03:31<1:07:43,  1.40it/s, loss=0.206, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00146, train/loss_step=0.306, global_step=606.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 294/5971 [03:31<1:07:43,  1.40it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0489, train/loss_vlb_step=0.000171, train/loss_step=0.0489, global_step=606.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 295/5971 [03:32<1:07:45,  1.40it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0489, train/loss_vlb_step=0.000171, train/loss_step=0.0489, global_step=606.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 295/5971 [03:32<1:07:45,  1.40it/s, loss=0.204, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.00101, train/loss_step=0.240, global_step=606.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   5%|▍         | 296/5971 [03:35<1:08:41,  1.38it/s, loss=0.204, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.00101, train/loss_step=0.240, global_step=606.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 296/5971 [03:35<1:08:41,  1.38it/s, loss=0.208, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000345, train/loss_step=0.100, global_step=606.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 297/5971 [03:36<1:08:44,  1.38it/s, loss=0.208, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000345, train/loss_step=0.100, global_step=606.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 297/5971 [03:36<1:08:44,  1.38it/s, loss=0.179, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000342, train/loss_step=0.103, global_step=607.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 298/5971 [03:37<1:08:46,  1.37it/s, loss=0.179, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000342, train/loss_step=0.103, global_step=607.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▍         | 298/5971 [03:37<1:08:46,  1.37it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00737, train/loss_vlb_step=3.53e-5, train/loss_step=0.00737, global_step=607.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 299/5971 [03:38<1:08:48,  1.37it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00737, train/loss_vlb_step=3.53e-5, train/loss_step=0.00737, global_step=607.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 299/5971 [03:38<1:08:48,  1.37it/s, loss=0.174, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00117, train/loss_step=0.289, global_step=607.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   5%|▌         | 300/5971 [03:41<1:09:31,  1.36it/s, loss=0.174, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00117, train/loss_step=0.289, global_step=607.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 300/5971 [03:41<1:09:31,  1.36it/s, loss=0.21, v_num=0, train/loss_simple_step=0.720, train/loss_vlb_step=0.015, train/loss_step=0.720, global_step=607.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   5%|▌         | 301/5971 [03:42<1:09:34,  1.36it/s, loss=0.21, v_num=0, train/loss_simple_step=0.720, train/loss_vlb_step=0.015, train/loss_step=0.720, global_step=607.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 301/5971 [03:42<1:09:34,  1.36it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00943, train/loss_vlb_step=4.68e-5, train/loss_step=0.00943, global_step=608.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 302/5971 [03:43<1:09:36,  1.36it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00943, train/loss_vlb_step=4.68e-5, train/loss_step=0.00943, global_step=608.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 302/5971 [03:43<1:09:36,  1.36it/s, loss=0.197, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000529, train/loss_step=0.158, global_step=608.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   5%|▌         | 303/5971 [03:44<1:09:38,  1.36it/s, loss=0.197, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000529, train/loss_step=0.158, global_step=608.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 303/5971 [03:44<1:09:38,  1.36it/s, loss=0.215, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00429, train/loss_step=0.550, global_step=608.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▌         | 304/5971 [03:46<1:10:03,  1.35it/s, loss=0.215, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00429, train/loss_step=0.550, global_step=608.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 304/5971 [03:46<1:10:03,  1.35it/s, loss=0.22, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00375, train/loss_step=0.433, global_step=608.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▌         | 305/5971 [03:47<1:10:06,  1.35it/s, loss=0.22, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00375, train/loss_step=0.433, global_step=608.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 305/5971 [03:47<1:10:06,  1.35it/s, loss=0.214, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000579, train/loss_step=0.172, global_step=609.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 306/5971 [03:48<1:10:08,  1.35it/s, loss=0.214, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000579, train/loss_step=0.172, global_step=609.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 306/5971 [03:48<1:10:08,  1.35it/s, loss=0.219, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000438, train/loss_step=0.132, global_step=609.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 307/5971 [03:48<1:10:09,  1.35it/s, loss=0.219, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000438, train/loss_step=0.132, global_step=609.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 307/5971 [03:48<1:10:09,  1.35it/s, loss=0.233, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.0016, train/loss_step=0.301, global_step=609.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   5%|▌         | 308/5971 [03:51<1:10:38,  1.34it/s, loss=0.233, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.0016, train/loss_step=0.301, global_step=609.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 308/5971 [03:51<1:10:38,  1.34it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.45e-5, train/loss_step=0.0142, global_step=609.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 309/5971 [03:52<1:10:40,  1.34it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.45e-5, train/loss_step=0.0142, global_step=609.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 309/5971 [03:52<1:10:40,  1.34it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00267, train/loss_vlb_step=1.53e-5, train/loss_step=0.00267, global_step=610.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 310/5971 [03:53<1:10:42,  1.33it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00267, train/loss_vlb_step=1.53e-5, train/loss_step=0.00267, global_step=610.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 310/5971 [03:53<1:10:42,  1.33it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.000258, train/loss_step=0.0727, global_step=610.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▌         | 311/5971 [03:53<1:10:43,  1.33it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.000258, train/loss_step=0.0727, global_step=610.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 311/5971 [03:53<1:10:43,  1.33it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0082, train/loss_vlb_step=4e-5, train/loss_step=0.0082, global_step=610.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   5%|▌         | 312/5971 [03:56<1:11:07,  1.33it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0082, train/loss_vlb_step=4e-5, train/loss_step=0.0082, global_step=610.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 312/5971 [03:56<1:11:07,  1.33it/s, loss=0.195, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000842, train/loss_step=0.237, global_step=610.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 313/5971 [03:56<1:11:09,  1.33it/s, loss=0.195, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000842, train/loss_step=0.237, global_step=610.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 313/5971 [03:56<1:11:09,  1.33it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0517, train/loss_vlb_step=0.000188, train/loss_step=0.0517, global_step=611.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 314/5971 [03:57<1:11:10,  1.32it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0517, train/loss_vlb_step=0.000188, train/loss_step=0.0517, global_step=611.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 314/5971 [03:57<1:11:10,  1.32it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000265, train/loss_step=0.0797, global_step=611.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 315/5971 [03:58<1:11:12,  1.32it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000265, train/loss_step=0.0797, global_step=611.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 315/5971 [03:58<1:11:12,  1.32it/s, loss=0.18, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000496, train/loss_step=0.150, global_step=611.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   5%|▌         | 316/5971 [04:01<1:11:41,  1.31it/s, loss=0.18, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000496, train/loss_step=0.150, global_step=611.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 316/5971 [04:01<1:11:41,  1.31it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000121, train/loss_step=0.0296, global_step=611.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 317/5971 [04:02<1:11:43,  1.31it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000121, train/loss_step=0.0296, global_step=611.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 317/5971 [04:02<1:11:43,  1.31it/s, loss=0.177, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000408, train/loss_step=0.122, global_step=612.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   5%|▌         | 318/5971 [04:02<1:11:44,  1.31it/s, loss=0.177, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000408, train/loss_step=0.122, global_step=612.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 318/5971 [04:02<1:11:44,  1.31it/s, loss=0.192, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00132, train/loss_step=0.315, global_step=612.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▌         | 319/5971 [04:03<1:11:46,  1.31it/s, loss=0.192, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00132, train/loss_step=0.315, global_step=612.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 319/5971 [04:03<1:11:46,  1.31it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000282, train/loss_step=0.0828, global_step=612.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 320/5971 [04:05<1:12:09,  1.31it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000282, train/loss_step=0.0828, global_step=612.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 320/5971 [04:05<1:12:09,  1.31it/s, loss=0.152, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.00041, train/loss_step=0.124, global_step=612.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   5%|▌         | 321/5971 [04:06<1:12:11,  1.30it/s, loss=0.152, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.00041, train/loss_step=0.124, global_step=612.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 321/5971 [04:06<1:12:11,  1.30it/s, loss=0.167, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00109, train/loss_step=0.295, global_step=613.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 322/5971 [04:07<1:12:12,  1.30it/s, loss=0.167, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00109, train/loss_step=0.295, global_step=613.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 322/5971 [04:07<1:12:12,  1.30it/s, loss=0.171, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000945, train/loss_step=0.243, global_step=613.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 323/5971 [04:08<1:12:13,  1.30it/s, loss=0.171, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000945, train/loss_step=0.243, global_step=613.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 323/5971 [04:08<1:12:13,  1.30it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000178, train/loss_step=0.0533, global_step=613.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 324/5971 [04:10<1:12:36,  1.30it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000178, train/loss_step=0.0533, global_step=613.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 324/5971 [04:10<1:12:36,  1.30it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000152, train/loss_step=0.0419, global_step=613.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 325/5971 [04:11<1:12:37,  1.30it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000152, train/loss_step=0.0419, global_step=613.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 325/5971 [04:11<1:12:37,  1.30it/s, loss=0.126, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000519, train/loss_step=0.155, global_step=614.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   5%|▌         | 326/5971 [04:12<1:12:38,  1.30it/s, loss=0.126, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000519, train/loss_step=0.155, global_step=614.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 326/5971 [04:12<1:12:38,  1.30it/s, loss=0.13, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000888, train/loss_step=0.221, global_step=614.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▌         | 327/5971 [04:13<1:12:39,  1.29it/s, loss=0.13, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000888, train/loss_step=0.221, global_step=614.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 327/5971 [04:13<1:12:39,  1.29it/s, loss=0.159, v_num=0, train/loss_simple_step=0.891, train/loss_vlb_step=0.0908, train/loss_step=0.891, global_step=614.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   5%|▌         | 328/5971 [04:15<1:13:02,  1.29it/s, loss=0.159, v_num=0, train/loss_simple_step=0.891, train/loss_vlb_step=0.0908, train/loss_step=0.891, global_step=614.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   5%|▌         | 328/5971 [04:15<1:13:02,  1.29it/s, loss=0.173, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00127, train/loss_step=0.277, global_step=614.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 329/5971 [04:16<1:13:03,  1.29it/s, loss=0.173, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00127, train/loss_step=0.277, global_step=614.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 329/5971 [04:16<1:13:03,  1.29it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0873, train/loss_vlb_step=0.000292, train/loss_step=0.0873, global_step=615.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 330/5971 [04:17<1:13:04,  1.29it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0873, train/loss_vlb_step=0.000292, train/loss_step=0.0873, global_step=615.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 330/5971 [04:17<1:13:04,  1.29it/s, loss=0.181, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000516, train/loss_step=0.154, global_step=615.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   6%|▌         | 331/5971 [04:18<1:13:04,  1.29it/s, loss=0.181, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000516, train/loss_step=0.154, global_step=615.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 331/5971 [04:18<1:13:04,  1.29it/s, loss=0.192, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000814, train/loss_step=0.225, global_step=615.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 332/5971 [04:20<1:13:28,  1.28it/s, loss=0.192, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000814, train/loss_step=0.225, global_step=615.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 332/5971 [04:20<1:13:28,  1.28it/s, loss=0.19, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000683, train/loss_step=0.199, global_step=615.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   6%|▌         | 333/5971 [04:21<1:13:29,  1.28it/s, loss=0.19, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000683, train/loss_step=0.199, global_step=615.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 333/5971 [04:21<1:13:29,  1.28it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.23e-5, train/loss_step=0.0176, global_step=616.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 334/5971 [04:22<1:13:29,  1.28it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.23e-5, train/loss_step=0.0176, global_step=616.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 334/5971 [04:22<1:13:29,  1.28it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.62e-5, train/loss_step=0.00289, global_step=616.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 335/5971 [04:22<1:13:30,  1.28it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.62e-5, train/loss_step=0.00289, global_step=616.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 335/5971 [04:22<1:13:30,  1.28it/s, loss=0.189, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000889, train/loss_step=0.235, global_step=616.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   6%|▌         | 336/5971 [04:25<1:13:54,  1.27it/s, loss=0.189, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000889, train/loss_step=0.235, global_step=616.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 336/5971 [04:25<1:13:54,  1.27it/s, loss=0.188, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.17e-5, train/loss_step=0.023, global_step=616.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   6%|▌         | 337/5971 [04:26<1:13:55,  1.27it/s, loss=0.188, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.17e-5, train/loss_step=0.023, global_step=616.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 337/5971 [04:26<1:13:55,  1.27it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00571, train/loss_vlb_step=2.88e-5, train/loss_step=0.00571, global_step=617.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 338/5971 [04:26<1:13:55,  1.27it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00571, train/loss_vlb_step=2.88e-5, train/loss_step=0.00571, global_step=617.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 338/5971 [04:26<1:13:56,  1.27it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.19e-5, train/loss_step=0.00195, global_step=617.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 339/5971 [04:27<1:13:56,  1.27it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.19e-5, train/loss_step=0.00195, global_step=617.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 339/5971 [04:27<1:13:56,  1.27it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000147, train/loss_step=0.0408, global_step=617.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   6%|▌         | 340/5971 [04:30<1:14:19,  1.26it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000147, train/loss_step=0.0408, global_step=617.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 340/5971 [04:30<1:14:19,  1.26it/s, loss=0.172, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00109, train/loss_step=0.270, global_step=617.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   6%|▌         | 341/5971 [04:30<1:14:20,  1.26it/s, loss=0.172, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00109, train/loss_step=0.270, global_step=617.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 341/5971 [04:30<1:14:20,  1.26it/s, loss=0.164, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.00044, train/loss_step=0.129, global_step=618.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 342/5971 [04:31<1:14:20,  1.26it/s, loss=0.164, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.00044, train/loss_step=0.129, global_step=618.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 342/5971 [04:31<1:14:20,  1.26it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00781, train/loss_vlb_step=3.73e-5, train/loss_step=0.00781, global_step=618.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 343/5971 [04:32<1:14:21,  1.26it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00781, train/loss_vlb_step=3.73e-5, train/loss_step=0.00781, global_step=618.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 343/5971 [04:32<1:14:21,  1.26it/s, loss=0.171, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00281, train/loss_step=0.443, global_step=618.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   6%|▌         | 344/5971 [04:35<1:14:45,  1.25it/s, loss=0.171, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00281, train/loss_step=0.443, global_step=618.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 344/5971 [04:35<1:14:45,  1.25it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.72e-5, train/loss_step=0.0238, global_step=618.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 345/5971 [04:35<1:14:46,  1.25it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.72e-5, train/loss_step=0.0238, global_step=618.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 345/5971 [04:35<1:14:46,  1.25it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00916, train/loss_vlb_step=4.04e-5, train/loss_step=0.00916, global_step=619.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 346/5971 [04:36<1:14:46,  1.25it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00916, train/loss_vlb_step=4.04e-5, train/loss_step=0.00916, global_step=619.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 346/5971 [04:36<1:14:46,  1.25it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0987, train/loss_vlb_step=0.00033, train/loss_step=0.0987, global_step=619.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   6%|▌         | 347/5971 [04:37<1:14:46,  1.25it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0987, train/loss_vlb_step=0.00033, train/loss_step=0.0987, global_step=619.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 347/5971 [04:37<1:14:46,  1.25it/s, loss=0.144, v_num=0, train/loss_simple_step=0.627, train/loss_vlb_step=0.022, train/loss_step=0.627, global_step=619.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   6%|▌         | 348/5971 [04:39<1:15:07,  1.25it/s, loss=0.144, v_num=0, train/loss_simple_step=0.627, train/loss_vlb_step=0.022, train/loss_step=0.627, global_step=619.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 348/5971 [04:39<1:15:07,  1.25it/s, loss=0.145, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00128, train/loss_step=0.297, global_step=619.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 349/5971 [04:40<1:15:08,  1.25it/s, loss=0.145, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00128, train/loss_step=0.297, global_step=619.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 349/5971 [04:40<1:15:08,  1.25it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000247, train/loss_step=0.0716, global_step=620.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 350/5971 [04:41<1:15:08,  1.25it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000247, train/loss_step=0.0716, global_step=620.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 350/5971 [04:41<1:15:08,  1.25it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00592, train/loss_vlb_step=2.74e-5, train/loss_step=0.00592, global_step=620.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 351/5971 [04:42<1:15:08,  1.25it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00592, train/loss_vlb_step=2.74e-5, train/loss_step=0.00592, global_step=620.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 351/5971 [04:42<1:15:08,  1.25it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=5.19e-5, train/loss_step=0.0108, global_step=620.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   6%|▌         | 352/5971 [04:44<1:15:29,  1.24it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=5.19e-5, train/loss_step=0.0108, global_step=620.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 352/5971 [04:44<1:15:29,  1.24it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000134, train/loss_step=0.0341, global_step=620.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 353/5971 [04:45<1:15:30,  1.24it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000134, train/loss_step=0.0341, global_step=620.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 353/5971 [04:45<1:15:30,  1.24it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.06e-5, train/loss_step=0.0019, global_step=621.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   6%|▌         | 354/5971 [04:46<1:15:30,  1.24it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.06e-5, train/loss_step=0.0019, global_step=621.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 354/5971 [04:46<1:15:30,  1.24it/s, loss=0.129, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000888, train/loss_step=0.243, global_step=621.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   6%|▌         | 355/5971 [04:47<1:15:31,  1.24it/s, loss=0.129, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000888, train/loss_step=0.243, global_step=621.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 355/5971 [04:47<1:15:31,  1.24it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000314, train/loss_step=0.0955, global_step=621.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 356/5971 [04:49<1:15:50,  1.23it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000314, train/loss_step=0.0955, global_step=621.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 356/5971 [04:49<1:15:50,  1.23it/s, loss=0.149, v_num=0, train/loss_simple_step=0.560, train/loss_vlb_step=0.00774, train/loss_step=0.560, global_step=621.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   6%|▌         | 357/5971 [04:50<1:15:51,  1.23it/s, loss=0.149, v_num=0, train/loss_simple_step=0.560, train/loss_vlb_step=0.00774, train/loss_step=0.560, global_step=621.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 357/5971 [04:50<1:15:51,  1.23it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.45e-5, train/loss_step=0.0206, global_step=622.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 358/5971 [04:51<1:15:51,  1.23it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.45e-5, train/loss_step=0.0206, global_step=622.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 358/5971 [04:51<1:15:51,  1.23it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0736, train/loss_vlb_step=0.000245, train/loss_step=0.0736, global_step=622.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 359/5971 [04:51<1:15:51,  1.23it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0736, train/loss_vlb_step=0.000245, train/loss_step=0.0736, global_step=622.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 359/5971 [04:51<1:15:51,  1.23it/s, loss=0.159, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000538, train/loss_step=0.159, global_step=622.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   6%|▌         | 360/5971 [04:54<1:16:11,  1.23it/s, loss=0.159, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000538, train/loss_step=0.159, global_step=622.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 360/5971 [04:54<1:16:11,  1.23it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000135, train/loss_step=0.0363, global_step=622.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 361/5971 [04:55<1:16:12,  1.23it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000135, train/loss_step=0.0363, global_step=622.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 361/5971 [04:55<1:16:12,  1.23it/s, loss=0.146, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=623.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   6%|▌         | 362/5971 [04:55<1:16:12,  1.23it/s, loss=0.146, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=623.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 362/5971 [04:55<1:16:12,  1.23it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.77e-5, train/loss_step=0.00316, global_step=623.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 363/5971 [04:56<1:16:12,  1.23it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.77e-5, train/loss_step=0.00316, global_step=623.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 363/5971 [04:56<1:16:12,  1.23it/s, loss=0.133, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.00064, train/loss_step=0.184, global_step=623.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   6%|▌         | 364/5971 [04:59<1:16:36,  1.22it/s, loss=0.133, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.00064, train/loss_step=0.184, global_step=623.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 364/5971 [04:59<1:16:36,  1.22it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00646, train/loss_vlb_step=3.13e-5, train/loss_step=0.00646, global_step=623.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 365/5971 [05:00<1:16:37,  1.22it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00646, train/loss_vlb_step=3.13e-5, train/loss_step=0.00646, global_step=623.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 365/5971 [05:00<1:16:37,  1.22it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.86e-5, train/loss_step=0.0161, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   6%|▌         | 366/5971 [05:01<1:16:37,  1.22it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.86e-5, train/loss_step=0.0161, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 366/5971 [05:01<1:16:37,  1.22it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0651, train/loss_vlb_step=0.000225, train/loss_step=0.0651, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 367/5971 [05:01<1:16:37,  1.22it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0651, train/loss_vlb_step=0.000225, train/loss_step=0.0651, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 367/5971 [05:01<1:16:37,  1.22it/s, loss=0.101, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000115, train/loss_step=0.030, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   6%|▌         | 368/5971 [05:04<1:16:56,  1.21it/s, loss=0.101, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000115, train/loss_step=0.030, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   6%|▌         | 368/5971 [05:04<1:16:56,  1.21it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.36it/s][A
Epoch 1:   6%|▌         | 370/5971 [05:04<1:16:36,  1.22it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   1%|          | 2/167 [00:00<01:13,  2.23it/s][A
Epoch 1:   6%|▌         | 372/5971 [05:04<1:16:18,  1.22it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:01<00:25,  6.42it/s][A
Epoch 1:   6%|▋         | 375/5971 [05:05<1:15:40,  1.23it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:01<00:15, 10.32it/s][A
Epoch 1:   6%|▋         | 378/5971 [05:05<1:15:04,  1.24it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:11, 13.70it/s][A
Epoch 1:   6%|▋         | 381/5971 [05:05<1:14:28,  1.25it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:09, 16.08it/s][A
Epoch 1:   6%|▋         | 384/5971 [05:05<1:13:53,  1.26it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:08, 18.56it/s][A
Epoch 1:   6%|▋         | 387/5971 [05:05<1:13:18,  1.27it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.00it/s][A
Epoch 1:   7%|▋         | 390/5971 [05:05<1:12:43,  1.28it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.69it/s][A
Epoch 1:   7%|▋         | 393/5971 [05:05<1:12:09,  1.29it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.61it/s][A
Epoch 1:   7%|▋         | 396/5971 [05:05<1:11:36,  1.30it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.83it/s][A
Epoch 1:   7%|▋         | 399/5971 [05:06<1:11:03,  1.31it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 26.20it/s][A
Epoch 1:   7%|▋         | 403/5971 [05:06<1:10:19,  1.32it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 36/167 [00:02<00:04, 26.43it/s][A
Epoch 1:   7%|▋         | 407/5971 [05:06<1:09:37,  1.33it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.54it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.76it/s][A
Epoch 1:   7%|▋         | 411/5971 [05:06<1:08:56,  1.34it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 24.47it/s][A
Epoch 1:   7%|▋         | 415/5971 [05:06<1:08:15,  1.36it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.92it/s][A
Epoch 1:   7%|▋         | 419/5971 [05:06<1:07:36,  1.37it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 24.96it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.72it/s][A
Epoch 1:   7%|▋         | 423/5971 [05:06<1:06:56,  1.38it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:03<00:04, 26.22it/s][A
Epoch 1:   7%|▋         | 427/5971 [05:07<1:06:18,  1.39it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  36%|███▌      | 60/167 [00:03<00:05, 20.95it/s][A
Epoch 1:   7%|▋         | 431/5971 [05:07<1:05:41,  1.41it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:03<00:04, 22.59it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:04, 23.73it/s][A
Epoch 1:   7%|▋         | 435/5971 [05:07<1:05:04,  1.42it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 24.61it/s][A
Epoch 1:   7%|▋         | 439/5971 [05:07<1:04:28,  1.43it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.69it/s][A
Epoch 1:   7%|▋         | 443/5971 [05:07<1:03:52,  1.44it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.95it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.24it/s][A
Epoch 1:   7%|▋         | 447/5971 [05:07<1:03:17,  1.45it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:04<00:03, 25.66it/s][A
Epoch 1:   8%|▊         | 451/5971 [05:08<1:02:42,  1.47it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|█████     | 84/167 [00:04<00:03, 26.51it/s][A
Epoch 1:   8%|▊         | 455/5971 [05:08<1:02:08,  1.48it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 26.63it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 27.29it/s][A
Epoch 1:   8%|▊         | 459/5971 [05:08<1:01:35,  1.49it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.13it/s][A
Epoch 1:   8%|▊         | 463/5971 [05:08<1:01:02,  1.50it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 28.20it/s][A
Epoch 1:   8%|▊         | 467/5971 [05:08<1:00:30,  1.52it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 27.54it/s][A
Epoch 1:   8%|▊         | 471/5971 [05:08<59:58,  1.53it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.54it/s][A

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 28.12it/s][A
Epoch 1:   8%|▊         | 475/5971 [05:08<59:27,  1.54it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 27.41it/s][A
Epoch 1:   8%|▊         | 479/5971 [05:09<58:56,  1.55it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  67%|██████▋   | 112/167 [00:05<00:01, 27.72it/s][A
Epoch 1:   8%|▊         | 483/5971 [05:09<58:26,  1.56it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 27.15it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 24.80it/s][A
Epoch 1:   8%|▊         | 487/5971 [05:09<57:57,  1.58it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.34it/s][A
Epoch 1:   8%|▊         | 491/5971 [05:09<57:28,  1.59it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.90it/s][A
Epoch 1:   8%|▊         | 495/5971 [05:09<56:59,  1.60it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.45it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 25.94it/s][A
Epoch 1:   8%|▊         | 499/5971 [05:09<56:31,  1.61it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.81it/s][A
Epoch 1:   8%|▊         | 503/5971 [05:10<56:04,  1.63it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.65it/s][A
Epoch 1:   8%|▊         | 507/5971 [05:10<55:36,  1.64it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 25.61it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.06it/s][A
Epoch 1:   9%|▊         | 511/5971 [05:10<55:09,  1.65it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.05it/s][A
Epoch 1:   9%|▊         | 515/5971 [05:10<54:43,  1.66it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.50it/s][A
Epoch 1:   9%|▊         | 519/5971 [05:10<54:17,  1.67it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.72it/s][A
Epoch 1:   9%|▉         | 523/5971 [05:10<53:51,  1.69it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 28.11it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 28.32it/s][A
Epoch 1:   9%|▉         | 527/5971 [05:10<53:25,  1.70it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 28.81it/s][A
Epoch 1:   9%|▉         | 531/5971 [05:11<53:00,  1.71it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 28.89it/s][A
Epoch 1:   9%|▉         | 535/5971 [05:11<52:36,  1.72it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 536/5971 [05:11<52:33,  1.72it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=624.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:   9%|▉         | 537/5971 [05:12<52:36,  1.72it/s, loss=0.113, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.0053, train/loss_step=0.509, global_step=625.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:   9%|▉         | 538/5971 [05:13<52:39,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00143, train/loss_step=0.310, global_step=625.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 539/5971 [05:14<52:41,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00143, train/loss_step=0.310, global_step=625.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 539/5971 [05:14<52:41,  1.72it/s, loss=0.137, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000635, train/loss_step=0.188, global_step=625.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 540/5971 [05:16<52:59,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000566, train/loss_step=0.170, global_step=625.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 541/5971 [05:17<53:02,  1.71it/s, loss=0.151, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000457, train/loss_step=0.138, global_step=626.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 542/5971 [05:18<53:04,  1.70it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=4.04e-5, train/loss_step=0.00851, global_step=626.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 543/5971 [05:19<53:07,  1.70it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=4.04e-5, train/loss_step=0.00851, global_step=626.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 543/5971 [05:19<53:07,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00133, train/loss_step=0.321, global_step=626.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   9%|▉         | 544/5971 [05:21<53:22,  1.69it/s, loss=0.14, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00165, train/loss_step=0.348, global_step=626.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   9%|▉         | 545/5971 [05:22<53:24,  1.69it/s, loss=0.156, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.0014, train/loss_step=0.338, global_step=627.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 546/5971 [05:23<53:27,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=2.98e-5, train/loss_step=0.00618, global_step=627.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 547/5971 [05:24<53:29,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=2.98e-5, train/loss_step=0.00618, global_step=627.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 547/5971 [05:24<53:29,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00208, train/loss_step=0.426, global_step=627.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   9%|▉         | 548/5971 [05:26<53:44,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00315, train/loss_step=0.420, global_step=627.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 549/5971 [05:27<53:46,  1.68it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00957, train/loss_vlb_step=4.31e-5, train/loss_step=0.00957, global_step=628.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 550/5971 [05:28<53:49,  1.68it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000255, train/loss_step=0.0759, global_step=628.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 551/5971 [05:29<53:51,  1.68it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000255, train/loss_step=0.0759, global_step=628.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 551/5971 [05:29<53:51,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00353, train/loss_vlb_step=1.9e-5, train/loss_step=0.00353, global_step=628.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 552/5971 [05:31<54:07,  1.67it/s, loss=0.19, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00179, train/loss_step=0.323, global_step=628.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:   9%|▉         | 553/5971 [05:32<54:10,  1.67it/s, loss=0.229, v_num=0, train/loss_simple_step=0.791, train/loss_vlb_step=0.0807, train/loss_step=0.791, global_step=629.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 554/5971 [05:33<54:12,  1.67it/s, loss=0.233, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000453, train/loss_step=0.137, global_step=629.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 555/5971 [05:34<54:14,  1.66it/s, loss=0.233, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000453, train/loss_step=0.137, global_step=629.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 555/5971 [05:34<54:14,  1.66it/s, loss=0.234, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000174, train/loss_step=0.0505, global_step=629.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 556/5971 [05:36<54:28,  1.66it/s, loss=0.229, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.82e-5, train/loss_step=0.0055, global_step=629.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   9%|▉         | 557/5971 [05:37<54:31,  1.65it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.84e-5, train/loss_step=0.0133, global_step=630.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 558/5971 [05:38<54:33,  1.65it/s, loss=0.199, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000757, train/loss_step=0.209, global_step=630.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   9%|▉         | 559/5971 [05:38<54:35,  1.65it/s, loss=0.199, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000757, train/loss_step=0.209, global_step=630.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 559/5971 [05:38<54:35,  1.65it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.72e-5, train/loss_step=0.0235, global_step=630.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 560/5971 [05:41<54:53,  1.64it/s, loss=0.191, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000583, train/loss_step=0.169, global_step=630.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   9%|▉         | 561/5971 [05:42<54:55,  1.64it/s, loss=0.201, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00171, train/loss_step=0.343, global_step=631.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:   9%|▉         | 562/5971 [05:43<54:57,  1.64it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000144, train/loss_step=0.0382, global_step=631.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 563/5971 [05:44<54:59,  1.64it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000144, train/loss_step=0.0382, global_step=631.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 563/5971 [05:44<54:59,  1.64it/s, loss=0.195, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000571, train/loss_step=0.172, global_step=631.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   9%|▉         | 564/5971 [05:46<55:12,  1.63it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.11e-5, train/loss_step=0.0227, global_step=631.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 565/5971 [05:47<55:14,  1.63it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.44e-6, train/loss_step=0.0016, global_step=632.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 566/5971 [05:47<55:16,  1.63it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.000181, train/loss_step=0.0495, global_step=632.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 567/5971 [05:48<55:18,  1.63it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.000181, train/loss_step=0.0495, global_step=632.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:   9%|▉         | 567/5971 [05:48<55:18,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.56e-5, train/loss_step=0.00481, global_step=632.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 568/5971 [05:50<55:32,  1.62it/s, loss=0.133, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000793, train/loss_step=0.226, global_step=632.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  10%|▉         | 569/5971 [05:51<55:34,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00259, train/loss_step=0.427, global_step=633.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|▉         | 570/5971 [05:52<55:36,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000277, train/loss_step=0.0842, global_step=633.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 571/5971 [05:53<55:37,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000277, train/loss_step=0.0842, global_step=633.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 571/5971 [05:53<55:37,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000908, train/loss_step=0.236, global_step=633.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|▉         | 572/5971 [05:55<55:51,  1.61it/s, loss=0.169, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00246, train/loss_step=0.377, global_step=633.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|▉         | 573/5971 [05:56<55:53,  1.61it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000184, train/loss_step=0.0528, global_step=634.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 574/5971 [05:57<55:54,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.65e-5, train/loss_step=0.0204, global_step=634.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|▉         | 575/5971 [05:58<55:56,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.65e-5, train/loss_step=0.0204, global_step=634.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 575/5971 [05:58<55:56,  1.61it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000221, train/loss_step=0.0624, global_step=634.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 576/5971 [06:00<56:10,  1.60it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0696, train/loss_vlb_step=0.000234, train/loss_step=0.0696, global_step=634.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|▉         | 577/5971 [06:01<56:11,  1.60it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000136, train/loss_step=0.0372, global_step=635.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 578/5971 [06:02<56:13,  1.60it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00637, train/loss_vlb_step=3.28e-5, train/loss_step=0.00637, global_step=635.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 579/5971 [06:03<56:15,  1.60it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00637, train/loss_vlb_step=3.28e-5, train/loss_step=0.00637, global_step=635.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 579/5971 [06:03<56:15,  1.60it/s, loss=0.121, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5e-5, train/loss_step=0.011, global_step=635.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]       
Epoch 1:  10%|▉         | 580/5971 [06:05<56:28,  1.59it/s, loss=0.117, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000341, train/loss_step=0.103, global_step=635.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 581/5971 [06:06<56:30,  1.59it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00605, train/loss_vlb_step=2.92e-5, train/loss_step=0.00605, global_step=636.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 582/5971 [06:06<56:32,  1.59it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.4e-5, train/loss_step=0.0206, global_step=636.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 583/5971 [06:07<56:33,  1.59it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.4e-5, train/loss_step=0.0206, global_step=636.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 583/5971 [06:07<56:34,  1.59it/s, loss=0.109, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00188, train/loss_step=0.360, global_step=636.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|▉         | 584/5971 [06:10<56:47,  1.58it/s, loss=0.129, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00314, train/loss_step=0.422, global_step=636.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 585/5971 [06:10<56:49,  1.58it/s, loss=0.135, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=637.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 586/5971 [06:11<56:50,  1.58it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.33e-5, train/loss_step=0.00237, global_step=637.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 587/5971 [06:12<56:52,  1.58it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.33e-5, train/loss_step=0.00237, global_step=637.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 587/5971 [06:12<56:52,  1.58it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000128, train/loss_step=0.0342, global_step=637.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|▉         | 588/5971 [06:15<57:08,  1.57it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0838, train/loss_vlb_step=0.000276, train/loss_step=0.0838, global_step=637.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 589/5971 [06:16<57:10,  1.57it/s, loss=0.118, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000954, train/loss_step=0.246, global_step=638.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|▉         | 590/5971 [06:16<57:12,  1.57it/s, loss=0.124, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000756, train/loss_step=0.216, global_step=638.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 591/5971 [06:17<57:13,  1.57it/s, loss=0.124, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000756, train/loss_step=0.216, global_step=638.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 591/5971 [06:17<57:13,  1.57it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0902, train/loss_vlb_step=0.000296, train/loss_step=0.0902, global_step=638.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 592/5971 [06:20<57:27,  1.56it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000145, train/loss_step=0.0406, global_step=638.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|▉         | 593/5971 [06:20<57:28,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00279, train/loss_step=0.427, global_step=639.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|▉         | 594/5971 [06:21<57:30,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.71e-5, train/loss_step=0.0108, global_step=639.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 595/5971 [06:22<57:31,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.71e-5, train/loss_step=0.0108, global_step=639.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 595/5971 [06:22<57:31,  1.56it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.63e-5, train/loss_step=0.00287, global_step=639.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|▉         | 596/5971 [06:25<57:48,  1.55it/s, loss=0.14, v_num=0, train/loss_simple_step=0.555, train/loss_vlb_step=0.0112, train/loss_step=0.555, global_step=639.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  10%|▉         | 597/5971 [06:26<57:50,  1.55it/s, loss=0.169, v_num=0, train/loss_simple_step=0.620, train/loss_vlb_step=0.0145, train/loss_step=0.620, global_step=640.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 598/5971 [06:27<57:51,  1.55it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.16e-5, train/loss_step=0.0122, global_step=640.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 599/5971 [06:27<57:53,  1.55it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.16e-5, train/loss_step=0.0122, global_step=640.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 599/5971 [06:27<57:53,  1.55it/s, loss=0.177, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000583, train/loss_step=0.174, global_step=640.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|█         | 600/5971 [06:30<58:05,  1.54it/s, loss=0.197, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00491, train/loss_step=0.486, global_step=640.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|█         | 601/5971 [06:30<58:07,  1.54it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=6.01e-5, train/loss_step=0.0134, global_step=641.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 602/5971 [06:31<58:08,  1.54it/s, loss=0.219, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00342, train/loss_step=0.456, global_step=641.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|█         | 603/5971 [06:32<58:09,  1.54it/s, loss=0.219, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00342, train/loss_step=0.456, global_step=641.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 603/5971 [06:32<58:09,  1.54it/s, loss=0.212, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000843, train/loss_step=0.230, global_step=641.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 604/5971 [06:34<58:22,  1.53it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.09e-5, train/loss_step=0.0144, global_step=641.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 605/5971 [06:35<58:23,  1.53it/s, loss=0.205, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00237, train/loss_step=0.386, global_step=642.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|█         | 606/5971 [06:36<58:25,  1.53it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00826, train/loss_vlb_step=3.9e-5, train/loss_step=0.00826, global_step=642.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 607/5971 [06:37<58:26,  1.53it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00826, train/loss_vlb_step=3.9e-5, train/loss_step=0.00826, global_step=642.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 607/5971 [06:37<58:26,  1.53it/s, loss=0.214, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.00067, train/loss_step=0.199, global_step=642.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  10%|█         | 608/5971 [06:40<58:44,  1.52it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0588, train/loss_vlb_step=0.000206, train/loss_step=0.0588, global_step=642.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 609/5971 [06:41<58:45,  1.52it/s, loss=0.203, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000171, train/loss_step=0.049, global_step=643.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|█         | 610/5971 [06:41<58:46,  1.52it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0893, train/loss_vlb_step=0.000293, train/loss_step=0.0893, global_step=643.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 611/5971 [06:42<58:47,  1.52it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0893, train/loss_vlb_step=0.000293, train/loss_step=0.0893, global_step=643.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 611/5971 [06:42<58:48,  1.52it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00453, train/loss_vlb_step=2.38e-5, train/loss_step=0.00453, global_step=643.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 612/5971 [06:45<59:04,  1.51it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.13e-5, train/loss_step=0.0167, global_step=643.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|█         | 613/5971 [06:46<59:05,  1.51it/s, loss=0.186, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00184, train/loss_step=0.328, global_step=644.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|█         | 614/5971 [06:47<59:06,  1.51it/s, loss=0.199, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.0011, train/loss_step=0.270, global_step=644.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|█         | 615/5971 [06:48<59:07,  1.51it/s, loss=0.199, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.0011, train/loss_step=0.270, global_step=644.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 615/5971 [06:48<59:07,  1.51it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0447, train/loss_vlb_step=0.000167, train/loss_step=0.0447, global_step=644.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 616/5971 [06:50<59:22,  1.50it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.5e-5, train/loss_step=0.0121, global_step=644.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|█         | 617/5971 [06:51<59:24,  1.50it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000124, train/loss_step=0.0324, global_step=645.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 618/5971 [06:52<59:25,  1.50it/s, loss=0.15, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=645.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  10%|█         | 619/5971 [06:53<59:26,  1.50it/s, loss=0.15, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=645.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 619/5971 [06:53<59:26,  1.50it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.00025, train/loss_step=0.0738, global_step=645.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 620/5971 [06:55<59:38,  1.50it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0717, train/loss_vlb_step=0.000237, train/loss_step=0.0717, global_step=645.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 621/5971 [06:56<59:39,  1.49it/s, loss=0.13, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000446, train/loss_step=0.136, global_step=646.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  10%|█         | 622/5971 [06:57<59:40,  1.49it/s, loss=0.148, v_num=0, train/loss_simple_step=0.809, train/loss_vlb_step=0.0825, train/loss_step=0.809, global_step=646.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  10%|█         | 623/5971 [06:57<59:42,  1.49it/s, loss=0.148, v_num=0, train/loss_simple_step=0.809, train/loss_vlb_step=0.0825, train/loss_step=0.809, global_step=646.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 623/5971 [06:57<59:42,  1.49it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0857, train/loss_vlb_step=0.000284, train/loss_step=0.0857, global_step=646.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 624/5971 [07:00<59:54,  1.49it/s, loss=0.165, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00447, train/loss_step=0.498, global_step=646.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  10%|█         | 625/5971 [07:01<59:55,  1.49it/s, loss=0.164, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00197, train/loss_step=0.378, global_step=647.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  10%|█         | 626/5971 [07:01<59:56,  1.49it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00453, train/loss_vlb_step=2.38e-5, train/loss_step=0.00453, global_step=647.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 627/5971 [07:02<59:57,  1.49it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00453, train/loss_vlb_step=2.38e-5, train/loss_step=0.00453, global_step=647.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 627/5971 [07:02<59:57,  1.49it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.4e-5, train/loss_step=0.00251, global_step=647.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  11%|█         | 628/5971 [07:04<1:00:09,  1.48it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00311, train/loss_vlb_step=1.73e-5, train/loss_step=0.00311, global_step=647.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 629/5971 [07:05<1:00:10,  1.48it/s, loss=0.152, v_num=0, train/loss_simple_step=0.067, train/loss_vlb_step=0.000223, train/loss_step=0.067, global_step=648.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  11%|█         | 630/5971 [07:06<1:00:11,  1.48it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.42e-5, train/loss_step=0.0026, global_step=648.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 631/5971 [07:07<1:00:12,  1.48it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.42e-5, train/loss_step=0.0026, global_step=648.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 631/5971 [07:07<1:00:12,  1.48it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=9.28e-5, train/loss_step=0.0224, global_step=648.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 632/5971 [07:09<1:00:24,  1.47it/s, loss=0.17, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00205, train/loss_step=0.429, global_step=648.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  11%|█         | 633/5971 [07:10<1:00:26,  1.47it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.69e-5, train/loss_step=0.00322, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 634/5971 [07:11<1:00:27,  1.47it/s, loss=0.146, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000411, train/loss_step=0.125, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  11%|█         | 635/5971 [07:12<1:00:28,  1.47it/s, loss=0.146, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000411, train/loss_step=0.125, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 635/5971 [07:12<1:00:28,  1.47it/s, loss=0.154, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000728, train/loss_step=0.203, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  11%|█         | 636/5971 [07:14<1:00:41,  1.46it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.45it/s][A

Validating:   1%|          | 2/167 [00:00<00:53,  3.06it/s][A
Epoch 1:  11%|█         | 639/5971 [07:15<1:00:28,  1.47it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.29it/s][A
Epoch 1:  11%|█         | 643/5971 [07:15<1:00:04,  1.48it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.66it/s][A
Epoch 1:  11%|█         | 647/5971 [07:15<59:41,  1.49it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.05it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.65it/s][A
Epoch 1:  11%|█         | 651/5971 [07:16<59:17,  1.50it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.15it/s][A
Epoch 1:  11%|█         | 655/5971 [07:16<58:54,  1.50it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.15it/s][A
Epoch 1:  11%|█         | 659/5971 [07:16<58:31,  1.51it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.68it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.94it/s][A
Epoch 1:  11%|█         | 663/5971 [07:16<58:09,  1.52it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 22.67it/s][A
Epoch 1:  11%|█         | 667/5971 [07:16<57:47,  1.53it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.40it/s][A
Epoch 1:  11%|█         | 671/5971 [07:16<57:25,  1.54it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.10it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.62it/s][A
Epoch 1:  11%|█▏        | 675/5971 [07:16<57:03,  1.55it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.98it/s][A
Epoch 1:  11%|█▏        | 679/5971 [07:17<56:41,  1.56it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.02it/s][A
Epoch 1:  11%|█▏        | 683/5971 [07:17<56:20,  1.56it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.81it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.85it/s][A
Epoch 1:  12%|█▏        | 687/5971 [07:17<55:59,  1.57it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 27.16it/s][A
Epoch 1:  12%|█▏        | 691/5971 [07:17<55:38,  1.58it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.99it/s][A
Epoch 1:  12%|█▏        | 695/5971 [07:17<55:17,  1.59it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.53it/s][A
Epoch 1:  12%|█▏        | 699/5971 [07:17<54:57,  1.60it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:03<00:03, 27.42it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.90it/s][A
Epoch 1:  12%|█▏        | 703/5971 [07:18<54:37,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.42it/s][A
Epoch 1:  12%|█▏        | 707/5971 [07:18<54:17,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.29it/s][A
Epoch 1:  12%|█▏        | 711/5971 [07:18<53:58,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.21it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 22.79it/s][A
Epoch 1:  12%|█▏        | 715/5971 [07:18<53:39,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 23.42it/s][A
Epoch 1:  12%|█▏        | 719/5971 [07:18<53:20,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|█████     | 84/167 [00:03<00:03, 23.81it/s][A
Epoch 1:  12%|█▏        | 723/5971 [07:18<53:01,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 25.20it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 25.82it/s][A
Epoch 1:  12%|█▏        | 727/5971 [07:18<52:42,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.15it/s][A
Epoch 1:  12%|█▏        | 731/5971 [07:19<52:23,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.60it/s][A
Epoch 1:  12%|█▏        | 735/5971 [07:19<52:05,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.92it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.99it/s][A
Epoch 1:  12%|█▏        | 739/5971 [07:19<51:46,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 24.37it/s][A
Epoch 1:  12%|█▏        | 743/5971 [07:19<51:29,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 25.21it/s][A
Epoch 1:  13%|█▎        | 747/5971 [07:19<51:11,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 25.87it/s][A

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 26.28it/s][A
Epoch 1:  13%|█▎        | 751/5971 [07:19<50:53,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.99it/s][A
Epoch 1:  13%|█▎        | 755/5971 [07:20<50:36,  1.72it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.46it/s][A
Epoch 1:  13%|█▎        | 759/5971 [07:20<50:18,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.44it/s][A

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.07it/s][A
Epoch 1:  13%|█▎        | 763/5971 [07:20<50:01,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.69it/s][A
Epoch 1:  13%|█▎        | 767/5971 [07:20<49:44,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.04it/s][A
Epoch 1:  13%|█▎        | 771/5971 [07:20<49:28,  1.75it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.59it/s][A

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.22it/s][A
Epoch 1:  13%|█▎        | 775/5971 [07:20<49:11,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 26.26it/s][A
Epoch 1:  13%|█▎        | 779/5971 [07:20<48:55,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.49it/s][A
Epoch 1:  13%|█▎        | 783/5971 [07:21<48:38,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.98it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.24it/s][A
Epoch 1:  13%|█▎        | 787/5971 [07:21<48:22,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.39it/s][A
Epoch 1:  13%|█▎        | 791/5971 [07:21<48:07,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.30it/s][A
Epoch 1:  13%|█▎        | 795/5971 [07:21<47:51,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.21it/s][A

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.16it/s][A
Epoch 1:  13%|█▎        | 799/5971 [07:21<47:35,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 22.86it/s][A
Epoch 1:  13%|█▎        | 803/5971 [07:21<47:20,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  13%|█▎        | 804/5971 [07:22<47:19,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.34e-5, train/loss_step=0.0202, global_step=649.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  13%|█▎        | 805/5971 [07:23<47:22,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00306, train/loss_vlb_step=1.67e-5, train/loss_step=0.00306, global_step=650.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  13%|█▎        | 806/5971 [07:24<47:24,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.001, train/loss_step=0.230, global_step=650.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  14%|█▎        | 807/5971 [07:25<47:25,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.001, train/loss_step=0.230, global_step=650.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 807/5971 [07:25<47:25,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0875, train/loss_vlb_step=0.000292, train/loss_step=0.0875, global_step=650.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 808/5971 [07:27<47:34,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.00368, train/loss_step=0.575, global_step=650.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  14%|█▎        | 809/5971 [07:28<47:36,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000461, train/loss_step=0.140, global_step=651.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 810/5971 [07:29<47:37,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=8.96e-5, train/loss_step=0.0234, global_step=651.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 811/5971 [07:29<47:39,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=8.96e-5, train/loss_step=0.0234, global_step=651.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 811/5971 [07:29<47:39,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.15e-5, train/loss_step=0.0167, global_step=651.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 812/5971 [07:32<47:50,  1.80it/s, loss=0.15, v_num=0, train/loss_simple_step=0.667, train/loss_vlb_step=0.0139, train/loss_step=0.667, global_step=651.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  14%|█▎        | 813/5971 [07:33<47:52,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.00012, train/loss_step=0.0307, global_step=652.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 814/5971 [07:34<47:53,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.26e-5, train/loss_step=0.0197, global_step=652.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 815/5971 [07:35<47:55,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.26e-5, train/loss_step=0.0197, global_step=652.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 815/5971 [07:35<47:55,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000446, train/loss_step=0.134, global_step=652.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  14%|█▎        | 816/5971 [07:37<48:04,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000475, train/loss_step=0.145, global_step=652.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 817/5971 [07:38<48:06,  1.79it/s, loss=0.173, v_num=0, train/loss_simple_step=0.587, train/loss_vlb_step=0.00477, train/loss_step=0.587, global_step=653.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▎        | 818/5971 [07:38<48:07,  1.78it/s, loss=0.181, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000554, train/loss_step=0.160, global_step=653.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 819/5971 [07:39<48:09,  1.78it/s, loss=0.181, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000554, train/loss_step=0.160, global_step=653.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 819/5971 [07:39<48:09,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00457, train/loss_step=0.478, global_step=653.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▎        | 820/5971 [07:41<48:18,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00268, train/loss_step=0.434, global_step=653.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▎        | 821/5971 [07:42<48:19,  1.78it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000233, train/loss_step=0.0669, global_step=654.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 822/5971 [07:43<48:21,  1.77it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.91e-5, train/loss_step=0.00351, global_step=654.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 823/5971 [07:44<48:22,  1.77it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.91e-5, train/loss_step=0.00351, global_step=654.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 823/5971 [07:44<48:22,  1.77it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000137, train/loss_step=0.0376, global_step=654.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 824/5971 [07:46<48:31,  1.77it/s, loss=0.22, v_num=0, train/loss_simple_step=0.553, train/loss_vlb_step=0.00559, train/loss_step=0.553, global_step=654.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  14%|█▍        | 825/5971 [07:47<48:33,  1.77it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.93e-5, train/loss_step=0.0246, global_step=655.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 826/5971 [07:48<48:34,  1.77it/s, loss=0.211, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000145, train/loss_step=0.040, global_step=655.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 827/5971 [07:49<48:35,  1.76it/s, loss=0.211, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000145, train/loss_step=0.040, global_step=655.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 827/5971 [07:49<48:35,  1.76it/s, loss=0.222, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00132, train/loss_step=0.295, global_step=655.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 828/5971 [07:51<48:45,  1.76it/s, loss=0.21, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.0014, train/loss_step=0.337, global_step=655.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  14%|█▍        | 829/5971 [07:52<48:46,  1.76it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000134, train/loss_step=0.0378, global_step=656.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 830/5971 [07:53<48:48,  1.76it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.45e-5, train/loss_step=0.00248, global_step=656.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 831/5971 [07:54<48:49,  1.75it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.45e-5, train/loss_step=0.00248, global_step=656.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 831/5971 [07:54<48:49,  1.75it/s, loss=0.211, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000542, train/loss_step=0.162, global_step=656.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  14%|█▍        | 832/5971 [07:56<48:59,  1.75it/s, loss=0.205, v_num=0, train/loss_simple_step=0.543, train/loss_vlb_step=0.00643, train/loss_step=0.543, global_step=656.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 833/5971 [07:57<49:01,  1.75it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.31e-5, train/loss_step=0.00222, global_step=657.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 834/5971 [07:58<49:02,  1.75it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0767, train/loss_vlb_step=0.000264, train/loss_step=0.0767, global_step=657.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 835/5971 [07:59<49:04,  1.74it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0767, train/loss_vlb_step=0.000264, train/loss_step=0.0767, global_step=657.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 835/5971 [07:59<49:04,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0223, train/loss_vlb_step=8.91e-5, train/loss_step=0.0223, global_step=657.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  14%|█▍        | 836/5971 [08:01<49:12,  1.74it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.77e-5, train/loss_step=0.00326, global_step=657.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 837/5971 [08:02<49:14,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.95e-5, train/loss_step=0.00568, global_step=658.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 838/5971 [08:03<49:15,  1.74it/s, loss=0.183, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00603, train/loss_step=0.537, global_step=658.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  14%|█▍        | 839/5971 [08:03<49:16,  1.74it/s, loss=0.183, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00603, train/loss_step=0.537, global_step=658.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 839/5971 [08:03<49:16,  1.74it/s, loss=0.167, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000584, train/loss_step=0.161, global_step=658.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 840/5971 [08:06<49:27,  1.73it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000216, train/loss_step=0.0628, global_step=658.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 841/5971 [08:07<49:28,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.7e-5, train/loss_step=0.00749, global_step=659.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 842/5971 [08:08<49:29,  1.73it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000147, train/loss_step=0.0413, global_step=659.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 843/5971 [08:08<49:30,  1.73it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000147, train/loss_step=0.0413, global_step=659.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 843/5971 [08:08<49:30,  1.73it/s, loss=0.152, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000401, train/loss_step=0.119, global_step=659.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  14%|█▍        | 844/5971 [08:11<49:39,  1.72it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.25e-5, train/loss_step=0.0125, global_step=659.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 845/5971 [08:11<49:40,  1.72it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000301, train/loss_step=0.0916, global_step=660.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 846/5971 [08:12<49:42,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.25e-5, train/loss_step=0.0207, global_step=660.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 847/5971 [08:13<49:43,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.25e-5, train/loss_step=0.0207, global_step=660.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 847/5971 [08:13<49:43,  1.72it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000145, train/loss_step=0.0393, global_step=660.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 848/5971 [08:15<49:51,  1.71it/s, loss=0.11, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000984, train/loss_step=0.261, global_step=660.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  14%|█▍        | 849/5971 [08:16<49:53,  1.71it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.51e-5, train/loss_step=0.00273, global_step=661.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 850/5971 [08:17<49:54,  1.71it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00818, train/loss_vlb_step=3.9e-5, train/loss_step=0.00818, global_step=661.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 851/5971 [08:18<49:55,  1.71it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00818, train/loss_vlb_step=3.9e-5, train/loss_step=0.00818, global_step=661.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 851/5971 [08:18<49:55,  1.71it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0683, train/loss_vlb_step=0.000237, train/loss_step=0.0683, global_step=661.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 852/5971 [08:20<50:05,  1.70it/s, loss=0.0785, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=0.0001, train/loss_step=0.0263, global_step=661.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 853/5971 [08:21<50:07,  1.70it/s, loss=0.079, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.76e-5, train/loss_step=0.0133, global_step=662.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 854/5971 [08:22<50:08,  1.70it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.0806, train/loss_vlb_step=0.000267, train/loss_step=0.0806, global_step=662.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 855/5971 [08:23<50:09,  1.70it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.0806, train/loss_vlb_step=0.000267, train/loss_step=0.0806, global_step=662.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 855/5971 [08:23<50:09,  1.70it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.76e-5, train/loss_step=0.0217, global_step=662.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 856/5971 [08:25<50:17,  1.69it/s, loss=0.114, v_num=0, train/loss_simple_step=0.692, train/loss_vlb_step=0.0278, train/loss_step=0.692, global_step=662.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  14%|█▍        | 857/5971 [08:26<50:19,  1.69it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.98e-5, train/loss_step=0.0189, global_step=663.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 858/5971 [08:27<50:21,  1.69it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000224, train/loss_step=0.0655, global_step=663.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 859/5971 [08:28<50:22,  1.69it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000224, train/loss_step=0.0655, global_step=663.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 859/5971 [08:28<50:22,  1.69it/s, loss=0.084, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.000103, train/loss_step=0.0265, global_step=663.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  14%|█▍        | 860/5971 [08:30<50:32,  1.69it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.45e-5, train/loss_step=0.021, global_step=663.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  14%|█▍        | 861/5971 [08:31<50:35,  1.68it/s, loss=0.082, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.22e-5, train/loss_step=0.00912, global_step=664.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 862/5971 [08:32<50:36,  1.68it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.34e-5, train/loss_step=0.00961, global_step=664.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 863/5971 [08:33<50:37,  1.68it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.34e-5, train/loss_step=0.00961, global_step=664.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 863/5971 [08:33<50:37,  1.68it/s, loss=0.0746, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.81e-5, train/loss_step=0.00327, global_step=664.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  14%|█▍        | 864/5971 [08:35<50:45,  1.68it/s, loss=0.0791, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000338, train/loss_step=0.102, global_step=664.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  14%|█▍        | 865/5971 [08:36<50:46,  1.68it/s, loss=0.0754, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.4e-5, train/loss_step=0.0176, global_step=665.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 866/5971 [08:37<50:47,  1.68it/s, loss=0.0747, v_num=0, train/loss_simple_step=0.00584, train/loss_vlb_step=3.04e-5, train/loss_step=0.00584, global_step=665.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 867/5971 [08:38<50:48,  1.67it/s, loss=0.0747, v_num=0, train/loss_simple_step=0.00584, train/loss_vlb_step=3.04e-5, train/loss_step=0.00584, global_step=665.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 867/5971 [08:38<50:48,  1.67it/s, loss=0.0731, v_num=0, train/loss_simple_step=0.00821, train/loss_vlb_step=3.82e-5, train/loss_step=0.00821, global_step=665.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 868/5971 [08:40<50:56,  1.67it/s, loss=0.0611, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.71e-5, train/loss_step=0.021, global_step=665.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  15%|█▍        | 869/5971 [08:41<50:58,  1.67it/s, loss=0.0672, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=666.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 870/5971 [08:42<50:59,  1.67it/s, loss=0.076, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000627, train/loss_step=0.183, global_step=666.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▍        | 871/5971 [08:43<51:00,  1.67it/s, loss=0.076, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000627, train/loss_step=0.183, global_step=666.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 871/5971 [08:43<51:00,  1.67it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.540, train/loss_vlb_step=0.00617, train/loss_step=0.540, global_step=666.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 872/5971 [08:45<51:08,  1.66it/s, loss=0.11, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000936, train/loss_step=0.228, global_step=666.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▍        | 873/5971 [08:46<51:09,  1.66it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.000103, train/loss_step=0.0282, global_step=667.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 874/5971 [08:47<51:10,  1.66it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000139, train/loss_step=0.0382, global_step=667.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 875/5971 [08:47<51:11,  1.66it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000139, train/loss_step=0.0382, global_step=667.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 875/5971 [08:47<51:11,  1.66it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.00025, train/loss_step=0.0748, global_step=667.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▍        | 876/5971 [08:50<51:19,  1.65it/s, loss=0.0765, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.85e-5, train/loss_step=0.00331, global_step=667.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 877/5971 [08:50<51:20,  1.65it/s, loss=0.0802, v_num=0, train/loss_simple_step=0.0936, train/loss_vlb_step=0.000308, train/loss_step=0.0936, global_step=668.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▍        | 878/5971 [08:51<51:21,  1.65it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.0012, train/loss_step=0.278, global_step=668.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  15%|█▍        | 879/5971 [08:52<51:22,  1.65it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.0012, train/loss_step=0.278, global_step=668.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 879/5971 [08:52<51:22,  1.65it/s, loss=0.108, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00214, train/loss_step=0.364, global_step=668.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 880/5971 [08:55<51:32,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00112, train/loss_step=0.293, global_step=668.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 881/5971 [08:56<51:33,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00733, train/loss_vlb_step=3.57e-5, train/loss_step=0.00733, global_step=669.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 882/5971 [08:56<51:34,  1.64it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=669.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 883/5971 [08:57<51:35,  1.64it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=669.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 883/5971 [08:57<51:35,  1.64it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00304, train/loss_vlb_step=1.6e-5, train/loss_step=0.00304, global_step=669.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▍        | 884/5971 [09:00<51:46,  1.64it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00934, train/loss_vlb_step=4.25e-5, train/loss_step=0.00934, global_step=669.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 885/5971 [09:01<51:47,  1.64it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000176, train/loss_step=0.0515, global_step=670.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▍        | 886/5971 [09:02<51:48,  1.64it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000176, train/loss_step=0.0505, global_step=670.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▍        | 887/5971 [09:03<51:49,  1.64it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000176, train/loss_step=0.0505, global_step=670.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 887/5971 [09:03<51:49,  1.64it/s, loss=0.131, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00091, train/loss_step=0.234, global_step=670.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  15%|█▍        | 888/5971 [09:05<51:59,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.89e-5, train/loss_step=0.0134, global_step=670.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 889/5971 [09:06<52:00,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000442, train/loss_step=0.129, global_step=671.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▍        | 890/5971 [09:07<52:01,  1.63it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000132, train/loss_step=0.0357, global_step=671.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 891/5971 [09:08<52:02,  1.63it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000132, train/loss_step=0.0357, global_step=671.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 891/5971 [09:08<52:02,  1.63it/s, loss=0.108, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.00087, train/loss_step=0.215, global_step=671.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  15%|█▍        | 892/5971 [09:10<52:11,  1.62it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.63e-5, train/loss_step=0.013, global_step=671.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 893/5971 [09:11<52:12,  1.62it/s, loss=0.107, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000869, train/loss_step=0.223, global_step=672.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 894/5971 [09:12<52:13,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000113, train/loss_step=0.0303, global_step=672.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 895/5971 [09:13<52:15,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000113, train/loss_step=0.0303, global_step=672.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▍        | 895/5971 [09:13<52:15,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0697, train/loss_vlb_step=0.000231, train/loss_step=0.0697, global_step=672.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▌        | 896/5971 [09:15<52:23,  1.61it/s, loss=0.116, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000757, train/loss_step=0.198, global_step=672.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  15%|█▌        | 897/5971 [09:16<52:24,  1.61it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0638, train/loss_vlb_step=0.000226, train/loss_step=0.0638, global_step=673.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▌        | 898/5971 [09:17<52:25,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00137, train/loss_step=0.263, global_step=673.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  15%|█▌        | 899/5971 [09:18<52:26,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00137, train/loss_step=0.263, global_step=673.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▌        | 899/5971 [09:18<52:26,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.0109, train/loss_step=0.595, global_step=673.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▌        | 900/5971 [09:20<52:35,  1.61it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=6.91e-5, train/loss_step=0.0189, global_step=673.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▌        | 901/5971 [09:21<52:36,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000128, train/loss_step=0.0326, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▌        | 902/5971 [09:22<52:37,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  15%|█▌        | 903/5971 [09:23<52:38,  1.60it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▌        | 903/5971 [09:23<52:38,  1.60it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00573, train/loss_vlb_step=2.85e-5, train/loss_step=0.00573, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  15%|█▌        | 904/5971 [09:25<52:46,  1.60it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:12,  2.30it/s][A

Validating:   1%|          | 2/167 [00:00<00:43,  3.77it/s][A
Epoch 1:  15%|█▌        | 907/5971 [09:26<52:37,  1.60it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.82it/s][A
Epoch 1:  15%|█▌        | 911/5971 [09:26<52:22,  1.61it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:10, 14.66it/s][A
Epoch 1:  15%|█▌        | 915/5971 [09:26<52:06,  1.62it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 12/167 [00:00<00:07, 20.10it/s][A
Epoch 1:  15%|█▌        | 919/5971 [09:26<51:51,  1.62it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   9%|▉         | 15/167 [00:01<00:06, 21.77it/s][A
Epoch 1:  15%|█▌        | 923/5971 [09:26<51:36,  1.63it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  11%|█▏        | 19/167 [00:01<00:05, 24.82it/s][A

Validating:  13%|█▎        | 22/167 [00:01<00:05, 24.82it/s][A
Epoch 1:  16%|█▌        | 927/5971 [09:26<51:21,  1.64it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.81it/s][A
Epoch 1:  16%|█▌        | 931/5971 [09:27<51:06,  1.64it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 26.79it/s][A
Epoch 1:  16%|█▌        | 935/5971 [09:27<50:51,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 27.14it/s][A

Validating:  20%|██        | 34/167 [00:01<00:05, 26.13it/s][A
Epoch 1:  16%|█▌        | 939/5971 [09:27<50:37,  1.66it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 37/167 [00:01<00:04, 26.07it/s][A
Epoch 1:  16%|█▌        | 943/5971 [09:27<50:22,  1.66it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  24%|██▍       | 40/167 [00:01<00:04, 27.14it/s][A
Epoch 1:  16%|█▌        | 947/5971 [09:27<50:08,  1.67it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 27.89it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:04, 27.12it/s][A
Epoch 1:  16%|█▌        | 951/5971 [09:27<49:54,  1.68it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.74it/s][A
Epoch 1:  16%|█▌        | 955/5971 [09:27<49:39,  1.68it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 52/167 [00:02<00:04, 26.17it/s][A
Epoch 1:  16%|█▌        | 959/5971 [09:28<49:25,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 25.92it/s][A

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.90it/s][A
Epoch 1:  16%|█▌        | 963/5971 [09:28<49:12,  1.70it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 28.29it/s][A
Epoch 1:  16%|█▌        | 967/5971 [09:28<48:58,  1.70it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 27.88it/s][A
Epoch 1:  16%|█▋        | 971/5971 [09:28<48:44,  1.71it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████      | 68/167 [00:02<00:03, 28.32it/s][A
Epoch 1:  16%|█▋        | 975/5971 [09:28<48:31,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.37it/s][A
Epoch 1:  16%|█▋        | 979/5971 [09:28<48:17,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.79it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.41it/s][A
Epoch 1:  16%|█▋        | 983/5971 [09:28<48:04,  1.73it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.94it/s][A
Epoch 1:  17%|█▋        | 987/5971 [09:29<47:50,  1.74it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|█████     | 84/167 [00:03<00:03, 27.61it/s][A
Epoch 1:  17%|█▋        | 991/5971 [09:29<47:37,  1.74it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 27.32it/s][A

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.81it/s][A
Epoch 1:  17%|█▋        | 995/5971 [09:29<47:24,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 26.99it/s][A
Epoch 1:  17%|█▋        | 999/5971 [09:29<47:11,  1.76it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.39it/s][A
Epoch 1:  17%|█▋        | 1003/5971 [09:29<46:59,  1.76it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 27.50it/s][A
Epoch 1:  17%|█▋        | 1007/5971 [09:29<46:46,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.00it/s][A

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.30it/s][A
Epoch 1:  17%|█▋        | 1011/5971 [09:29<46:33,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.88it/s][A
Epoch 1:  17%|█▋        | 1015/5971 [09:30<46:21,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 28.92it/s][A
Epoch 1:  17%|█▋        | 1019/5971 [09:30<46:08,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.96it/s][A
Epoch 1:  17%|█▋        | 1023/5971 [09:30<45:56,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 27.91it/s][A
Epoch 1:  17%|█▋        | 1027/5971 [09:30<45:44,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▎  | 123/167 [00:04<00:01, 28.27it/s][A

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.82it/s][A
Epoch 1:  17%|█▋        | 1031/5971 [09:30<45:31,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 28.08it/s][A
Epoch 1:  17%|█▋        | 1035/5971 [09:30<45:19,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.29it/s][A
Epoch 1:  17%|█▋        | 1039/5971 [09:30<45:07,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.38it/s][A
Epoch 1:  17%|█▋        | 1043/5971 [09:31<44:56,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.09it/s][A

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.15it/s][A
Epoch 1:  18%|█▊        | 1047/5971 [09:31<44:44,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 26.66it/s][A
Epoch 1:  18%|█▊        | 1051/5971 [09:31<44:32,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 26.71it/s][A
Epoch 1:  18%|█▊        | 1055/5971 [09:31<44:21,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.52it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.12it/s][A
Epoch 1:  18%|█▊        | 1059/5971 [09:31<44:09,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.57it/s][A
Epoch 1:  18%|█▊        | 1063/5971 [09:31<43:58,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.52it/s][A
Epoch 1:  18%|█▊        | 1067/5971 [09:32<43:46,  1.87it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.22it/s][A
Epoch 1:  18%|█▊        | 1071/5971 [09:32<43:35,  1.87it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 100%|██████████| 167/167 [00:06<00:00, 28.45it/s][A
Epoch 1:  18%|█▊        | 1072/5971 [09:32<43:34,  1.87it/s, loss=0.12, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000488, train/loss_step=0.149, global_step=674.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  18%|█▊        | 1073/5971 [09:33<43:37,  1.87it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000153, train/loss_step=0.0425, global_step=675.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1074/5971 [09:34<43:38,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0826, train/loss_vlb_step=0.000275, train/loss_step=0.0826, global_step=675.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1075/5971 [09:35<43:39,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0826, train/loss_vlb_step=0.000275, train/loss_step=0.0826, global_step=675.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1075/5971 [09:35<43:39,  1.87it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.38e-5, train/loss_step=0.0149, global_step=675.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  18%|█▊        | 1076/5971 [09:38<43:47,  1.86it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000325, train/loss_step=0.0971, global_step=675.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1077/5971 [09:39<43:50,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.581, train/loss_vlb_step=0.00727, train/loss_step=0.581, global_step=676.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  18%|█▊        | 1078/5971 [09:40<43:52,  1.86it/s, loss=0.136, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.55e-5, train/loss_step=0.013, global_step=676.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1079/5971 [09:41<43:53,  1.86it/s, loss=0.136, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.55e-5, train/loss_step=0.013, global_step=676.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1079/5971 [09:41<43:53,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.547, train/loss_vlb_step=0.00741, train/loss_step=0.547, global_step=676.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1080/5971 [09:43<43:59,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000733, train/loss_step=0.195, global_step=676.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1081/5971 [09:44<44:01,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00161, train/loss_step=0.324, global_step=677.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  18%|█▊        | 1082/5971 [09:45<44:02,  1.85it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000163, train/loss_step=0.0453, global_step=677.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1083/5971 [09:46<44:03,  1.85it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000163, train/loss_step=0.0453, global_step=677.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1083/5971 [09:46<44:03,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00122, train/loss_vlb_step=7.31e-6, train/loss_step=0.00122, global_step=677.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1084/5971 [09:49<44:12,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.2e-5, train/loss_step=0.0165, global_step=677.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  18%|█▊        | 1085/5971 [09:49<44:14,  1.84it/s, loss=0.163, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000981, train/loss_step=0.240, global_step=678.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1086/5971 [09:50<44:15,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00361, train/loss_vlb_step=2.01e-5, train/loss_step=0.00361, global_step=678.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1087/5971 [09:51<44:16,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00361, train/loss_vlb_step=2.01e-5, train/loss_step=0.00361, global_step=678.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1087/5971 [09:51<44:16,  1.84it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000156, train/loss_step=0.0424, global_step=678.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  18%|█▊        | 1088/5971 [09:54<44:24,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00164, train/loss_step=0.295, global_step=678.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  18%|█▊        | 1089/5971 [09:55<44:25,  1.83it/s, loss=0.154, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00241, train/loss_step=0.380, global_step=679.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1090/5971 [09:56<44:26,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00355, train/loss_step=0.481, global_step=679.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1091/5971 [09:56<44:27,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00355, train/loss_step=0.481, global_step=679.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1091/5971 [09:56<44:27,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00829, train/loss_vlb_step=3.92e-5, train/loss_step=0.00829, global_step=679.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1092/5971 [09:59<44:34,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00203, train/loss_step=0.356, global_step=679.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  18%|█▊        | 1093/5971 [09:59<44:35,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000791, train/loss_step=0.217, global_step=680.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1094/5971 [10:00<44:36,  1.82it/s, loss=0.206, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.000976, train/loss_step=0.263, global_step=680.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1095/5971 [10:01<44:37,  1.82it/s, loss=0.206, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.000976, train/loss_step=0.263, global_step=680.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1095/5971 [10:01<44:37,  1.82it/s, loss=0.213, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000499, train/loss_step=0.150, global_step=680.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1096/5971 [10:03<44:43,  1.82it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0822, train/loss_vlb_step=0.000271, train/loss_step=0.0822, global_step=680.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1097/5971 [10:04<44:44,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.54e-5, train/loss_step=0.00274, global_step=681.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1098/5971 [10:05<44:45,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00124, train/loss_step=0.281, global_step=681.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  18%|█▊        | 1099/5971 [10:06<44:46,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00124, train/loss_step=0.281, global_step=681.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1099/5971 [10:06<44:46,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.5e-5, train/loss_step=0.0027, global_step=681.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1100/5971 [10:08<44:53,  1.81it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.37e-5, train/loss_step=0.0248, global_step=681.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1101/5971 [10:09<44:54,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00979, train/loss_step=0.567, global_step=682.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  18%|█▊        | 1102/5971 [10:10<44:55,  1.81it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00586, train/loss_vlb_step=2.95e-5, train/loss_step=0.00586, global_step=682.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1103/5971 [10:11<44:56,  1.81it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00586, train/loss_vlb_step=2.95e-5, train/loss_step=0.00586, global_step=682.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  18%|█▊        | 1103/5971 [10:11<44:56,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000683, train/loss_step=0.181, global_step=682.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  18%|█▊        | 1104/5971 [10:13<45:02,  1.80it/s, loss=0.206, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.00638, train/loss_step=0.539, global_step=682.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1105/5971 [10:14<45:03,  1.80it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00305, train/loss_vlb_step=1.71e-5, train/loss_step=0.00305, global_step=683.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1106/5971 [10:15<45:04,  1.80it/s, loss=0.226, v_num=0, train/loss_simple_step=0.648, train/loss_vlb_step=0.00959, train/loss_step=0.648, global_step=683.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  19%|█▊        | 1107/5971 [10:16<45:05,  1.80it/s, loss=0.226, v_num=0, train/loss_simple_step=0.648, train/loss_vlb_step=0.00959, train/loss_step=0.648, global_step=683.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1107/5971 [10:16<45:05,  1.80it/s, loss=0.256, v_num=0, train/loss_simple_step=0.630, train/loss_vlb_step=0.0154, train/loss_step=0.630, global_step=683.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▊        | 1108/5971 [10:18<45:12,  1.79it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.11e-5, train/loss_step=0.0143, global_step=683.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1109/5971 [10:19<45:13,  1.79it/s, loss=0.223, v_num=0, train/loss_simple_step=0.00825, train/loss_vlb_step=3.95e-5, train/loss_step=0.00825, global_step=684.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1110/5971 [10:20<45:14,  1.79it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000281, train/loss_step=0.0844, global_step=684.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▊        | 1111/5971 [10:21<45:16,  1.79it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000281, train/loss_step=0.0844, global_step=684.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1111/5971 [10:21<45:16,  1.79it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0094, train/loss_vlb_step=4.46e-5, train/loss_step=0.0094, global_step=684.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▊        | 1112/5971 [10:23<45:23,  1.78it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.94e-5, train/loss_step=0.0213, global_step=684.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1113/5971 [10:24<45:23,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.000981, train/loss_step=0.268, global_step=685.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▊        | 1114/5971 [10:25<45:24,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000573, train/loss_step=0.173, global_step=685.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1115/5971 [10:26<45:25,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000573, train/loss_step=0.173, global_step=685.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1115/5971 [10:26<45:25,  1.78it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0513, train/loss_vlb_step=0.000177, train/loss_step=0.0513, global_step=685.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1116/5971 [10:29<45:34,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00282, train/loss_step=0.403, global_step=685.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  19%|█▊        | 1117/5971 [10:29<45:35,  1.77it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00819, train/loss_vlb_step=3.79e-5, train/loss_step=0.00819, global_step=686.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1118/5971 [10:30<45:35,  1.77it/s, loss=0.221, v_num=0, train/loss_simple_step=0.783, train/loss_vlb_step=0.0209, train/loss_step=0.783, global_step=686.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  19%|█▊        | 1119/5971 [10:31<45:36,  1.77it/s, loss=0.221, v_num=0, train/loss_simple_step=0.783, train/loss_vlb_step=0.0209, train/loss_step=0.783, global_step=686.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▊        | 1119/5971 [10:31<45:36,  1.77it/s, loss=0.238, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00264, train/loss_step=0.329, global_step=686.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1120/5971 [10:33<45:43,  1.77it/s, loss=0.237, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.96e-5, train/loss_step=0.00358, global_step=686.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1121/5971 [10:34<45:43,  1.77it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000174, train/loss_step=0.0481, global_step=687.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▉        | 1122/5971 [10:35<45:44,  1.77it/s, loss=0.213, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000177, train/loss_step=0.053, global_step=687.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  19%|█▉        | 1123/5971 [10:36<45:45,  1.77it/s, loss=0.213, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000177, train/loss_step=0.053, global_step=687.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1123/5971 [10:36<45:45,  1.77it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00461, train/loss_vlb_step=2.46e-5, train/loss_step=0.00461, global_step=687.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1124/5971 [10:38<45:51,  1.76it/s, loss=0.187, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000781, train/loss_step=0.202, global_step=687.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  19%|█▉        | 1125/5971 [10:39<45:52,  1.76it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.5e-5, train/loss_step=0.0128, global_step=688.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1126/5971 [10:40<45:53,  1.76it/s, loss=0.162, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000474, train/loss_step=0.143, global_step=688.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1127/5971 [10:41<45:54,  1.76it/s, loss=0.162, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000474, train/loss_step=0.143, global_step=688.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1127/5971 [10:41<45:54,  1.76it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.88e-5, train/loss_step=0.0137, global_step=688.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1128/5971 [10:43<46:01,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.0006, train/loss_step=0.170, global_step=688.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  19%|█▉        | 1129/5971 [10:44<46:02,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.000215, train/loss_step=0.0609, global_step=689.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1130/5971 [10:45<46:02,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.87e-5, train/loss_step=0.00568, global_step=689.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1131/5971 [10:46<46:04,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.87e-5, train/loss_step=0.00568, global_step=689.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1131/5971 [10:46<46:04,  1.75it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000141, train/loss_step=0.0389, global_step=689.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  19%|█▉        | 1132/5971 [10:48<46:10,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00472, train/loss_vlb_step=2.53e-5, train/loss_step=0.00472, global_step=689.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1133/5971 [10:49<46:11,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00206, train/loss_step=0.374, global_step=690.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  19%|█▉        | 1134/5971 [10:50<46:11,  1.75it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.24e-5, train/loss_step=0.00868, global_step=690.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1135/5971 [10:51<46:12,  1.74it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.24e-5, train/loss_step=0.00868, global_step=690.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1135/5971 [10:51<46:12,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.419, train/loss_vlb_step=0.00216, train/loss_step=0.419, global_step=690.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  19%|█▉        | 1136/5971 [10:53<46:19,  1.74it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.57e-5, train/loss_step=0.0151, global_step=690.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1137/5971 [10:54<46:20,  1.74it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=8.96e-5, train/loss_step=0.0228, global_step=691.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1138/5971 [10:55<46:21,  1.74it/s, loss=0.105, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000743, train/loss_step=0.166, global_step=691.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▉        | 1139/5971 [10:56<46:22,  1.74it/s, loss=0.105, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000743, train/loss_step=0.166, global_step=691.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1139/5971 [10:56<46:22,  1.74it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.00959, train/loss_vlb_step=4.49e-5, train/loss_step=0.00959, global_step=691.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1140/5971 [10:58<46:28,  1.73it/s, loss=0.122, v_num=0, train/loss_simple_step=0.666, train/loss_vlb_step=0.0134, train/loss_step=0.666, global_step=691.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  19%|█▉        | 1141/5971 [10:59<46:29,  1.73it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00453, train/loss_vlb_step=2.46e-5, train/loss_step=0.00453, global_step=692.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1142/5971 [11:00<46:29,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000125, train/loss_step=0.0352, global_step=692.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1143/5971 [11:01<46:30,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000125, train/loss_step=0.0352, global_step=692.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1143/5971 [11:01<46:30,  1.73it/s, loss=0.152, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.015, train/loss_step=0.668, global_step=692.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  19%|█▉        | 1144/5971 [11:03<46:36,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00281, train/loss_step=0.457, global_step=692.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1145/5971 [11:04<46:37,  1.73it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000241, train/loss_step=0.0709, global_step=693.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1146/5971 [11:05<46:37,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=2.95e-5, train/loss_step=0.00626, global_step=693.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1147/5971 [11:05<46:38,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=2.95e-5, train/loss_step=0.00626, global_step=693.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1147/5971 [11:05<46:38,  1.72it/s, loss=0.167, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000486, train/loss_step=0.145, global_step=693.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  19%|█▉        | 1148/5971 [11:08<46:45,  1.72it/s, loss=0.167, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000568, train/loss_step=0.167, global_step=693.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1149/5971 [11:09<46:46,  1.72it/s, loss=0.178, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.000942, train/loss_step=0.270, global_step=694.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1150/5971 [11:10<46:46,  1.72it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=6.89e-5, train/loss_step=0.0181, global_step=694.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1151/5971 [11:10<46:47,  1.72it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=6.89e-5, train/loss_step=0.0181, global_step=694.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1151/5971 [11:10<46:47,  1.72it/s, loss=0.185, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000615, train/loss_step=0.177, global_step=694.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▉        | 1152/5971 [11:13<46:53,  1.71it/s, loss=0.206, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00216, train/loss_step=0.421, global_step=694.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▉        | 1153/5971 [11:13<46:53,  1.71it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=8.94e-6, train/loss_step=0.00157, global_step=695.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1154/5971 [11:14<46:54,  1.71it/s, loss=0.195, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000506, train/loss_step=0.153, global_step=695.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  19%|█▉        | 1155/5971 [11:15<46:55,  1.71it/s, loss=0.195, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000506, train/loss_step=0.153, global_step=695.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1155/5971 [11:15<46:55,  1.71it/s, loss=0.185, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000777, train/loss_step=0.223, global_step=695.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1156/5971 [11:18<47:02,  1.71it/s, loss=0.19, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000413, train/loss_step=0.123, global_step=695.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▉        | 1157/5971 [11:19<47:03,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.6e-5, train/loss_step=0.00273, global_step=696.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1158/5971 [11:20<47:03,  1.70it/s, loss=0.199, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00166, train/loss_step=0.353, global_step=696.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  19%|█▉        | 1159/5971 [11:20<47:04,  1.70it/s, loss=0.199, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00166, train/loss_step=0.353, global_step=696.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1159/5971 [11:20<47:04,  1.70it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.6e-5, train/loss_step=0.0129, global_step=696.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1160/5971 [11:23<47:10,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.000222, train/loss_step=0.0647, global_step=696.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1161/5971 [11:23<47:11,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.44e-5, train/loss_step=0.0104, global_step=697.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  19%|█▉        | 1162/5971 [11:24<47:11,  1.70it/s, loss=0.21, v_num=0, train/loss_simple_step=0.859, train/loss_vlb_step=0.0732, train/loss_step=0.859, global_step=697.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  19%|█▉        | 1163/5971 [11:25<47:12,  1.70it/s, loss=0.21, v_num=0, train/loss_simple_step=0.859, train/loss_vlb_step=0.0732, train/loss_step=0.859, global_step=697.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1163/5971 [11:25<47:12,  1.70it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=6.03e-5, train/loss_step=0.0137, global_step=697.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  19%|█▉        | 1164/5971 [11:27<47:18,  1.69it/s, loss=0.174, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00237, train/loss_step=0.392, global_step=697.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  20%|█▉        | 1165/5971 [11:28<47:18,  1.69it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=7.9e-5, train/loss_step=0.0203, global_step=698.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  20%|█▉        | 1166/5971 [11:29<47:19,  1.69it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0083, train/loss_vlb_step=3.59e-5, train/loss_step=0.0083, global_step=698.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  20%|█▉        | 1167/5971 [11:30<47:20,  1.69it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0083, train/loss_vlb_step=3.59e-5, train/loss_step=0.0083, global_step=698.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  20%|█▉        | 1167/5971 [11:30<47:20,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.78e-5, train/loss_step=0.0192, global_step=698.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  20%|█▉        | 1168/5971 [11:33<47:28,  1.69it/s, loss=0.172, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00113, train/loss_step=0.293, global_step=698.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  20%|█▉        | 1169/5971 [11:34<47:29,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000513, train/loss_step=0.122, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  20%|█▉        | 1170/5971 [11:35<47:30,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00703, train/loss_vlb_step=3.34e-5, train/loss_step=0.00703, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  20%|█▉        | 1171/5971 [11:36<47:30,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00703, train/loss_vlb_step=3.34e-5, train/loss_step=0.00703, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  20%|█▉        | 1171/5971 [11:36<47:30,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.06e-5, train/loss_step=0.00378, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  20%|█▉        | 1172/5971 [11:38<47:36,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:01<03:16,  1.18s/it][A

Validating:   1%|          | 2/167 [00:01<01:46,  1.55it/s][A
Epoch 1:  20%|█▉        | 1175/5971 [11:39<47:33,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:01<00:34,  4.76it/s][A
Epoch 1:  20%|█▉        | 1179/5971 [11:39<47:21,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:01<00:19,  8.10it/s][A
Epoch 1:  20%|█▉        | 1183/5971 [11:39<47:10,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:13, 11.26it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:10, 14.55it/s][A
Epoch 1:  20%|█▉        | 1187/5971 [11:40<46:59,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:02<00:08, 17.15it/s][A
Epoch 1:  20%|█▉        | 1191/5971 [11:40<46:48,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:02<00:07, 19.43it/s][A
Epoch 1:  20%|██        | 1195/5971 [11:40<46:36,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:02<00:06, 21.74it/s][A

Validating:  16%|█▌        | 26/167 [00:02<00:05, 23.65it/s][A
Epoch 1:  20%|██        | 1199/5971 [11:40<46:25,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:02<00:05, 23.92it/s][A
Epoch 1:  20%|██        | 1203/5971 [11:40<46:14,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▉        | 32/167 [00:02<00:05, 24.63it/s][A
Epoch 1:  20%|██        | 1207/5971 [11:40<46:03,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:02<00:05, 24.18it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.58it/s][A
Epoch 1:  20%|██        | 1211/5971 [11:41<45:53,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.49it/s][A
Epoch 1:  20%|██        | 1215/5971 [11:41<45:42,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▋       | 44/167 [00:03<00:04, 26.56it/s][A
Epoch 1:  20%|██        | 1219/5971 [11:41<45:31,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:03<00:04, 26.49it/s][A

Validating:  30%|██▉       | 50/167 [00:03<00:04, 26.09it/s][A
Epoch 1:  20%|██        | 1223/5971 [11:41<45:21,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:03<00:04, 26.01it/s][A
Epoch 1:  21%|██        | 1227/5971 [11:41<45:10,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 26.88it/s][A
Epoch 1:  21%|██        | 1231/5971 [11:41<44:59,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▌      | 59/167 [00:03<00:04, 26.90it/s][A

Validating:  37%|███▋      | 62/167 [00:03<00:03, 26.54it/s][A
Epoch 1:  21%|██        | 1235/5971 [11:41<44:49,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.81it/s][A
Epoch 1:  21%|██        | 1239/5971 [11:42<44:39,  1.77it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.97it/s][A
Epoch 1:  21%|██        | 1243/5971 [11:42<44:28,  1.77it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:04<00:03, 26.23it/s][A
Epoch 1:  21%|██        | 1247/5971 [11:42<44:18,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:04<00:03, 25.91it/s][A

Validating:  47%|████▋     | 78/167 [00:04<00:03, 24.84it/s][A
Epoch 1:  21%|██        | 1251/5971 [11:42<44:08,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:04<00:03, 25.66it/s][A
Epoch 1:  21%|██        | 1255/5971 [11:42<43:58,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|█████     | 84/167 [00:04<00:03, 25.89it/s][A
Epoch 1:  21%|██        | 1259/5971 [11:42<43:48,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 88/167 [00:04<00:02, 27.55it/s][A
Epoch 1:  21%|██        | 1263/5971 [11:42<43:38,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 27.26it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 27.98it/s][A
Epoch 1:  21%|██        | 1267/5971 [11:43<43:28,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  58%|█████▊    | 97/167 [00:05<00:02, 27.43it/s][A
Epoch 1:  21%|██▏       | 1271/5971 [11:43<43:18,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|█████▉    | 100/167 [00:05<00:02, 27.16it/s][A
Epoch 1:  21%|██▏       | 1275/5971 [11:43<43:08,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 104/167 [00:05<00:02, 28.54it/s][A
Epoch 1:  21%|██▏       | 1279/5971 [11:43<42:58,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 27.10it/s][A

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 26.70it/s][A
Epoch 1:  21%|██▏       | 1283/5971 [11:43<42:49,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 114/167 [00:05<00:01, 28.24it/s][A
Epoch 1:  22%|██▏       | 1287/5971 [11:43<42:39,  1.83it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  70%|███████   | 117/167 [00:05<00:01, 28.65it/s][A
Epoch 1:  22%|██▏       | 1291/5971 [11:43<42:29,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 27.94it/s][A
Epoch 1:  22%|██▏       | 1295/5971 [11:44<42:20,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.93it/s][A

Validating:  75%|███████▌  | 126/167 [00:06<00:01, 27.28it/s][A
Epoch 1:  22%|██▏       | 1299/5971 [11:44<42:11,  1.85it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 129/167 [00:06<00:01, 27.85it/s][A
Epoch 1:  22%|██▏       | 1303/5971 [11:44<42:01,  1.85it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 26.60it/s][A
Epoch 1:  22%|██▏       | 1307/5971 [11:44<41:52,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████  | 135/167 [00:06<00:01, 27.34it/s][A

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 26.87it/s][A
Epoch 1:  22%|██▏       | 1311/5971 [11:44<41:42,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 25.65it/s][A
Epoch 1:  22%|██▏       | 1315/5971 [11:44<41:33,  1.87it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 25.76it/s][A
Epoch 1:  22%|██▏       | 1319/5971 [11:45<41:24,  1.87it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 25.07it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.00it/s][A
Epoch 1:  22%|██▏       | 1323/5971 [11:45<41:15,  1.88it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 153/167 [00:07<00:00, 26.80it/s][A
Epoch 1:  22%|██▏       | 1327/5971 [11:45<41:06,  1.88it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 156/167 [00:07<00:00, 27.09it/s][A
Epoch 1:  22%|██▏       | 1331/5971 [11:45<40:57,  1.89it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  95%|█████████▌| 159/167 [00:07<00:00, 26.78it/s][A

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 26.94it/s][A
Epoch 1:  22%|██▏       | 1335/5971 [11:45<40:48,  1.89it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 27.43it/s][A
Epoch 1:  22%|██▏       | 1339/5971 [11:45<40:39,  1.90it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  22%|██▏       | 1340/5971 [11:46<40:38,  1.90it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.81e-6, train/loss_step=0.00164, global_step=699.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.87it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.99it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.19it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.30it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.34it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.35it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.56it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.64it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.61it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.56it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.57it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.59it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.60it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.60it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.58it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.19it/s]

Epoch 1:  22%|██▏       | 1341/5971 [11:58<41:18,  1.87it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.96e-5, train/loss_step=0.0157, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.23it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.86it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.32it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.71it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.98it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.10it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.21it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.53it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.49it/s][A
Epoch 1:  22%|██▏       | 1341/5971 [12:03<41:34,  1.86it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.96e-5, train/loss_step=0.0157, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.41it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.41it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.44it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.44it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.48it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.51it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.52it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.50it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 1:  22%|██▏       | 1342/5971 [12:10<41:57,  1.84it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.96e-5, train/loss_step=0.0157, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  22%|██▏       | 1342/5971 [12:10<41:57,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.091, train/loss_vlb_step=0.000313, train/loss_step=0.091, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.41it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.01it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.19it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.52it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.17it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.16it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.11it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.09it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.09it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.08it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.15it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.35it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.28it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.20it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.25it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.21it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.21it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.30it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.28it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s]

Epoch 1:  22%|██▏       | 1343/5971 [12:22<42:36,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.091, train/loss_vlb_step=0.000313, train/loss_step=0.091, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  22%|██▏       | 1343/5971 [12:22<42:36,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000176, train/loss_step=0.0522, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.88it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.70it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.12it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.43it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.46it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.53it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.56it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.25it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.24it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.42it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.43it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.43it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.30it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.34it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.39it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.48it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.50it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.13it/s]

Epoch 1:  23%|██▎       | 1344/5971 [12:36<43:21,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000176, train/loss_step=0.0522, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1344/5971 [12:36<43:21,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.58e-5, train/loss_step=0.00285, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1345/5971 [12:37<43:22,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.58e-5, train/loss_step=0.00285, global_step=700.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1345/5971 [12:37<43:22,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.15e-5, train/loss_step=0.00396, global_step=701.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1346/5971 [12:38<43:22,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.15e-5, train/loss_step=0.00396, global_step=701.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1346/5971 [12:38<43:22,  1.78it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.19e-5, train/loss_step=0.00868, global_step=701.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  23%|██▎       | 1347/5971 [12:38<43:23,  1.78it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.19e-5, train/loss_step=0.00868, global_step=701.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1347/5971 [12:38<43:23,  1.78it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0199, train/loss_vlb_step=8.01e-5, train/loss_step=0.0199, global_step=701.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  23%|██▎       | 1348/5971 [12:41<43:29,  1.77it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0199, train/loss_vlb_step=8.01e-5, train/loss_step=0.0199, global_step=701.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1348/5971 [12:41<43:29,  1.77it/s, loss=0.115, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00157, train/loss_step=0.359, global_step=701.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1349/5971 [12:42<43:29,  1.77it/s, loss=0.115, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00157, train/loss_step=0.359, global_step=701.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1349/5971 [12:42<43:29,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000787, train/loss_step=0.222, global_step=702.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1350/5971 [12:43<43:30,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000787, train/loss_step=0.222, global_step=702.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1350/5971 [12:43<43:30,  1.77it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.0936, train/loss_vlb_step=0.000308, train/loss_step=0.0936, global_step=702.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1351/5971 [12:43<43:30,  1.77it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.0936, train/loss_vlb_step=0.000308, train/loss_step=0.0936, global_step=702.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1351/5971 [12:43<43:30,  1.77it/s, loss=0.0887, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000137, train/loss_step=0.0376, global_step=702.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1352/5971 [12:46<43:35,  1.77it/s, loss=0.0887, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000137, train/loss_step=0.0376, global_step=702.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1352/5971 [12:46<43:35,  1.77it/s, loss=0.0722, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=702.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1353/5971 [12:47<43:36,  1.77it/s, loss=0.0722, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=702.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1353/5971 [12:47<43:36,  1.77it/s, loss=0.0743, v_num=0, train/loss_simple_step=0.0611, train/loss_vlb_step=0.000212, train/loss_step=0.0611, global_step=703.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1354/5971 [12:47<43:36,  1.76it/s, loss=0.0743, v_num=0, train/loss_simple_step=0.0611, train/loss_vlb_step=0.000212, train/loss_step=0.0611, global_step=703.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1354/5971 [12:47<43:36,  1.76it/s, loss=0.0745, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.72e-5, train/loss_step=0.0123, global_step=703.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  23%|██▎       | 1355/5971 [12:48<43:37,  1.76it/s, loss=0.0745, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.72e-5, train/loss_step=0.0123, global_step=703.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1355/5971 [12:48<43:37,  1.76it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00101, train/loss_step=0.258, global_step=703.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  23%|██▎       | 1356/5971 [12:51<43:42,  1.76it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00101, train/loss_step=0.258, global_step=703.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1356/5971 [12:51<43:42,  1.76it/s, loss=0.072, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.88e-5, train/loss_step=0.0055, global_step=703.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1357/5971 [12:51<43:42,  1.76it/s, loss=0.072, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.88e-5, train/loss_step=0.0055, global_step=703.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1357/5971 [12:51<43:42,  1.76it/s, loss=0.0663, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.38e-5, train/loss_step=0.00676, global_step=704.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1358/5971 [12:52<43:43,  1.76it/s, loss=0.0663, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.38e-5, train/loss_step=0.00676, global_step=704.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1358/5971 [12:52<43:43,  1.76it/s, loss=0.066, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.51e-5, train/loss_step=0.00269, global_step=704.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  23%|██▎       | 1359/5971 [12:53<43:43,  1.76it/s, loss=0.066, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.51e-5, train/loss_step=0.00269, global_step=704.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1359/5971 [12:53<43:43,  1.76it/s, loss=0.0752, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000629, train/loss_step=0.187, global_step=704.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  23%|██▎       | 1360/5971 [12:55<43:48,  1.75it/s, loss=0.0752, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000629, train/loss_step=0.187, global_step=704.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1360/5971 [12:55<43:48,  1.75it/s, loss=0.0761, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.98e-5, train/loss_step=0.0193, global_step=704.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1361/5971 [12:56<43:49,  1.75it/s, loss=0.0761, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.98e-5, train/loss_step=0.0193, global_step=704.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1361/5971 [12:56<43:49,  1.75it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.0995, train/loss_vlb_step=0.00033, train/loss_step=0.0995, global_step=705.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1362/5971 [12:57<43:49,  1.75it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.0995, train/loss_vlb_step=0.00033, train/loss_step=0.0995, global_step=705.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1362/5971 [12:57<43:49,  1.75it/s, loss=0.0765, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.59e-5, train/loss_step=0.0154, global_step=705.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1363/5971 [12:58<43:50,  1.75it/s, loss=0.0765, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.59e-5, train/loss_step=0.0154, global_step=705.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1363/5971 [12:58<43:50,  1.75it/s, loss=0.0748, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.69e-5, train/loss_step=0.0186, global_step=705.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1364/5971 [13:00<43:55,  1.75it/s, loss=0.0748, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.69e-5, train/loss_step=0.0186, global_step=705.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1364/5971 [13:00<43:55,  1.75it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000207, train/loss_step=0.0624, global_step=705.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1365/5971 [13:01<43:55,  1.75it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000207, train/loss_step=0.0624, global_step=705.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1365/5971 [13:01<43:55,  1.75it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.36e-5, train/loss_step=0.00228, global_step=706.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1366/5971 [13:02<43:56,  1.75it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.36e-5, train/loss_step=0.00228, global_step=706.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1366/5971 [13:02<43:56,  1.75it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.0012, train/loss_step=0.311, global_step=706.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  23%|██▎       | 1367/5971 [13:03<43:56,  1.75it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.0012, train/loss_step=0.311, global_step=706.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1367/5971 [13:03<43:56,  1.75it/s, loss=0.0924, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.62e-5, train/loss_step=0.0126, global_step=706.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1368/5971 [13:05<44:02,  1.74it/s, loss=0.0924, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.62e-5, train/loss_step=0.0126, global_step=706.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1368/5971 [13:05<44:02,  1.74it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000151, train/loss_step=0.0437, global_step=706.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1369/5971 [13:06<44:02,  1.74it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000151, train/loss_step=0.0437, global_step=706.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1369/5971 [13:06<44:02,  1.74it/s, loss=0.0674, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.00014, train/loss_step=0.0359, global_step=707.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  23%|██▎       | 1370/5971 [13:07<44:03,  1.74it/s, loss=0.0674, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.00014, train/loss_step=0.0359, global_step=707.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1370/5971 [13:07<44:03,  1.74it/s, loss=0.0662, v_num=0, train/loss_simple_step=0.0711, train/loss_vlb_step=0.000243, train/loss_step=0.0711, global_step=707.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1371/5971 [13:08<44:03,  1.74it/s, loss=0.0662, v_num=0, train/loss_simple_step=0.0711, train/loss_vlb_step=0.000243, train/loss_step=0.0711, global_step=707.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1371/5971 [13:08<44:03,  1.74it/s, loss=0.0645, v_num=0, train/loss_simple_step=0.00339, train/loss_vlb_step=1.83e-5, train/loss_step=0.00339, global_step=707.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1372/5971 [13:10<44:08,  1.74it/s, loss=0.0645, v_num=0, train/loss_simple_step=0.00339, train/loss_vlb_step=1.83e-5, train/loss_step=0.00339, global_step=707.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1372/5971 [13:10<44:08,  1.74it/s, loss=0.0684, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000471, train/loss_step=0.141, global_step=707.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  23%|██▎       | 1373/5971 [13:11<44:08,  1.74it/s, loss=0.0684, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000471, train/loss_step=0.141, global_step=707.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1373/5971 [13:11<44:08,  1.74it/s, loss=0.0661, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.27e-5, train/loss_step=0.0146, global_step=708.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1374/5971 [13:12<44:09,  1.74it/s, loss=0.0661, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.27e-5, train/loss_step=0.0146, global_step=708.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1374/5971 [13:12<44:09,  1.74it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.592, train/loss_vlb_step=0.0102, train/loss_step=0.592, global_step=708.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  23%|██▎       | 1375/5971 [13:13<44:09,  1.73it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.592, train/loss_vlb_step=0.0102, train/loss_step=0.592, global_step=708.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1375/5971 [13:13<44:09,  1.73it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.09e-5, train/loss_step=0.00181, global_step=708.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1376/5971 [13:15<44:15,  1.73it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.09e-5, train/loss_step=0.00181, global_step=708.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1376/5971 [13:15<44:15,  1.73it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=8.77e-5, train/loss_step=0.0222, global_step=708.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  23%|██▎       | 1377/5971 [13:16<44:16,  1.73it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=8.77e-5, train/loss_step=0.0222, global_step=708.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1377/5971 [13:16<44:16,  1.73it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.93e-5, train/loss_step=0.0118, global_step=709.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1378/5971 [13:17<44:16,  1.73it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.93e-5, train/loss_step=0.0118, global_step=709.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1378/5971 [13:17<44:16,  1.73it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.79e-5, train/loss_step=0.00324, global_step=709.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1379/5971 [13:18<44:17,  1.73it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.79e-5, train/loss_step=0.00324, global_step=709.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1379/5971 [13:18<44:17,  1.73it/s, loss=0.0772, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000206, train/loss_step=0.0621, global_step=709.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  23%|██▎       | 1380/5971 [13:20<44:21,  1.72it/s, loss=0.0772, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000206, train/loss_step=0.0621, global_step=709.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1380/5971 [13:20<44:21,  1.72it/s, loss=0.0763, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.56e-5, train/loss_step=0.00262, global_step=709.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1381/5971 [13:21<44:22,  1.72it/s, loss=0.0763, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.56e-5, train/loss_step=0.00262, global_step=709.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1381/5971 [13:21<44:22,  1.72it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00177, train/loss_step=0.342, global_step=710.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  23%|██▎       | 1382/5971 [13:22<44:22,  1.72it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00177, train/loss_step=0.342, global_step=710.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1382/5971 [13:22<44:22,  1.72it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000362, train/loss_step=0.109, global_step=710.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1383/5971 [13:23<44:22,  1.72it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000362, train/loss_step=0.109, global_step=710.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1383/5971 [13:23<44:22,  1.72it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.45e-5, train/loss_step=0.0025, global_step=710.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1384/5971 [13:26<44:31,  1.72it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.45e-5, train/loss_step=0.0025, global_step=710.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1384/5971 [13:26<44:31,  1.72it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=1.95e-5, train/loss_step=0.00405, global_step=710.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1385/5971 [13:27<44:32,  1.72it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=1.95e-5, train/loss_step=0.00405, global_step=710.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1385/5971 [13:27<44:32,  1.72it/s, loss=0.108, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00285, train/loss_step=0.372, global_step=711.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  23%|██▎       | 1386/5971 [13:28<44:32,  1.72it/s, loss=0.108, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00285, train/loss_step=0.372, global_step=711.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1386/5971 [13:28<44:32,  1.72it/s, loss=0.0932, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.5e-5, train/loss_step=0.0156, global_step=711.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1387/5971 [13:29<44:33,  1.71it/s, loss=0.0932, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.5e-5, train/loss_step=0.0156, global_step=711.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1387/5971 [13:29<44:33,  1.71it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=711.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1388/5971 [13:31<44:39,  1.71it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=711.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1388/5971 [13:31<44:39,  1.71it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.00759, train/loss_vlb_step=3.51e-5, train/loss_step=0.00759, global_step=711.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1389/5971 [13:32<44:39,  1.71it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.00759, train/loss_vlb_step=3.51e-5, train/loss_step=0.00759, global_step=711.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1389/5971 [13:32<44:39,  1.71it/s, loss=0.113, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00206, train/loss_step=0.371, global_step=712.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  23%|██▎       | 1390/5971 [13:33<44:39,  1.71it/s, loss=0.113, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00206, train/loss_step=0.371, global_step=712.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1390/5971 [13:33<44:39,  1.71it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.58e-5, train/loss_step=0.00291, global_step=712.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1391/5971 [13:34<44:40,  1.71it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.58e-5, train/loss_step=0.00291, global_step=712.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1391/5971 [13:34<44:40,  1.71it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.33e-5, train/loss_step=0.0216, global_step=712.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  23%|██▎       | 1392/5971 [13:36<44:44,  1.71it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.33e-5, train/loss_step=0.0216, global_step=712.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1392/5971 [13:36<44:44,  1.71it/s, loss=0.11, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=712.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  23%|██▎       | 1393/5971 [13:37<44:45,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=712.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1393/5971 [13:37<44:45,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.32e-5, train/loss_step=0.017, global_step=713.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1394/5971 [13:38<44:45,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.32e-5, train/loss_step=0.017, global_step=713.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1394/5971 [13:38<44:45,  1.70it/s, loss=0.0902, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000689, train/loss_step=0.192, global_step=713.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1395/5971 [13:39<44:45,  1.70it/s, loss=0.0902, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000689, train/loss_step=0.192, global_step=713.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1395/5971 [13:39<44:45,  1.70it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.42e-5, train/loss_step=0.0115, global_step=713.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1396/5971 [13:41<44:50,  1.70it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.42e-5, train/loss_step=0.0115, global_step=713.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1396/5971 [13:41<44:50,  1.70it/s, loss=0.095, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000361, train/loss_step=0.109, global_step=713.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  23%|██▎       | 1397/5971 [13:42<44:50,  1.70it/s, loss=0.095, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000361, train/loss_step=0.109, global_step=713.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1397/5971 [13:42<44:50,  1.70it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.0943, train/loss_vlb_step=0.000313, train/loss_step=0.0943, global_step=714.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1398/5971 [13:43<44:51,  1.70it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.0943, train/loss_vlb_step=0.000313, train/loss_step=0.0943, global_step=714.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1398/5971 [13:43<44:51,  1.70it/s, loss=0.105, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000412, train/loss_step=0.121, global_step=714.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  23%|██▎       | 1399/5971 [13:44<44:51,  1.70it/s, loss=0.105, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000412, train/loss_step=0.121, global_step=714.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1399/5971 [13:44<44:51,  1.70it/s, loss=0.125, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00382, train/loss_step=0.455, global_step=714.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  23%|██▎       | 1400/5971 [13:46<44:56,  1.70it/s, loss=0.125, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00382, train/loss_step=0.455, global_step=714.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1400/5971 [13:46<44:56,  1.70it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000206, train/loss_step=0.0614, global_step=714.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1401/5971 [13:47<44:56,  1.69it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000206, train/loss_step=0.0614, global_step=714.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1401/5971 [13:47<44:56,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.772, train/loss_vlb_step=0.0565, train/loss_step=0.772, global_step=715.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  23%|██▎       | 1402/5971 [13:48<44:56,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.772, train/loss_vlb_step=0.0565, train/loss_step=0.772, global_step=715.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1402/5971 [13:48<44:56,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000347, train/loss_step=0.106, global_step=715.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1403/5971 [13:49<44:57,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000347, train/loss_step=0.106, global_step=715.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  23%|██▎       | 1403/5971 [13:49<44:57,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00116, train/loss_step=0.299, global_step=715.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  24%|██▎       | 1404/5971 [13:51<45:01,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00116, train/loss_step=0.299, global_step=715.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1404/5971 [13:51<45:01,  1.69it/s, loss=0.185, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00238, train/loss_step=0.429, global_step=715.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1405/5971 [13:52<45:02,  1.69it/s, loss=0.185, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00238, train/loss_step=0.429, global_step=715.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1405/5971 [13:52<45:02,  1.69it/s, loss=0.195, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.00576, train/loss_step=0.570, global_step=716.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1406/5971 [13:52<45:02,  1.69it/s, loss=0.195, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.00576, train/loss_step=0.570, global_step=716.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1406/5971 [13:52<45:02,  1.69it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00424, train/loss_vlb_step=2.27e-5, train/loss_step=0.00424, global_step=716.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1407/5971 [13:53<45:02,  1.69it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00424, train/loss_vlb_step=2.27e-5, train/loss_step=0.00424, global_step=716.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1407/5971 [13:53<45:02,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.55e-5, train/loss_step=0.0173, global_step=716.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▎       | 1408/5971 [13:56<45:07,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.55e-5, train/loss_step=0.0173, global_step=716.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1408/5971 [13:56<45:07,  1.69it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.7e-5, train/loss_step=0.0156, global_step=716.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▎       | 1409/5971 [13:57<45:08,  1.68it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.7e-5, train/loss_step=0.0156, global_step=716.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1409/5971 [13:57<45:08,  1.68it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.000218, train/loss_step=0.0601, global_step=717.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1410/5971 [13:57<45:08,  1.68it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.000218, train/loss_step=0.0601, global_step=717.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1410/5971 [13:57<45:08,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000217, train/loss_step=0.0644, global_step=717.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1411/5971 [13:58<45:08,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000217, train/loss_step=0.0644, global_step=717.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1411/5971 [13:58<45:08,  1.68it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000228, train/loss_step=0.0694, global_step=717.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  24%|██▎       | 1412/5971 [14:01<45:14,  1.68it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000228, train/loss_step=0.0694, global_step=717.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1412/5971 [14:01<45:14,  1.68it/s, loss=0.193, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00212, train/loss_step=0.392, global_step=717.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▎       | 1413/5971 [14:02<45:15,  1.68it/s, loss=0.193, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00212, train/loss_step=0.392, global_step=717.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1413/5971 [14:02<45:15,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000216, train/loss_step=0.0636, global_step=718.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1414/5971 [14:03<45:15,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000216, train/loss_step=0.0636, global_step=718.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1414/5971 [14:03<45:15,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000829, train/loss_step=0.230, global_step=718.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▎       | 1415/5971 [14:03<45:15,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000829, train/loss_step=0.230, global_step=718.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1415/5971 [14:03<45:15,  1.68it/s, loss=0.209, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00095, train/loss_step=0.244, global_step=718.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  24%|██▎       | 1416/5971 [14:06<45:21,  1.67it/s, loss=0.209, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00095, train/loss_step=0.244, global_step=718.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1416/5971 [14:06<45:21,  1.67it/s, loss=0.24, v_num=0, train/loss_simple_step=0.742, train/loss_vlb_step=0.0298, train/loss_step=0.742, global_step=718.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▎       | 1417/5971 [14:07<45:22,  1.67it/s, loss=0.24, v_num=0, train/loss_simple_step=0.742, train/loss_vlb_step=0.0298, train/loss_step=0.742, global_step=718.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1417/5971 [14:07<45:22,  1.67it/s, loss=0.257, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00225, train/loss_step=0.432, global_step=719.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1418/5971 [14:08<45:22,  1.67it/s, loss=0.257, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00225, train/loss_step=0.432, global_step=719.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▎       | 1418/5971 [14:08<45:22,  1.67it/s, loss=0.252, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.58e-5, train/loss_step=0.0129, global_step=719.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1419/5971 [14:09<45:22,  1.67it/s, loss=0.252, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.58e-5, train/loss_step=0.0129, global_step=719.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1419/5971 [14:09<45:22,  1.67it/s, loss=0.229, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.76e-5, train/loss_step=0.00314, global_step=719.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1420/5971 [14:11<45:27,  1.67it/s, loss=0.229, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.76e-5, train/loss_step=0.00314, global_step=719.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1420/5971 [14:11<45:27,  1.67it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000337, train/loss_step=0.0968, global_step=719.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  24%|██▍       | 1421/5971 [14:12<45:27,  1.67it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000337, train/loss_step=0.0968, global_step=719.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1421/5971 [14:12<45:27,  1.67it/s, loss=0.201, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000595, train/loss_step=0.173, global_step=720.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▍       | 1422/5971 [14:13<45:27,  1.67it/s, loss=0.201, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000595, train/loss_step=0.173, global_step=720.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1422/5971 [14:13<45:27,  1.67it/s, loss=0.203, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000468, train/loss_step=0.142, global_step=720.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1423/5971 [14:14<45:27,  1.67it/s, loss=0.203, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000468, train/loss_step=0.142, global_step=720.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1423/5971 [14:14<45:28,  1.67it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.12e-5, train/loss_step=0.0128, global_step=720.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1424/5971 [14:16<45:32,  1.66it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.12e-5, train/loss_step=0.0128, global_step=720.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1424/5971 [14:16<45:32,  1.66it/s, loss=0.181, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=720.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▍       | 1425/5971 [14:17<45:32,  1.66it/s, loss=0.181, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=720.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1425/5971 [14:17<45:32,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000176, train/loss_step=0.051, global_step=721.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1426/5971 [14:18<45:32,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000176, train/loss_step=0.051, global_step=721.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1426/5971 [14:18<45:32,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00858, train/loss_vlb_step=4.08e-5, train/loss_step=0.00858, global_step=721.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1427/5971 [14:18<45:33,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00858, train/loss_vlb_step=4.08e-5, train/loss_step=0.00858, global_step=721.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1427/5971 [14:18<45:33,  1.66it/s, loss=0.162, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000524, train/loss_step=0.151, global_step=721.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  24%|██▍       | 1428/5971 [14:21<45:38,  1.66it/s, loss=0.162, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000524, train/loss_step=0.151, global_step=721.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1428/5971 [14:21<45:38,  1.66it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000186, train/loss_step=0.0544, global_step=721.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1429/5971 [14:22<45:38,  1.66it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000186, train/loss_step=0.0544, global_step=721.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1429/5971 [14:22<45:38,  1.66it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000104, train/loss_step=0.0277, global_step=722.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1430/5971 [14:23<45:38,  1.66it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000104, train/loss_step=0.0277, global_step=722.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1430/5971 [14:23<45:38,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000671, train/loss_step=0.194, global_step=722.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▍       | 1431/5971 [14:23<45:38,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000671, train/loss_step=0.194, global_step=722.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1431/5971 [14:23<45:38,  1.66it/s, loss=0.173, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.00055, train/loss_step=0.159, global_step=722.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  24%|██▍       | 1432/5971 [14:26<45:44,  1.65it/s, loss=0.173, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.00055, train/loss_step=0.159, global_step=722.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1432/5971 [14:26<45:44,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.66e-5, train/loss_step=0.0158, global_step=722.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1433/5971 [14:27<45:44,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.66e-5, train/loss_step=0.0158, global_step=722.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1433/5971 [14:27<45:44,  1.65it/s, loss=0.164, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00105, train/loss_step=0.258, global_step=723.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▍       | 1434/5971 [14:28<45:45,  1.65it/s, loss=0.164, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00105, train/loss_step=0.258, global_step=723.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1434/5971 [14:28<45:45,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000115, train/loss_step=0.0273, global_step=723.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1435/5971 [14:29<45:45,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000115, train/loss_step=0.0273, global_step=723.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1435/5971 [14:29<45:45,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0524, train/loss_vlb_step=0.000178, train/loss_step=0.0524, global_step=723.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1436/5971 [14:31<45:50,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0524, train/loss_vlb_step=0.000178, train/loss_step=0.0524, global_step=723.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1436/5971 [14:31<45:50,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000954, train/loss_step=0.240, global_step=723.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  24%|██▍       | 1437/5971 [14:32<45:50,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000954, train/loss_step=0.240, global_step=723.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1437/5971 [14:32<45:50,  1.65it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.00472, train/loss_vlb_step=2.36e-5, train/loss_step=0.00472, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1438/5971 [14:33<45:50,  1.65it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.00472, train/loss_vlb_step=2.36e-5, train/loss_step=0.00472, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1438/5971 [14:33<45:50,  1.65it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000112, train/loss_step=0.0298, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  24%|██▍       | 1439/5971 [14:34<45:51,  1.65it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000112, train/loss_step=0.0298, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1439/5971 [14:34<45:51,  1.65it/s, loss=0.126, v_num=0, train/loss_simple_step=0.552, train/loss_vlb_step=0.00764, train/loss_step=0.552, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  24%|██▍       | 1440/5971 [14:36<45:55,  1.64it/s, loss=0.126, v_num=0, train/loss_simple_step=0.552, train/loss_vlb_step=0.00764, train/loss_step=0.552, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  24%|██▍       | 1440/5971 [14:36<45:55,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:24,  1.96it/s][A
Epoch 1:  24%|██▍       | 1442/5971 [14:36<45:52,  1.65it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   2%|▏         | 3/167 [00:00<00:30,  5.46it/s][A
Epoch 1:  24%|██▍       | 1444/5971 [14:37<45:47,  1.65it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   4%|▎         | 6/167 [00:00<00:15, 10.50it/s][A
Epoch 1:  24%|██▍       | 1447/5971 [14:37<45:40,  1.65it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.63it/s][A
Epoch 1:  24%|██▍       | 1450/5971 [14:37<45:33,  1.65it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 12/167 [00:00<00:08, 17.94it/s][A
Epoch 1:  24%|██▍       | 1453/5971 [14:37<45:26,  1.66it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|▉         | 16/167 [00:01<00:07, 21.37it/s][A
Epoch 1:  24%|██▍       | 1457/5971 [14:37<45:16,  1.66it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.92it/s][A
Epoch 1:  24%|██▍       | 1461/5971 [14:37<45:07,  1.67it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.31it/s][A
Epoch 1:  25%|██▍       | 1465/5971 [14:37<44:58,  1.67it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.02it/s][A
Epoch 1:  25%|██▍       | 1469/5971 [14:37<44:48,  1.67it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.64it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.96it/s][A
Epoch 1:  25%|██▍       | 1473/5971 [14:38<44:39,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 27.44it/s][A
Epoch 1:  25%|██▍       | 1477/5971 [14:38<44:30,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 26.69it/s][A
Epoch 1:  25%|██▍       | 1481/5971 [14:38<44:21,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.77it/s][A
Epoch 1:  25%|██▍       | 1485/5971 [14:38<44:12,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.86it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.64it/s][A
Epoch 1:  25%|██▍       | 1489/5971 [14:38<44:03,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.70it/s][A
Epoch 1:  25%|██▌       | 1493/5971 [14:38<43:54,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.80it/s][A
Epoch 1:  25%|██▌       | 1497/5971 [14:39<43:45,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.18it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.06it/s][A
Epoch 1:  25%|██▌       | 1501/5971 [14:39<43:36,  1.71it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.05it/s][A
Epoch 1:  25%|██▌       | 1505/5971 [14:39<43:27,  1.71it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.86it/s][A
Epoch 1:  25%|██▌       | 1509/5971 [14:39<43:18,  1.72it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.89it/s][A
Epoch 1:  25%|██▌       | 1513/5971 [14:39<43:10,  1.72it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 28.01it/s][A
Epoch 1:  25%|██▌       | 1517/5971 [14:39<43:01,  1.73it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.36it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.95it/s][A
Epoch 1:  25%|██▌       | 1521/5971 [14:39<42:52,  1.73it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.99it/s][A
Epoch 1:  26%|██▌       | 1525/5971 [14:40<42:44,  1.73it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.37it/s][A
Epoch 1:  26%|██▌       | 1529/5971 [14:40<42:35,  1.74it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.85it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.54it/s][A
Epoch 1:  26%|██▌       | 1533/5971 [14:40<42:27,  1.74it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.56it/s][A
Epoch 1:  26%|██▌       | 1537/5971 [14:40<42:18,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.84it/s][A
Epoch 1:  26%|██▌       | 1541/5971 [14:40<42:10,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.21it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.96it/s][A
Epoch 1:  26%|██▌       | 1545/5971 [14:40<42:01,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.52it/s][A
Epoch 1:  26%|██▌       | 1549/5971 [14:41<41:53,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.30it/s][A
Epoch 1:  26%|██▌       | 1553/5971 [14:41<41:45,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.55it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 27.63it/s][A
Epoch 1:  26%|██▌       | 1557/5971 [14:41<41:36,  1.77it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.99it/s][A
Epoch 1:  26%|██▌       | 1561/5971 [14:41<41:28,  1.77it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 27.56it/s][A
Epoch 1:  26%|██▌       | 1565/5971 [14:41<41:20,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 28.36it/s][A
Epoch 1:  26%|██▋       | 1569/5971 [14:41<41:12,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.50it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.54it/s][A
Epoch 1:  26%|██▋       | 1573/5971 [14:41<41:04,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.40it/s][A
Epoch 1:  26%|██▋       | 1577/5971 [14:42<40:56,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 27.09it/s][A
Epoch 1:  26%|██▋       | 1581/5971 [14:42<40:48,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.64it/s][A

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 26.95it/s][A
Epoch 1:  27%|██▋       | 1585/5971 [14:42<40:40,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 25.36it/s][A
Epoch 1:  27%|██▋       | 1589/5971 [14:42<40:32,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 23.37it/s][A
Epoch 1:  27%|██▋       | 1593/5971 [14:42<40:24,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 24.20it/s][A

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.25it/s][A
Epoch 1:  27%|██▋       | 1597/5971 [14:42<40:16,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.10it/s][A
Epoch 1:  27%|██▋       | 1601/5971 [14:43<40:08,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.42it/s][A
Epoch 1:  27%|██▋       | 1605/5971 [14:43<40:00,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.74it/s][A
Epoch 1:  27%|██▋       | 1608/5971 [14:43<39:56,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  27%|██▋       | 1609/5971 [14:44<39:57,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00221, train/loss_step=0.345, global_step=724.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1609/5971 [14:44<39:57,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000886, train/loss_step=0.249, global_step=725.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1610/5971 [14:45<39:57,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00973, train/loss_vlb_step=4.53e-5, train/loss_step=0.00973, global_step=725.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1611/5971 [14:46<39:58,  1.82it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.48e-5, train/loss_step=0.00261, global_step=725.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1612/5971 [14:48<40:02,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.26e-5, train/loss_step=0.00208, global_step=725.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1613/5971 [14:49<40:02,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.26e-5, train/loss_step=0.00208, global_step=725.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1613/5971 [14:49<40:02,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.00382, train/loss_step=0.435, global_step=726.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  27%|██▋       | 1614/5971 [14:50<40:02,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.863, train/loss_vlb_step=0.0447, train/loss_step=0.863, global_step=726.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  27%|██▋       | 1615/5971 [14:51<40:03,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000603, train/loss_step=0.172, global_step=726.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1616/5971 [14:53<40:06,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000485, train/loss_step=0.143, global_step=726.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1617/5971 [14:54<40:07,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000485, train/loss_step=0.143, global_step=726.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1617/5971 [14:54<40:07,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000485, train/loss_step=0.142, global_step=727.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1618/5971 [14:55<40:07,  1.81it/s, loss=0.206, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00261, train/loss_step=0.409, global_step=727.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  27%|██▋       | 1619/5971 [14:56<40:07,  1.81it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=1.04e-5, train/loss_step=0.00171, global_step=727.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1620/5971 [14:58<40:11,  1.80it/s, loss=0.207, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000757, train/loss_step=0.199, global_step=727.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  27%|██▋       | 1621/5971 [14:59<40:12,  1.80it/s, loss=0.207, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000757, train/loss_step=0.199, global_step=727.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1621/5971 [14:59<40:12,  1.80it/s, loss=0.2, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000413, train/loss_step=0.126, global_step=728.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  27%|██▋       | 1622/5971 [15:00<40:12,  1.80it/s, loss=0.231, v_num=0, train/loss_simple_step=0.648, train/loss_vlb_step=0.00754, train/loss_step=0.648, global_step=728.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1623/5971 [15:01<40:12,  1.80it/s, loss=0.229, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.95e-5, train/loss_step=0.0114, global_step=728.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1624/5971 [15:03<40:17,  1.80it/s, loss=0.242, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00477, train/loss_step=0.486, global_step=728.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  27%|██▋       | 1625/5971 [15:04<40:17,  1.80it/s, loss=0.242, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00477, train/loss_step=0.486, global_step=728.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1625/5971 [15:04<40:17,  1.80it/s, loss=0.249, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000517, train/loss_step=0.147, global_step=729.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1626/5971 [15:05<40:17,  1.80it/s, loss=0.282, v_num=0, train/loss_simple_step=0.693, train/loss_vlb_step=0.0108, train/loss_step=0.693, global_step=729.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  27%|██▋       | 1627/5971 [15:06<40:18,  1.80it/s, loss=0.26, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000413, train/loss_step=0.125, global_step=729.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1628/5971 [15:08<40:22,  1.79it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000257, train/loss_step=0.0771, global_step=729.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1629/5971 [15:09<40:22,  1.79it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000257, train/loss_step=0.0771, global_step=729.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1629/5971 [15:09<40:22,  1.79it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0816, train/loss_vlb_step=0.000271, train/loss_step=0.0816, global_step=730.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1630/5971 [15:10<40:22,  1.79it/s, loss=0.241, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000172, train/loss_step=0.0471, global_step=730.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1631/5971 [15:11<40:23,  1.79it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000323, train/loss_step=0.0968, global_step=730.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1632/5971 [15:13<40:27,  1.79it/s, loss=0.252, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000528, train/loss_step=0.146, global_step=730.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  27%|██▋       | 1633/5971 [15:14<40:27,  1.79it/s, loss=0.252, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000528, train/loss_step=0.146, global_step=730.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1633/5971 [15:14<40:27,  1.79it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000114, train/loss_step=0.0289, global_step=731.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1634/5971 [15:15<40:27,  1.79it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=9.83e-5, train/loss_step=0.0267, global_step=731.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  27%|██▋       | 1635/5971 [15:16<40:28,  1.79it/s, loss=0.202, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00236, train/loss_step=0.397, global_step=731.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  27%|██▋       | 1636/5971 [15:18<40:31,  1.78it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0733, train/loss_vlb_step=0.000249, train/loss_step=0.0733, global_step=731.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1637/5971 [15:19<40:32,  1.78it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0733, train/loss_vlb_step=0.000249, train/loss_step=0.0733, global_step=731.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1637/5971 [15:19<40:32,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000214, train/loss_step=0.0635, global_step=732.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1638/5971 [15:20<40:32,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000931, train/loss_step=0.227, global_step=732.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  27%|██▋       | 1639/5971 [15:21<40:32,  1.78it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000117, train/loss_step=0.0309, global_step=732.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1640/5971 [15:23<40:36,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000129, train/loss_step=0.0368, global_step=732.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1641/5971 [15:24<40:36,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000129, train/loss_step=0.0368, global_step=732.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  27%|██▋       | 1641/5971 [15:24<40:36,  1.78it/s, loss=0.181, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000589, train/loss_step=0.171, global_step=733.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  27%|██▋       | 1642/5971 [15:24<40:37,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00104, train/loss_step=0.250, global_step=733.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1643/5971 [15:25<40:37,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.63e-5, train/loss_step=0.0107, global_step=733.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1644/5971 [15:27<40:40,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000714, train/loss_step=0.193, global_step=733.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1645/5971 [15:28<40:41,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000714, train/loss_step=0.193, global_step=733.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1645/5971 [15:28<40:41,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00471, train/loss_vlb_step=2.37e-5, train/loss_step=0.00471, global_step=734.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1646/5971 [15:29<40:41,  1.77it/s, loss=0.114, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.00068, train/loss_step=0.186, global_step=734.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  28%|██▊       | 1647/5971 [15:30<40:41,  1.77it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00621, train/loss_vlb_step=3.2e-5, train/loss_step=0.00621, global_step=734.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1648/5971 [15:32<40:45,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.877, train/loss_vlb_step=0.0894, train/loss_step=0.877, global_step=734.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  28%|██▊       | 1649/5971 [15:33<40:45,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.877, train/loss_vlb_step=0.0894, train/loss_step=0.877, global_step=734.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1649/5971 [15:33<40:45,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000678, train/loss_step=0.203, global_step=735.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1650/5971 [15:34<40:45,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00559, train/loss_vlb_step=2.79e-5, train/loss_step=0.00559, global_step=735.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1651/5971 [15:35<40:46,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000981, train/loss_step=0.257, global_step=735.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  28%|██▊       | 1652/5971 [15:37<40:50,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.62e-5, train/loss_step=0.0163, global_step=735.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1653/5971 [15:38<40:50,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.62e-5, train/loss_step=0.0163, global_step=735.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1653/5971 [15:38<40:50,  1.76it/s, loss=0.159, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000479, train/loss_step=0.142, global_step=736.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1654/5971 [15:39<40:51,  1.76it/s, loss=0.189, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.00717, train/loss_step=0.638, global_step=736.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1655/5971 [15:40<40:51,  1.76it/s, loss=0.17, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.68e-5, train/loss_step=0.003, global_step=736.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1656/5971 [15:42<40:54,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.04e-5, train/loss_step=0.0213, global_step=736.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1657/5971 [15:43<40:55,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.04e-5, train/loss_step=0.0213, global_step=736.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1657/5971 [15:43<40:55,  1.76it/s, loss=0.171, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000464, train/loss_step=0.140, global_step=737.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1658/5971 [15:44<40:55,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.2e-5, train/loss_step=0.00401, global_step=737.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1659/5971 [15:45<40:55,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000107, train/loss_step=0.0301, global_step=737.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1660/5971 [15:47<40:59,  1.75it/s, loss=0.183, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00466, train/loss_step=0.500, global_step=737.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  28%|██▊       | 1661/5971 [15:48<40:59,  1.75it/s, loss=0.183, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00466, train/loss_step=0.500, global_step=737.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1661/5971 [15:48<40:59,  1.75it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00828, train/loss_vlb_step=3.87e-5, train/loss_step=0.00828, global_step=738.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1662/5971 [15:49<40:59,  1.75it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.000239, train/loss_step=0.0727, global_step=738.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1663/5971 [15:50<41:00,  1.75it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0799, train/loss_vlb_step=0.000269, train/loss_step=0.0799, global_step=738.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1664/5971 [15:52<41:03,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000547, train/loss_step=0.163, global_step=738.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  28%|██▊       | 1665/5971 [15:53<41:03,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000547, train/loss_step=0.163, global_step=738.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1665/5971 [15:53<41:03,  1.75it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000126, train/loss_step=0.0333, global_step=739.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1666/5971 [15:54<41:03,  1.75it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0618, train/loss_vlb_step=0.00021, train/loss_step=0.0618, global_step=739.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1667/5971 [15:54<41:04,  1.75it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00891, train/loss_vlb_step=4.44e-5, train/loss_step=0.00891, global_step=739.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1668/5971 [15:57<41:07,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000449, train/loss_step=0.136, global_step=739.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  28%|██▊       | 1669/5971 [15:57<41:07,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000449, train/loss_step=0.136, global_step=739.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1669/5971 [15:57<41:07,  1.74it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000128, train/loss_step=0.0346, global_step=740.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1670/5971 [15:58<41:08,  1.74it/s, loss=0.132, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.0013, train/loss_step=0.289, global_step=740.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  28%|██▊       | 1671/5971 [15:59<41:08,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.592, train/loss_vlb_step=0.00689, train/loss_step=0.592, global_step=740.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1672/5971 [16:02<41:14,  1.74it/s, loss=0.163, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00121, train/loss_step=0.302, global_step=740.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1673/5971 [16:03<41:14,  1.74it/s, loss=0.163, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00121, train/loss_step=0.302, global_step=740.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1673/5971 [16:03<41:14,  1.74it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0663, train/loss_vlb_step=0.000222, train/loss_step=0.0663, global_step=741.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1674/5971 [16:04<41:14,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.86e-5, train/loss_step=0.00336, global_step=741.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1675/5971 [16:05<41:15,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000427, train/loss_step=0.129, global_step=741.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  28%|██▊       | 1676/5971 [16:08<41:19,  1.73it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.18e-5, train/loss_step=0.00671, global_step=741.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1677/5971 [16:09<41:20,  1.73it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.18e-5, train/loss_step=0.00671, global_step=741.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1677/5971 [16:09<41:20,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.19e-5, train/loss_step=0.0241, global_step=742.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  28%|██▊       | 1678/5971 [16:10<41:20,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.48e-5, train/loss_step=0.00252, global_step=742.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1679/5971 [16:10<41:20,  1.73it/s, loss=0.157, v_num=0, train/loss_simple_step=0.630, train/loss_vlb_step=0.0106, train/loss_step=0.630, global_step=742.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  28%|██▊       | 1680/5971 [16:13<41:23,  1.73it/s, loss=0.149, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.0014, train/loss_step=0.337, global_step=742.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1681/5971 [16:14<41:24,  1.73it/s, loss=0.149, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.0014, train/loss_step=0.337, global_step=742.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1681/5971 [16:14<41:24,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00152, train/loss_step=0.329, global_step=743.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1682/5971 [16:14<41:24,  1.73it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000164, train/loss_step=0.0464, global_step=743.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1683/5971 [16:15<41:24,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00325, train/loss_vlb_step=1.89e-5, train/loss_step=0.00325, global_step=743.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1684/5971 [16:17<41:27,  1.72it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00449, train/loss_vlb_step=2.35e-5, train/loss_step=0.00449, global_step=743.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1685/5971 [16:18<41:28,  1.72it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00449, train/loss_vlb_step=2.35e-5, train/loss_step=0.00449, global_step=743.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1685/5971 [16:18<41:28,  1.72it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000107, train/loss_step=0.0266, global_step=744.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1686/5971 [16:19<41:28,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=1.95e-5, train/loss_step=0.00378, global_step=744.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1687/5971 [16:20<41:28,  1.72it/s, loss=0.155, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000469, train/loss_step=0.142, global_step=744.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  28%|██▊       | 1688/5971 [16:22<41:32,  1.72it/s, loss=0.157, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000525, train/loss_step=0.160, global_step=744.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1689/5971 [16:23<41:32,  1.72it/s, loss=0.157, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000525, train/loss_step=0.160, global_step=744.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1689/5971 [16:23<41:32,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00211, train/loss_step=0.431, global_step=745.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1690/5971 [16:24<41:32,  1.72it/s, loss=0.174, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000895, train/loss_step=0.251, global_step=745.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1691/5971 [16:25<41:32,  1.72it/s, loss=0.174, v_num=0, train/loss_simple_step=0.581, train/loss_vlb_step=0.010, train/loss_step=0.581, global_step=745.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  28%|██▊       | 1692/5971 [16:28<41:37,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00182, train/loss_step=0.343, global_step=745.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1693/5971 [16:28<41:37,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00182, train/loss_step=0.343, global_step=745.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1693/5971 [16:28<41:37,  1.71it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0031, train/loss_vlb_step=1.69e-5, train/loss_step=0.0031, global_step=746.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1694/5971 [16:29<41:37,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000652, train/loss_step=0.181, global_step=746.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1695/5971 [16:30<41:37,  1.71it/s, loss=0.196, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00243, train/loss_step=0.417, global_step=746.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  28%|██▊       | 1696/5971 [16:32<41:41,  1.71it/s, loss=0.203, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000488, train/loss_step=0.141, global_step=746.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1697/5971 [16:33<41:41,  1.71it/s, loss=0.203, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000488, train/loss_step=0.141, global_step=746.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1697/5971 [16:33<41:41,  1.71it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.46e-5, train/loss_step=0.00261, global_step=747.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1698/5971 [16:34<41:41,  1.71it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.55e-5, train/loss_step=0.0109, global_step=747.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  28%|██▊       | 1699/5971 [16:35<41:41,  1.71it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000249, train/loss_step=0.0757, global_step=747.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1700/5971 [16:37<41:45,  1.70it/s, loss=0.17, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000902, train/loss_step=0.253, global_step=747.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  28%|██▊       | 1701/5971 [16:38<41:45,  1.70it/s, loss=0.17, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000902, train/loss_step=0.253, global_step=747.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  28%|██▊       | 1701/5971 [16:38<41:45,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00146, train/loss_vlb_step=8.88e-6, train/loss_step=0.00146, global_step=748.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  29%|██▊       | 1702/5971 [16:39<41:45,  1.70it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000108, train/loss_step=0.0266, global_step=748.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  29%|██▊       | 1703/5971 [16:40<41:45,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00119, train/loss_step=0.278, global_step=748.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  29%|██▊       | 1704/5971 [16:42<41:48,  1.70it/s, loss=0.198, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00635, train/loss_step=0.628, global_step=748.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  29%|██▊       | 1705/5971 [16:43<41:49,  1.70it/s, loss=0.198, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00635, train/loss_step=0.628, global_step=748.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  29%|██▊       | 1705/5971 [16:43<41:49,  1.70it/s, loss=0.221, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00428, train/loss_step=0.500, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  29%|██▊       | 1706/5971 [16:44<41:49,  1.70it/s, loss=0.222, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.73e-5, train/loss_step=0.00541, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  29%|██▊       | 1707/5971 [16:45<41:49,  1.70it/s, loss=0.239, v_num=0, train/loss_simple_step=0.484, train/loss_vlb_step=0.00268, train/loss_step=0.484, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  29%|██▊       | 1708/5971 [16:47<41:53,  1.70it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  29%|██▊       | 1709/5971 [16:47<41:51,  1.70it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<02:25,  1.14it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:44,  3.70it/s][A
Epoch 1:  29%|██▊       | 1713/5971 [16:48<41:45,  1.70it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   4%|▎         | 6/167 [00:01<00:20,  7.74it/s][A
Epoch 1:  29%|██▉       | 1717/5971 [16:48<41:37,  1.70it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▌         | 9/167 [00:01<00:13, 11.65it/s][A

Validating:   7%|▋         | 12/167 [00:01<00:10, 15.13it/s][A
Epoch 1:  29%|██▉       | 1721/5971 [16:48<41:30,  1.71it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   9%|▉         | 15/167 [00:01<00:08, 18.00it/s][A
Epoch 1:  29%|██▉       | 1725/5971 [16:49<41:22,  1.71it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  11%|█         | 18/167 [00:01<00:07, 20.55it/s][A
Epoch 1:  29%|██▉       | 1729/5971 [16:49<41:14,  1.71it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 21.23it/s][A

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.26it/s][A
Epoch 1:  29%|██▉       | 1733/5971 [16:49<41:06,  1.72it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.36it/s][A
Epoch 1:  29%|██▉       | 1737/5971 [16:49<40:59,  1.72it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  18%|█▊        | 30/167 [00:02<00:05, 24.26it/s][A
Epoch 1:  29%|██▉       | 1741/5971 [16:49<40:51,  1.73it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 24.12it/s][A

Validating:  22%|██▏       | 36/167 [00:02<00:05, 25.39it/s][A
Epoch 1:  29%|██▉       | 1745/5971 [16:49<40:44,  1.73it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 25.72it/s][A
Epoch 1:  29%|██▉       | 1749/5971 [16:49<40:36,  1.73it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.78it/s][A
Epoch 1:  29%|██▉       | 1753/5971 [16:50<40:29,  1.74it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.21it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.99it/s][A
Epoch 1:  29%|██▉       | 1757/5971 [16:50<40:21,  1.74it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.69it/s][A
Epoch 1:  29%|██▉       | 1761/5971 [16:50<40:14,  1.74it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.50it/s][A
Epoch 1:  30%|██▉       | 1765/5971 [16:50<40:06,  1.75it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:03<00:04, 26.65it/s][A

Validating:  36%|███▌      | 60/167 [00:03<00:04, 25.62it/s][A
Epoch 1:  30%|██▉       | 1769/5971 [16:50<39:59,  1.75it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:03<00:04, 25.28it/s][A
Epoch 1:  30%|██▉       | 1773/5971 [16:50<39:52,  1.75it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.27it/s][A
Epoch 1:  30%|██▉       | 1777/5971 [16:51<39:44,  1.76it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.92it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.41it/s][A
Epoch 1:  30%|██▉       | 1781/5971 [16:51<39:37,  1.76it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.19it/s][A
Epoch 1:  30%|██▉       | 1785/5971 [16:51<39:30,  1.77it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.79it/s][A
Epoch 1:  30%|██▉       | 1789/5971 [16:51<39:23,  1.77it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.05it/s][A

Validating:  50%|█████     | 84/167 [00:04<00:03, 27.35it/s][A
Epoch 1:  30%|███       | 1793/5971 [16:51<39:16,  1.77it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:04<00:02, 27.46it/s][A
Epoch 1:  30%|███       | 1797/5971 [16:51<39:08,  1.78it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 27.31it/s][A
Epoch 1:  30%|███       | 1801/5971 [16:51<39:01,  1.78it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 27.81it/s][A
Epoch 1:  30%|███       | 1805/5971 [16:52<38:54,  1.78it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.96it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.24it/s][A
Epoch 1:  30%|███       | 1809/5971 [16:52<38:47,  1.79it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 26.02it/s][A
Epoch 1:  30%|███       | 1813/5971 [16:52<38:40,  1.79it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.77it/s][A
Epoch 1:  30%|███       | 1817/5971 [16:52<38:33,  1.80it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 25.71it/s][A

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 25.17it/s][A
Epoch 1:  30%|███       | 1821/5971 [16:52<38:26,  1.80it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 25.62it/s][A
Epoch 1:  31%|███       | 1825/5971 [16:52<38:19,  1.80it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.79it/s][A
Epoch 1:  31%|███       | 1829/5971 [16:53<38:12,  1.81it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.23it/s][A

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.01it/s][A
Epoch 1:  31%|███       | 1833/5971 [16:53<38:06,  1.81it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.53it/s][A
Epoch 1:  31%|███       | 1837/5971 [16:53<37:59,  1.81it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 28.05it/s][A
Epoch 1:  31%|███       | 1841/5971 [16:53<37:52,  1.82it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|████████  | 134/167 [00:05<00:01, 27.81it/s][A
Epoch 1:  31%|███       | 1845/5971 [16:53<37:45,  1.82it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.95it/s][A

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.77it/s][A
Epoch 1:  31%|███       | 1849/5971 [16:53<37:38,  1.82it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 27.46it/s][A
Epoch 1:  31%|███       | 1853/5971 [16:53<37:32,  1.83it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 28.31it/s][A
Epoch 1:  31%|███       | 1857/5971 [16:54<37:25,  1.83it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 28.68it/s][A
Epoch 1:  31%|███       | 1861/5971 [16:54<37:18,  1.84it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 28.64it/s][A
Epoch 1:  31%|███       | 1865/5971 [16:54<37:11,  1.84it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 28.85it/s][A
Epoch 1:  31%|███▏      | 1869/5971 [16:54<37:05,  1.84it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 29.12it/s][A

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 29.04it/s][A
Epoch 1:  31%|███▏      | 1873/5971 [16:54<36:58,  1.85it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  31%|███▏      | 1876/5971 [16:54<36:54,  1.85it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  31%|███▏      | 1877/5971 [16:55<36:54,  1.85it/s, loss=0.243, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000851, train/loss_step=0.242, global_step=749.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  31%|███▏      | 1877/5971 [16:55<36:54,  1.85it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0891, train/loss_vlb_step=0.000293, train/loss_step=0.0891, global_step=750.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  31%|███▏      | 1878/5971 [16:56<36:54,  1.85it/s, loss=0.223, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000678, train/loss_step=0.191, global_step=750.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  31%|███▏      | 1879/5971 [16:57<36:55,  1.85it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.1e-5, train/loss_step=0.00181, global_step=750.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  31%|███▏      | 1880/5971 [17:00<36:58,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000329, train/loss_step=0.0971, global_step=750.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1881/5971 [17:01<36:59,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000329, train/loss_step=0.0971, global_step=750.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1881/5971 [17:01<36:59,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.81e-5, train/loss_step=0.0181, global_step=751.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1882/5971 [17:01<36:59,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00112, train/loss_step=0.297, global_step=751.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  32%|███▏      | 1883/5971 [17:02<36:59,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00181, train/loss_step=0.356, global_step=751.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1884/5971 [17:04<37:02,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00178, train/loss_vlb_step=1.04e-5, train/loss_step=0.00178, global_step=751.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1885/5971 [17:05<37:02,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00178, train/loss_vlb_step=1.04e-5, train/loss_step=0.00178, global_step=751.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1885/5971 [17:05<37:02,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000463, train/loss_step=0.136, global_step=752.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  32%|███▏      | 1886/5971 [17:06<37:02,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00117, train/loss_step=0.285, global_step=752.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1887/5971 [17:07<37:02,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=752.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1888/5971 [17:09<37:05,  1.83it/s, loss=0.193, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000387, train/loss_step=0.118, global_step=752.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1889/5971 [17:10<37:06,  1.83it/s, loss=0.193, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000387, train/loss_step=0.118, global_step=752.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1889/5971 [17:10<37:06,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.000105, train/loss_step=0.0276, global_step=753.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1890/5971 [17:11<37:06,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00274, train/loss_step=0.464, global_step=753.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  32%|███▏      | 1891/5971 [17:12<37:06,  1.83it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000216, train/loss_step=0.0658, global_step=753.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1892/5971 [17:14<37:09,  1.83it/s, loss=0.191, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00183, train/loss_step=0.343, global_step=753.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  32%|███▏      | 1893/5971 [17:15<37:09,  1.83it/s, loss=0.191, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00183, train/loss_step=0.343, global_step=753.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1893/5971 [17:15<37:09,  1.83it/s, loss=0.171, v_num=0, train/loss_simple_step=0.097, train/loss_vlb_step=0.000326, train/loss_step=0.097, global_step=754.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1894/5971 [17:16<37:09,  1.83it/s, loss=0.174, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000219, train/loss_step=0.066, global_step=754.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1895/5971 [17:17<37:09,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000453, train/loss_step=0.135, global_step=754.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1896/5971 [17:19<37:13,  1.82it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.59e-5, train/loss_step=0.0182, global_step=754.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1897/5971 [17:20<37:13,  1.82it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.59e-5, train/loss_step=0.0182, global_step=754.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1897/5971 [17:20<37:13,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.91e-5, train/loss_step=0.00579, global_step=755.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1898/5971 [17:21<37:13,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000123, train/loss_step=0.0331, global_step=755.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1899/5971 [17:22<37:13,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.43e-5, train/loss_step=0.00252, global_step=755.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1900/5971 [17:24<37:16,  1.82it/s, loss=0.144, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00132, train/loss_step=0.311, global_step=755.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  32%|███▏      | 1901/5971 [17:25<37:16,  1.82it/s, loss=0.144, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00132, train/loss_step=0.311, global_step=755.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1901/5971 [17:25<37:16,  1.82it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.29e-5, train/loss_step=0.0202, global_step=756.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1902/5971 [17:26<37:17,  1.82it/s, loss=0.145, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00131, train/loss_step=0.322, global_step=756.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  32%|███▏      | 1903/5971 [17:27<37:17,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00057, train/loss_step=0.168, global_step=756.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1904/5971 [17:29<37:20,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00449, train/loss_step=0.488, global_step=756.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1905/5971 [17:30<37:20,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00449, train/loss_step=0.488, global_step=756.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1905/5971 [17:30<37:20,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=7.03e-5, train/loss_step=0.0159, global_step=757.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1906/5971 [17:31<37:20,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.891, train/loss_vlb_step=0.113, train/loss_step=0.891, global_step=757.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  32%|███▏      | 1907/5971 [17:32<37:20,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=757.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1908/5971 [17:34<37:23,  1.81it/s, loss=0.202, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00374, train/loss_step=0.465, global_step=757.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1909/5971 [17:35<37:23,  1.81it/s, loss=0.202, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00374, train/loss_step=0.465, global_step=757.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1909/5971 [17:35<37:23,  1.81it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.42e-5, train/loss_step=0.0233, global_step=758.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1910/5971 [17:35<37:24,  1.81it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.57e-5, train/loss_step=0.00966, global_step=758.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1911/5971 [17:36<37:24,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.8e-5, train/loss_step=0.00337, global_step=758.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1912/5971 [17:39<37:26,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00889, train/loss_vlb_step=4.13e-5, train/loss_step=0.00889, global_step=758.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1913/5971 [17:39<37:27,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00889, train/loss_vlb_step=4.13e-5, train/loss_step=0.00889, global_step=758.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1913/5971 [17:39<37:27,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.2e-5, train/loss_step=0.0148, global_step=759.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  32%|███▏      | 1914/5971 [17:40<37:27,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.22e-5, train/loss_step=0.014, global_step=759.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1915/5971 [17:41<37:27,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000385, train/loss_step=0.115, global_step=759.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1916/5971 [17:44<37:32,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.0024, train/loss_step=0.386, global_step=759.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  32%|███▏      | 1917/5971 [17:45<37:32,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.0024, train/loss_step=0.386, global_step=759.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1917/5971 [17:45<37:32,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.42e-5, train/loss_step=0.00441, global_step=760.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1918/5971 [17:46<37:32,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00252, train/loss_step=0.413, global_step=760.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  32%|███▏      | 1919/5971 [17:47<37:32,  1.80it/s, loss=0.199, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000745, train/loss_step=0.204, global_step=760.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1920/5971 [17:50<37:36,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.53e-5, train/loss_step=0.00265, global_step=760.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1921/5971 [17:51<37:37,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.53e-5, train/loss_step=0.00265, global_step=760.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1921/5971 [17:51<37:37,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=0.000106, train/loss_step=0.0269, global_step=761.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1922/5971 [17:51<37:37,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00213, train/loss_step=0.402, global_step=761.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  32%|███▏      | 1923/5971 [17:52<37:37,  1.79it/s, loss=0.207, v_num=0, train/loss_simple_step=0.541, train/loss_vlb_step=0.00677, train/loss_step=0.541, global_step=761.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1924/5971 [17:54<37:39,  1.79it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=2.95e-5, train/loss_step=0.00636, global_step=761.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1925/5971 [17:55<37:40,  1.79it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=2.95e-5, train/loss_step=0.00636, global_step=761.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1925/5971 [17:55<37:40,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=762.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  32%|███▏      | 1926/5971 [17:56<37:40,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.36e-5, train/loss_step=0.00227, global_step=762.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1927/5971 [17:57<37:40,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0781, train/loss_vlb_step=0.000263, train/loss_step=0.0781, global_step=762.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1928/5971 [18:00<37:44,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0856, train/loss_vlb_step=0.000282, train/loss_step=0.0856, global_step=762.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1929/5971 [18:01<37:44,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0856, train/loss_vlb_step=0.000282, train/loss_step=0.0856, global_step=762.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1929/5971 [18:01<37:44,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.76e-5, train/loss_step=0.0106, global_step=763.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1930/5971 [18:02<37:44,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00375, train/loss_vlb_step=1.96e-5, train/loss_step=0.00375, global_step=763.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1931/5971 [18:02<37:44,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.83e-5, train/loss_step=0.0163, global_step=763.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  32%|███▏      | 1932/5971 [18:05<37:47,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.47e-5, train/loss_step=0.00262, global_step=763.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1933/5971 [18:06<37:48,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.47e-5, train/loss_step=0.00262, global_step=763.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1933/5971 [18:06<37:48,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.69e-5, train/loss_step=0.0165, global_step=764.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  32%|███▏      | 1934/5971 [18:07<37:48,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.41e-6, train/loss_step=0.00139, global_step=764.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1935/5971 [18:08<37:48,  1.78it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000109, train/loss_step=0.0289, global_step=764.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1936/5971 [18:10<37:51,  1.78it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.64e-5, train/loss_step=0.0126, global_step=764.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1937/5971 [18:11<37:51,  1.78it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.64e-5, train/loss_step=0.0126, global_step=764.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1937/5971 [18:11<37:51,  1.78it/s, loss=0.109, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000655, train/loss_step=0.198, global_step=765.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  32%|███▏      | 1938/5971 [18:12<37:51,  1.78it/s, loss=0.0893, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.81e-5, train/loss_step=0.0188, global_step=765.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  32%|███▏      | 1939/5971 [18:13<37:51,  1.77it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000536, train/loss_step=0.163, global_step=765.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  32%|███▏      | 1940/5971 [18:15<37:55,  1.77it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.85e-5, train/loss_step=0.0198, global_step=765.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1941/5971 [18:16<37:55,  1.77it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.85e-5, train/loss_step=0.0198, global_step=765.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1941/5971 [18:16<37:55,  1.77it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000753, train/loss_step=0.210, global_step=766.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  33%|███▎      | 1942/5971 [18:17<37:55,  1.77it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=766.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1943/5971 [18:18<37:55,  1.77it/s, loss=0.0566, v_num=0, train/loss_simple_step=0.00799, train/loss_vlb_step=4.02e-5, train/loss_step=0.00799, global_step=766.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1944/5971 [18:20<37:58,  1.77it/s, loss=0.0586, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000165, train/loss_step=0.0467, global_step=766.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  33%|███▎      | 1945/5971 [18:21<37:58,  1.77it/s, loss=0.0586, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000165, train/loss_step=0.0467, global_step=766.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1945/5971 [18:21<37:58,  1.77it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00246, train/loss_step=0.432, global_step=767.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  33%|███▎      | 1946/5971 [18:22<37:58,  1.77it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.19e-5, train/loss_step=0.00201, global_step=767.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1947/5971 [18:22<37:58,  1.77it/s, loss=0.0717, v_num=0, train/loss_simple_step=0.0356, train/loss_vlb_step=0.000135, train/loss_step=0.0356, global_step=767.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  33%|███▎      | 1948/5971 [18:25<38:01,  1.76it/s, loss=0.0676, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.69e-5, train/loss_step=0.00294, global_step=767.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1949/5971 [18:26<38:01,  1.76it/s, loss=0.0676, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.69e-5, train/loss_step=0.00294, global_step=767.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1949/5971 [18:26<38:01,  1.76it/s, loss=0.0859, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00174, train/loss_step=0.378, global_step=768.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  33%|███▎      | 1950/5971 [18:27<38:01,  1.76it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=1.03e-5, train/loss_step=0.00171, global_step=768.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1951/5971 [18:27<38:01,  1.76it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.00385, train/loss_vlb_step=2.03e-5, train/loss_step=0.00385, global_step=768.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1952/5971 [18:30<38:04,  1.76it/s, loss=0.102, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.0014, train/loss_step=0.344, global_step=768.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  33%|███▎      | 1953/5971 [18:30<38:04,  1.76it/s, loss=0.102, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.0014, train/loss_step=0.344, global_step=768.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1953/5971 [18:30<38:04,  1.76it/s, loss=0.111, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000677, train/loss_step=0.192, global_step=769.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1954/5971 [18:31<38:04,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.2e-5, train/loss_step=0.0239, global_step=769.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1955/5971 [18:32<38:04,  1.76it/s, loss=0.148, v_num=0, train/loss_simple_step=0.745, train/loss_vlb_step=0.0323, train/loss_step=0.745, global_step=769.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  33%|███▎      | 1956/5971 [18:35<38:07,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.36e-5, train/loss_step=0.00251, global_step=769.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1957/5971 [18:36<38:08,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.36e-5, train/loss_step=0.00251, global_step=769.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1957/5971 [18:36<38:08,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.83e-5, train/loss_step=0.0105, global_step=770.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  33%|███▎      | 1958/5971 [18:36<38:08,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.66e-5, train/loss_step=0.0129, global_step=770.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1959/5971 [18:37<38:08,  1.75it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.84e-5, train/loss_step=0.0166, global_step=770.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  33%|███▎      | 1960/5971 [18:40<38:11,  1.75it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00684, train/loss_vlb_step=3.36e-5, train/loss_step=0.00684, global_step=770.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1961/5971 [18:41<38:12,  1.75it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00684, train/loss_vlb_step=3.36e-5, train/loss_step=0.00684, global_step=770.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1961/5971 [18:41<38:12,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00279, train/loss_step=0.452, global_step=771.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  33%|███▎      | 1962/5971 [18:42<38:12,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00165, train/loss_step=0.350, global_step=771.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1963/5971 [18:43<38:12,  1.75it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000189, train/loss_step=0.0539, global_step=771.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1964/5971 [18:45<38:15,  1.75it/s, loss=0.176, v_num=0, train/loss_simple_step=0.450, train/loss_vlb_step=0.00303, train/loss_step=0.450, global_step=771.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  33%|███▎      | 1965/5971 [18:46<38:15,  1.75it/s, loss=0.176, v_num=0, train/loss_simple_step=0.450, train/loss_vlb_step=0.00303, train/loss_step=0.450, global_step=771.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1965/5971 [18:46<38:15,  1.75it/s, loss=0.172, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.0023, train/loss_step=0.360, global_step=772.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  33%|███▎      | 1966/5971 [18:47<38:15,  1.74it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.58e-5, train/loss_step=0.0233, global_step=772.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1967/5971 [18:48<38:15,  1.74it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.4e-5, train/loss_step=0.00243, global_step=772.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1968/5971 [18:50<38:18,  1.74it/s, loss=0.188, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00175, train/loss_step=0.325, global_step=772.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  33%|███▎      | 1969/5971 [18:51<38:19,  1.74it/s, loss=0.188, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00175, train/loss_step=0.325, global_step=772.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1969/5971 [18:51<38:19,  1.74it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00832, train/loss_vlb_step=3.9e-5, train/loss_step=0.00832, global_step=773.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1970/5971 [18:52<38:19,  1.74it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.8e-5, train/loss_step=0.00531, global_step=773.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1971/5971 [18:53<38:19,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=773.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  33%|███▎      | 1972/5971 [18:56<38:23,  1.74it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.74e-5, train/loss_step=0.00329, global_step=773.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1973/5971 [18:57<38:23,  1.74it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.74e-5, train/loss_step=0.00329, global_step=773.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1973/5971 [18:57<38:23,  1.74it/s, loss=0.173, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00558, train/loss_step=0.488, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  33%|███▎      | 1974/5971 [18:58<38:23,  1.73it/s, loss=0.204, v_num=0, train/loss_simple_step=0.647, train/loss_vlb_step=0.0145, train/loss_step=0.647, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  33%|███▎      | 1975/5971 [18:59<38:23,  1.73it/s, loss=0.173, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000417, train/loss_step=0.127, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  33%|███▎      | 1976/5971 [19:01<38:27,  1.73it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  33%|███▎      | 1977/5971 [19:01<38:25,  1.73it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:18,  2.12it/s][A

Validating:   1%|          | 2/167 [00:00<00:56,  2.90it/s][A
Epoch 1:  33%|███▎      | 1981/5971 [19:02<38:20,  1.73it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:20,  7.83it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:13, 11.96it/s][A
Epoch 1:  33%|███▎      | 1985/5971 [19:02<38:13,  1.74it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:10, 15.44it/s][A
Epoch 1:  33%|███▎      | 1989/5971 [19:03<38:07,  1.74it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.74it/s][A
Epoch 1:  33%|███▎      | 1993/5971 [19:03<38:00,  1.74it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.73it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.53it/s][A
Epoch 1:  33%|███▎      | 1997/5971 [19:03<37:54,  1.75it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.75it/s][A
Epoch 1:  34%|███▎      | 2001/5971 [19:03<37:47,  1.75it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 22.76it/s][A
Epoch 1:  34%|███▎      | 2005/5971 [19:03<37:41,  1.75it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.94it/s][A
Epoch 1:  34%|███▎      | 2009/5971 [19:03<37:34,  1.76it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 23.74it/s][A

Validating:  22%|██▏       | 36/167 [00:02<00:05, 24.25it/s][A
Epoch 1:  34%|███▎      | 2013/5971 [19:03<37:28,  1.76it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 24.92it/s][A
Epoch 1:  34%|███▍      | 2017/5971 [19:04<37:21,  1.76it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.48it/s][A
Epoch 1:  34%|███▍      | 2021/5971 [19:04<37:15,  1.77it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.35it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.70it/s][A
Epoch 1:  34%|███▍      | 2025/5971 [19:04<37:09,  1.77it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 23.83it/s][A
Epoch 1:  34%|███▍      | 2029/5971 [19:04<37:02,  1.77it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 54/167 [00:02<00:05, 22.39it/s][A
Epoch 1:  34%|███▍      | 2033/5971 [19:04<36:56,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 22.63it/s][A

Validating:  36%|███▌      | 60/167 [00:03<00:04, 23.27it/s][A
Epoch 1:  34%|███▍      | 2037/5971 [19:04<36:50,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:03<00:04, 23.46it/s][A
Epoch 1:  34%|███▍      | 2041/5971 [19:05<36:43,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 24.39it/s][A
Epoch 1:  34%|███▍      | 2045/5971 [19:05<36:37,  1.79it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 24.27it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.52it/s][A
Epoch 1:  34%|███▍      | 2049/5971 [19:05<36:31,  1.79it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.77it/s][A
Epoch 1:  34%|███▍      | 2053/5971 [19:05<36:25,  1.79it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.70it/s][A
Epoch 1:  34%|███▍      | 2057/5971 [19:05<36:19,  1.80it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.23it/s][A

Validating:  50%|█████     | 84/167 [00:04<00:03, 24.98it/s][A
Epoch 1:  35%|███▍      | 2061/5971 [19:05<36:12,  1.80it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 25.76it/s][A
Epoch 1:  35%|███▍      | 2065/5971 [19:06<36:06,  1.80it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 21.20it/s][A
Epoch 1:  35%|███▍      | 2069/5971 [19:06<36:00,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▌    | 93/167 [00:04<00:03, 23.12it/s][A
Epoch 1:  35%|███▍      | 2073/5971 [19:06<35:54,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  58%|█████▊    | 97/167 [00:04<00:04, 15.01it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:03, 16.94it/s][A
Epoch 1:  35%|███▍      | 2077/5971 [19:06<35:49,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 103/167 [00:05<00:03, 18.63it/s][A
Epoch 1:  35%|███▍      | 2081/5971 [19:07<35:43,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 20.49it/s][A
Epoch 1:  35%|███▍      | 2085/5971 [19:07<35:37,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 22.02it/s][A

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 23.03it/s][A
Epoch 1:  35%|███▍      | 2089/5971 [19:07<35:31,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 23.92it/s][A
Epoch 1:  35%|███▌      | 2093/5971 [19:07<35:25,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████   | 118/167 [00:05<00:01, 24.99it/s][A
Epoch 1:  35%|███▌      | 2097/5971 [19:07<35:19,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 23.78it/s][A

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.04it/s][A
Epoch 1:  35%|███▌      | 2101/5971 [19:07<35:13,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:06<00:01, 25.17it/s][A
Epoch 1:  35%|███▌      | 2105/5971 [19:07<35:07,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 23.67it/s][A
Epoch 1:  35%|███▌      | 2109/5971 [19:08<35:01,  1.84it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|███████▉  | 133/167 [00:06<00:01, 25.05it/s][A

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.90it/s][A
Epoch 1:  35%|███▌      | 2113/5971 [19:08<34:55,  1.84it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 26.25it/s][A
Epoch 1:  35%|███▌      | 2117/5971 [19:08<34:49,  1.84it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.19it/s][A
Epoch 1:  36%|███▌      | 2121/5971 [19:08<34:43,  1.85it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.68it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.57it/s][A
Epoch 1:  36%|███▌      | 2125/5971 [19:08<34:38,  1.85it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.47it/s][A
Epoch 1:  36%|███▌      | 2129/5971 [19:08<34:32,  1.85it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 154/167 [00:07<00:00, 27.63it/s][A
Epoch 1:  36%|███▌      | 2133/5971 [19:09<34:26,  1.86it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  94%|█████████▍| 157/167 [00:07<00:00, 28.17it/s][A

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 27.44it/s][A
Epoch 1:  36%|███▌      | 2137/5971 [19:09<34:20,  1.86it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 26.48it/s][A
Epoch 1:  36%|███▌      | 2141/5971 [19:09<34:15,  1.86it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 26.58it/s][A
Epoch 1:  36%|███▌      | 2144/5971 [19:09<34:11,  1.87it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  36%|███▌      | 2145/5971 [19:10<34:11,  1.86it/s, loss=0.194, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00314, train/loss_step=0.426, global_step=774.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2145/5971 [19:10<34:11,  1.86it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.00017, train/loss_step=0.0493, global_step=775.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2146/5971 [19:11<34:11,  1.86it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00357, train/loss_vlb_step=1.97e-5, train/loss_step=0.00357, global_step=775.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2147/5971 [19:12<34:11,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00919, train/loss_vlb_step=4.45e-5, train/loss_step=0.00919, global_step=775.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2148/5971 [19:14<34:14,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.17e-5, train/loss_step=0.00192, global_step=775.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2149/5971 [19:15<34:14,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.17e-5, train/loss_step=0.00192, global_step=775.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2149/5971 [19:15<34:14,  1.86it/s, loss=0.179, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.00041, train/loss_step=0.124, global_step=776.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  36%|███▌      | 2150/5971 [19:16<34:14,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.21e-5, train/loss_step=0.0179, global_step=776.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2151/5971 [19:17<34:14,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00708, train/loss_step=0.490, global_step=776.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  36%|███▌      | 2152/5971 [19:19<34:17,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000226, train/loss_step=0.0665, global_step=776.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2153/5971 [19:20<34:17,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000226, train/loss_step=0.0665, global_step=776.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2153/5971 [19:20<34:17,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00603, train/loss_step=0.518, global_step=777.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  36%|███▌      | 2154/5971 [19:21<34:17,  1.85it/s, loss=0.193, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00358, train/loss_step=0.436, global_step=777.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2155/5971 [19:22<34:17,  1.85it/s, loss=0.199, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000363, train/loss_step=0.109, global_step=777.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2156/5971 [19:25<34:21,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.09e-5, train/loss_step=0.0239, global_step=777.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2157/5971 [19:26<34:21,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.09e-5, train/loss_step=0.0239, global_step=777.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2157/5971 [19:26<34:21,  1.85it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.39e-5, train/loss_step=0.00247, global_step=778.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2158/5971 [19:27<34:21,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=778.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  36%|███▌      | 2159/5971 [19:28<34:21,  1.85it/s, loss=0.19, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000453, train/loss_step=0.138, global_step=778.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  36%|███▌      | 2160/5971 [19:30<34:23,  1.85it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.36e-5, train/loss_step=0.00227, global_step=778.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2161/5971 [19:31<34:24,  1.85it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.36e-5, train/loss_step=0.00227, global_step=778.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2161/5971 [19:31<34:24,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0041, train/loss_vlb_step=2.13e-5, train/loss_step=0.0041, global_step=779.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  36%|███▌      | 2162/5971 [19:32<34:24,  1.85it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.77e-5, train/loss_step=0.0032, global_step=779.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2163/5971 [19:33<34:24,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000115, train/loss_step=0.0277, global_step=779.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▌      | 2164/5971 [19:35<34:26,  1.84it/s, loss=0.115, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000456, train/loss_step=0.138, global_step=779.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  36%|███▋      | 2165/5971 [19:36<34:26,  1.84it/s, loss=0.115, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000456, train/loss_step=0.138, global_step=779.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2165/5971 [19:36<34:26,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0765, train/loss_vlb_step=0.000257, train/loss_step=0.0765, global_step=780.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2166/5971 [19:37<34:26,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.43e-5, train/loss_step=0.00259, global_step=780.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2167/5971 [19:37<34:26,  1.84it/s, loss=0.133, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00203, train/loss_step=0.355, global_step=780.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  36%|███▋      | 2168/5971 [19:40<34:29,  1.84it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000145, train/loss_step=0.0399, global_step=780.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2169/5971 [19:41<34:29,  1.84it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000145, train/loss_step=0.0399, global_step=780.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2169/5971 [19:41<34:29,  1.84it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.94e-5, train/loss_step=0.0213, global_step=781.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  36%|███▋      | 2170/5971 [19:42<34:29,  1.84it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.92e-5, train/loss_step=0.0136, global_step=781.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2171/5971 [19:43<34:29,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.24e-5, train/loss_step=0.0143, global_step=781.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2172/5971 [19:45<34:32,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00331, train/loss_step=0.488, global_step=781.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  36%|███▋      | 2173/5971 [19:46<34:32,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00331, train/loss_step=0.488, global_step=781.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2173/5971 [19:46<34:32,  1.83it/s, loss=0.107, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000383, train/loss_step=0.116, global_step=782.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2174/5971 [19:47<34:32,  1.83it/s, loss=0.096, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000941, train/loss_step=0.217, global_step=782.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2175/5971 [19:47<34:32,  1.83it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0041, train/loss_vlb_step=2.26e-5, train/loss_step=0.0041, global_step=782.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2176/5971 [19:50<34:35,  1.83it/s, loss=0.102, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00115, train/loss_step=0.247, global_step=782.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  36%|███▋      | 2177/5971 [19:51<34:35,  1.83it/s, loss=0.102, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00115, train/loss_step=0.247, global_step=782.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2177/5971 [19:51<34:35,  1.83it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000281, train/loss_step=0.0797, global_step=783.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2178/5971 [19:52<34:35,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000307, train/loss_step=0.0928, global_step=783.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  36%|███▋      | 2179/5971 [19:52<34:35,  1.83it/s, loss=0.136, v_num=0, train/loss_simple_step=0.769, train/loss_vlb_step=0.0309, train/loss_step=0.769, global_step=783.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  37%|███▋      | 2180/5971 [19:55<34:37,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000957, train/loss_step=0.251, global_step=783.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2181/5971 [19:56<34:37,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000957, train/loss_step=0.251, global_step=783.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2181/5971 [19:56<34:37,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.8e-5, train/loss_step=0.0136, global_step=784.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2182/5971 [19:56<34:37,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000429, train/loss_step=0.128, global_step=784.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2183/5971 [19:57<34:37,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.56e-5, train/loss_step=0.00961, global_step=784.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2184/5971 [19:59<34:39,  1.82it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.28e-5, train/loss_step=0.00212, global_step=784.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2185/5971 [20:00<34:39,  1.82it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.28e-5, train/loss_step=0.00212, global_step=784.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2185/5971 [20:00<34:39,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.803, train/loss_vlb_step=0.0416, train/loss_step=0.803, global_step=785.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  37%|███▋      | 2186/5971 [20:01<34:39,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.0001, train/loss_step=0.0256, global_step=785.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2187/5971 [20:02<34:39,  1.82it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.57e-5, train/loss_step=0.00536, global_step=785.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2188/5971 [20:04<34:41,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0074, train/loss_vlb_step=3.69e-5, train/loss_step=0.0074, global_step=785.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  37%|███▋      | 2189/5971 [20:05<34:42,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0074, train/loss_vlb_step=3.69e-5, train/loss_step=0.0074, global_step=785.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2189/5971 [20:05<34:42,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00955, train/loss_vlb_step=4.46e-5, train/loss_step=0.00955, global_step=786.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2190/5971 [20:06<34:42,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000891, train/loss_step=0.231, global_step=786.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  37%|███▋      | 2191/5971 [20:07<34:42,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=786.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2192/5971 [20:09<34:44,  1.81it/s, loss=0.161, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000362, train/loss_step=0.108, global_step=786.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2193/5971 [20:10<34:44,  1.81it/s, loss=0.161, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000362, train/loss_step=0.108, global_step=786.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2193/5971 [20:10<34:44,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000443, train/loss_step=0.134, global_step=787.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2194/5971 [20:11<34:44,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.54e-5, train/loss_step=0.0148, global_step=787.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2195/5971 [20:12<34:44,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=787.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2196/5971 [20:14<34:47,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000617, train/loss_step=0.184, global_step=787.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2197/5971 [20:15<34:47,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000617, train/loss_step=0.184, global_step=787.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2197/5971 [20:15<34:47,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000488, train/loss_step=0.146, global_step=788.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2198/5971 [20:16<34:47,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.33e-5, train/loss_step=0.0124, global_step=788.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2199/5971 [20:17<34:47,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00612, train/loss_vlb_step=3.2e-5, train/loss_step=0.00612, global_step=788.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2200/5971 [20:19<34:49,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.00409, train/loss_step=0.504, global_step=788.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  37%|███▋      | 2201/5971 [20:20<34:49,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.00409, train/loss_step=0.504, global_step=788.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2201/5971 [20:20<34:49,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000109, train/loss_step=0.0266, global_step=789.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2202/5971 [20:21<34:49,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.11e-5, train/loss_step=0.00392, global_step=789.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2203/5971 [20:22<34:49,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0739, train/loss_vlb_step=0.000243, train/loss_step=0.0739, global_step=789.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2204/5971 [20:24<34:51,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.07e-5, train/loss_step=0.0172, global_step=789.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2205/5971 [20:25<34:51,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.07e-5, train/loss_step=0.0172, global_step=789.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2205/5971 [20:25<34:51,  1.80it/s, loss=0.0926, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=790.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2206/5971 [20:26<34:51,  1.80it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.00354, train/loss_vlb_step=1.94e-5, train/loss_step=0.00354, global_step=790.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2207/5971 [20:26<34:51,  1.80it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.8e-5, train/loss_step=0.0182, global_step=790.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  37%|███▋      | 2208/5971 [20:29<34:53,  1.80it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000181, train/loss_step=0.0505, global_step=790.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2209/5971 [20:30<34:53,  1.80it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000181, train/loss_step=0.0505, global_step=790.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2209/5971 [20:30<34:53,  1.80it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.37e-5, train/loss_step=0.00237, global_step=791.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2210/5971 [20:30<34:53,  1.80it/s, loss=0.0896, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.00048, train/loss_step=0.145, global_step=791.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  37%|███▋      | 2211/5971 [20:31<34:53,  1.80it/s, loss=0.0901, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000374, train/loss_step=0.113, global_step=791.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2212/5971 [20:34<34:56,  1.79it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00131, train/loss_step=0.303, global_step=791.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2213/5971 [20:35<34:56,  1.79it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00131, train/loss_step=0.303, global_step=791.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2213/5971 [20:35<34:56,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00135, train/loss_step=0.269, global_step=792.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2214/5971 [20:36<34:56,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.05e-5, train/loss_step=0.0233, global_step=792.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2215/5971 [20:37<34:56,  1.79it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000125, train/loss_step=0.0322, global_step=792.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2216/5971 [20:39<34:59,  1.79it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000183, train/loss_step=0.0531, global_step=792.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2217/5971 [20:40<34:59,  1.79it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000183, train/loss_step=0.0531, global_step=792.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2217/5971 [20:40<34:59,  1.79it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.09e-5, train/loss_step=0.011, global_step=793.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  37%|███▋      | 2218/5971 [20:41<34:59,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00174, train/loss_step=0.355, global_step=793.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2219/5971 [20:42<34:59,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000115, train/loss_step=0.0314, global_step=793.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2220/5971 [20:44<35:02,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.691, train/loss_vlb_step=0.0176, train/loss_step=0.691, global_step=793.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  37%|███▋      | 2221/5971 [20:45<35:02,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.691, train/loss_vlb_step=0.0176, train/loss_step=0.691, global_step=793.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2221/5971 [20:45<35:02,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.0014, train/loss_step=0.280, global_step=794.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2222/5971 [20:46<35:02,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000138, train/loss_step=0.0396, global_step=794.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2223/5971 [20:47<35:02,  1.78it/s, loss=0.145, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00202, train/loss_step=0.349, global_step=794.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  37%|███▋      | 2224/5971 [20:49<35:04,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000353, train/loss_step=0.106, global_step=794.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2225/5971 [20:50<35:04,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000353, train/loss_step=0.106, global_step=794.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2225/5971 [20:50<35:04,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000241, train/loss_step=0.0725, global_step=795.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2226/5971 [20:51<35:04,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.00018, train/loss_step=0.0496, global_step=795.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  37%|███▋      | 2227/5971 [20:52<35:04,  1.78it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00306, train/loss_vlb_step=1.69e-5, train/loss_step=0.00306, global_step=795.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2228/5971 [20:54<35:06,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.43e-5, train/loss_step=0.0186, global_step=795.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  37%|███▋      | 2229/5971 [20:55<35:06,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.43e-5, train/loss_step=0.0186, global_step=795.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2229/5971 [20:55<35:06,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000863, train/loss_step=0.221, global_step=796.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2230/5971 [20:56<35:06,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.36e-5, train/loss_step=0.00231, global_step=796.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2231/5971 [20:57<35:06,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00763, train/loss_vlb_step=3.7e-5, train/loss_step=0.00763, global_step=796.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2232/5971 [20:59<35:08,  1.77it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.63e-5, train/loss_step=0.00501, global_step=796.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2233/5971 [21:00<35:08,  1.77it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.63e-5, train/loss_step=0.00501, global_step=796.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2233/5971 [21:00<35:08,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0478, train/loss_vlb_step=0.00017, train/loss_step=0.0478, global_step=797.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  37%|███▋      | 2234/5971 [21:01<35:08,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000696, train/loss_step=0.188, global_step=797.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2235/5971 [21:02<35:08,  1.77it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00764, train/loss_vlb_step=3.68e-5, train/loss_step=0.00764, global_step=797.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2236/5971 [21:04<35:11,  1.77it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.1e-5, train/loss_step=0.00182, global_step=797.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  37%|███▋      | 2237/5971 [21:05<35:11,  1.77it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.1e-5, train/loss_step=0.00182, global_step=797.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2237/5971 [21:05<35:11,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000294, train/loss_step=0.0883, global_step=798.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  37%|███▋      | 2238/5971 [21:06<35:11,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.00075, train/loss_step=0.198, global_step=798.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  37%|███▋      | 2239/5971 [21:07<35:11,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0678, train/loss_vlb_step=0.000229, train/loss_step=0.0678, global_step=798.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  38%|███▊      | 2240/5971 [21:09<35:13,  1.77it/s, loss=0.088, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.96e-5, train/loss_step=0.00588, global_step=798.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  38%|███▊      | 2241/5971 [21:10<35:13,  1.77it/s, loss=0.088, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.96e-5, train/loss_step=0.00588, global_step=798.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  38%|███▊      | 2241/5971 [21:10<35:13,  1.77it/s, loss=0.0862, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000948, train/loss_step=0.245, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  38%|███▊      | 2242/5971 [21:11<35:13,  1.76it/s, loss=0.0845, v_num=0, train/loss_simple_step=0.00447, train/loss_vlb_step=2.37e-5, train/loss_step=0.00447, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  38%|███▊      | 2243/5971 [21:11<35:13,  1.76it/s, loss=0.0721, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  38%|███▊      | 2244/5971 [21:14<35:16,  1.76it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  38%|███▊      | 2245/5971 [21:15<35:15,  1.76it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:19,  2.09it/s][A

Validating:   1%|          | 2/167 [00:00<00:58,  2.83it/s][A
Epoch 1:  38%|███▊      | 2249/5971 [21:15<35:10,  1.76it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:21,  7.62it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:13, 11.66it/s][A
Epoch 1:  38%|███▊      | 2253/5971 [21:16<35:04,  1.77it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:10, 14.94it/s][A
Epoch 1:  38%|███▊      | 2257/5971 [21:16<34:59,  1.77it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.20it/s][A
Epoch 1:  38%|███▊      | 2261/5971 [21:16<34:53,  1.77it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.87it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.88it/s][A
Epoch 1:  38%|███▊      | 2265/5971 [21:16<34:47,  1.78it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.91it/s][A
Epoch 1:  38%|███▊      | 2269/5971 [21:16<34:41,  1.78it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.36it/s][A
Epoch 1:  38%|███▊      | 2273/5971 [21:16<34:36,  1.78it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.67it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.08it/s][A
Epoch 1:  38%|███▊      | 2277/5971 [21:16<34:30,  1.78it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:02<00:05, 25.61it/s][A
Epoch 1:  38%|███▊      | 2281/5971 [21:17<34:25,  1.79it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.19it/s][A
Epoch 1:  38%|███▊      | 2285/5971 [21:17<34:19,  1.79it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.20it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 27.02it/s][A
Epoch 1:  38%|███▊      | 2289/5971 [21:17<34:13,  1.79it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 27.61it/s][A
Epoch 1:  38%|███▊      | 2293/5971 [21:17<34:08,  1.80it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.66it/s][A
Epoch 1:  38%|███▊      | 2297/5971 [21:17<34:02,  1.80it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 27.30it/s][A

Validating:  34%|███▎      | 56/167 [00:02<00:03, 27.83it/s][A
Epoch 1:  39%|███▊      | 2301/5971 [21:17<33:57,  1.80it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▌      | 59/167 [00:03<00:05, 20.92it/s][A
Epoch 1:  39%|███▊      | 2305/5971 [21:18<33:51,  1.80it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  37%|███▋      | 62/167 [00:03<00:04, 22.65it/s][A
Epoch 1:  39%|███▊      | 2309/5971 [21:18<33:46,  1.81it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 23.81it/s][A

Validating:  41%|████      | 68/167 [00:03<00:03, 25.12it/s][A
Epoch 1:  39%|███▊      | 2313/5971 [21:18<33:40,  1.81it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 24.95it/s][A
Epoch 1:  39%|███▉      | 2317/5971 [21:18<33:35,  1.81it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.51it/s][A
Epoch 1:  39%|███▉      | 2321/5971 [21:18<33:29,  1.82it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.72it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.78it/s][A
Epoch 1:  39%|███▉      | 2325/5971 [21:18<33:24,  1.82it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.19it/s][A
Epoch 1:  39%|███▉      | 2329/5971 [21:18<33:19,  1.82it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 26.42it/s][A
Epoch 1:  39%|███▉      | 2333/5971 [21:19<33:13,  1.82it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 89/167 [00:04<00:02, 26.40it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.75it/s][A
Epoch 1:  39%|███▉      | 2337/5971 [21:19<33:08,  1.83it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.22it/s][A
Epoch 1:  39%|███▉      | 2341/5971 [21:19<33:03,  1.83it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.21it/s][A
Epoch 1:  39%|███▉      | 2345/5971 [21:19<32:57,  1.83it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.60it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.53it/s][A
Epoch 1:  39%|███▉      | 2349/5971 [21:19<32:52,  1.84it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.59it/s][A
Epoch 1:  39%|███▉      | 2353/5971 [21:19<32:47,  1.84it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.10it/s][A
Epoch 1:  39%|███▉      | 2357/5971 [21:20<32:41,  1.84it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 25.82it/s][A

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.24it/s][A
Epoch 1:  40%|███▉      | 2361/5971 [21:20<32:36,  1.85it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 24.47it/s][A
Epoch 1:  40%|███▉      | 2365/5971 [21:20<32:31,  1.85it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 23.44it/s][A
Epoch 1:  40%|███▉      | 2369/5971 [21:20<32:26,  1.85it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 24.35it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.34it/s][A
Epoch 1:  40%|███▉      | 2373/5971 [21:20<32:21,  1.85it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 24.68it/s][A
Epoch 1:  40%|███▉      | 2377/5971 [21:20<32:15,  1.86it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|████████  | 134/167 [00:05<00:01, 23.43it/s][A
Epoch 1:  40%|███▉      | 2381/5971 [21:21<32:10,  1.86it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 23.92it/s][A

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 24.51it/s][A
Epoch 1:  40%|███▉      | 2385/5971 [21:21<32:05,  1.86it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 24.22it/s][A
Epoch 1:  40%|████      | 2389/5971 [21:21<32:00,  1.87it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 24.92it/s][A
Epoch 1:  40%|████      | 2393/5971 [21:21<31:55,  1.87it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.03it/s][A

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.09it/s][A
Epoch 1:  40%|████      | 2397/5971 [21:21<31:50,  1.87it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.55it/s][A
Epoch 1:  40%|████      | 2401/5971 [21:21<31:45,  1.87it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.63it/s][A
Epoch 1:  40%|████      | 2405/5971 [21:21<31:40,  1.88it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.85it/s][A

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 25.18it/s][A
Epoch 1:  40%|████      | 2409/5971 [21:22<31:35,  1.88it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 100%|██████████| 167/167 [00:07<00:00, 24.84it/s][A
Epoch 1:  40%|████      | 2412/5971 [21:22<31:31,  1.88it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.71it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.96it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.14it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.45it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.44it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.35it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.25it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.31it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.34it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.26it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.15it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.12it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.03it/s][A
Epoch 1:  40%|████      | 2412/5971 [21:33<31:47,  1.87it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.11it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.16it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.18it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.24it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s]

Epoch 1:  40%|████      | 2413/5971 [21:34<31:48,  1.86it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000598, train/loss_step=0.180, global_step=799.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  40%|████      | 2413/5971 [21:34<31:48,  1.86it/s, loss=0.0724, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.17e-5, train/loss_step=0.00396, global_step=800.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.33it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.10it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.70it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.15it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.51it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.81it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.94it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.06it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.46it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.49it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.47it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.56it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.59it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.55it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.35it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.12it/s]

Epoch 1:  40%|████      | 2414/5971 [21:47<32:05,  1.85it/s, loss=0.0724, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.17e-5, train/loss_step=0.00396, global_step=800.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  40%|████      | 2414/5971 [21:47<32:05,  1.85it/s, loss=0.0706, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.31e-5, train/loss_step=0.0142, global_step=800.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.21it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.86it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.70it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.95it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.16it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.52it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.57it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.43it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.61it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.32it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.25it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.15it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.22it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.27it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.24it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.47it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.57it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.51it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.14it/s]

Epoch 1:  40%|████      | 2415/5971 [21:59<32:21,  1.83it/s, loss=0.0706, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.31e-5, train/loss_step=0.0142, global_step=800.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  40%|████      | 2415/5971 [21:59<32:21,  1.83it/s, loss=0.0806, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000707, train/loss_step=0.202, global_step=800.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.22it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.87it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.62it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.88it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.48it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.40it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.35it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.47it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.47it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.53it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.57it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.61it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.63it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.64it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.58it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.56it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.57it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.60it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 1:  40%|████      | 2416/5971 [22:12<32:39,  1.81it/s, loss=0.0806, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000707, train/loss_step=0.202, global_step=800.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  40%|████      | 2416/5971 [22:12<32:39,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.708, train/loss_vlb_step=0.0143, train/loss_step=0.708, global_step=800.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  40%|████      | 2417/5971 [22:13<32:39,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.708, train/loss_vlb_step=0.0143, train/loss_step=0.708, global_step=800.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  40%|████      | 2417/5971 [22:13<32:39,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00262, train/loss_step=0.378, global_step=801.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  40%|████      | 2418/5971 [22:14<32:39,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00262, train/loss_step=0.378, global_step=801.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  40%|████      | 2418/5971 [22:14<32:39,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00152, train/loss_step=0.326, global_step=801.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2419/5971 [22:15<32:39,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00152, train/loss_step=0.326, global_step=801.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2419/5971 [22:15<32:39,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=1.02e-5, train/loss_step=0.0017, global_step=801.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2420/5971 [22:18<32:43,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=1.02e-5, train/loss_step=0.0017, global_step=801.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2420/5971 [22:18<32:43,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00372, train/loss_step=0.463, global_step=801.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████      | 2421/5971 [22:19<32:43,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00372, train/loss_step=0.463, global_step=801.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2421/5971 [22:19<32:43,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=802.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2422/5971 [22:20<32:43,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=802.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2422/5971 [22:20<32:43,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.14e-5, train/loss_step=0.0238, global_step=802.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2423/5971 [22:21<32:43,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.14e-5, train/loss_step=0.0238, global_step=802.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2423/5971 [22:21<32:43,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.0112, train/loss_step=0.574, global_step=802.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  41%|████      | 2424/5971 [22:23<32:45,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.0112, train/loss_step=0.574, global_step=802.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2424/5971 [22:23<32:45,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000129, train/loss_step=0.0368, global_step=802.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2425/5971 [22:24<32:45,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000129, train/loss_step=0.0368, global_step=802.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2425/5971 [22:24<32:45,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.62e-5, train/loss_step=0.005, global_step=803.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  41%|████      | 2426/5971 [22:25<32:45,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.62e-5, train/loss_step=0.005, global_step=803.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2426/5971 [22:25<32:45,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0223, train/loss_vlb_step=8.61e-5, train/loss_step=0.0223, global_step=803.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2427/5971 [22:26<32:45,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0223, train/loss_vlb_step=8.61e-5, train/loss_step=0.0223, global_step=803.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2427/5971 [22:26<32:45,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.361, train/loss_vlb_step=0.00215, train/loss_step=0.361, global_step=803.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████      | 2428/5971 [22:29<32:47,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.361, train/loss_vlb_step=0.00215, train/loss_step=0.361, global_step=803.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2428/5971 [22:29<32:47,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.3e-5, train/loss_step=0.00224, global_step=803.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2429/5971 [22:30<32:47,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.3e-5, train/loss_step=0.00224, global_step=803.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2429/5971 [22:30<32:47,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.22e-5, train/loss_step=0.0234, global_step=804.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████      | 2430/5971 [22:31<32:47,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.22e-5, train/loss_step=0.0234, global_step=804.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2430/5971 [22:31<32:47,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00049, train/loss_step=0.140, global_step=804.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████      | 2431/5971 [22:31<32:47,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00049, train/loss_step=0.140, global_step=804.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2431/5971 [22:31<32:47,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000185, train/loss_step=0.0542, global_step=804.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2432/5971 [22:34<32:49,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000185, train/loss_step=0.0542, global_step=804.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2432/5971 [22:34<32:49,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00605, train/loss_vlb_step=2.84e-5, train/loss_step=0.00605, global_step=804.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2433/5971 [22:35<32:49,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00605, train/loss_vlb_step=2.84e-5, train/loss_step=0.00605, global_step=804.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2433/5971 [22:35<32:49,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.77e-5, train/loss_step=0.00327, global_step=805.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2434/5971 [22:36<32:49,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.77e-5, train/loss_step=0.00327, global_step=805.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2434/5971 [22:36<32:49,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.39e-5, train/loss_step=0.0146, global_step=805.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████      | 2435/5971 [22:36<32:49,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.39e-5, train/loss_step=0.0146, global_step=805.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2435/5971 [22:36<32:49,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.00403, train/loss_step=0.444, global_step=805.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████      | 2436/5971 [22:39<32:52,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.00403, train/loss_step=0.444, global_step=805.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2436/5971 [22:39<32:52,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00257, train/loss_vlb_step=1.49e-5, train/loss_step=0.00257, global_step=805.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2437/5971 [22:40<32:52,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00257, train/loss_vlb_step=1.49e-5, train/loss_step=0.00257, global_step=805.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2437/5971 [22:40<32:52,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0821, train/loss_vlb_step=0.000275, train/loss_step=0.0821, global_step=806.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████      | 2438/5971 [22:41<32:52,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0821, train/loss_vlb_step=0.000275, train/loss_step=0.0821, global_step=806.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2438/5971 [22:41<32:52,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000109, train/loss_step=0.0272, global_step=806.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2439/5971 [22:42<32:52,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000109, train/loss_step=0.0272, global_step=806.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2439/5971 [22:42<32:52,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00103, train/loss_step=0.282, global_step=806.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  41%|████      | 2440/5971 [22:45<32:54,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00103, train/loss_step=0.282, global_step=806.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2440/5971 [22:45<32:54,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00991, train/loss_vlb_step=4.56e-5, train/loss_step=0.00991, global_step=806.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2441/5971 [22:45<32:54,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00991, train/loss_vlb_step=4.56e-5, train/loss_step=0.00991, global_step=806.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2441/5971 [22:45<32:54,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.915, train/loss_vlb_step=0.0933, train/loss_step=0.915, global_step=807.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  41%|████      | 2442/5971 [22:46<32:54,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.915, train/loss_vlb_step=0.0933, train/loss_step=0.915, global_step=807.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2442/5971 [22:46<32:54,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000396, train/loss_step=0.117, global_step=807.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2443/5971 [22:47<32:54,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000396, train/loss_step=0.117, global_step=807.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2443/5971 [22:47<32:54,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.92e-5, train/loss_step=0.016, global_step=807.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████      | 2444/5971 [22:49<32:56,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.92e-5, train/loss_step=0.016, global_step=807.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2444/5971 [22:49<32:56,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00421, train/loss_step=0.507, global_step=807.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2445/5971 [22:50<32:56,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00421, train/loss_step=0.507, global_step=807.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2445/5971 [22:50<32:56,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.09e-5, train/loss_step=0.00179, global_step=808.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2446/5971 [22:51<32:55,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.09e-5, train/loss_step=0.00179, global_step=808.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2446/5971 [22:51<32:55,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.61e-5, train/loss_step=0.00289, global_step=808.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2447/5971 [22:52<32:55,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.61e-5, train/loss_step=0.00289, global_step=808.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2447/5971 [22:52<32:55,  1.78it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0535, train/loss_vlb_step=0.000185, train/loss_step=0.0535, global_step=808.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████      | 2448/5971 [22:54<32:57,  1.78it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0535, train/loss_vlb_step=0.000185, train/loss_step=0.0535, global_step=808.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2448/5971 [22:54<32:57,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000219, train/loss_step=0.065, global_step=808.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████      | 2449/5971 [22:55<32:57,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000219, train/loss_step=0.065, global_step=808.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2449/5971 [22:55<32:57,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000253, train/loss_step=0.0757, global_step=809.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2450/5971 [22:56<32:57,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000253, train/loss_step=0.0757, global_step=809.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2450/5971 [22:56<32:57,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.43e-5, train/loss_step=0.00242, global_step=809.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2451/5971 [22:57<32:57,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.43e-5, train/loss_step=0.00242, global_step=809.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2451/5971 [22:57<32:57,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000157, train/loss_step=0.0448, global_step=809.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████      | 2452/5971 [23:00<32:59,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000157, train/loss_step=0.0448, global_step=809.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2452/5971 [23:00<32:59,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.75e-5, train/loss_step=0.0131, global_step=809.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████      | 2453/5971 [23:01<33:00,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.75e-5, train/loss_step=0.0131, global_step=809.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2453/5971 [23:01<33:00,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000487, train/loss_step=0.148, global_step=810.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████      | 2454/5971 [23:02<33:00,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000487, train/loss_step=0.148, global_step=810.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2454/5971 [23:02<33:00,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00122, train/loss_step=0.275, global_step=810.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████      | 2455/5971 [23:02<32:59,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00122, train/loss_step=0.275, global_step=810.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2455/5971 [23:02<32:59,  1.78it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000284, train/loss_step=0.0864, global_step=810.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2456/5971 [23:05<33:01,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000284, train/loss_step=0.0864, global_step=810.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2456/5971 [23:05<33:01,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00599, train/loss_step=0.524, global_step=810.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  41%|████      | 2457/5971 [23:06<33:01,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00599, train/loss_step=0.524, global_step=810.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2457/5971 [23:06<33:01,  1.77it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00463, train/loss_vlb_step=2.5e-5, train/loss_step=0.00463, global_step=811.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2458/5971 [23:06<33:01,  1.77it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00463, train/loss_vlb_step=2.5e-5, train/loss_step=0.00463, global_step=811.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2458/5971 [23:06<33:01,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0891, train/loss_vlb_step=0.000293, train/loss_step=0.0891, global_step=811.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2459/5971 [23:07<33:01,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0891, train/loss_vlb_step=0.000293, train/loss_step=0.0891, global_step=811.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2459/5971 [23:07<33:01,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.6e-5, train/loss_step=0.0098, global_step=811.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████      | 2460/5971 [23:10<33:03,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.6e-5, train/loss_step=0.0098, global_step=811.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2460/5971 [23:10<33:03,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00473, train/loss_vlb_step=2.47e-5, train/loss_step=0.00473, global_step=811.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2461/5971 [23:11<33:03,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00473, train/loss_vlb_step=2.47e-5, train/loss_step=0.00473, global_step=811.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2461/5971 [23:11<33:03,  1.77it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0972, train/loss_vlb_step=0.00032, train/loss_step=0.0972, global_step=812.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████      | 2462/5971 [23:12<33:03,  1.77it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0972, train/loss_vlb_step=0.00032, train/loss_step=0.0972, global_step=812.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2462/5971 [23:12<33:03,  1.77it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.37e-5, train/loss_step=0.0244, global_step=812.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2463/5971 [23:12<33:03,  1.77it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.37e-5, train/loss_step=0.0244, global_step=812.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████      | 2463/5971 [23:12<33:03,  1.77it/s, loss=0.105, v_num=0, train/loss_simple_step=0.079, train/loss_vlb_step=0.00026, train/loss_step=0.079, global_step=812.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████▏     | 2464/5971 [23:15<33:04,  1.77it/s, loss=0.105, v_num=0, train/loss_simple_step=0.079, train/loss_vlb_step=0.00026, train/loss_step=0.079, global_step=812.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2464/5971 [23:15<33:04,  1.77it/s, loss=0.0807, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.45e-5, train/loss_step=0.0119, global_step=812.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2465/5971 [23:15<33:04,  1.77it/s, loss=0.0807, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.45e-5, train/loss_step=0.0119, global_step=812.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2465/5971 [23:15<33:04,  1.77it/s, loss=0.0824, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.00013, train/loss_step=0.0369, global_step=813.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2466/5971 [23:16<33:04,  1.77it/s, loss=0.0824, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.00013, train/loss_step=0.0369, global_step=813.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2466/5971 [23:16<33:04,  1.77it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.00917, train/loss_vlb_step=3.93e-5, train/loss_step=0.00917, global_step=813.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2467/5971 [23:17<33:04,  1.77it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.00917, train/loss_vlb_step=3.93e-5, train/loss_step=0.00917, global_step=813.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2467/5971 [23:17<33:04,  1.77it/s, loss=0.098, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00207, train/loss_step=0.358, global_step=813.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  41%|████▏     | 2468/5971 [23:20<33:06,  1.76it/s, loss=0.098, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00207, train/loss_step=0.358, global_step=813.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2468/5971 [23:20<33:06,  1.76it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000108, train/loss_step=0.0283, global_step=813.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2469/5971 [23:21<33:06,  1.76it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000108, train/loss_step=0.0283, global_step=813.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2469/5971 [23:21<33:06,  1.76it/s, loss=0.094, v_num=0, train/loss_simple_step=0.0329, train/loss_vlb_step=0.000124, train/loss_step=0.0329, global_step=814.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████▏     | 2470/5971 [23:21<33:06,  1.76it/s, loss=0.094, v_num=0, train/loss_simple_step=0.0329, train/loss_vlb_step=0.000124, train/loss_step=0.0329, global_step=814.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2470/5971 [23:21<33:06,  1.76it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0866, train/loss_vlb_step=0.000291, train/loss_step=0.0866, global_step=814.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2471/5971 [23:22<33:06,  1.76it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0866, train/loss_vlb_step=0.000291, train/loss_step=0.0866, global_step=814.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2471/5971 [23:22<33:06,  1.76it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0944, train/loss_vlb_step=0.000313, train/loss_step=0.0944, global_step=814.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  41%|████▏     | 2472/5971 [23:25<33:08,  1.76it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0944, train/loss_vlb_step=0.000313, train/loss_step=0.0944, global_step=814.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2472/5971 [23:25<33:08,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00143, train/loss_step=0.324, global_step=814.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  41%|████▏     | 2473/5971 [23:26<33:08,  1.76it/s, loss=0.116, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00143, train/loss_step=0.324, global_step=814.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2473/5971 [23:26<33:08,  1.76it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.32e-5, train/loss_step=0.00434, global_step=815.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2474/5971 [23:27<33:08,  1.76it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.32e-5, train/loss_step=0.00434, global_step=815.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2474/5971 [23:27<33:08,  1.76it/s, loss=0.096, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.12e-5, train/loss_step=0.0144, global_step=815.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  41%|████▏     | 2475/5971 [23:28<33:08,  1.76it/s, loss=0.096, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.12e-5, train/loss_step=0.0144, global_step=815.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2475/5971 [23:28<33:08,  1.76it/s, loss=0.1, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000578, train/loss_step=0.169, global_step=815.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  41%|████▏     | 2476/5971 [23:30<33:10,  1.76it/s, loss=0.1, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000578, train/loss_step=0.169, global_step=815.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2476/5971 [23:30<33:10,  1.76it/s, loss=0.101, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00536, train/loss_step=0.538, global_step=815.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2477/5971 [23:31<33:10,  1.76it/s, loss=0.101, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00536, train/loss_step=0.538, global_step=815.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  41%|████▏     | 2477/5971 [23:31<33:10,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00177, train/loss_step=0.332, global_step=816.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2478/5971 [23:32<33:10,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00177, train/loss_step=0.332, global_step=816.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2478/5971 [23:32<33:10,  1.76it/s, loss=0.124, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000774, train/loss_step=0.216, global_step=816.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2479/5971 [23:33<33:10,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000774, train/loss_step=0.216, global_step=816.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2479/5971 [23:33<33:10,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.55e-5, train/loss_step=0.0155, global_step=816.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2480/5971 [23:35<33:12,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.55e-5, train/loss_step=0.0155, global_step=816.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2480/5971 [23:35<33:12,  1.75it/s, loss=0.129, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000367, train/loss_step=0.112, global_step=816.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  42%|████▏     | 2481/5971 [23:36<33:12,  1.75it/s, loss=0.129, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000367, train/loss_step=0.112, global_step=816.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2481/5971 [23:36<33:12,  1.75it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00823, train/loss_vlb_step=3.73e-5, train/loss_step=0.00823, global_step=817.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2482/5971 [23:37<33:11,  1.75it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00823, train/loss_vlb_step=3.73e-5, train/loss_step=0.00823, global_step=817.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2482/5971 [23:37<33:11,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.26e-5, train/loss_step=0.0116, global_step=817.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  42%|████▏     | 2483/5971 [23:38<33:11,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.26e-5, train/loss_step=0.0116, global_step=817.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2483/5971 [23:38<33:11,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.14e-5, train/loss_step=0.004, global_step=817.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  42%|████▏     | 2484/5971 [23:40<33:13,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.14e-5, train/loss_step=0.004, global_step=817.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2484/5971 [23:40<33:13,  1.75it/s, loss=0.135, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00136, train/loss_step=0.308, global_step=817.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2485/5971 [23:41<33:13,  1.75it/s, loss=0.135, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00136, train/loss_step=0.308, global_step=817.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2485/5971 [23:41<33:13,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.00086, train/loss_step=0.220, global_step=818.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2486/5971 [23:42<33:13,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.00086, train/loss_step=0.220, global_step=818.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2486/5971 [23:42<33:13,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00466, train/loss_step=0.475, global_step=818.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2487/5971 [23:43<33:13,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00466, train/loss_step=0.475, global_step=818.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2487/5971 [23:43<33:13,  1.75it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00735, train/loss_vlb_step=3.59e-5, train/loss_step=0.00735, global_step=818.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2488/5971 [23:45<33:15,  1.75it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00735, train/loss_vlb_step=3.59e-5, train/loss_step=0.00735, global_step=818.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2488/5971 [23:45<33:15,  1.75it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.85e-5, train/loss_step=0.00335, global_step=818.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2489/5971 [23:46<33:15,  1.75it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.85e-5, train/loss_step=0.00335, global_step=818.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2489/5971 [23:46<33:15,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.82e-5, train/loss_step=0.0034, global_step=819.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  42%|████▏     | 2490/5971 [23:47<33:15,  1.74it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.82e-5, train/loss_step=0.0034, global_step=819.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2490/5971 [23:47<33:15,  1.74it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000131, train/loss_step=0.0335, global_step=819.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2491/5971 [23:48<33:14,  1.74it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000131, train/loss_step=0.0335, global_step=819.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2491/5971 [23:48<33:14,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00157, train/loss_step=0.313, global_step=819.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  42%|████▏     | 2492/5971 [23:50<33:16,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00157, train/loss_step=0.313, global_step=819.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2492/5971 [23:50<33:16,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000609, train/loss_step=0.180, global_step=819.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2493/5971 [23:51<33:16,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000609, train/loss_step=0.180, global_step=819.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2493/5971 [23:51<33:16,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.71e-5, train/loss_step=0.0181, global_step=820.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2494/5971 [23:52<33:16,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.71e-5, train/loss_step=0.0181, global_step=820.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2494/5971 [23:52<33:16,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.66e-5, train/loss_step=0.0101, global_step=820.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2495/5971 [23:53<33:16,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.66e-5, train/loss_step=0.0101, global_step=820.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2495/5971 [23:53<33:16,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0819, train/loss_vlb_step=0.000271, train/loss_step=0.0819, global_step=820.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2496/5971 [23:55<33:17,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0819, train/loss_vlb_step=0.000271, train/loss_step=0.0819, global_step=820.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2496/5971 [23:55<33:17,  1.74it/s, loss=0.152, v_num=0, train/loss_simple_step=0.694, train/loss_vlb_step=0.0328, train/loss_step=0.694, global_step=820.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  42%|████▏     | 2497/5971 [23:56<33:17,  1.74it/s, loss=0.152, v_num=0, train/loss_simple_step=0.694, train/loss_vlb_step=0.0328, train/loss_step=0.694, global_step=820.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2497/5971 [23:56<33:17,  1.74it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.26e-5, train/loss_step=0.0138, global_step=821.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2498/5971 [23:57<33:17,  1.74it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.26e-5, train/loss_step=0.0138, global_step=821.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2498/5971 [23:57<33:17,  1.74it/s, loss=0.147, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00304, train/loss_step=0.427, global_step=821.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  42%|████▏     | 2499/5971 [23:58<33:17,  1.74it/s, loss=0.147, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00304, train/loss_step=0.427, global_step=821.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2499/5971 [23:58<33:17,  1.74it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0714, train/loss_vlb_step=0.000249, train/loss_step=0.0714, global_step=821.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2500/5971 [24:00<33:19,  1.74it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0714, train/loss_vlb_step=0.000249, train/loss_step=0.0714, global_step=821.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2500/5971 [24:00<33:19,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.617, train/loss_vlb_step=0.00802, train/loss_step=0.617, global_step=821.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  42%|████▏     | 2501/5971 [24:01<33:19,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.617, train/loss_vlb_step=0.00802, train/loss_step=0.617, global_step=821.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2501/5971 [24:01<33:19,  1.74it/s, loss=0.197, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00306, train/loss_step=0.454, global_step=822.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2502/5971 [24:02<33:19,  1.74it/s, loss=0.197, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00306, train/loss_step=0.454, global_step=822.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2502/5971 [24:02<33:19,  1.74it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.89e-5, train/loss_step=0.00337, global_step=822.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2503/5971 [24:03<33:19,  1.73it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.89e-5, train/loss_step=0.00337, global_step=822.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2503/5971 [24:03<33:19,  1.73it/s, loss=0.206, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000672, train/loss_step=0.192, global_step=822.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  42%|████▏     | 2504/5971 [24:05<33:20,  1.73it/s, loss=0.206, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000672, train/loss_step=0.192, global_step=822.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2504/5971 [24:05<33:20,  1.73it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00744, train/loss_vlb_step=3.59e-5, train/loss_step=0.00744, global_step=822.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2505/5971 [24:06<33:20,  1.73it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00744, train/loss_vlb_step=3.59e-5, train/loss_step=0.00744, global_step=822.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2505/5971 [24:06<33:20,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0938, train/loss_vlb_step=0.000309, train/loss_step=0.0938, global_step=823.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  42%|████▏     | 2506/5971 [24:07<33:20,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0938, train/loss_vlb_step=0.000309, train/loss_step=0.0938, global_step=823.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2506/5971 [24:07<33:20,  1.73it/s, loss=0.179, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00162, train/loss_step=0.357, global_step=823.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  42%|████▏     | 2507/5971 [24:08<33:20,  1.73it/s, loss=0.179, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00162, train/loss_step=0.357, global_step=823.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2507/5971 [24:08<33:20,  1.73it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000152, train/loss_step=0.0444, global_step=823.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2508/5971 [24:10<33:22,  1.73it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000152, train/loss_step=0.0444, global_step=823.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2508/5971 [24:10<33:22,  1.73it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00389, train/loss_vlb_step=2.12e-5, train/loss_step=0.00389, global_step=823.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2509/5971 [24:11<33:21,  1.73it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00389, train/loss_vlb_step=2.12e-5, train/loss_step=0.00389, global_step=823.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2509/5971 [24:11<33:21,  1.73it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.32e-5, train/loss_step=0.0211, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  42%|████▏     | 2510/5971 [24:12<33:21,  1.73it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.32e-5, train/loss_step=0.0211, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2510/5971 [24:12<33:21,  1.73it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0702, train/loss_vlb_step=0.000235, train/loss_step=0.0702, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2511/5971 [24:13<33:21,  1.73it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0702, train/loss_vlb_step=0.000235, train/loss_step=0.0702, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2511/5971 [24:13<33:21,  1.73it/s, loss=0.182, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00113, train/loss_step=0.279, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  42%|████▏     | 2512/5971 [24:15<33:23,  1.73it/s, loss=0.182, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00113, train/loss_step=0.279, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  42%|████▏     | 2512/5971 [24:15<33:23,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.33it/s][A
Epoch 1:  42%|████▏     | 2514/5971 [24:15<33:20,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   1%|          | 2/167 [00:01<01:24,  1.95it/s][A
Epoch 1:  42%|████▏     | 2516/5971 [24:16<33:19,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:01<00:28,  5.74it/s][A
Epoch 1:  42%|████▏     | 2519/5971 [24:16<33:15,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:01<00:16,  9.40it/s][A
Epoch 1:  42%|████▏     | 2522/5971 [24:16<33:11,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:12, 12.55it/s][A
Epoch 1:  42%|████▏     | 2525/5971 [24:16<33:07,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:09, 15.87it/s][A
Epoch 1:  42%|████▏     | 2528/5971 [24:16<33:03,  1.74it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:07, 18.89it/s][A
Epoch 1:  42%|████▏     | 2531/5971 [24:16<32:59,  1.74it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 20.94it/s][A
Epoch 1:  42%|████▏     | 2534/5971 [24:17<32:55,  1.74it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 21.72it/s][A
Epoch 1:  42%|████▏     | 2537/5971 [24:17<32:51,  1.74it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 22.83it/s][A
Epoch 1:  43%|████▎     | 2540/5971 [24:17<32:47,  1.74it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:02<00:06, 22.42it/s][A
Epoch 1:  43%|████▎     | 2543/5971 [24:17<32:43,  1.75it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▉        | 32/167 [00:02<00:06, 22.14it/s][A
Epoch 1:  43%|████▎     | 2546/5971 [24:17<32:40,  1.75it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:02<00:06, 21.48it/s][A
Epoch 1:  43%|████▎     | 2549/5971 [24:17<32:36,  1.75it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 22.18it/s][A
Epoch 1:  43%|████▎     | 2552/5971 [24:17<32:32,  1.75it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 23.49it/s][A
Epoch 1:  43%|████▎     | 2555/5971 [24:17<32:28,  1.75it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 24.02it/s][A
Epoch 1:  43%|████▎     | 2558/5971 [24:18<32:24,  1.76it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:02<00:05, 23.80it/s][A
Epoch 1:  43%|████▎     | 2561/5971 [24:18<32:20,  1.76it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.03it/s][A
Epoch 1:  43%|████▎     | 2564/5971 [24:18<32:16,  1.76it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:03<00:04, 25.30it/s][A
Epoch 1:  43%|████▎     | 2567/5971 [24:18<32:13,  1.76it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 25.69it/s][A
Epoch 1:  43%|████▎     | 2570/5971 [24:18<32:09,  1.76it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▌      | 59/167 [00:03<00:04, 26.35it/s][A
Epoch 1:  43%|████▎     | 2573/5971 [24:18<32:05,  1.76it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  37%|███▋      | 62/167 [00:03<00:03, 26.70it/s][A
Epoch 1:  43%|████▎     | 2576/5971 [24:18<32:01,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.65it/s][A
Epoch 1:  43%|████▎     | 2579/5971 [24:18<31:57,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.55it/s][A
Epoch 1:  43%|████▎     | 2582/5971 [24:18<31:54,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.79it/s][A
Epoch 1:  43%|████▎     | 2585/5971 [24:19<31:50,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.32it/s][A
Epoch 1:  43%|████▎     | 2588/5971 [24:19<31:46,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.98it/s][A
Epoch 1:  43%|████▎     | 2591/5971 [24:19<31:42,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  48%|████▊     | 80/167 [00:04<00:03, 26.56it/s][A
Epoch 1:  43%|████▎     | 2595/5971 [24:19<31:37,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|█████     | 84/167 [00:04<00:03, 27.58it/s][A
Epoch 1:  44%|████▎     | 2599/5971 [24:19<31:32,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 25.63it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.57it/s][A
Epoch 1:  44%|████▎     | 2603/5971 [24:19<31:28,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.60it/s][A
Epoch 1:  44%|████▎     | 2607/5971 [24:19<31:23,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.92it/s][A
Epoch 1:  44%|████▎     | 2611/5971 [24:20<31:18,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.70it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.55it/s][A
Epoch 1:  44%|████▍     | 2615/5971 [24:20<31:13,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 27.13it/s][A
Epoch 1:  44%|████▍     | 2619/5971 [24:20<31:08,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  66%|██████▌   | 110/167 [00:05<00:01, 28.53it/s][A
Epoch 1:  44%|████▍     | 2623/5971 [24:20<31:03,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 114/167 [00:05<00:01, 29.40it/s][A
Epoch 1:  44%|████▍     | 2627/5971 [24:20<30:58,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  70%|███████   | 117/167 [00:05<00:01, 29.15it/s][A
Epoch 1:  44%|████▍     | 2631/5971 [24:20<30:53,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 28.19it/s][A
Epoch 1:  44%|████▍     | 2635/5971 [24:20<30:48,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.42it/s][A

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.94it/s][A
Epoch 1:  44%|████▍     | 2639/5971 [24:21<30:43,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 28.08it/s][A
Epoch 1:  44%|████▍     | 2643/5971 [24:21<30:39,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.34it/s][A
Epoch 1:  44%|████▍     | 2647/5971 [24:21<30:34,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████  | 135/167 [00:06<00:01, 27.43it/s][A

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 27.99it/s][A
Epoch 1:  44%|████▍     | 2651/5971 [24:21<30:29,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 27.53it/s][A
Epoch 1:  44%|████▍     | 2655/5971 [24:21<30:24,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 27.90it/s][A
Epoch 1:  45%|████▍     | 2659/5971 [24:21<30:20,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.69it/s][A
Epoch 1:  45%|████▍     | 2663/5971 [24:21<30:15,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 25.72it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.61it/s][A
Epoch 1:  45%|████▍     | 2667/5971 [24:22<30:10,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.67it/s][A
Epoch 1:  45%|████▍     | 2671/5971 [24:22<30:05,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 25.13it/s][A
Epoch 1:  45%|████▍     | 2675/5971 [24:22<30:01,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 25.61it/s][A

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 25.64it/s][A
Epoch 1:  45%|████▍     | 2679/5971 [24:22<29:56,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▍     | 2680/5971 [24:22<29:55,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.00027, train/loss_step=0.0817, global_step=824.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  45%|████▍     | 2681/5971 [24:23<29:55,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.82e-6, train/loss_step=0.00144, global_step=825.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▍     | 2682/5971 [24:24<29:55,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.19e-5, train/loss_step=0.00415, global_step=825.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▍     | 2683/5971 [24:25<29:55,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.19e-5, train/loss_step=0.00415, global_step=825.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▍     | 2683/5971 [24:25<29:55,  1.83it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.31e-5, train/loss_step=0.00422, global_step=825.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▍     | 2684/5971 [24:27<29:56,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00132, train/loss_step=0.299, global_step=825.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  45%|████▍     | 2685/5971 [24:28<29:56,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=826.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▍     | 2686/5971 [24:29<29:56,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000136, train/loss_step=0.0375, global_step=826.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2687/5971 [24:30<29:56,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000136, train/loss_step=0.0375, global_step=826.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2687/5971 [24:30<29:56,  1.83it/s, loss=0.144, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000763, train/loss_step=0.209, global_step=826.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  45%|████▌     | 2688/5971 [24:32<29:58,  1.83it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.9e-5, train/loss_step=0.0128, global_step=826.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2689/5971 [24:33<29:57,  1.83it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.04e-5, train/loss_step=0.00395, global_step=827.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2690/5971 [24:34<29:57,  1.82it/s, loss=0.105, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00119, train/loss_step=0.269, global_step=827.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  45%|████▌     | 2691/5971 [24:35<29:57,  1.82it/s, loss=0.105, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00119, train/loss_step=0.269, global_step=827.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2691/5971 [24:35<29:57,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.531, train/loss_vlb_step=0.0064, train/loss_step=0.531, global_step=827.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  45%|████▌     | 2692/5971 [24:37<29:59,  1.82it/s, loss=0.15, v_num=0, train/loss_simple_step=0.581, train/loss_vlb_step=0.00664, train/loss_step=0.581, global_step=827.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2693/5971 [24:38<29:59,  1.82it/s, loss=0.164, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00242, train/loss_step=0.371, global_step=828.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2694/5971 [24:39<29:58,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=828.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2695/5971 [24:40<29:58,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=828.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2695/5971 [24:40<29:58,  1.82it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.55e-5, train/loss_step=0.00274, global_step=828.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2696/5971 [24:42<30:00,  1.82it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.81e-5, train/loss_step=0.0106, global_step=828.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  45%|████▌     | 2697/5971 [24:43<30:00,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000147, train/loss_step=0.0411, global_step=829.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2698/5971 [24:44<30:00,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.069, train/loss_vlb_step=0.000227, train/loss_step=0.069, global_step=829.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  45%|████▌     | 2699/5971 [24:45<29:59,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.069, train/loss_vlb_step=0.000227, train/loss_step=0.069, global_step=829.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2699/5971 [24:45<29:59,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000926, train/loss_step=0.222, global_step=829.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2700/5971 [24:47<30:01,  1.82it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.65e-5, train/loss_step=0.0124, global_step=829.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2701/5971 [24:48<30:01,  1.82it/s, loss=0.164, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00205, train/loss_step=0.382, global_step=830.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  45%|████▌     | 2702/5971 [24:49<30:01,  1.82it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0736, train/loss_vlb_step=0.000242, train/loss_step=0.0736, global_step=830.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2703/5971 [24:50<30:00,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0736, train/loss_vlb_step=0.000242, train/loss_step=0.0736, global_step=830.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2703/5971 [24:50<30:00,  1.81it/s, loss=0.174, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000423, train/loss_step=0.127, global_step=830.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  45%|████▌     | 2704/5971 [24:52<30:02,  1.81it/s, loss=0.171, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000924, train/loss_step=0.250, global_step=830.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2705/5971 [24:53<30:02,  1.81it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00362, train/loss_vlb_step=1.95e-5, train/loss_step=0.00362, global_step=831.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2706/5971 [24:53<30:01,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00725, train/loss_vlb_step=3.42e-5, train/loss_step=0.00725, global_step=831.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2707/5971 [24:54<30:01,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00725, train/loss_vlb_step=3.42e-5, train/loss_step=0.00725, global_step=831.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2707/5971 [24:54<30:01,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.49e-5, train/loss_step=0.00269, global_step=831.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2708/5971 [24:56<30:03,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.58e-5, train/loss_step=0.0158, global_step=831.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  45%|████▌     | 2709/5971 [24:57<30:02,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.94e-5, train/loss_step=0.00358, global_step=832.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2710/5971 [24:58<30:02,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.0016, train/loss_step=0.342, global_step=832.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  45%|████▌     | 2711/5971 [24:59<30:02,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.0016, train/loss_step=0.342, global_step=832.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2711/5971 [24:59<30:02,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00385, train/loss_vlb_step=1.88e-5, train/loss_step=0.00385, global_step=832.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2712/5971 [25:01<30:03,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00157, train/loss_step=0.336, global_step=832.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  45%|████▌     | 2713/5971 [25:02<30:03,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00568, train/loss_step=0.538, global_step=833.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2714/5971 [25:03<30:03,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00644, train/loss_vlb_step=3.1e-5, train/loss_step=0.00644, global_step=833.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2715/5971 [25:04<30:03,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00644, train/loss_vlb_step=3.1e-5, train/loss_step=0.00644, global_step=833.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2715/5971 [25:04<30:03,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.23e-5, train/loss_step=0.00656, global_step=833.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  45%|████▌     | 2716/5971 [25:06<30:05,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000591, train/loss_step=0.164, global_step=833.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  46%|████▌     | 2717/5971 [25:07<30:04,  1.80it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000185, train/loss_step=0.0538, global_step=834.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2718/5971 [25:08<30:04,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.755, train/loss_vlb_step=0.0249, train/loss_step=0.755, global_step=834.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  46%|████▌     | 2719/5971 [25:09<30:04,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.755, train/loss_vlb_step=0.0249, train/loss_step=0.755, global_step=834.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2719/5971 [25:09<30:04,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00182, train/loss_step=0.378, global_step=834.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2720/5971 [25:11<30:05,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00385, train/loss_vlb_step=2.07e-5, train/loss_step=0.00385, global_step=834.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2721/5971 [25:12<30:05,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00921, train/loss_vlb_step=4.25e-5, train/loss_step=0.00921, global_step=835.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2722/5971 [25:13<30:05,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00887, train/loss_vlb_step=4.19e-5, train/loss_step=0.00887, global_step=835.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2723/5971 [25:14<30:05,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00887, train/loss_vlb_step=4.19e-5, train/loss_step=0.00887, global_step=835.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2723/5971 [25:14<30:05,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0776, train/loss_vlb_step=0.00026, train/loss_step=0.0776, global_step=835.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  46%|████▌     | 2724/5971 [25:16<30:07,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.748, train/loss_vlb_step=0.0246, train/loss_step=0.748, global_step=835.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  46%|████▌     | 2725/5971 [25:17<30:07,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00925, train/loss_vlb_step=4.17e-5, train/loss_step=0.00925, global_step=836.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2726/5971 [25:18<30:07,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000329, train/loss_step=0.0967, global_step=836.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  46%|████▌     | 2727/5971 [25:19<30:06,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000329, train/loss_step=0.0967, global_step=836.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2727/5971 [25:19<30:06,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000323, train/loss_step=0.0971, global_step=836.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2728/5971 [25:21<30:08,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000469, train/loss_step=0.137, global_step=836.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  46%|████▌     | 2729/5971 [25:22<30:08,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=837.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2730/5971 [25:23<30:07,  1.79it/s, loss=0.186, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000606, train/loss_step=0.172, global_step=837.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2731/5971 [25:24<30:07,  1.79it/s, loss=0.186, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000606, train/loss_step=0.172, global_step=837.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2731/5971 [25:24<30:07,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.3e-5, train/loss_step=0.0229, global_step=837.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2732/5971 [25:26<30:08,  1.79it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000209, train/loss_step=0.0614, global_step=837.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2733/5971 [25:27<30:08,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0489, train/loss_vlb_step=0.000175, train/loss_step=0.0489, global_step=838.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2734/5971 [25:28<30:08,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00284, train/loss_vlb_step=1.57e-5, train/loss_step=0.00284, global_step=838.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2735/5971 [25:28<30:08,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00284, train/loss_vlb_step=1.57e-5, train/loss_step=0.00284, global_step=838.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2735/5971 [25:28<30:08,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.52e-5, train/loss_step=0.0247, global_step=838.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  46%|████▌     | 2736/5971 [25:31<30:09,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.01e-5, train/loss_step=0.0138, global_step=838.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2737/5971 [25:32<30:09,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00707, train/loss_vlb_step=3.35e-5, train/loss_step=0.00707, global_step=839.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2738/5971 [25:33<30:09,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00995, train/loss_vlb_step=4.64e-5, train/loss_step=0.00995, global_step=839.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2739/5971 [25:33<30:09,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00995, train/loss_vlb_step=4.64e-5, train/loss_step=0.00995, global_step=839.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2739/5971 [25:33<30:09,  1.79it/s, loss=0.105, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00243, train/loss_step=0.436, global_step=839.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  46%|████▌     | 2740/5971 [25:36<30:10,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00108, train/loss_step=0.289, global_step=839.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  46%|████▌     | 2741/5971 [25:37<30:10,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.0006, train/loss_step=0.176, global_step=840.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2742/5971 [25:37<30:10,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.0002, train/loss_step=0.0581, global_step=840.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2743/5971 [25:38<30:10,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.0002, train/loss_step=0.0581, global_step=840.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2743/5971 [25:38<30:10,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00102, train/loss_step=0.236, global_step=840.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2744/5971 [25:41<30:11,  1.78it/s, loss=0.119, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00249, train/loss_step=0.367, global_step=840.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2745/5971 [25:42<30:11,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000487, train/loss_step=0.148, global_step=841.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2746/5971 [25:43<30:11,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000119, train/loss_step=0.0315, global_step=841.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2747/5971 [25:44<30:11,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000119, train/loss_step=0.0315, global_step=841.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2747/5971 [25:44<30:11,  1.78it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.77e-5, train/loss_step=0.0234, global_step=841.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  46%|████▌     | 2748/5971 [25:46<30:12,  1.78it/s, loss=0.131, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00234, train/loss_step=0.366, global_step=841.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  46%|████▌     | 2749/5971 [25:47<30:12,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.79e-5, train/loss_step=0.0215, global_step=842.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2750/5971 [25:47<30:12,  1.78it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.39e-5, train/loss_step=0.0131, global_step=842.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2751/5971 [25:48<30:12,  1.78it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.39e-5, train/loss_step=0.0131, global_step=842.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2751/5971 [25:48<30:12,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000249, train/loss_step=0.0709, global_step=842.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2752/5971 [25:51<30:13,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000117, train/loss_step=0.0318, global_step=842.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2753/5971 [25:51<30:13,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=843.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  46%|████▌     | 2754/5971 [25:52<30:13,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.88e-5, train/loss_step=0.0133, global_step=843.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2755/5971 [25:53<30:13,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.88e-5, train/loss_step=0.0133, global_step=843.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2755/5971 [25:53<30:13,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000108, train/loss_step=0.028, global_step=843.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  46%|████▌     | 2756/5971 [25:56<30:14,  1.77it/s, loss=0.144, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00331, train/loss_step=0.457, global_step=843.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  46%|████▌     | 2757/5971 [25:57<30:14,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000634, train/loss_step=0.174, global_step=844.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2758/5971 [25:57<30:14,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.82e-5, train/loss_step=0.0154, global_step=844.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2759/5971 [25:58<30:14,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.82e-5, train/loss_step=0.0154, global_step=844.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2759/5971 [25:58<30:14,  1.77it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.15e-5, train/loss_step=0.00191, global_step=844.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▌     | 2760/5971 [26:00<30:15,  1.77it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00791, train/loss_vlb_step=3.9e-5, train/loss_step=0.00791, global_step=844.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  46%|████▌     | 2761/5971 [26:01<30:15,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000763, train/loss_step=0.217, global_step=845.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  46%|████▋     | 2762/5971 [26:02<30:14,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000923, train/loss_step=0.237, global_step=845.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2763/5971 [26:03<30:14,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000923, train/loss_step=0.237, global_step=845.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2763/5971 [26:03<30:14,  1.77it/s, loss=0.125, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000569, train/loss_step=0.169, global_step=845.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2764/5971 [26:06<30:16,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00103, train/loss_step=0.276, global_step=845.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  46%|████▋     | 2765/5971 [26:06<30:16,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00215, train/loss_step=0.446, global_step=846.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2766/5971 [26:07<30:15,  1.76it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000205, train/loss_step=0.0614, global_step=846.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2767/5971 [26:08<30:15,  1.76it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000205, train/loss_step=0.0614, global_step=846.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2767/5971 [26:08<30:15,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.22e-5, train/loss_step=0.00401, global_step=846.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2768/5971 [26:10<30:17,  1.76it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00767, train/loss_vlb_step=3.65e-5, train/loss_step=0.00767, global_step=846.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2769/5971 [26:11<30:16,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=6.82e-5, train/loss_step=0.0167, global_step=847.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  46%|████▋     | 2770/5971 [26:12<30:16,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.95e-5, train/loss_step=0.00342, global_step=847.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2771/5971 [26:13<30:16,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.95e-5, train/loss_step=0.00342, global_step=847.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2771/5971 [26:13<30:16,  1.76it/s, loss=0.139, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.0044, train/loss_step=0.502, global_step=847.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  46%|████▋     | 2772/5971 [26:15<30:17,  1.76it/s, loss=0.142, v_num=0, train/loss_simple_step=0.093, train/loss_vlb_step=0.000306, train/loss_step=0.093, global_step=847.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2773/5971 [26:16<30:17,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000126, train/loss_step=0.0336, global_step=848.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2774/5971 [26:17<30:17,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.506, train/loss_vlb_step=0.00376, train/loss_step=0.506, global_step=848.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  46%|████▋     | 2775/5971 [26:18<30:17,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.506, train/loss_vlb_step=0.00376, train/loss_step=0.506, global_step=848.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2775/5971 [26:18<30:17,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000164, train/loss_step=0.0439, global_step=848.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  46%|████▋     | 2776/5971 [26:20<30:18,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000227, train/loss_step=0.0657, global_step=848.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  47%|████▋     | 2777/5971 [26:21<30:18,  1.76it/s, loss=0.149, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.000971, train/loss_step=0.264, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  47%|████▋     | 2778/5971 [26:22<30:18,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  47%|████▋     | 2779/5971 [26:23<30:17,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  47%|████▋     | 2779/5971 [26:23<30:17,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000114, train/loss_step=0.0323, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  47%|████▋     | 2780/5971 [26:25<30:19,  1.75it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<02:22,  1.16it/s][A

Validating:   1%|          | 2/167 [00:01<01:22,  2.01it/s][A
Epoch 1:  47%|████▋     | 2783/5971 [26:26<30:16,  1.75it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:01<00:27,  5.88it/s][A
Epoch 1:  47%|████▋     | 2787/5971 [26:26<30:12,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:01<00:16,  9.85it/s][A
Epoch 1:  47%|████▋     | 2791/5971 [26:26<30:07,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 12/167 [00:01<00:10, 14.62it/s][A
Epoch 1:  47%|████▋     | 2795/5971 [26:26<30:02,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   9%|▉         | 15/167 [00:01<00:08, 17.13it/s][A

Validating:  11%|█         | 18/167 [00:01<00:08, 18.50it/s][A
Epoch 1:  47%|████▋     | 2799/5971 [26:27<29:58,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 21.75it/s][A
Epoch 1:  47%|████▋     | 2803/5971 [26:27<29:53,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 23.40it/s][A
Epoch 1:  47%|████▋     | 2807/5971 [26:27<29:48,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 28/167 [00:02<00:05, 24.53it/s][A
Epoch 1:  47%|████▋     | 2811/5971 [26:27<29:44,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▊        | 31/167 [00:02<00:05, 24.03it/s][A

Validating:  20%|██        | 34/167 [00:02<00:05, 24.36it/s][A
Epoch 1:  47%|████▋     | 2815/5971 [26:27<29:39,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 25.40it/s][A
Epoch 1:  47%|████▋     | 2819/5971 [26:27<29:34,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 27.25it/s][A
Epoch 1:  47%|████▋     | 2823/5971 [26:28<29:30,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.23it/s][A
Epoch 1:  47%|████▋     | 2827/5971 [26:28<29:25,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.16it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 27.01it/s][A
Epoch 1:  47%|████▋     | 2831/5971 [26:28<29:21,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:03<00:04, 27.12it/s][A
Epoch 1:  47%|████▋     | 2835/5971 [26:28<29:16,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 26.88it/s][A
Epoch 1:  48%|████▊     | 2839/5971 [26:28<29:11,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▌      | 59/167 [00:03<00:04, 26.50it/s][A

Validating:  37%|███▋      | 62/167 [00:03<00:03, 26.58it/s][A
Epoch 1:  48%|████▊     | 2843/5971 [26:28<29:07,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.24it/s][A
Epoch 1:  48%|████▊     | 2847/5971 [26:28<29:02,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.80it/s][A
Epoch 1:  48%|████▊     | 2851/5971 [26:29<28:58,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.14it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.58it/s][A
Epoch 1:  48%|████▊     | 2855/5971 [26:29<28:53,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.69it/s][A
Epoch 1:  48%|████▊     | 2859/5971 [26:29<28:49,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  48%|████▊     | 80/167 [00:04<00:03, 26.75it/s][A
Epoch 1:  48%|████▊     | 2863/5971 [26:29<28:44,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|████▉     | 83/167 [00:04<00:03, 26.80it/s][A

Validating:  51%|█████▏    | 86/167 [00:04<00:04, 19.82it/s][A
Epoch 1:  48%|████▊     | 2867/5971 [26:29<28:40,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 21.66it/s][A
Epoch 1:  48%|████▊     | 2871/5971 [26:29<28:36,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 22.42it/s][A
Epoch 1:  48%|████▊     | 2875/5971 [26:30<28:31,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 95/167 [00:04<00:03, 23.16it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.11it/s][A
Epoch 1:  48%|████▊     | 2879/5971 [26:30<28:27,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.40it/s][A
Epoch 1:  48%|████▊     | 2883/5971 [26:30<28:22,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 104/167 [00:05<00:02, 25.33it/s][A
Epoch 1:  48%|████▊     | 2887/5971 [26:30<28:18,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 24.97it/s][A

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 24.96it/s][A
Epoch 1:  48%|████▊     | 2891/5971 [26:30<28:14,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 25.21it/s][A
Epoch 1:  48%|████▊     | 2895/5971 [26:30<28:09,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.36it/s][A
Epoch 1:  49%|████▊     | 2899/5971 [26:31<28:05,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.25it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 24.69it/s][A
Epoch 1:  49%|████▊     | 2903/5971 [26:31<28:01,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 24.88it/s][A
Epoch 1:  49%|████▊     | 2907/5971 [26:31<27:56,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 128/167 [00:06<00:01, 24.01it/s][A
Epoch 1:  49%|████▉     | 2911/5971 [26:31<27:52,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 24.68it/s][A

Validating:  80%|████████  | 134/167 [00:06<00:01, 25.75it/s][A
Epoch 1:  49%|████▉     | 2915/5971 [26:31<27:48,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 24.20it/s][A
Epoch 1:  49%|████▉     | 2919/5971 [26:31<27:43,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 24.83it/s][A
Epoch 1:  49%|████▉     | 2923/5971 [26:32<27:39,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 24.26it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 23.51it/s][A
Epoch 1:  49%|████▉     | 2927/5971 [26:32<27:35,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 24.45it/s][A
Epoch 1:  49%|████▉     | 2931/5971 [26:32<27:31,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  91%|█████████ | 152/167 [00:07<00:00, 24.80it/s][A
Epoch 1:  49%|████▉     | 2935/5971 [26:32<27:26,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:07<00:00, 24.63it/s][A

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 22.78it/s][A
Epoch 1:  49%|████▉     | 2939/5971 [26:32<27:22,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 24.11it/s][A
Epoch 1:  49%|████▉     | 2943/5971 [26:32<27:18,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 23.75it/s][A
Epoch 1:  49%|████▉     | 2947/5971 [26:33<27:14,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 100%|██████████| 167/167 [00:07<00:00, 25.19it/s][A
Epoch 1:  49%|████▉     | 2948/5971 [26:33<27:13,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000323, train/loss_step=0.0983, global_step=849.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  49%|████▉     | 2949/5971 [26:34<27:13,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.75e-6, train/loss_step=0.0016, global_step=850.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  49%|████▉     | 2950/5971 [26:35<27:13,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.87e-5, train/loss_step=0.0105, global_step=850.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  49%|████▉     | 2951/5971 [26:36<27:12,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.87e-5, train/loss_step=0.0105, global_step=850.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  49%|████▉     | 2951/5971 [26:36<27:12,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000549, train/loss_step=0.162, global_step=850.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  49%|████▉     | 2952/5971 [26:38<27:14,  1.85it/s, loss=0.129, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000362, train/loss_step=0.106, global_step=850.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  49%|████▉     | 2953/5971 [26:39<27:14,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.523, train/loss_vlb_step=0.00989, train/loss_step=0.523, global_step=851.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  49%|████▉     | 2954/5971 [26:40<27:13,  1.85it/s, loss=0.143, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.000996, train/loss_step=0.268, global_step=851.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  49%|████▉     | 2955/5971 [26:41<27:13,  1.85it/s, loss=0.143, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.000996, train/loss_step=0.268, global_step=851.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  49%|████▉     | 2955/5971 [26:41<27:13,  1.85it/s, loss=0.179, v_num=0, train/loss_simple_step=0.708, train/loss_vlb_step=0.022, train/loss_step=0.708, global_step=851.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  50%|████▉     | 2956/5971 [26:43<27:14,  1.84it/s, loss=0.193, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00106, train/loss_step=0.285, global_step=851.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2957/5971 [26:44<27:14,  1.84it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.8e-5, train/loss_step=0.00341, global_step=852.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2958/5971 [26:45<27:14,  1.84it/s, loss=0.212, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00292, train/loss_step=0.404, global_step=852.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  50%|████▉     | 2959/5971 [26:45<27:14,  1.84it/s, loss=0.212, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00292, train/loss_step=0.404, global_step=852.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2959/5971 [26:45<27:14,  1.84it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000168, train/loss_step=0.0491, global_step=852.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2960/5971 [26:48<27:15,  1.84it/s, loss=0.191, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000437, train/loss_step=0.132, global_step=852.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  50%|████▉     | 2961/5971 [26:49<27:15,  1.84it/s, loss=0.235, v_num=0, train/loss_simple_step=0.902, train/loss_vlb_step=0.0768, train/loss_step=0.902, global_step=853.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  50%|████▉     | 2962/5971 [26:49<27:14,  1.84it/s, loss=0.209, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.12e-5, train/loss_step=0.00193, global_step=853.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2963/5971 [26:50<27:14,  1.84it/s, loss=0.209, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.12e-5, train/loss_step=0.00193, global_step=853.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2963/5971 [26:50<27:14,  1.84it/s, loss=0.228, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00242, train/loss_step=0.412, global_step=853.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  50%|████▉     | 2964/5971 [26:52<27:15,  1.84it/s, loss=0.225, v_num=0, train/loss_simple_step=0.00412, train/loss_vlb_step=2.2e-5, train/loss_step=0.00412, global_step=853.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2965/5971 [26:53<27:15,  1.84it/s, loss=0.246, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.0176, train/loss_step=0.689, global_step=854.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  50%|████▉     | 2966/5971 [26:54<27:15,  1.84it/s, loss=0.24, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.19e-5, train/loss_step=0.00401, global_step=854.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2967/5971 [26:55<27:15,  1.84it/s, loss=0.24, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.19e-5, train/loss_step=0.00401, global_step=854.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2967/5971 [26:55<27:15,  1.84it/s, loss=0.245, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000477, train/loss_step=0.144, global_step=854.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  50%|████▉     | 2968/5971 [26:57<27:16,  1.84it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000138, train/loss_step=0.0393, global_step=854.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2969/5971 [26:58<27:16,  1.83it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0583, train/loss_vlb_step=0.000197, train/loss_step=0.0583, global_step=855.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2970/5971 [26:59<27:15,  1.83it/s, loss=0.245, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.64e-5, train/loss_step=0.00301, global_step=855.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2971/5971 [27:00<27:15,  1.83it/s, loss=0.245, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.64e-5, train/loss_step=0.00301, global_step=855.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2971/5971 [27:00<27:15,  1.83it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0663, train/loss_vlb_step=0.000224, train/loss_step=0.0663, global_step=855.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  50%|████▉     | 2972/5971 [27:02<27:16,  1.83it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.000242, train/loss_step=0.0728, global_step=855.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2973/5971 [27:03<27:16,  1.83it/s, loss=0.218, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000362, train/loss_step=0.107, global_step=856.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  50%|████▉     | 2974/5971 [27:04<27:16,  1.83it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00888, train/loss_vlb_step=3.9e-5, train/loss_step=0.00888, global_step=856.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2975/5971 [27:05<27:16,  1.83it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00888, train/loss_vlb_step=3.9e-5, train/loss_step=0.00888, global_step=856.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2975/5971 [27:05<27:16,  1.83it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.33e-5, train/loss_step=0.00223, global_step=856.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2976/5971 [27:07<27:17,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.5e-5, train/loss_step=0.0193, global_step=856.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  50%|████▉     | 2977/5971 [27:08<27:17,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000178, train/loss_step=0.0497, global_step=857.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2978/5971 [27:09<27:16,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00192, train/loss_step=0.391, global_step=857.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  50%|████▉     | 2979/5971 [27:10<27:16,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00192, train/loss_step=0.391, global_step=857.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2979/5971 [27:10<27:16,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.08e-5, train/loss_step=0.0114, global_step=857.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2980/5971 [27:12<27:18,  1.83it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00648, train/loss_vlb_step=3.1e-5, train/loss_step=0.00648, global_step=857.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2981/5971 [27:13<27:17,  1.83it/s, loss=0.112, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000478, train/loss_step=0.142, global_step=858.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  50%|████▉     | 2982/5971 [27:14<27:17,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000112, train/loss_step=0.0311, global_step=858.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2983/5971 [27:15<27:17,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000112, train/loss_step=0.0311, global_step=858.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2983/5971 [27:15<27:17,  1.82it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000134, train/loss_step=0.0361, global_step=858.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|████▉     | 2984/5971 [27:18<27:19,  1.82it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=9.27e-5, train/loss_step=0.0214, global_step=858.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  50%|████▉     | 2985/5971 [27:19<27:19,  1.82it/s, loss=0.0614, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.16e-5, train/loss_step=0.0143, global_step=859.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2986/5971 [27:20<27:19,  1.82it/s, loss=0.0615, v_num=0, train/loss_simple_step=0.00477, train/loss_vlb_step=2.5e-5, train/loss_step=0.00477, global_step=859.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2987/5971 [27:21<27:19,  1.82it/s, loss=0.0615, v_num=0, train/loss_simple_step=0.00477, train/loss_vlb_step=2.5e-5, train/loss_step=0.00477, global_step=859.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2987/5971 [27:21<27:19,  1.82it/s, loss=0.0595, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=859.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  50%|█████     | 2988/5971 [27:24<27:20,  1.82it/s, loss=0.0576, v_num=0, train/loss_simple_step=0.00257, train/loss_vlb_step=1.43e-5, train/loss_step=0.00257, global_step=859.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2989/5971 [27:25<27:20,  1.82it/s, loss=0.0558, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.46e-5, train/loss_step=0.0209, global_step=860.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  50%|█████     | 2990/5971 [27:26<27:20,  1.82it/s, loss=0.057, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000107, train/loss_step=0.0275, global_step=860.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2991/5971 [27:26<27:20,  1.82it/s, loss=0.057, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000107, train/loss_step=0.0275, global_step=860.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2991/5971 [27:26<27:20,  1.82it/s, loss=0.0566, v_num=0, train/loss_simple_step=0.0576, train/loss_vlb_step=0.000196, train/loss_step=0.0576, global_step=860.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2992/5971 [27:29<27:22,  1.81it/s, loss=0.0566, v_num=0, train/loss_simple_step=0.0736, train/loss_vlb_step=0.000248, train/loss_step=0.0736, global_step=860.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2993/5971 [27:30<27:22,  1.81it/s, loss=0.0514, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.59e-5, train/loss_step=0.00291, global_step=861.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2994/5971 [27:31<27:21,  1.81it/s, loss=0.0575, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=861.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  50%|█████     | 2995/5971 [27:32<27:21,  1.81it/s, loss=0.0575, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=861.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2995/5971 [27:32<27:21,  1.81it/s, loss=0.0637, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000416, train/loss_step=0.126, global_step=861.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2996/5971 [27:35<27:23,  1.81it/s, loss=0.0628, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.8e-6, train/loss_step=0.00164, global_step=861.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2997/5971 [27:36<27:22,  1.81it/s, loss=0.105, v_num=0, train/loss_simple_step=0.902, train/loss_vlb_step=0.092, train/loss_step=0.902, global_step=862.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  50%|█████     | 2998/5971 [27:37<27:22,  1.81it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000536, train/loss_step=0.151, global_step=862.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2999/5971 [27:38<27:22,  1.81it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000536, train/loss_step=0.151, global_step=862.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 2999/5971 [27:38<27:22,  1.81it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000462, train/loss_step=0.135, global_step=862.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3000/5971 [27:41<27:24,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000513, train/loss_step=0.156, global_step=862.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  50%|█████     | 3001/5971 [27:41<27:24,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000511, train/loss_step=0.151, global_step=863.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3002/5971 [27:42<27:24,  1.81it/s, loss=0.118, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000948, train/loss_step=0.236, global_step=863.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3003/5971 [27:43<27:23,  1.81it/s, loss=0.118, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000948, train/loss_step=0.236, global_step=863.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3003/5971 [27:43<27:23,  1.81it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.0001, train/loss_step=0.0272, global_step=863.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3004/5971 [27:46<27:25,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.071, train/loss_vlb_step=0.000247, train/loss_step=0.071, global_step=863.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  50%|█████     | 3005/5971 [27:47<27:24,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0855, train/loss_vlb_step=0.000281, train/loss_step=0.0855, global_step=864.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3006/5971 [27:47<27:24,  1.80it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000126, train/loss_step=0.0358, global_step=864.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3007/5971 [27:48<27:24,  1.80it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000126, train/loss_step=0.0358, global_step=864.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3007/5971 [27:48<27:24,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.089, train/loss_vlb_step=0.0003, train/loss_step=0.089, global_step=864.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  50%|█████     | 3008/5971 [27:50<27:25,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00762, train/loss_vlb_step=3.7e-5, train/loss_step=0.00762, global_step=864.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3009/5971 [27:51<27:25,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0081, train/loss_vlb_step=3.87e-5, train/loss_step=0.0081, global_step=865.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  50%|█████     | 3010/5971 [27:52<27:24,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000147, train/loss_step=0.0411, global_step=865.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3011/5971 [27:53<27:24,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000147, train/loss_step=0.0411, global_step=865.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3011/5971 [27:53<27:24,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.00277, train/loss_step=0.414, global_step=865.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  50%|█████     | 3012/5971 [27:56<27:26,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2.11e-5, train/loss_step=0.00391, global_step=865.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3013/5971 [27:57<27:26,  1.80it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000131, train/loss_step=0.0328, global_step=866.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  50%|█████     | 3014/5971 [27:58<27:26,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000135, train/loss_step=0.0378, global_step=866.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3015/5971 [27:59<27:25,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000135, train/loss_step=0.0378, global_step=866.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  50%|█████     | 3015/5971 [27:59<27:25,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00574, train/loss_vlb_step=2.75e-5, train/loss_step=0.00574, global_step=866.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3016/5971 [28:01<27:27,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.00081, train/loss_step=0.190, global_step=866.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  51%|█████     | 3017/5971 [28:02<27:26,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000508, train/loss_step=0.154, global_step=867.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3018/5971 [28:03<27:26,  1.79it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.43e-5, train/loss_step=0.0212, global_step=867.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3019/5971 [28:04<27:26,  1.79it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.43e-5, train/loss_step=0.0212, global_step=867.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3019/5971 [28:04<27:26,  1.79it/s, loss=0.101, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.00105, train/loss_step=0.254, global_step=867.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  51%|█████     | 3020/5971 [28:06<27:27,  1.79it/s, loss=0.094, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.58e-5, train/loss_step=0.0124, global_step=867.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3021/5971 [28:07<27:27,  1.79it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.08e-5, train/loss_step=0.00182, global_step=868.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3022/5971 [28:08<27:26,  1.79it/s, loss=0.0753, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.7e-5, train/loss_step=0.0132, global_step=868.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  51%|█████     | 3023/5971 [28:09<27:26,  1.79it/s, loss=0.0753, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.7e-5, train/loss_step=0.0132, global_step=868.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3023/5971 [28:09<27:26,  1.79it/s, loss=0.0797, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000382, train/loss_step=0.115, global_step=868.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3024/5971 [28:11<27:27,  1.79it/s, loss=0.0785, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.00016, train/loss_step=0.0455, global_step=868.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3025/5971 [28:12<27:27,  1.79it/s, loss=0.0743, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.05e-5, train/loss_step=0.00174, global_step=869.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3026/5971 [28:13<27:27,  1.79it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00131, train/loss_step=0.287, global_step=869.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  51%|█████     | 3027/5971 [28:14<27:27,  1.79it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00131, train/loss_step=0.287, global_step=869.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3027/5971 [28:14<27:27,  1.79it/s, loss=0.083, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.36e-5, train/loss_step=0.012, global_step=869.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  51%|█████     | 3028/5971 [28:16<27:28,  1.79it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=6.78e-5, train/loss_step=0.0172, global_step=869.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3029/5971 [28:17<27:28,  1.78it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.00178, train/loss_vlb_step=1.07e-5, train/loss_step=0.00178, global_step=870.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3030/5971 [28:18<27:28,  1.78it/s, loss=0.0812, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.41e-5, train/loss_step=0.00256, global_step=870.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3031/5971 [28:19<27:27,  1.78it/s, loss=0.0812, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.41e-5, train/loss_step=0.00256, global_step=870.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3031/5971 [28:19<27:27,  1.78it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000716, train/loss_step=0.209, global_step=870.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  51%|█████     | 3032/5971 [28:21<27:28,  1.78it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.00249, train/loss_vlb_step=1.43e-5, train/loss_step=0.00249, global_step=870.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3033/5971 [28:22<27:28,  1.78it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000591, train/loss_step=0.172, global_step=871.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  51%|█████     | 3034/5971 [28:23<27:28,  1.78it/s, loss=0.0793, v_num=0, train/loss_simple_step=0.0668, train/loss_vlb_step=0.000228, train/loss_step=0.0668, global_step=871.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3035/5971 [28:24<27:27,  1.78it/s, loss=0.0793, v_num=0, train/loss_simple_step=0.0668, train/loss_vlb_step=0.000228, train/loss_step=0.0668, global_step=871.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3035/5971 [28:24<27:27,  1.78it/s, loss=0.0817, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000202, train/loss_step=0.0551, global_step=871.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3036/5971 [28:26<27:28,  1.78it/s, loss=0.0779, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=871.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  51%|█████     | 3037/5971 [28:27<27:28,  1.78it/s, loss=0.0776, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000506, train/loss_step=0.147, global_step=872.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3038/5971 [28:28<27:28,  1.78it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000186, train/loss_step=0.0538, global_step=872.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3039/5971 [28:28<27:28,  1.78it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000186, train/loss_step=0.0538, global_step=872.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3039/5971 [28:28<27:28,  1.78it/s, loss=0.0669, v_num=0, train/loss_simple_step=0.0088, train/loss_vlb_step=4.32e-5, train/loss_step=0.0088, global_step=872.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  51%|█████     | 3040/5971 [28:31<27:29,  1.78it/s, loss=0.07, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000242, train/loss_step=0.0738, global_step=872.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  51%|█████     | 3041/5971 [28:31<27:28,  1.78it/s, loss=0.0765, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000447, train/loss_step=0.132, global_step=873.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3042/5971 [28:32<27:28,  1.78it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000137, train/loss_step=0.0367, global_step=873.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3043/5971 [28:33<27:28,  1.78it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000137, train/loss_step=0.0367, global_step=873.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3043/5971 [28:33<27:28,  1.78it/s, loss=0.0768, v_num=0, train/loss_simple_step=0.0962, train/loss_vlb_step=0.000317, train/loss_step=0.0962, global_step=873.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3044/5971 [28:36<27:29,  1.77it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00245, train/loss_step=0.445, global_step=873.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  51%|█████     | 3045/5971 [28:37<27:29,  1.77it/s, loss=0.102, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000358, train/loss_step=0.109, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3046/5971 [28:38<27:29,  1.77it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.00646, train/loss_vlb_step=3.39e-5, train/loss_step=0.00646, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3047/5971 [28:39<27:29,  1.77it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.00646, train/loss_vlb_step=3.39e-5, train/loss_step=0.00646, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  51%|█████     | 3047/5971 [28:39<27:29,  1.77it/s, loss=0.0876, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.36e-5, train/loss_step=0.0023, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  51%|█████     | 3048/5971 [28:41<27:30,  1.77it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.24it/s][A

Validating:   1%|          | 2/167 [00:00<00:42,  3.87it/s][A
Epoch 1:  51%|█████     | 3051/5971 [28:42<27:27,  1.77it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.70it/s][A
Epoch 1:  51%|█████     | 3055/5971 [28:42<27:23,  1.77it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.43it/s][A
Epoch 1:  51%|█████     | 3059/5971 [28:42<27:19,  1.78it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.20it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.35it/s][A
Epoch 1:  51%|█████▏    | 3063/5971 [28:42<27:14,  1.78it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.56it/s][A
Epoch 1:  51%|█████▏    | 3067/5971 [28:42<27:10,  1.78it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.56it/s][A
Epoch 1:  51%|█████▏    | 3071/5971 [28:42<27:06,  1.78it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.51it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.95it/s][A
Epoch 1:  51%|█████▏    | 3075/5971 [28:42<27:02,  1.79it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 26.25it/s][A
Epoch 1:  52%|█████▏    | 3079/5971 [28:43<26:57,  1.79it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.08it/s][A
Epoch 1:  52%|█████▏    | 3083/5971 [28:43<26:53,  1.79it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.73it/s][A

Validating:  23%|██▎       | 38/167 [00:01<00:04, 25.81it/s][A
Epoch 1:  52%|█████▏    | 3087/5971 [28:43<26:49,  1.79it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.95it/s][A
Epoch 1:  52%|█████▏    | 3091/5971 [28:43<26:45,  1.79it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.74it/s][A
Epoch 1:  52%|█████▏    | 3095/5971 [28:43<26:41,  1.80it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.70it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.86it/s][A
Epoch 1:  52%|█████▏    | 3099/5971 [28:43<26:37,  1.80it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 28.22it/s][A
Epoch 1:  52%|█████▏    | 3103/5971 [28:44<26:32,  1.80it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.78it/s][A
Epoch 1:  52%|█████▏    | 3107/5971 [28:44<26:28,  1.80it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.24it/s][A
Epoch 1:  52%|█████▏    | 3111/5971 [28:44<26:24,  1.80it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.78it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.52it/s][A
Epoch 1:  52%|█████▏    | 3115/5971 [28:44<26:20,  1.81it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.75it/s][A
Epoch 1:  52%|█████▏    | 3119/5971 [28:44<26:16,  1.81it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.01it/s][A
Epoch 1:  52%|█████▏    | 3123/5971 [28:44<26:12,  1.81it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.20it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.14it/s][A
Epoch 1:  52%|█████▏    | 3127/5971 [28:44<26:08,  1.81it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.38it/s][A
Epoch 1:  52%|█████▏    | 3131/5971 [28:45<26:04,  1.82it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  51%|█████     | 85/167 [00:03<00:03, 27.02it/s][A
Epoch 1:  53%|█████▎    | 3135/5971 [28:45<26:00,  1.82it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.19it/s][A
Epoch 1:  53%|█████▎    | 3139/5971 [28:45<25:56,  1.82it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 26.43it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.91it/s][A
Epoch 1:  53%|█████▎    | 3143/5971 [28:45<25:52,  1.82it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.91it/s][A
Epoch 1:  53%|█████▎    | 3147/5971 [28:45<25:48,  1.82it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.88it/s][A
Epoch 1:  53%|█████▎    | 3151/5971 [28:45<25:44,  1.83it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.23it/s][A

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.70it/s][A
Epoch 1:  53%|█████▎    | 3155/5971 [28:46<25:40,  1.83it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 24.50it/s][A
Epoch 1:  53%|█████▎    | 3159/5971 [28:46<25:36,  1.83it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 25.22it/s][A
Epoch 1:  53%|█████▎    | 3163/5971 [28:46<25:32,  1.83it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:04<00:02, 25.49it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.42it/s][A
Epoch 1:  53%|█████▎    | 3167/5971 [28:46<25:28,  1.83it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.56it/s][A
Epoch 1:  53%|█████▎    | 3171/5971 [28:46<25:24,  1.84it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.71it/s][A
Epoch 1:  53%|█████▎    | 3175/5971 [28:46<25:20,  1.84it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.29it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.52it/s][A
Epoch 1:  53%|█████▎    | 3179/5971 [28:46<25:16,  1.84it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.84it/s][A
Epoch 1:  53%|█████▎    | 3183/5971 [28:47<25:12,  1.84it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.60it/s][A
Epoch 1:  53%|█████▎    | 3187/5971 [28:47<25:08,  1.85it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.23it/s][A

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 25.73it/s][A
Epoch 1:  53%|█████▎    | 3191/5971 [28:47<25:04,  1.85it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.06it/s][A
Epoch 1:  54%|█████▎    | 3195/5971 [28:47<25:00,  1.85it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.61it/s][A
Epoch 1:  54%|█████▎    | 3199/5971 [28:47<24:56,  1.85it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.93it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.79it/s][A
Epoch 1:  54%|█████▎    | 3203/5971 [28:47<24:52,  1.85it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.36it/s][A
Epoch 1:  54%|█████▎    | 3207/5971 [28:47<24:48,  1.86it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.72it/s][A
Epoch 1:  54%|█████▍    | 3211/5971 [28:48<24:44,  1.86it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.15it/s][A

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 25.87it/s][A
Epoch 1:  54%|█████▍    | 3215/5971 [28:48<24:41,  1.86it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3216/5971 [28:48<24:40,  1.86it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=874.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  54%|█████▍    | 3217/5971 [28:49<24:40,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.813, train/loss_vlb_step=0.0595, train/loss_step=0.813, global_step=875.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  54%|█████▍    | 3218/5971 [28:50<24:40,  1.86it/s, loss=0.145, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00205, train/loss_step=0.364, global_step=875.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3219/5971 [28:51<24:39,  1.86it/s, loss=0.145, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00205, train/loss_step=0.364, global_step=875.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3219/5971 [28:51<24:39,  1.86it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00699, train/loss_vlb_step=3.35e-5, train/loss_step=0.00699, global_step=875.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3220/5971 [28:53<24:40,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000897, train/loss_step=0.219, global_step=875.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  54%|█████▍    | 3221/5971 [28:54<24:40,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000645, train/loss_step=0.195, global_step=876.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3222/5971 [28:55<24:40,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00724, train/loss_vlb_step=3.53e-5, train/loss_step=0.00724, global_step=876.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3223/5971 [28:56<24:40,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00724, train/loss_vlb_step=3.53e-5, train/loss_step=0.00724, global_step=876.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3223/5971 [28:56<24:40,  1.86it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.7e-5, train/loss_step=0.0221, global_step=876.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  54%|█████▍    | 3224/5971 [28:58<24:40,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.0023, train/loss_step=0.413, global_step=876.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  54%|█████▍    | 3225/5971 [28:59<24:40,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000855, train/loss_step=0.222, global_step=877.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3226/5971 [29:00<24:40,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=877.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3227/5971 [29:01<24:40,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=877.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3227/5971 [29:01<24:40,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=877.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3228/5971 [29:03<24:40,  1.85it/s, loss=0.176, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000738, train/loss_step=0.212, global_step=877.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3229/5971 [29:04<24:40,  1.85it/s, loss=0.183, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.000938, train/loss_step=0.264, global_step=878.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3230/5971 [29:05<24:40,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=878.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3231/5971 [29:05<24:40,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=878.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3231/5971 [29:05<24:40,  1.85it/s, loss=0.205, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00334, train/loss_step=0.440, global_step=878.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  54%|█████▍    | 3232/5971 [29:08<24:40,  1.85it/s, loss=0.213, v_num=0, train/loss_simple_step=0.603, train/loss_vlb_step=0.0118, train/loss_step=0.603, global_step=878.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  54%|█████▍    | 3233/5971 [29:08<24:40,  1.85it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00989, train/loss_vlb_step=4.61e-5, train/loss_step=0.00989, global_step=879.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3234/5971 [29:09<24:40,  1.85it/s, loss=0.218, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000724, train/loss_step=0.199, global_step=879.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  54%|█████▍    | 3235/5971 [29:10<24:40,  1.85it/s, loss=0.218, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000724, train/loss_step=0.199, global_step=879.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3235/5971 [29:10<24:40,  1.85it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00373, train/loss_vlb_step=1.99e-5, train/loss_step=0.00373, global_step=879.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3236/5971 [29:12<24:41,  1.85it/s, loss=0.227, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.00061, train/loss_step=0.181, global_step=879.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  54%|█████▍    | 3237/5971 [29:13<24:40,  1.85it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.62e-5, train/loss_step=0.0248, global_step=880.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3238/5971 [29:14<24:40,  1.85it/s, loss=0.179, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000761, train/loss_step=0.203, global_step=880.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  54%|█████▍    | 3239/5971 [29:15<24:40,  1.85it/s, loss=0.179, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000761, train/loss_step=0.203, global_step=880.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3239/5971 [29:15<24:40,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000718, train/loss_step=0.209, global_step=880.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3240/5971 [29:17<24:41,  1.84it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00962, train/loss_vlb_step=4.33e-5, train/loss_step=0.00962, global_step=880.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3241/5971 [29:18<24:41,  1.84it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.000267, train/loss_step=0.0778, global_step=881.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  54%|█████▍    | 3242/5971 [29:19<24:40,  1.84it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.09e-5, train/loss_step=0.0171, global_step=881.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  54%|█████▍    | 3243/5971 [29:20<24:40,  1.84it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.09e-5, train/loss_step=0.0171, global_step=881.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3243/5971 [29:20<24:40,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00412, train/loss_step=0.497, global_step=881.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  54%|█████▍    | 3244/5971 [29:22<24:41,  1.84it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.1e-5, train/loss_step=0.00204, global_step=881.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3245/5971 [29:23<24:41,  1.84it/s, loss=0.175, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000639, train/loss_step=0.186, global_step=882.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  54%|█████▍    | 3246/5971 [29:24<24:40,  1.84it/s, loss=0.205, v_num=0, train/loss_simple_step=0.719, train/loss_vlb_step=0.0339, train/loss_step=0.719, global_step=882.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  54%|█████▍    | 3247/5971 [29:25<24:40,  1.84it/s, loss=0.205, v_num=0, train/loss_simple_step=0.719, train/loss_vlb_step=0.0339, train/loss_step=0.719, global_step=882.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3247/5971 [29:25<24:40,  1.84it/s, loss=0.221, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00199, train/loss_step=0.428, global_step=882.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3248/5971 [29:28<24:41,  1.84it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.18e-5, train/loss_step=0.0119, global_step=882.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3249/5971 [29:28<24:41,  1.84it/s, loss=0.205, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000443, train/loss_step=0.135, global_step=883.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  54%|█████▍    | 3250/5971 [29:30<24:41,  1.84it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0796, train/loss_vlb_step=0.000277, train/loss_step=0.0796, global_step=883.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3251/5971 [29:30<24:41,  1.84it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0796, train/loss_vlb_step=0.000277, train/loss_step=0.0796, global_step=883.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3251/5971 [29:30<24:41,  1.84it/s, loss=0.193, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000944, train/loss_step=0.262, global_step=883.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  54%|█████▍    | 3252/5971 [29:32<24:41,  1.83it/s, loss=0.169, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000404, train/loss_step=0.122, global_step=883.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3253/5971 [29:33<24:41,  1.83it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0046, train/loss_vlb_step=2.45e-5, train/loss_step=0.0046, global_step=884.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  54%|█████▍    | 3254/5971 [29:34<24:41,  1.83it/s, loss=0.164, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=884.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▍    | 3255/5971 [29:35<24:41,  1.83it/s, loss=0.164, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=884.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3255/5971 [29:35<24:41,  1.83it/s, loss=0.174, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000709, train/loss_step=0.196, global_step=884.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3256/5971 [29:37<24:41,  1.83it/s, loss=0.17, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=884.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▍    | 3257/5971 [29:38<24:41,  1.83it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000221, train/loss_step=0.0661, global_step=885.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3258/5971 [29:39<24:41,  1.83it/s, loss=0.198, v_num=0, train/loss_simple_step=0.730, train/loss_vlb_step=0.0142, train/loss_step=0.730, global_step=885.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  55%|█████▍    | 3259/5971 [29:40<24:41,  1.83it/s, loss=0.198, v_num=0, train/loss_simple_step=0.730, train/loss_vlb_step=0.0142, train/loss_step=0.730, global_step=885.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3259/5971 [29:40<24:41,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.87e-5, train/loss_step=0.00784, global_step=885.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3260/5971 [29:42<24:42,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000588, train/loss_step=0.169, global_step=885.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  55%|█████▍    | 3261/5971 [29:43<24:41,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.00021, train/loss_step=0.0622, global_step=886.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3262/5971 [29:44<24:41,  1.83it/s, loss=0.219, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.0042, train/loss_step=0.485, global_step=886.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  55%|█████▍    | 3263/5971 [29:45<24:41,  1.83it/s, loss=0.219, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.0042, train/loss_step=0.485, global_step=886.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3263/5971 [29:45<24:41,  1.83it/s, loss=0.201, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000692, train/loss_step=0.153, global_step=886.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3264/5971 [29:47<24:42,  1.83it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.94e-5, train/loss_step=0.00366, global_step=886.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3265/5971 [29:48<24:41,  1.83it/s, loss=0.214, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00233, train/loss_step=0.432, global_step=887.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  55%|█████▍    | 3266/5971 [29:49<24:41,  1.83it/s, loss=0.191, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000969, train/loss_step=0.262, global_step=887.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3267/5971 [29:50<24:41,  1.83it/s, loss=0.191, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000969, train/loss_step=0.262, global_step=887.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3267/5971 [29:50<24:41,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00454, train/loss_step=0.512, global_step=887.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▍    | 3268/5971 [29:52<24:42,  1.82it/s, loss=0.205, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000683, train/loss_step=0.203, global_step=887.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3269/5971 [29:53<24:41,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0578, train/loss_vlb_step=0.000205, train/loss_step=0.0578, global_step=888.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3270/5971 [29:54<24:41,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=888.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  55%|█████▍    | 3271/5971 [29:55<24:41,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=888.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3271/5971 [29:55<24:41,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00108, train/loss_step=0.267, global_step=888.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▍    | 3272/5971 [29:57<24:42,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.29e-5, train/loss_step=0.00222, global_step=888.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3273/5971 [29:58<24:41,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00651, train/loss_vlb_step=3.2e-5, train/loss_step=0.00651, global_step=889.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▍    | 3274/5971 [29:59<24:41,  1.82it/s, loss=0.207, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.0011, train/loss_step=0.287, global_step=889.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  55%|█████▍    | 3275/5971 [30:00<24:41,  1.82it/s, loss=0.207, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.0011, train/loss_step=0.287, global_step=889.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3275/5971 [30:00<24:41,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00699, train/loss_vlb_step=3.44e-5, train/loss_step=0.00699, global_step=889.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3276/5971 [30:02<24:42,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=889.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  55%|█████▍    | 3277/5971 [30:03<24:41,  1.82it/s, loss=0.207, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000866, train/loss_step=0.246, global_step=890.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3278/5971 [30:03<24:41,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.00017, train/loss_step=0.0475, global_step=890.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3279/5971 [30:04<24:41,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.00017, train/loss_step=0.0475, global_step=890.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3279/5971 [30:04<24:41,  1.82it/s, loss=0.181, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000716, train/loss_step=0.185, global_step=890.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▍    | 3280/5971 [30:07<24:42,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.37e-5, train/loss_step=0.00246, global_step=890.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3281/5971 [30:07<24:41,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.536, train/loss_vlb_step=0.0034, train/loss_step=0.536, global_step=891.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  55%|█████▍    | 3282/5971 [30:08<24:41,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.29e-5, train/loss_step=0.00223, global_step=891.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3283/5971 [30:09<24:41,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.29e-5, train/loss_step=0.00223, global_step=891.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3283/5971 [30:09<24:41,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.61e-5, train/loss_step=0.00281, global_step=891.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▍    | 3284/5971 [30:11<24:42,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000917, train/loss_step=0.247, global_step=891.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  55%|█████▌    | 3285/5971 [30:12<24:41,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.66e-5, train/loss_step=0.0159, global_step=892.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3286/5971 [30:13<24:41,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.93e-5, train/loss_step=0.0109, global_step=892.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3287/5971 [30:14<24:41,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.93e-5, train/loss_step=0.0109, global_step=892.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3287/5971 [30:14<24:41,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000121, train/loss_step=0.0318, global_step=892.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3288/5971 [30:16<24:42,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000946, train/loss_step=0.233, global_step=892.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▌    | 3289/5971 [30:17<24:41,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000393, train/loss_step=0.119, global_step=893.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3290/5971 [30:18<24:41,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.77e-5, train/loss_step=0.0136, global_step=893.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3291/5971 [30:19<24:41,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.77e-5, train/loss_step=0.0136, global_step=893.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3291/5971 [30:19<24:41,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000106, train/loss_step=0.0257, global_step=893.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3292/5971 [30:21<24:42,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.25e-5, train/loss_step=0.00207, global_step=893.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3293/5971 [30:22<24:41,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000375, train/loss_step=0.114, global_step=894.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  55%|█████▌    | 3294/5971 [30:23<24:41,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00153, train/loss_step=0.289, global_step=894.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▌    | 3295/5971 [30:24<24:41,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00153, train/loss_step=0.289, global_step=894.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3295/5971 [30:24<24:41,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.740, train/loss_vlb_step=0.0259, train/loss_step=0.740, global_step=894.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▌    | 3296/5971 [30:26<24:41,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0869, train/loss_vlb_step=0.000286, train/loss_step=0.0869, global_step=894.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3297/5971 [30:27<24:41,  1.80it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00258, train/loss_vlb_step=1.42e-5, train/loss_step=0.00258, global_step=895.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3298/5971 [30:28<24:41,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00217, train/loss_step=0.289, global_step=895.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  55%|█████▌    | 3299/5971 [30:29<24:40,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00217, train/loss_step=0.289, global_step=895.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3299/5971 [30:29<24:40,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000455, train/loss_step=0.128, global_step=895.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3300/5971 [30:31<24:41,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.04e-5, train/loss_step=0.00172, global_step=895.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3301/5971 [30:32<24:41,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000218, train/loss_step=0.0636, global_step=896.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▌    | 3302/5971 [30:33<24:41,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00657, train/loss_step=0.423, global_step=896.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  55%|█████▌    | 3303/5971 [30:33<24:40,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00657, train/loss_step=0.423, global_step=896.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3303/5971 [30:33<24:40,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000668, train/loss_step=0.187, global_step=896.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3304/5971 [30:35<24:41,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.37e-5, train/loss_step=0.00235, global_step=896.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3305/5971 [30:36<24:41,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.49e-5, train/loss_step=0.0103, global_step=897.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  55%|█████▌    | 3306/5971 [30:37<24:40,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.2e-5, train/loss_step=0.0207, global_step=897.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  55%|█████▌    | 3307/5971 [30:38<24:40,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.2e-5, train/loss_step=0.0207, global_step=897.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3307/5971 [30:38<24:40,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000131, train/loss_step=0.038, global_step=897.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3308/5971 [30:40<24:41,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00717, train/loss_vlb_step=3.43e-5, train/loss_step=0.00717, global_step=897.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3309/5971 [30:41<24:41,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.0022, train/loss_step=0.423, global_step=898.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  55%|█████▌    | 3310/5971 [30:42<24:40,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000628, train/loss_step=0.189, global_step=898.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3311/5971 [30:43<24:40,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000628, train/loss_step=0.189, global_step=898.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3311/5971 [30:43<24:40,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000522, train/loss_step=0.152, global_step=898.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3312/5971 [30:45<24:41,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000756, train/loss_step=0.201, global_step=898.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  55%|█████▌    | 3313/5971 [30:46<24:41,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00506, train/loss_vlb_step=2.57e-5, train/loss_step=0.00506, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  56%|█████▌    | 3314/5971 [30:47<24:40,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00108, train/loss_step=0.292, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  56%|█████▌    | 3315/5971 [30:48<24:40,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00108, train/loss_step=0.292, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  56%|█████▌    | 3315/5971 [30:48<24:40,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.82e-5, train/loss_step=0.018, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  56%|█████▌    | 3316/5971 [30:50<24:41,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:15,  2.20it/s][A
Epoch 1:  56%|█████▌    | 3319/5971 [30:51<24:38,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   2%|▏         | 3/167 [00:00<00:28,  5.85it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.12it/s][A
Epoch 1:  56%|█████▌    | 3323/5971 [30:51<24:34,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.47it/s][A
Epoch 1:  56%|█████▌    | 3327/5971 [30:51<24:31,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.04it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.12it/s][A
Epoch 1:  56%|█████▌    | 3331/5971 [30:51<24:27,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.70it/s][A
Epoch 1:  56%|█████▌    | 3335/5971 [30:51<24:23,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  13%|█▎        | 21/167 [00:01<00:05, 25.20it/s][A
Epoch 1:  56%|█████▌    | 3339/5971 [30:52<24:19,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.65it/s][A
Epoch 1:  56%|█████▌    | 3343/5971 [30:52<24:15,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.98it/s][A

Validating:  18%|█▊        | 30/167 [00:01<00:05, 27.02it/s][A
Epoch 1:  56%|█████▌    | 3347/5971 [30:52<24:11,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.17it/s][A
Epoch 1:  56%|█████▌    | 3351/5971 [30:52<24:07,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.43it/s][A
Epoch 1:  56%|█████▌    | 3355/5971 [30:52<24:04,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 39/167 [00:01<00:05, 25.49it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.01it/s][A
Epoch 1:  56%|█████▋    | 3359/5971 [30:52<24:00,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.68it/s][A
Epoch 1:  56%|█████▋    | 3363/5971 [30:53<23:56,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.28it/s][A
Epoch 1:  56%|█████▋    | 3367/5971 [30:53<23:52,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.51it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.51it/s][A
Epoch 1:  56%|█████▋    | 3371/5971 [30:53<23:49,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.70it/s][A
Epoch 1:  57%|█████▋    | 3375/5971 [30:53<23:45,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 26.81it/s][A
Epoch 1:  57%|█████▋    | 3379/5971 [30:53<23:41,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.19it/s][A

Validating:  40%|███▉      | 66/167 [00:02<00:03, 26.94it/s][A
Epoch 1:  57%|█████▋    | 3383/5971 [30:53<23:37,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 24.44it/s][A
Epoch 1:  57%|█████▋    | 3387/5971 [30:53<23:33,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.15it/s][A
Epoch 1:  57%|█████▋    | 3391/5971 [30:54<23:30,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.32it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.67it/s][A
Epoch 1:  57%|█████▋    | 3395/5971 [30:54<23:26,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.06it/s][A
Epoch 1:  57%|█████▋    | 3399/5971 [30:54<23:22,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|█████     | 84/167 [00:03<00:03, 24.59it/s][A
Epoch 1:  57%|█████▋    | 3403/5971 [30:54<23:19,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 24.89it/s][A

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.15it/s][A
Epoch 1:  57%|█████▋    | 3407/5971 [30:54<23:15,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.74it/s][A
Epoch 1:  57%|█████▋    | 3411/5971 [30:54<23:11,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.15it/s][A
Epoch 1:  57%|█████▋    | 3415/5971 [30:55<23:08,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.60it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.81it/s][A
Epoch 1:  57%|█████▋    | 3419/5971 [30:55<23:04,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.42it/s][A
Epoch 1:  57%|█████▋    | 3423/5971 [30:55<23:00,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 24.03it/s][A
Epoch 1:  57%|█████▋    | 3427/5971 [30:55<22:57,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 24.73it/s][A
Epoch 1:  57%|█████▋    | 3431/5971 [30:55<22:53,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 26.32it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.14it/s][A
Epoch 1:  58%|█████▊    | 3435/5971 [30:55<22:49,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.29it/s][A
Epoch 1:  58%|█████▊    | 3439/5971 [30:55<22:46,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.22it/s][A
Epoch 1:  58%|█████▊    | 3443/5971 [30:56<22:42,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.61it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.62it/s][A
Epoch 1:  58%|█████▊    | 3447/5971 [30:56<22:38,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.32it/s][A
Epoch 1:  58%|█████▊    | 3451/5971 [30:56<22:35,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 27.64it/s][A
Epoch 1:  58%|█████▊    | 3455/5971 [30:56<22:31,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.51it/s][A
Epoch 1:  58%|█████▊    | 3459/5971 [30:56<22:27,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.43it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 28.06it/s][A
Epoch 1:  58%|█████▊    | 3463/5971 [30:56<22:24,  1.87it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 28.14it/s][A
Epoch 1:  58%|█████▊    | 3467/5971 [30:56<22:20,  1.87it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.75it/s][A
Epoch 1:  58%|█████▊    | 3471/5971 [30:57<22:17,  1.87it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.10it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.86it/s][A
Epoch 1:  58%|█████▊    | 3475/5971 [30:57<22:13,  1.87it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 24.63it/s][A
Epoch 1:  58%|█████▊    | 3479/5971 [30:57<22:10,  1.87it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.69it/s][A
Epoch 1:  58%|█████▊    | 3483/5971 [30:57<22:06,  1.88it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 100%|██████████| 167/167 [00:06<00:00, 25.23it/s][A
Epoch 1:  58%|█████▊    | 3484/5971 [30:58<22:05,  1.88it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=899.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.37it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.77it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.18it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.55it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.79it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.94it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.30it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.28it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.44it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.32it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.25it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.24it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.34it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.20it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.35it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.24it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.17it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.19it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.22it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.16it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.02it/s]

Epoch 1:  58%|█████▊    | 3485/5971 [31:10<22:13,  1.86it/s, loss=0.135, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000588, train/loss_step=0.175, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.83it/s][A
Epoch 1:  58%|█████▊    | 3485/5971 [31:13<22:15,  1.86it/s, loss=0.135, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000588, train/loss_step=0.175, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.21it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.54it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.48it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.53it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.47it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.35it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.19it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.16it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.15it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.14it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.12it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.10it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:03,  4.99it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.03it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.09it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.14it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.21it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.23it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.12it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.18it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.24it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.33it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.33it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.43it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.02it/s]

Epoch 1:  58%|█████▊    | 3486/5971 [31:22<22:21,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000588, train/loss_step=0.175, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3486/5971 [31:22<22:21,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000876, train/loss_step=0.241, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:26,  1.84it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:15,  3.05it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.79it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:13,  3.31it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  3.87it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.29it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.63it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  4.89it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.09it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.42it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.34it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.39it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.20it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.15it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.12it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.09it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.13it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.29it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.32it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.53it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.44it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.57it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.59it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.48it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.13it/s]

Epoch 1:  58%|█████▊    | 3487/5971 [31:35<22:29,  1.84it/s, loss=0.133, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000876, train/loss_step=0.241, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3487/5971 [31:35<22:29,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.64e-5, train/loss_step=0.0103, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.32it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.10it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.65it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  4.03it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.32it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.51it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:09,  4.56it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.68it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.84it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  4.94it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.01it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.05it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.07it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.03it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.04it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.04it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.12it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.21it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.06it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.13it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:05<00:05,  5.17it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.18it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.14it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.08it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.06it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:06<00:04,  4.99it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  4.95it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:04,  4.92it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.00it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.02it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:07<00:03,  5.11it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.22it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.53it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.27it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.16it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.15it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.27it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.37it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.43it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.50it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.90it/s]

Epoch 1:  58%|█████▊    | 3488/5971 [31:49<22:38,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.64e-5, train/loss_step=0.0103, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3488/5971 [31:49<22:38,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=8.08e-5, train/loss_step=0.0185, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3489/5971 [31:50<22:38,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=8.08e-5, train/loss_step=0.0185, global_step=900.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3489/5971 [31:50<22:38,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.08e-5, train/loss_step=0.00181, global_step=901.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3490/5971 [31:51<22:38,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.08e-5, train/loss_step=0.00181, global_step=901.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3490/5971 [31:51<22:38,  1.83it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.000113, train/loss_step=0.0293, global_step=901.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  58%|█████▊    | 3491/5971 [31:51<22:37,  1.83it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.000113, train/loss_step=0.0293, global_step=901.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3491/5971 [31:51<22:37,  1.83it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.24e-5, train/loss_step=0.0043, global_step=901.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3492/5971 [31:54<22:38,  1.82it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.24e-5, train/loss_step=0.0043, global_step=901.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3492/5971 [31:54<22:38,  1.82it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=0.000101, train/loss_step=0.0252, global_step=901.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3493/5971 [31:54<22:38,  1.82it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=0.000101, train/loss_step=0.0252, global_step=901.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  58%|█████▊    | 3493/5971 [31:54<22:38,  1.82it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.53e-5, train/loss_step=0.025, global_step=902.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  59%|█████▊    | 3494/5971 [31:55<22:37,  1.82it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.53e-5, train/loss_step=0.025, global_step=902.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3494/5971 [31:55<22:37,  1.82it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.6e-5, train/loss_step=0.0176, global_step=902.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3495/5971 [31:56<22:37,  1.82it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.6e-5, train/loss_step=0.0176, global_step=902.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3495/5971 [31:56<22:37,  1.82it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.82e-5, train/loss_step=0.00331, global_step=902.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3496/5971 [31:58<22:38,  1.82it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.82e-5, train/loss_step=0.00331, global_step=902.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3496/5971 [31:58<22:38,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00216, train/loss_step=0.349, global_step=902.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  59%|█████▊    | 3497/5971 [31:59<22:37,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00216, train/loss_step=0.349, global_step=902.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3497/5971 [31:59<22:37,  1.82it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.41e-5, train/loss_step=0.0104, global_step=903.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3498/5971 [32:00<22:37,  1.82it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.41e-5, train/loss_step=0.0104, global_step=903.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3498/5971 [32:00<22:37,  1.82it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000543, train/loss_step=0.157, global_step=903.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▊    | 3499/5971 [32:01<22:37,  1.82it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000543, train/loss_step=0.157, global_step=903.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3499/5971 [32:01<22:37,  1.82it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=903.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3500/5971 [32:03<22:37,  1.82it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=903.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3500/5971 [32:03<22:37,  1.82it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000505, train/loss_step=0.152, global_step=903.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3501/5971 [32:04<22:37,  1.82it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000505, train/loss_step=0.152, global_step=903.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3501/5971 [32:04<22:37,  1.82it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000821, train/loss_step=0.218, global_step=904.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3502/5971 [32:05<22:37,  1.82it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000821, train/loss_step=0.218, global_step=904.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3502/5971 [32:05<22:37,  1.82it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.00014, train/loss_step=0.0374, global_step=904.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3503/5971 [32:06<22:36,  1.82it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.00014, train/loss_step=0.0374, global_step=904.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3503/5971 [32:06<22:36,  1.82it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.00515, train/loss_vlb_step=2.56e-5, train/loss_step=0.00515, global_step=904.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3504/5971 [32:08<22:37,  1.82it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.00515, train/loss_vlb_step=2.56e-5, train/loss_step=0.00515, global_step=904.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3504/5971 [32:08<22:37,  1.82it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.084, train/loss_vlb_step=0.000276, train/loss_step=0.084, global_step=904.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  59%|█████▊    | 3505/5971 [32:09<22:37,  1.82it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.084, train/loss_vlb_step=0.000276, train/loss_step=0.084, global_step=904.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3505/5971 [32:09<22:37,  1.82it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00231, train/loss_step=0.380, global_step=905.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▊    | 3506/5971 [32:10<22:36,  1.82it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00231, train/loss_step=0.380, global_step=905.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3506/5971 [32:10<22:36,  1.82it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00107, train/loss_step=0.270, global_step=905.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3507/5971 [32:11<22:36,  1.82it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00107, train/loss_step=0.270, global_step=905.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▊    | 3507/5971 [32:11<22:36,  1.82it/s, loss=0.112, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00182, train/loss_step=0.344, global_step=905.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3508/5971 [32:13<22:37,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00182, train/loss_step=0.344, global_step=905.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3508/5971 [32:13<22:37,  1.81it/s, loss=0.147, v_num=0, train/loss_simple_step=0.728, train/loss_vlb_step=0.024, train/loss_step=0.728, global_step=905.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  59%|█████▉    | 3509/5971 [32:14<22:36,  1.81it/s, loss=0.147, v_num=0, train/loss_simple_step=0.728, train/loss_vlb_step=0.024, train/loss_step=0.728, global_step=905.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3509/5971 [32:14<22:36,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0476, train/loss_vlb_step=0.000171, train/loss_step=0.0476, global_step=906.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3510/5971 [32:15<22:36,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0476, train/loss_vlb_step=0.000171, train/loss_step=0.0476, global_step=906.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3510/5971 [32:15<22:36,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00167, train/loss_step=0.284, global_step=906.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  59%|█████▉    | 3511/5971 [32:16<22:36,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00167, train/loss_step=0.284, global_step=906.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3511/5971 [32:16<22:36,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0944, train/loss_vlb_step=0.000313, train/loss_step=0.0944, global_step=906.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3512/5971 [32:18<22:36,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0944, train/loss_vlb_step=0.000313, train/loss_step=0.0944, global_step=906.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3512/5971 [32:18<22:36,  1.81it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.45e-5, train/loss_step=0.00503, global_step=906.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3513/5971 [32:19<22:36,  1.81it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.45e-5, train/loss_step=0.00503, global_step=906.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3513/5971 [32:19<22:36,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000152, train/loss_step=0.0406, global_step=907.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3514/5971 [32:20<22:36,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000152, train/loss_step=0.0406, global_step=907.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3514/5971 [32:20<22:36,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.00012, train/loss_step=0.0318, global_step=907.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3515/5971 [32:20<22:35,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.00012, train/loss_step=0.0318, global_step=907.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3515/5971 [32:20<22:35,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.1e-5, train/loss_step=0.0111, global_step=907.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3516/5971 [32:23<22:36,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.1e-5, train/loss_step=0.0111, global_step=907.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3516/5971 [32:23<22:36,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000683, train/loss_step=0.184, global_step=907.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3517/5971 [32:24<22:36,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000683, train/loss_step=0.184, global_step=907.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3517/5971 [32:24<22:36,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000824, train/loss_step=0.220, global_step=908.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3518/5971 [32:25<22:36,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000824, train/loss_step=0.220, global_step=908.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3518/5971 [32:25<22:36,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.46e-5, train/loss_step=0.0234, global_step=908.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3519/5971 [32:26<22:35,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.46e-5, train/loss_step=0.0234, global_step=908.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3519/5971 [32:26<22:35,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000173, train/loss_step=0.0496, global_step=908.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3520/5971 [32:28<22:36,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000173, train/loss_step=0.0496, global_step=908.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3520/5971 [32:28<22:36,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00318, train/loss_step=0.400, global_step=908.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  59%|█████▉    | 3521/5971 [32:29<22:36,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00318, train/loss_step=0.400, global_step=908.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3521/5971 [32:29<22:36,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.811, train/loss_vlb_step=0.0522, train/loss_step=0.811, global_step=909.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3522/5971 [32:30<22:35,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.811, train/loss_vlb_step=0.0522, train/loss_step=0.811, global_step=909.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3522/5971 [32:30<22:35,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0695, train/loss_vlb_step=0.000236, train/loss_step=0.0695, global_step=909.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3523/5971 [32:31<22:35,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0695, train/loss_vlb_step=0.000236, train/loss_step=0.0695, global_step=909.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3523/5971 [32:31<22:35,  1.81it/s, loss=0.21, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=909.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  59%|█████▉    | 3524/5971 [32:33<22:35,  1.80it/s, loss=0.21, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=909.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3524/5971 [32:33<22:35,  1.80it/s, loss=0.222, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00163, train/loss_step=0.326, global_step=909.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3525/5971 [32:34<22:35,  1.80it/s, loss=0.222, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00163, train/loss_step=0.326, global_step=909.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3525/5971 [32:34<22:35,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.4e-5, train/loss_step=0.0183, global_step=910.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3526/5971 [32:34<22:35,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.4e-5, train/loss_step=0.0183, global_step=910.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3526/5971 [32:34<22:35,  1.80it/s, loss=0.201, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000826, train/loss_step=0.220, global_step=910.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3527/5971 [32:35<22:34,  1.80it/s, loss=0.201, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000826, train/loss_step=0.220, global_step=910.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3527/5971 [32:35<22:34,  1.80it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.26e-5, train/loss_step=0.0112, global_step=910.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3528/5971 [32:38<22:36,  1.80it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.26e-5, train/loss_step=0.0112, global_step=910.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3528/5971 [32:38<22:36,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00677, train/loss_vlb_step=3.26e-5, train/loss_step=0.00677, global_step=910.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3529/5971 [32:39<22:35,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00677, train/loss_vlb_step=3.26e-5, train/loss_step=0.00677, global_step=910.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3529/5971 [32:39<22:35,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000841, train/loss_step=0.216, global_step=911.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  59%|█████▉    | 3530/5971 [32:40<22:35,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000841, train/loss_step=0.216, global_step=911.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3530/5971 [32:40<22:35,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00406, train/loss_step=0.468, global_step=911.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3531/5971 [32:41<22:35,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00406, train/loss_step=0.468, global_step=911.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3531/5971 [32:41<22:35,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000118, train/loss_step=0.0302, global_step=911.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3532/5971 [32:43<22:35,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000118, train/loss_step=0.0302, global_step=911.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3532/5971 [32:43<22:35,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000722, train/loss_step=0.202, global_step=911.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  59%|█████▉    | 3533/5971 [32:44<22:35,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000722, train/loss_step=0.202, global_step=911.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3533/5971 [32:44<22:35,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000445, train/loss_step=0.130, global_step=912.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3534/5971 [32:45<22:34,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000445, train/loss_step=0.130, global_step=912.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3534/5971 [32:45<22:34,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00276, train/loss_vlb_step=1.58e-5, train/loss_step=0.00276, global_step=912.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3535/5971 [32:46<22:34,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00276, train/loss_vlb_step=1.58e-5, train/loss_step=0.00276, global_step=912.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3535/5971 [32:46<22:34,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000191, train/loss_step=0.0514, global_step=912.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3536/5971 [32:48<22:35,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000191, train/loss_step=0.0514, global_step=912.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3536/5971 [32:48<22:35,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.95e-5, train/loss_step=0.0193, global_step=912.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  59%|█████▉    | 3537/5971 [32:49<22:35,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.95e-5, train/loss_step=0.0193, global_step=912.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3537/5971 [32:49<22:35,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0966, train/loss_vlb_step=0.000318, train/loss_step=0.0966, global_step=913.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3538/5971 [32:50<22:34,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0966, train/loss_vlb_step=0.000318, train/loss_step=0.0966, global_step=913.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3538/5971 [32:50<22:34,  1.80it/s, loss=0.18, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.0021, train/loss_step=0.359, global_step=913.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  59%|█████▉    | 3539/5971 [32:51<22:34,  1.80it/s, loss=0.18, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.0021, train/loss_step=0.359, global_step=913.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3539/5971 [32:51<22:34,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.36e-5, train/loss_step=0.00247, global_step=913.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3540/5971 [32:54<22:35,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.36e-5, train/loss_step=0.00247, global_step=913.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3540/5971 [32:54<22:35,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00171, train/loss_step=0.328, global_step=913.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  59%|█████▉    | 3541/5971 [32:54<22:34,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00171, train/loss_step=0.328, global_step=913.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3541/5971 [32:54<22:34,  1.79it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.63e-5, train/loss_step=0.0211, global_step=914.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3542/5971 [32:55<22:34,  1.79it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.63e-5, train/loss_step=0.0211, global_step=914.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3542/5971 [32:55<22:34,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.491, train/loss_vlb_step=0.00496, train/loss_step=0.491, global_step=914.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  59%|█████▉    | 3543/5971 [32:56<22:34,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.491, train/loss_vlb_step=0.00496, train/loss_step=0.491, global_step=914.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3543/5971 [32:56<22:34,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000123, train/loss_step=0.035, global_step=914.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3544/5971 [32:58<22:34,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000123, train/loss_step=0.035, global_step=914.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3544/5971 [32:58<22:34,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.00055, train/loss_step=0.162, global_step=914.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3545/5971 [32:59<22:34,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.00055, train/loss_step=0.162, global_step=914.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3545/5971 [32:59<22:34,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=8.02e-5, train/loss_step=0.020, global_step=915.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3546/5971 [33:00<22:34,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=8.02e-5, train/loss_step=0.020, global_step=915.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3546/5971 [33:00<22:34,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000633, train/loss_step=0.178, global_step=915.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3547/5971 [33:01<22:33,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000633, train/loss_step=0.178, global_step=915.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3547/5971 [33:01<22:33,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.2e-5, train/loss_step=0.00868, global_step=915.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3548/5971 [33:04<22:34,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.2e-5, train/loss_step=0.00868, global_step=915.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3548/5971 [33:04<22:34,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00945, train/loss_vlb_step=4.46e-5, train/loss_step=0.00945, global_step=915.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3549/5971 [33:05<22:34,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00945, train/loss_vlb_step=4.46e-5, train/loss_step=0.00945, global_step=915.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3549/5971 [33:05<22:34,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000115, train/loss_step=0.0303, global_step=916.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3550/5971 [33:06<22:34,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000115, train/loss_step=0.0303, global_step=916.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3550/5971 [33:06<22:34,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00543, train/loss_vlb_step=2.82e-5, train/loss_step=0.00543, global_step=916.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3551/5971 [33:07<22:33,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00543, train/loss_vlb_step=2.82e-5, train/loss_step=0.00543, global_step=916.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3551/5971 [33:07<22:33,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000116, train/loss_step=0.0306, global_step=916.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  59%|█████▉    | 3552/5971 [33:09<22:34,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000116, train/loss_step=0.0306, global_step=916.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  59%|█████▉    | 3552/5971 [33:09<22:34,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000894, train/loss_step=0.236, global_step=916.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  60%|█████▉    | 3553/5971 [33:10<22:34,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000894, train/loss_step=0.236, global_step=916.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3553/5971 [33:10<22:34,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=917.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  60%|█████▉    | 3554/5971 [33:11<22:33,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=917.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3554/5971 [33:11<22:33,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00967, train/loss_vlb_step=4.4e-5, train/loss_step=0.00967, global_step=917.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3555/5971 [33:11<22:33,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00967, train/loss_vlb_step=4.4e-5, train/loss_step=0.00967, global_step=917.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3555/5971 [33:11<22:33,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=917.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  60%|█████▉    | 3556/5971 [33:14<22:34,  1.78it/s, loss=0.115, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=917.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3556/5971 [33:14<22:34,  1.78it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000108, train/loss_step=0.0284, global_step=917.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3557/5971 [33:15<22:33,  1.78it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000108, train/loss_step=0.0284, global_step=917.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3557/5971 [33:15<22:33,  1.78it/s, loss=0.119, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000633, train/loss_step=0.174, global_step=918.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  60%|█████▉    | 3558/5971 [33:16<22:33,  1.78it/s, loss=0.119, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000633, train/loss_step=0.174, global_step=918.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3558/5971 [33:16<22:33,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0834, train/loss_vlb_step=0.000274, train/loss_step=0.0834, global_step=918.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3559/5971 [33:17<22:33,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0834, train/loss_vlb_step=0.000274, train/loss_step=0.0834, global_step=918.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3559/5971 [33:17<22:33,  1.78it/s, loss=0.107, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.95e-5, train/loss_step=0.025, global_step=918.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  60%|█████▉    | 3560/5971 [33:19<22:33,  1.78it/s, loss=0.107, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.95e-5, train/loss_step=0.025, global_step=918.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3560/5971 [33:19<22:33,  1.78it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000156, train/loss_step=0.0413, global_step=918.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3561/5971 [33:20<22:33,  1.78it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000156, train/loss_step=0.0413, global_step=918.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3561/5971 [33:20<22:33,  1.78it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.087, train/loss_vlb_step=0.000287, train/loss_step=0.087, global_step=919.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  60%|█████▉    | 3562/5971 [33:21<22:32,  1.78it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.087, train/loss_vlb_step=0.000287, train/loss_step=0.087, global_step=919.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3562/5971 [33:21<22:32,  1.78it/s, loss=0.0725, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000114, train/loss_step=0.0288, global_step=919.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3563/5971 [33:21<22:32,  1.78it/s, loss=0.0725, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000114, train/loss_step=0.0288, global_step=919.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3563/5971 [33:21<22:32,  1.78it/s, loss=0.0794, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000609, train/loss_step=0.175, global_step=919.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  60%|█████▉    | 3564/5971 [33:24<22:33,  1.78it/s, loss=0.0794, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000609, train/loss_step=0.175, global_step=919.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3564/5971 [33:24<22:33,  1.78it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00122, train/loss_step=0.309, global_step=919.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  60%|█████▉    | 3565/5971 [33:25<22:33,  1.78it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00122, train/loss_step=0.309, global_step=919.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3565/5971 [33:25<22:33,  1.78it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000186, train/loss_step=0.0536, global_step=920.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3566/5971 [33:26<22:32,  1.78it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000186, train/loss_step=0.0536, global_step=920.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3566/5971 [33:26<22:32,  1.78it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00131, train/loss_step=0.317, global_step=920.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  60%|█████▉    | 3567/5971 [33:27<22:32,  1.78it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00131, train/loss_step=0.317, global_step=920.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3567/5971 [33:27<22:32,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.0017, train/loss_step=0.313, global_step=920.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  60%|█████▉    | 3568/5971 [33:29<22:32,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.0017, train/loss_step=0.313, global_step=920.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3568/5971 [33:29<22:32,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00241, train/loss_step=0.388, global_step=920.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3569/5971 [33:30<22:32,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00241, train/loss_step=0.388, global_step=920.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3569/5971 [33:30<22:32,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000129, train/loss_step=0.0368, global_step=921.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3570/5971 [33:31<22:32,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000129, train/loss_step=0.0368, global_step=921.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3570/5971 [33:31<22:32,  1.78it/s, loss=0.143, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00113, train/loss_step=0.265, global_step=921.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  60%|█████▉    | 3571/5971 [33:32<22:31,  1.78it/s, loss=0.143, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00113, train/loss_step=0.265, global_step=921.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3571/5971 [33:32<22:31,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=921.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3572/5971 [33:34<22:32,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=921.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3572/5971 [33:34<22:32,  1.77it/s, loss=0.157, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00277, train/loss_step=0.449, global_step=921.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  60%|█████▉    | 3573/5971 [33:34<22:31,  1.77it/s, loss=0.157, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00277, train/loss_step=0.449, global_step=921.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3573/5971 [33:34<22:31,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000197, train/loss_step=0.0542, global_step=922.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3574/5971 [33:35<22:31,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000197, train/loss_step=0.0542, global_step=922.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3574/5971 [33:35<22:31,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.65e-5, train/loss_step=0.0188, global_step=922.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  60%|█████▉    | 3575/5971 [33:36<22:31,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.65e-5, train/loss_step=0.0188, global_step=922.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3575/5971 [33:36<22:31,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000227, train/loss_step=0.0641, global_step=922.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3576/5971 [33:39<22:32,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000227, train/loss_step=0.0641, global_step=922.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3576/5971 [33:39<22:32,  1.77it/s, loss=0.158, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000612, train/loss_step=0.171, global_step=922.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  60%|█████▉    | 3577/5971 [33:40<22:31,  1.77it/s, loss=0.158, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000612, train/loss_step=0.171, global_step=922.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3577/5971 [33:40<22:31,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0459, train/loss_vlb_step=0.000162, train/loss_step=0.0459, global_step=923.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3578/5971 [33:41<22:31,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0459, train/loss_vlb_step=0.000162, train/loss_step=0.0459, global_step=923.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3578/5971 [33:41<22:31,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.04e-5, train/loss_step=0.0234, global_step=923.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  60%|█████▉    | 3579/5971 [33:42<22:31,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.04e-5, train/loss_step=0.0234, global_step=923.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3579/5971 [33:42<22:31,  1.77it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000171, train/loss_step=0.0453, global_step=923.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3580/5971 [33:44<22:31,  1.77it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000171, train/loss_step=0.0453, global_step=923.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3580/5971 [33:44<22:31,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00125, train/loss_step=0.307, global_step=923.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  60%|█████▉    | 3581/5971 [33:45<22:31,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00125, train/loss_step=0.307, global_step=923.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3581/5971 [33:45<22:31,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0321, train/loss_vlb_step=0.000122, train/loss_step=0.0321, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3582/5971 [33:46<22:31,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0321, train/loss_vlb_step=0.000122, train/loss_step=0.0321, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|█████▉    | 3582/5971 [33:46<22:31,  1.77it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.1e-5, train/loss_step=0.0166, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  60%|██████    | 3583/5971 [33:47<22:30,  1.77it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.1e-5, train/loss_step=0.0166, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|██████    | 3583/5971 [33:47<22:30,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.85e-5, train/loss_step=0.0105, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|██████    | 3584/5971 [33:49<22:31,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.85e-5, train/loss_step=0.0105, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  60%|██████    | 3584/5971 [33:49<22:31,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.25it/s][A
Epoch 1:  60%|██████    | 3586/5971 [33:50<22:29,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   1%|          | 2/167 [00:00<00:42,  3.85it/s][A
Epoch 1:  60%|██████    | 3588/5971 [33:50<22:28,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.75it/s][A
Epoch 1:  60%|██████    | 3591/5971 [33:50<22:25,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:10, 14.63it/s][A
Epoch 1:  60%|██████    | 3594/5971 [33:50<22:22,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.21it/s][A
Epoch 1:  60%|██████    | 3597/5971 [33:50<22:19,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.86it/s][A
Epoch 1:  60%|██████    | 3600/5971 [33:50<22:17,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.53it/s][A
Epoch 1:  60%|██████    | 3603/5971 [33:50<22:14,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.59it/s][A
Epoch 1:  60%|██████    | 3606/5971 [33:50<22:11,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.80it/s][A
Epoch 1:  60%|██████    | 3609/5971 [33:51<22:08,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 22.34it/s][A
Epoch 1:  60%|██████    | 3612/5971 [33:51<22:06,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.99it/s][A
Epoch 1:  61%|██████    | 3615/5971 [33:51<22:03,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.68it/s][A
Epoch 1:  61%|██████    | 3618/5971 [33:51<22:00,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.12it/s][A
Epoch 1:  61%|██████    | 3621/5971 [33:51<21:58,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 23.54it/s][A
Epoch 1:  61%|██████    | 3624/5971 [33:51<21:55,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.45it/s][A
Epoch 1:  61%|██████    | 3627/5971 [33:51<21:52,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.10it/s][A
Epoch 1:  61%|██████    | 3630/5971 [33:51<21:50,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 24.34it/s][A
Epoch 1:  61%|██████    | 3633/5971 [33:52<21:47,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 23.78it/s][A
Epoch 1:  61%|██████    | 3636/5971 [33:52<21:44,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 23.69it/s][A
Epoch 1:  61%|██████    | 3639/5971 [33:52<21:42,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 24.11it/s][A
Epoch 1:  61%|██████    | 3642/5971 [33:52<21:39,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 24.98it/s][A
Epoch 1:  61%|██████    | 3645/5971 [33:52<21:36,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 26.18it/s][A
Epoch 1:  61%|██████    | 3648/5971 [33:52<21:34,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 25.25it/s][A
Epoch 1:  61%|██████    | 3651/5971 [33:52<21:31,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████      | 68/167 [00:03<00:03, 24.87it/s][A
Epoch 1:  61%|██████    | 3654/5971 [33:52<21:28,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.13it/s][A
Epoch 1:  61%|██████    | 3657/5971 [33:52<21:26,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.54it/s][A
Epoch 1:  61%|██████▏   | 3660/5971 [33:53<21:23,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.25it/s][A
Epoch 1:  61%|██████▏   | 3663/5971 [33:53<21:20,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.42it/s][A
Epoch 1:  61%|██████▏   | 3666/5971 [33:53<21:18,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.63it/s][A
Epoch 1:  61%|██████▏   | 3669/5971 [33:53<21:15,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.52it/s][A
Epoch 1:  61%|██████▏   | 3672/5971 [33:53<21:12,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 24.64it/s][A
Epoch 1:  62%|██████▏   | 3675/5971 [33:53<21:10,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 24.98it/s][A
Epoch 1:  62%|██████▏   | 3678/5971 [33:53<21:07,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.34it/s][A
Epoch 1:  62%|██████▏   | 3681/5971 [33:53<21:05,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.04it/s][A
Epoch 1:  62%|██████▏   | 3684/5971 [33:54<21:02,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.26it/s][A
Epoch 1:  62%|██████▏   | 3687/5971 [33:54<20:59,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.59it/s][A
Epoch 1:  62%|██████▏   | 3690/5971 [33:54<20:57,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.95it/s][A
Epoch 1:  62%|██████▏   | 3694/5971 [33:54<20:53,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 27.75it/s][A
Epoch 1:  62%|██████▏   | 3698/5971 [33:54<20:50,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 27.32it/s][A
Epoch 1:  62%|██████▏   | 3702/5971 [33:54<20:46,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.78it/s][A

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.92it/s][A
Epoch 1:  62%|██████▏   | 3706/5971 [33:54<20:43,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.22it/s][A
Epoch 1:  62%|██████▏   | 3710/5971 [33:55<20:39,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.18it/s][A
Epoch 1:  62%|██████▏   | 3714/5971 [33:55<20:36,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.21it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.84it/s][A
Epoch 1:  62%|██████▏   | 3718/5971 [33:55<20:33,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 25.53it/s][A
Epoch 1:  62%|██████▏   | 3722/5971 [33:55<20:29,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 24.37it/s][A
Epoch 1:  62%|██████▏   | 3726/5971 [33:55<20:26,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  85%|████████▌ | 142/167 [00:06<00:01, 24.48it/s][A

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 24.15it/s][A
Epoch 1:  62%|██████▏   | 3730/5971 [33:55<20:22,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 23.53it/s][A
Epoch 1:  63%|██████▎   | 3734/5971 [33:56<20:19,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 23.62it/s][A
Epoch 1:  63%|██████▎   | 3738/5971 [33:56<20:16,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 23.92it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 22.94it/s][A
Epoch 1:  63%|██████▎   | 3742/5971 [33:56<20:12,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 24.53it/s][A
Epoch 1:  63%|██████▎   | 3746/5971 [33:56<20:09,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 24.91it/s][A
Epoch 1:  63%|██████▎   | 3750/5971 [33:56<20:05,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 25.14it/s][A
Epoch 1:  63%|██████▎   | 3752/5971 [33:57<20:04,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00311, train/loss_step=0.406, global_step=924.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  63%|██████▎   | 3753/5971 [33:57<20:04,  1.84it/s, loss=0.159, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=925.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3754/5971 [33:58<20:03,  1.84it/s, loss=0.159, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=925.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3754/5971 [33:58<20:03,  1.84it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.45e-5, train/loss_step=0.0146, global_step=925.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3755/5971 [33:59<20:03,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.5e-5, train/loss_step=0.00253, global_step=925.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3756/5971 [34:01<20:03,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000471, train/loss_step=0.136, global_step=925.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  63%|██████▎   | 3757/5971 [34:02<20:03,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000136, train/loss_step=0.0387, global_step=926.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3758/5971 [34:03<20:03,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000136, train/loss_step=0.0387, global_step=926.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3758/5971 [34:03<20:03,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000201, train/loss_step=0.0554, global_step=926.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3759/5971 [34:04<20:02,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0742, train/loss_vlb_step=0.000252, train/loss_step=0.0742, global_step=926.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3760/5971 [34:06<20:03,  1.84it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00177, train/loss_step=0.296, global_step=926.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  63%|██████▎   | 3761/5971 [34:07<20:03,  1.84it/s, loss=0.1, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000413, train/loss_step=0.125, global_step=927.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  63%|██████▎   | 3762/5971 [34:08<20:02,  1.84it/s, loss=0.1, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000413, train/loss_step=0.125, global_step=927.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3762/5971 [34:08<20:02,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000479, train/loss_step=0.143, global_step=927.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3763/5971 [34:09<20:02,  1.84it/s, loss=0.109, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.00038, train/loss_step=0.115, global_step=927.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  63%|██████▎   | 3764/5971 [34:11<20:02,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000476, train/loss_step=0.145, global_step=927.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3765/5971 [34:12<20:02,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=928.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3766/5971 [34:13<20:02,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=928.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3766/5971 [34:13<20:02,  1.83it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000153, train/loss_step=0.0444, global_step=928.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3767/5971 [34:14<20:01,  1.83it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.4e-5, train/loss_step=0.0181, global_step=928.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  63%|██████▎   | 3768/5971 [34:16<20:02,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00142, train/loss_step=0.325, global_step=928.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3769/5971 [34:17<20:01,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.09e-5, train/loss_step=0.0206, global_step=929.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3770/5971 [34:18<20:01,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.09e-5, train/loss_step=0.0206, global_step=929.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3770/5971 [34:18<20:01,  1.83it/s, loss=0.134, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00341, train/loss_step=0.475, global_step=929.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  63%|██████▎   | 3771/5971 [34:19<20:01,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000428, train/loss_step=0.130, global_step=929.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3772/5971 [34:22<20:01,  1.83it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000306, train/loss_step=0.0922, global_step=929.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3773/5971 [34:23<20:01,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=930.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  63%|██████▎   | 3774/5971 [34:23<20:01,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=930.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3774/5971 [34:23<20:01,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00139, train/loss_step=0.320, global_step=930.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  63%|██████▎   | 3775/5971 [34:24<20:00,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00915, train/loss_vlb_step=4.22e-5, train/loss_step=0.00915, global_step=930.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3776/5971 [34:27<20:01,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.547, train/loss_vlb_step=0.00587, train/loss_step=0.547, global_step=930.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  63%|██████▎   | 3777/5971 [34:28<20:01,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.93e-5, train/loss_step=0.00351, global_step=931.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3778/5971 [34:29<20:00,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.93e-5, train/loss_step=0.00351, global_step=931.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3778/5971 [34:29<20:00,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=931.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  63%|██████▎   | 3779/5971 [34:30<20:00,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.21e-5, train/loss_step=0.0021, global_step=931.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3780/5971 [34:32<20:00,  1.82it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.72e-5, train/loss_step=0.0218, global_step=931.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3781/5971 [34:33<20:00,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.85e-5, train/loss_step=0.00346, global_step=932.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3782/5971 [34:34<20:00,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.85e-5, train/loss_step=0.00346, global_step=932.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3782/5971 [34:34<20:00,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000133, train/loss_step=0.0368, global_step=932.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  63%|██████▎   | 3783/5971 [34:35<19:59,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000997, train/loss_step=0.230, global_step=932.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  63%|██████▎   | 3784/5971 [34:37<20:00,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000122, train/loss_step=0.0315, global_step=932.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3785/5971 [34:38<19:59,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000143, train/loss_step=0.0393, global_step=933.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3786/5971 [34:39<19:59,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000143, train/loss_step=0.0393, global_step=933.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3786/5971 [34:39<19:59,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000112, train/loss_step=0.0292, global_step=933.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3787/5971 [34:40<19:59,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=933.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  63%|██████▎   | 3788/5971 [34:42<19:59,  1.82it/s, loss=0.14, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00474, train/loss_step=0.476, global_step=933.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  63%|██████▎   | 3789/5971 [34:43<19:59,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.603, train/loss_vlb_step=0.00529, train/loss_step=0.603, global_step=934.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3790/5971 [34:43<19:58,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.603, train/loss_vlb_step=0.00529, train/loss_step=0.603, global_step=934.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3790/5971 [34:43<19:58,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000621, train/loss_step=0.174, global_step=934.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  63%|██████▎   | 3791/5971 [34:44<19:58,  1.82it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00402, train/loss_vlb_step=1.97e-5, train/loss_step=0.00402, global_step=934.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3792/5971 [34:47<19:59,  1.82it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.25e-5, train/loss_step=0.0152, global_step=934.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▎   | 3793/5971 [34:48<19:58,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=935.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3794/5971 [34:48<19:58,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=935.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3794/5971 [34:48<19:58,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000517, train/loss_step=0.140, global_step=935.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  64%|██████▎   | 3795/5971 [34:49<19:57,  1.82it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000186, train/loss_step=0.0539, global_step=935.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3796/5971 [34:52<19:58,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0659, train/loss_vlb_step=0.00023, train/loss_step=0.0659, global_step=935.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  64%|██████▎   | 3797/5971 [34:53<19:58,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0915, train/loss_vlb_step=0.000309, train/loss_step=0.0915, global_step=936.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3798/5971 [34:54<19:57,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0915, train/loss_vlb_step=0.000309, train/loss_step=0.0915, global_step=936.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3798/5971 [34:54<19:57,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00459, train/loss_vlb_step=2.29e-5, train/loss_step=0.00459, global_step=936.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3799/5971 [34:55<19:57,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00177, train/loss_vlb_step=1.03e-5, train/loss_step=0.00177, global_step=936.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3800/5971 [34:57<19:58,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00333, train/loss_vlb_step=1.84e-5, train/loss_step=0.00333, global_step=936.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  64%|██████▎   | 3801/5971 [34:58<19:57,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00159, train/loss_step=0.315, global_step=937.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  64%|██████▎   | 3802/5971 [34:59<19:57,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00159, train/loss_step=0.315, global_step=937.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3802/5971 [34:59<19:57,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000164, train/loss_step=0.0439, global_step=937.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3803/5971 [35:00<19:57,  1.81it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00185, train/loss_vlb_step=1.13e-5, train/loss_step=0.00185, global_step=937.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3804/5971 [35:03<19:57,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.21e-5, train/loss_step=0.0123, global_step=937.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▎   | 3805/5971 [35:03<19:57,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.23e-5, train/loss_step=0.00219, global_step=938.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3806/5971 [35:04<19:56,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.23e-5, train/loss_step=0.00219, global_step=938.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▎   | 3806/5971 [35:04<19:56,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.69e-5, train/loss_step=0.0151, global_step=938.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▍   | 3807/5971 [35:05<19:56,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.00044, train/loss_step=0.133, global_step=938.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▍   | 3808/5971 [35:07<19:56,  1.81it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=938.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3809/5971 [35:08<19:56,  1.81it/s, loss=0.0643, v_num=0, train/loss_simple_step=0.00349, train/loss_vlb_step=1.8e-5, train/loss_step=0.00349, global_step=939.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3810/5971 [35:09<19:56,  1.81it/s, loss=0.0643, v_num=0, train/loss_simple_step=0.00349, train/loss_vlb_step=1.8e-5, train/loss_step=0.00349, global_step=939.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3810/5971 [35:09<19:56,  1.81it/s, loss=0.0557, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.29e-5, train/loss_step=0.00219, global_step=939.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3811/5971 [35:10<19:55,  1.81it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00436, train/loss_step=0.498, global_step=939.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  64%|██████▍   | 3812/5971 [35:13<19:56,  1.80it/s, loss=0.0845, v_num=0, train/loss_simple_step=0.0985, train/loss_vlb_step=0.000325, train/loss_step=0.0985, global_step=939.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3813/5971 [35:14<19:56,  1.80it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000378, train/loss_step=0.115, global_step=940.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▍   | 3814/5971 [35:14<19:55,  1.80it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000378, train/loss_step=0.115, global_step=940.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3814/5971 [35:14<19:55,  1.80it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000391, train/loss_step=0.118, global_step=940.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3815/5971 [35:15<19:55,  1.80it/s, loss=0.0836, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.48e-5, train/loss_step=0.0215, global_step=940.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3816/5971 [35:17<19:55,  1.80it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=940.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  64%|██████▍   | 3817/5971 [35:19<19:55,  1.80it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0985, train/loss_vlb_step=0.000324, train/loss_step=0.0985, global_step=941.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3818/5971 [35:20<19:55,  1.80it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0985, train/loss_vlb_step=0.000324, train/loss_step=0.0985, global_step=941.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3818/5971 [35:20<19:55,  1.80it/s, loss=0.087, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000115, train/loss_step=0.0287, global_step=941.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  64%|██████▍   | 3819/5971 [35:20<19:54,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.522, train/loss_vlb_step=0.00551, train/loss_step=0.522, global_step=941.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  64%|██████▍   | 3820/5971 [35:23<19:55,  1.80it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0639, train/loss_vlb_step=0.000219, train/loss_step=0.0639, global_step=941.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3821/5971 [35:23<19:54,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00118, train/loss_step=0.284, global_step=942.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  64%|██████▍   | 3822/5971 [35:24<19:54,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00118, train/loss_step=0.284, global_step=942.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3822/5971 [35:24<19:54,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000579, train/loss_step=0.171, global_step=942.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3823/5971 [35:25<19:54,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.0001, train/loss_step=0.026, global_step=942.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▍   | 3824/5971 [35:28<19:54,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000112, train/loss_step=0.0274, global_step=942.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3825/5971 [35:29<19:54,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=943.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▍   | 3826/5971 [35:30<19:54,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=943.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3826/5971 [35:30<19:54,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000368, train/loss_step=0.111, global_step=943.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3827/5971 [35:31<19:53,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000137, train/loss_step=0.0359, global_step=943.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3828/5971 [35:34<19:54,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00314, train/loss_step=0.422, global_step=943.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  64%|██████▍   | 3829/5971 [35:35<19:54,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.38e-5, train/loss_step=0.0172, global_step=944.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3830/5971 [35:36<19:53,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.38e-5, train/loss_step=0.0172, global_step=944.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3830/5971 [35:36<19:53,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.44e-5, train/loss_step=0.0027, global_step=944.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3831/5971 [35:37<19:53,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00105, train/loss_step=0.267, global_step=944.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▍   | 3832/5971 [35:39<19:53,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00863, train/loss_vlb_step=3.85e-5, train/loss_step=0.00863, global_step=944.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3833/5971 [35:40<19:53,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.11e-5, train/loss_step=0.00182, global_step=945.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3834/5971 [35:41<19:53,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.11e-5, train/loss_step=0.00182, global_step=945.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3834/5971 [35:41<19:53,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000696, train/loss_step=0.202, global_step=945.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  64%|██████▍   | 3835/5971 [35:41<19:52,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=3.07e-5, train/loss_step=0.00626, global_step=945.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3836/5971 [35:45<19:53,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00165, train/loss_step=0.306, global_step=945.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  64%|██████▍   | 3837/5971 [35:45<19:53,  1.79it/s, loss=0.17, v_num=0, train/loss_simple_step=0.786, train/loss_vlb_step=0.0341, train/loss_step=0.786, global_step=946.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▍   | 3838/5971 [35:46<19:52,  1.79it/s, loss=0.17, v_num=0, train/loss_simple_step=0.786, train/loss_vlb_step=0.0341, train/loss_step=0.786, global_step=946.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3838/5971 [35:46<19:52,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00827, train/loss_vlb_step=3.87e-5, train/loss_step=0.00827, global_step=946.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3839/5971 [35:47<19:52,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000119, train/loss_step=0.0341, global_step=946.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  64%|██████▍   | 3840/5971 [35:49<19:52,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.83e-5, train/loss_step=0.00336, global_step=946.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3841/5971 [35:50<19:52,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=9.44e-6, train/loss_step=0.00172, global_step=947.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3842/5971 [35:51<19:52,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=9.44e-6, train/loss_step=0.00172, global_step=947.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3842/5971 [35:51<19:52,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00358, train/loss_step=0.464, global_step=947.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  64%|██████▍   | 3843/5971 [35:52<19:51,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0595, train/loss_vlb_step=0.000198, train/loss_step=0.0595, global_step=947.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3844/5971 [35:54<19:52,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00174, train/loss_step=0.287, global_step=947.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  64%|██████▍   | 3845/5971 [35:55<19:51,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.00037, train/loss_step=0.111, global_step=948.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3846/5971 [35:56<19:51,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.00037, train/loss_step=0.111, global_step=948.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3846/5971 [35:56<19:51,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00701, train/loss_vlb_step=3.51e-5, train/loss_step=0.00701, global_step=948.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3847/5971 [35:57<19:50,  1.78it/s, loss=0.167, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00146, train/loss_step=0.342, global_step=948.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  64%|██████▍   | 3848/5971 [35:59<19:51,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.01e-5, train/loss_step=0.0117, global_step=948.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3849/5971 [36:00<19:50,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00138, train/loss_step=0.329, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  64%|██████▍   | 3850/5971 [36:01<19:50,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00138, train/loss_step=0.329, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3850/5971 [36:01<19:50,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000198, train/loss_step=0.0581, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  64%|██████▍   | 3851/5971 [36:02<19:50,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000773, train/loss_step=0.207, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  65%|██████▍   | 3852/5971 [36:04<19:50,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:15,  2.19it/s][A
Epoch 1:  65%|██████▍   | 3854/5971 [36:05<19:48,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   1%|          | 2/167 [00:00<00:53,  3.08it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.46it/s][A
Epoch 1:  65%|██████▍   | 3858/5971 [36:05<19:45,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.93it/s][A
Epoch 1:  65%|██████▍   | 3862/5971 [36:05<19:42,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.27it/s][A
Epoch 1:  65%|██████▍   | 3866/5971 [36:05<19:38,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.49it/s][A

Validating:  10%|█         | 17/167 [00:01<00:07, 20.45it/s][A
Epoch 1:  65%|██████▍   | 3870/5971 [36:05<19:35,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.11it/s][A
Epoch 1:  65%|██████▍   | 3874/5971 [36:06<19:32,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.26it/s][A
Epoch 1:  65%|██████▍   | 3878/5971 [36:06<19:28,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.51it/s][A
Epoch 1:  65%|██████▌   | 3882/5971 [36:06<19:25,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 26.49it/s][A
Epoch 1:  65%|██████▌   | 3886/5971 [36:06<19:22,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  20%|██        | 34/167 [00:01<00:05, 26.19it/s][A

Validating:  22%|██▏       | 37/167 [00:01<00:04, 27.14it/s][A
Epoch 1:  65%|██████▌   | 3890/5971 [36:06<19:18,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 27.71it/s][A
Epoch 1:  65%|██████▌   | 3894/5971 [36:06<19:15,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 27.98it/s][A
Epoch 1:  65%|██████▌   | 3898/5971 [36:06<19:12,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 28.28it/s][A

Validating:  29%|██▉       | 49/167 [00:02<00:04, 28.03it/s][A
Epoch 1:  65%|██████▌   | 3902/5971 [36:07<19:08,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 52/167 [00:02<00:04, 28.06it/s][A
Epoch 1:  65%|██████▌   | 3906/5971 [36:07<19:05,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 27.16it/s][A
Epoch 1:  65%|██████▌   | 3910/5971 [36:07<19:02,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.71it/s][A
Epoch 1:  66%|██████▌   | 3914/5971 [36:07<18:58,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.85it/s][A

Validating:  39%|███▉      | 65/167 [00:02<00:03, 27.48it/s][A
Epoch 1:  66%|██████▌   | 3918/5971 [36:07<18:55,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.42it/s][A
Epoch 1:  66%|██████▌   | 3922/5971 [36:07<18:52,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 28.49it/s][A
Epoch 1:  66%|██████▌   | 3926/5971 [36:07<18:48,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 28.79it/s][A
Epoch 1:  66%|██████▌   | 3930/5971 [36:08<18:45,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 28.34it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.88it/s][A
Epoch 1:  66%|██████▌   | 3934/5971 [36:08<18:42,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.80it/s][A
Epoch 1:  66%|██████▌   | 3938/5971 [36:08<18:39,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 26.96it/s][A
Epoch 1:  66%|██████▌   | 3942/5971 [36:08<18:35,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 25.95it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.68it/s][A
Epoch 1:  66%|██████▌   | 3946/5971 [36:08<18:32,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.46it/s][A
Epoch 1:  66%|██████▌   | 3950/5971 [36:08<18:29,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.44it/s][A
Epoch 1:  66%|██████▌   | 3954/5971 [36:08<18:26,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.44it/s][A
Epoch 1:  66%|██████▋   | 3958/5971 [36:09<18:22,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.31it/s][A

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.91it/s][A
Epoch 1:  66%|██████▋   | 3962/5971 [36:09<18:19,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.94it/s][A
Epoch 1:  66%|██████▋   | 3966/5971 [36:09<18:16,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:04<00:02, 25.48it/s][A
Epoch 1:  66%|██████▋   | 3970/5971 [36:09<18:13,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.85it/s][A
Epoch 1:  67%|██████▋   | 3974/5971 [36:09<18:10,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.31it/s][A
Epoch 1:  67%|██████▋   | 3978/5971 [36:09<18:06,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.39it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 23.96it/s][A
Epoch 1:  67%|██████▋   | 3982/5971 [36:10<18:03,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 24.94it/s][A
Epoch 1:  67%|██████▋   | 3986/5971 [36:10<18:00,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████  | 135/167 [00:05<00:01, 23.81it/s][A
Epoch 1:  67%|██████▋   | 3990/5971 [36:10<17:57,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 24.13it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:01, 24.43it/s][A
Epoch 1:  67%|██████▋   | 3994/5971 [36:10<17:54,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 25.84it/s][A
Epoch 1:  67%|██████▋   | 3998/5971 [36:10<17:50,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.32it/s][A
Epoch 1:  67%|██████▋   | 4002/5971 [36:10<17:47,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.62it/s][A
Epoch 1:  67%|██████▋   | 4006/5971 [36:11<17:44,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 25.07it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 25.76it/s][A
Epoch 1:  67%|██████▋   | 4010/5971 [36:11<17:41,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.83it/s][A
Epoch 1:  67%|██████▋   | 4014/5971 [36:11<17:38,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.66it/s][A
Epoch 1:  67%|██████▋   | 4018/5971 [36:11<17:35,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 100%|██████████| 167/167 [00:06<00:00, 28.01it/s][A
Epoch 1:  67%|██████▋   | 4020/5971 [36:11<17:33,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.04e-5, train/loss_step=0.0112, global_step=949.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  67%|██████▋   | 4021/5971 [36:12<17:33,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00152, train/loss_step=0.324, global_step=950.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  67%|██████▋   | 4022/5971 [36:13<17:33,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00152, train/loss_step=0.324, global_step=950.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  67%|██████▋   | 4022/5971 [36:13<17:33,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.79e-5, train/loss_step=0.0032, global_step=950.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  67%|██████▋   | 4023/5971 [36:14<17:32,  1.85it/s, loss=0.187, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00218, train/loss_step=0.387, global_step=950.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  67%|██████▋   | 4024/5971 [36:16<17:32,  1.85it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0534, train/loss_vlb_step=0.000177, train/loss_step=0.0534, global_step=950.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  67%|██████▋   | 4025/5971 [36:17<17:32,  1.85it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00751, train/loss_vlb_step=3.67e-5, train/loss_step=0.00751, global_step=951.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  67%|██████▋   | 4026/5971 [36:18<17:32,  1.85it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00751, train/loss_vlb_step=3.67e-5, train/loss_step=0.00751, global_step=951.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  67%|██████▋   | 4026/5971 [36:18<17:32,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=8e-6, train/loss_step=0.00136, global_step=951.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  67%|██████▋   | 4027/5971 [36:19<17:31,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00119, train/loss_step=0.278, global_step=951.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  67%|██████▋   | 4028/5971 [36:21<17:32,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.23e-5, train/loss_step=0.0167, global_step=951.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  67%|██████▋   | 4029/5971 [36:22<17:31,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.44e-5, train/loss_step=0.0024, global_step=952.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  67%|██████▋   | 4030/5971 [36:23<17:31,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.44e-5, train/loss_step=0.0024, global_step=952.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  67%|██████▋   | 4030/5971 [36:23<17:31,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000748, train/loss_step=0.207, global_step=952.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4031/5971 [36:24<17:31,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.00038, train/loss_step=0.115, global_step=952.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4032/5971 [36:26<17:31,  1.84it/s, loss=0.147, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00441, train/loss_step=0.467, global_step=952.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4033/5971 [36:27<17:30,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00801, train/loss_vlb_step=3.91e-5, train/loss_step=0.00801, global_step=953.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4034/5971 [36:28<17:30,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00801, train/loss_vlb_step=3.91e-5, train/loss_step=0.00801, global_step=953.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4034/5971 [36:28<17:30,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.82e-5, train/loss_step=0.0054, global_step=953.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  68%|██████▊   | 4035/5971 [36:29<17:30,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00491, train/loss_vlb_step=2.64e-5, train/loss_step=0.00491, global_step=953.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4036/5971 [36:31<17:30,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00414, train/loss_vlb_step=2.08e-5, train/loss_step=0.00414, global_step=953.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4037/5971 [36:32<17:30,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00156, train/loss_step=0.334, global_step=954.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  68%|██████▊   | 4038/5971 [36:33<17:29,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00156, train/loss_step=0.334, global_step=954.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4038/5971 [36:33<17:29,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00175, train/loss_step=0.341, global_step=954.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4039/5971 [36:34<17:29,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00734, train/loss_vlb_step=3.63e-5, train/loss_step=0.00734, global_step=954.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4040/5971 [36:36<17:29,  1.84it/s, loss=0.168, v_num=0, train/loss_simple_step=0.784, train/loss_vlb_step=0.0406, train/loss_step=0.784, global_step=954.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  68%|██████▊   | 4041/5971 [36:37<17:29,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.43e-5, train/loss_step=0.017, global_step=955.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4042/5971 [36:38<17:28,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.43e-5, train/loss_step=0.017, global_step=955.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4042/5971 [36:38<17:28,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000554, train/loss_step=0.155, global_step=955.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4043/5971 [36:39<17:28,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.6e-5, train/loss_step=0.0191, global_step=955.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4044/5971 [36:41<17:28,  1.84it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=0.000102, train/loss_step=0.0258, global_step=955.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4045/5971 [36:42<17:28,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000638, train/loss_step=0.188, global_step=956.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4046/5971 [36:43<17:27,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000638, train/loss_step=0.188, global_step=956.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4046/5971 [36:43<17:27,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000221, train/loss_step=0.0658, global_step=956.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4047/5971 [36:43<17:27,  1.84it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000138, train/loss_step=0.0372, global_step=956.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4048/5971 [36:46<17:27,  1.84it/s, loss=0.145, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=956.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4049/5971 [36:47<17:27,  1.84it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000124, train/loss_step=0.0308, global_step=957.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4050/5971 [36:47<17:27,  1.83it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000124, train/loss_step=0.0308, global_step=957.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4050/5971 [36:47<17:27,  1.83it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00582, train/loss_vlb_step=2.95e-5, train/loss_step=0.00582, global_step=957.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4051/5971 [36:48<17:26,  1.83it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.36e-5, train/loss_step=0.0159, global_step=957.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  68%|██████▊   | 4052/5971 [36:51<17:26,  1.83it/s, loss=0.116, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000501, train/loss_step=0.149, global_step=957.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4053/5971 [36:51<17:26,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0865, train/loss_vlb_step=0.000291, train/loss_step=0.0865, global_step=958.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4054/5971 [36:52<17:26,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0865, train/loss_vlb_step=0.000291, train/loss_step=0.0865, global_step=958.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4054/5971 [36:52<17:26,  1.83it/s, loss=0.131, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000892, train/loss_step=0.232, global_step=958.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4055/5971 [36:53<17:25,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000112, train/loss_step=0.0297, global_step=958.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4056/5971 [36:55<17:25,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00138, train/loss_vlb_step=8.37e-6, train/loss_step=0.00138, global_step=958.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4057/5971 [36:56<17:25,  1.83it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00923, train/loss_vlb_step=4.24e-5, train/loss_step=0.00923, global_step=959.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4058/5971 [36:57<17:25,  1.83it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00923, train/loss_vlb_step=4.24e-5, train/loss_step=0.00923, global_step=959.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4058/5971 [36:57<17:25,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000672, train/loss_step=0.189, global_step=959.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  68%|██████▊   | 4059/5971 [36:58<17:24,  1.83it/s, loss=0.114, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000446, train/loss_step=0.135, global_step=959.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4060/5971 [37:01<17:25,  1.83it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00476, train/loss_step=0.458, global_step=959.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4061/5971 [37:01<17:24,  1.83it/s, loss=0.101, v_num=0, train/loss_simple_step=0.071, train/loss_vlb_step=0.000233, train/loss_step=0.071, global_step=960.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4062/5971 [37:02<17:24,  1.83it/s, loss=0.101, v_num=0, train/loss_simple_step=0.071, train/loss_vlb_step=0.000233, train/loss_step=0.071, global_step=960.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4062/5971 [37:02<17:24,  1.83it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.94e-5, train/loss_step=0.0106, global_step=960.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4063/5971 [37:03<17:24,  1.83it/s, loss=0.0947, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000143, train/loss_step=0.0394, global_step=960.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4064/5971 [37:05<17:24,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00118, train/loss_step=0.289, global_step=960.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  68%|██████▊   | 4065/5971 [37:06<17:23,  1.83it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.00385, train/loss_vlb_step=2.11e-5, train/loss_step=0.00385, global_step=961.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4066/5971 [37:07<17:23,  1.83it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.00385, train/loss_vlb_step=2.11e-5, train/loss_step=0.00385, global_step=961.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4066/5971 [37:07<17:23,  1.83it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.00848, train/loss_vlb_step=4.08e-5, train/loss_step=0.00848, global_step=961.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4067/5971 [37:08<17:23,  1.83it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=961.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  68%|██████▊   | 4068/5971 [37:10<17:23,  1.82it/s, loss=0.107, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000959, train/loss_step=0.246, global_step=961.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4069/5971 [37:11<17:22,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0796, train/loss_vlb_step=0.000268, train/loss_step=0.0796, global_step=962.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4070/5971 [37:12<17:22,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0796, train/loss_vlb_step=0.000268, train/loss_step=0.0796, global_step=962.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4070/5971 [37:12<17:22,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00504, train/loss_vlb_step=2.56e-5, train/loss_step=0.00504, global_step=962.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4071/5971 [37:13<17:22,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.768, train/loss_vlb_step=0.0215, train/loss_step=0.768, global_step=962.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  68%|██████▊   | 4072/5971 [37:15<17:22,  1.82it/s, loss=0.14, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.92e-5, train/loss_step=0.019, global_step=962.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4073/5971 [37:16<17:21,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000202, train/loss_step=0.0573, global_step=963.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4074/5971 [37:17<17:21,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000202, train/loss_step=0.0573, global_step=963.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4074/5971 [37:17<17:21,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000423, train/loss_step=0.127, global_step=963.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  68%|██████▊   | 4075/5971 [37:18<17:21,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000149, train/loss_step=0.0392, global_step=963.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4076/5971 [37:20<17:21,  1.82it/s, loss=0.14, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=963.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  68%|██████▊   | 4077/5971 [37:21<17:20,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000147, train/loss_step=0.042, global_step=964.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4078/5971 [37:22<17:20,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000147, train/loss_step=0.042, global_step=964.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4078/5971 [37:22<17:20,  1.82it/s, loss=0.144, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000881, train/loss_step=0.230, global_step=964.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4079/5971 [37:22<17:20,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.089, train/loss_vlb_step=0.000296, train/loss_step=0.089, global_step=964.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4080/5971 [37:25<17:20,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00133, train/loss_step=0.278, global_step=964.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  68%|██████▊   | 4081/5971 [37:26<17:19,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000264, train/loss_step=0.0797, global_step=965.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4082/5971 [37:27<17:19,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000264, train/loss_step=0.0797, global_step=965.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4082/5971 [37:27<17:19,  1.82it/s, loss=0.143, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000756, train/loss_step=0.210, global_step=965.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  68%|██████▊   | 4083/5971 [37:27<17:19,  1.82it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0638, train/loss_vlb_step=0.000215, train/loss_step=0.0638, global_step=965.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4084/5971 [37:30<17:19,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0818, train/loss_vlb_step=0.000274, train/loss_step=0.0818, global_step=965.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4085/5971 [37:30<17:18,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000515, train/loss_step=0.157, global_step=966.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  68%|██████▊   | 4086/5971 [37:31<17:18,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000515, train/loss_step=0.157, global_step=966.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4086/5971 [37:31<17:18,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00421, train/loss_vlb_step=2.26e-5, train/loss_step=0.00421, global_step=966.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4087/5971 [37:32<17:18,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.35e-5, train/loss_step=0.00237, global_step=966.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4088/5971 [37:35<17:18,  1.81it/s, loss=0.161, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0214, train/loss_step=0.764, global_step=966.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  68%|██████▊   | 4089/5971 [37:35<17:18,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00282, train/loss_vlb_step=1.61e-5, train/loss_step=0.00282, global_step=967.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4090/5971 [37:36<17:17,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00282, train/loss_vlb_step=1.61e-5, train/loss_step=0.00282, global_step=967.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  68%|██████▊   | 4090/5971 [37:36<17:17,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.11e-5, train/loss_step=0.021, global_step=967.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  69%|██████▊   | 4091/5971 [37:37<17:17,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00109, train/loss_step=0.280, global_step=967.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4092/5971 [37:39<17:17,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000537, train/loss_step=0.162, global_step=967.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4093/5971 [37:41<17:17,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.14e-5, train/loss_step=0.00643, global_step=968.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4094/5971 [37:41<17:16,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.14e-5, train/loss_step=0.00643, global_step=968.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4094/5971 [37:41<17:16,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.00035, train/loss_step=0.107, global_step=968.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  69%|██████▊   | 4095/5971 [37:42<17:16,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.38e-5, train/loss_step=0.0119, global_step=968.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4096/5971 [37:44<17:16,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=968.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  69%|██████▊   | 4097/5971 [37:45<17:16,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.0012, train/loss_step=0.299, global_step=969.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  69%|██████▊   | 4098/5971 [37:46<17:15,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.0012, train/loss_step=0.299, global_step=969.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4098/5971 [37:46<17:15,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=8.77e-5, train/loss_step=0.0236, global_step=969.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4099/5971 [37:47<17:15,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00726, train/loss_vlb_step=3.47e-5, train/loss_step=0.00726, global_step=969.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4100/5971 [37:49<17:15,  1.81it/s, loss=0.121, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000119, train/loss_step=0.031, global_step=969.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  69%|██████▊   | 4101/5971 [37:50<17:15,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.003, train/loss_step=0.486, global_step=970.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  69%|██████▊   | 4102/5971 [37:51<17:14,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.003, train/loss_step=0.486, global_step=970.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4102/5971 [37:51<17:14,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.63e-5, train/loss_step=0.0187, global_step=970.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4103/5971 [37:52<17:14,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00792, train/loss_vlb_step=3.89e-5, train/loss_step=0.00792, global_step=970.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▊   | 4104/5971 [37:54<17:14,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00513, train/loss_step=0.493, global_step=970.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  69%|██████▊   | 4105/5971 [37:55<17:14,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000923, train/loss_step=0.252, global_step=971.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4106/5971 [37:56<17:13,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000923, train/loss_step=0.252, global_step=971.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4106/5971 [37:56<17:13,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000194, train/loss_step=0.0569, global_step=971.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4107/5971 [37:57<17:13,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=971.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  69%|██████▉   | 4108/5971 [37:59<17:13,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.25e-5, train/loss_step=0.00441, global_step=971.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4109/5971 [38:00<17:13,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000406, train/loss_step=0.115, global_step=972.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  69%|██████▉   | 4110/5971 [38:01<17:12,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000406, train/loss_step=0.115, global_step=972.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4110/5971 [38:01<17:12,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00465, train/loss_vlb_step=2.36e-5, train/loss_step=0.00465, global_step=972.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4111/5971 [38:02<17:12,  1.80it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0501, train/loss_vlb_step=0.000169, train/loss_step=0.0501, global_step=972.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  69%|██████▉   | 4112/5971 [38:04<17:12,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000114, train/loss_step=0.0296, global_step=972.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4113/5971 [38:05<17:12,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00129, train/loss_step=0.317, global_step=973.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  69%|██████▉   | 4114/5971 [38:06<17:11,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00129, train/loss_step=0.317, global_step=973.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4114/5971 [38:06<17:11,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.003, train/loss_step=0.449, global_step=973.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  69%|██████▉   | 4115/5971 [38:07<17:11,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000103, train/loss_step=0.0274, global_step=973.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4116/5971 [38:09<17:11,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000191, train/loss_step=0.0539, global_step=973.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4117/5971 [38:10<17:11,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00507, train/loss_vlb_step=2.57e-5, train/loss_step=0.00507, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4118/5971 [38:11<17:10,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00507, train/loss_vlb_step=2.57e-5, train/loss_step=0.00507, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4118/5971 [38:11<17:10,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.16e-5, train/loss_step=0.00398, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  69%|██████▉   | 4119/5971 [38:12<17:10,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.556, train/loss_vlb_step=0.00708, train/loss_step=0.556, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  69%|██████▉   | 4120/5971 [38:14<17:10,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:36,  1.71it/s][A
Epoch 1:  69%|██████▉   | 4122/5971 [38:15<17:09,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   1%|          | 2/167 [00:00<00:52,  3.13it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.16it/s][A
Epoch 1:  69%|██████▉   | 4126/5971 [38:15<17:06,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.86it/s][A
Epoch 1:  69%|██████▉   | 4130/5971 [38:15<17:03,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.42it/s][A
Epoch 1:  69%|██████▉   | 4134/5971 [38:15<16:59,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.29it/s][A

Validating:  10%|█         | 17/167 [00:01<00:07, 21.06it/s][A
Epoch 1:  69%|██████▉   | 4138/5971 [38:15<16:56,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.84it/s][A
Epoch 1:  69%|██████▉   | 4142/5971 [38:15<16:53,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.34it/s][A
Epoch 1:  69%|██████▉   | 4146/5971 [38:16<16:50,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.48it/s][A
Epoch 1:  70%|██████▉   | 4150/5971 [38:16<16:47,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 26.36it/s][A

Validating:  20%|█▉        | 33/167 [00:01<00:04, 26.88it/s][A
Epoch 1:  70%|██████▉   | 4154/5971 [38:16<16:44,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 27.63it/s][A
Epoch 1:  70%|██████▉   | 4158/5971 [38:16<16:41,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 27.02it/s][A
Epoch 1:  70%|██████▉   | 4162/5971 [38:16<16:37,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.38it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.98it/s][A
Epoch 1:  70%|██████▉   | 4166/5971 [38:16<16:34,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.88it/s][A
Epoch 1:  70%|██████▉   | 4170/5971 [38:16<16:31,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 28.23it/s][A
Epoch 1:  70%|██████▉   | 4174/5971 [38:17<16:28,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 54/167 [00:02<00:03, 28.30it/s][A

Validating:  34%|███▍      | 57/167 [00:02<00:03, 27.96it/s][A
Epoch 1:  70%|██████▉   | 4178/5971 [38:17<16:25,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.60it/s][A
Epoch 1:  70%|███████   | 4182/5971 [38:17<16:22,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 27.19it/s][A
Epoch 1:  70%|███████   | 4186/5971 [38:17<16:19,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.77it/s][A

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.56it/s][A
Epoch 1:  70%|███████   | 4190/5971 [38:17<16:16,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.46it/s][A
Epoch 1:  70%|███████   | 4194/5971 [38:17<16:13,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.15it/s][A
Epoch 1:  70%|███████   | 4198/5971 [38:17<16:10,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.59it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.93it/s][A
Epoch 1:  70%|███████   | 4202/5971 [38:18<16:07,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  51%|█████     | 85/167 [00:03<00:02, 27.68it/s][A
Epoch 1:  70%|███████   | 4206/5971 [38:18<16:04,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 26.41it/s][A
Epoch 1:  71%|███████   | 4210/5971 [38:18<16:01,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.77it/s][A
Epoch 1:  71%|███████   | 4214/5971 [38:18<15:58,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.62it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 24.91it/s][A
Epoch 1:  71%|███████   | 4218/5971 [38:18<15:55,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.56it/s][A
Epoch 1:  71%|███████   | 4222/5971 [38:18<15:52,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 26.18it/s][A
Epoch 1:  71%|███████   | 4226/5971 [38:19<15:49,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.91it/s][A

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.44it/s][A
Epoch 1:  71%|███████   | 4230/5971 [38:19<15:46,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 27.81it/s][A
Epoch 1:  71%|███████   | 4234/5971 [38:19<15:43,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 28.03it/s][A
Epoch 1:  71%|███████   | 4238/5971 [38:19<15:40,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████   | 118/167 [00:04<00:01, 28.24it/s][A

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.57it/s][A
Epoch 1:  71%|███████   | 4242/5971 [38:19<15:37,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 23.83it/s][A
Epoch 1:  71%|███████   | 4246/5971 [38:19<15:34,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 24.84it/s][A
Epoch 1:  71%|███████   | 4250/5971 [38:19<15:31,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 24.84it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.68it/s][A
Epoch 1:  71%|███████   | 4254/5971 [38:20<15:28,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.33it/s][A
Epoch 1:  71%|███████▏  | 4258/5971 [38:20<15:25,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 27.03it/s][A
Epoch 1:  71%|███████▏  | 4262/5971 [38:20<15:22,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.59it/s][A
Epoch 1:  71%|███████▏  | 4266/5971 [38:20<15:19,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.93it/s][A

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.91it/s][A
Epoch 1:  72%|███████▏  | 4270/5971 [38:20<15:16,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.38it/s][A
Epoch 1:  72%|███████▏  | 4274/5971 [38:20<15:13,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.93it/s][A
Epoch 1:  72%|███████▏  | 4278/5971 [38:21<15:10,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.85it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.06it/s][A
Epoch 1:  72%|███████▏  | 4282/5971 [38:21<15:07,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.14it/s][A
Epoch 1:  72%|███████▏  | 4286/5971 [38:21<15:04,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4288/5971 [38:21<15:03,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000211, train/loss_step=0.0626, global_step=974.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  72%|███████▏  | 4289/5971 [38:22<15:02,  1.86it/s, loss=0.158, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.0044, train/loss_step=0.554, global_step=975.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  72%|███████▏  | 4290/5971 [38:23<15:02,  1.86it/s, loss=0.158, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.0044, train/loss_step=0.554, global_step=975.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4290/5971 [38:23<15:02,  1.86it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0065, train/loss_vlb_step=3.2e-5, train/loss_step=0.0065, global_step=975.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4291/5971 [38:24<15:02,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000633, train/loss_step=0.180, global_step=975.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4292/5971 [38:26<15:02,  1.86it/s, loss=0.172, v_num=0, train/loss_simple_step=0.608, train/loss_vlb_step=0.00662, train/loss_step=0.608, global_step=975.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  72%|███████▏  | 4293/5971 [38:27<15:01,  1.86it/s, loss=0.181, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00335, train/loss_step=0.429, global_step=976.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4294/5971 [38:28<15:01,  1.86it/s, loss=0.181, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00335, train/loss_step=0.429, global_step=976.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4294/5971 [38:28<15:01,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.55e-5, train/loss_step=0.00256, global_step=976.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4295/5971 [38:29<15:00,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.000242, train/loss_step=0.0707, global_step=976.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  72%|███████▏  | 4296/5971 [38:31<15:01,  1.86it/s, loss=0.205, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00421, train/loss_step=0.571, global_step=976.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  72%|███████▏  | 4297/5971 [38:32<15:00,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.00016, train/loss_step=0.0448, global_step=977.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4298/5971 [38:33<15:00,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.00016, train/loss_step=0.0448, global_step=977.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4298/5971 [38:33<15:00,  1.86it/s, loss=0.223, v_num=0, train/loss_simple_step=0.447, train/loss_vlb_step=0.00445, train/loss_step=0.447, global_step=977.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  72%|███████▏  | 4299/5971 [38:34<14:59,  1.86it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0907, train/loss_vlb_step=0.000306, train/loss_step=0.0907, global_step=977.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4300/5971 [38:36<14:59,  1.86it/s, loss=0.227, v_num=0, train/loss_simple_step=0.0592, train/loss_vlb_step=0.000206, train/loss_step=0.0592, global_step=977.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4301/5971 [38:37<14:59,  1.86it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.00011, train/loss_step=0.0283, global_step=978.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  72%|███████▏  | 4302/5971 [38:38<14:59,  1.86it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.00011, train/loss_step=0.0283, global_step=978.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4302/5971 [38:38<14:59,  1.86it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000151, train/loss_step=0.0411, global_step=978.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4303/5971 [38:39<14:58,  1.86it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000128, train/loss_step=0.0338, global_step=978.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4304/5971 [38:41<14:58,  1.85it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.00016, train/loss_step=0.0426, global_step=978.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  72%|███████▏  | 4305/5971 [38:42<14:58,  1.85it/s, loss=0.207, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00129, train/loss_step=0.304, global_step=979.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  72%|███████▏  | 4306/5971 [38:43<14:58,  1.85it/s, loss=0.207, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00129, train/loss_step=0.304, global_step=979.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4306/5971 [38:43<14:58,  1.85it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000294, train/loss_step=0.0895, global_step=979.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4307/5971 [38:44<14:57,  1.85it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0629, train/loss_vlb_step=0.000217, train/loss_step=0.0629, global_step=979.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4308/5971 [38:46<14:57,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0914, train/loss_vlb_step=0.000303, train/loss_step=0.0914, global_step=979.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4309/5971 [38:47<14:57,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00318, train/loss_vlb_step=1.76e-5, train/loss_step=0.00318, global_step=980.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4310/5971 [38:48<14:56,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00318, train/loss_vlb_step=1.76e-5, train/loss_step=0.00318, global_step=980.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4310/5971 [38:48<14:56,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.25e-5, train/loss_step=0.0022, global_step=980.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  72%|███████▏  | 4311/5971 [38:48<14:56,  1.85it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0587, train/loss_vlb_step=0.000207, train/loss_step=0.0587, global_step=980.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4312/5971 [38:51<14:56,  1.85it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00169, train/loss_vlb_step=1.03e-5, train/loss_step=0.00169, global_step=980.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4313/5971 [38:52<14:56,  1.85it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00489, train/loss_vlb_step=2.46e-5, train/loss_step=0.00489, global_step=981.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4314/5971 [38:52<14:55,  1.85it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00489, train/loss_vlb_step=2.46e-5, train/loss_step=0.00489, global_step=981.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4314/5971 [38:52<14:55,  1.85it/s, loss=0.111, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000605, train/loss_step=0.165, global_step=981.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  72%|███████▏  | 4315/5971 [38:53<14:55,  1.85it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0703, train/loss_vlb_step=0.000241, train/loss_step=0.0703, global_step=981.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4316/5971 [38:56<14:55,  1.85it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.13e-5, train/loss_step=0.00404, global_step=981.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4317/5971 [38:56<14:55,  1.85it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00133, train/loss_step=0.322, global_step=982.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  72%|███████▏  | 4318/5971 [38:57<14:54,  1.85it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00133, train/loss_step=0.322, global_step=982.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4318/5971 [38:57<14:54,  1.85it/s, loss=0.0788, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000327, train/loss_step=0.0991, global_step=982.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4319/5971 [38:58<14:54,  1.85it/s, loss=0.0743, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.9e-6, train/loss_step=0.00165, global_step=982.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4320/5971 [39:00<14:54,  1.85it/s, loss=0.0797, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000559, train/loss_step=0.167, global_step=982.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  72%|███████▏  | 4321/5971 [39:01<14:53,  1.85it/s, loss=0.0824, v_num=0, train/loss_simple_step=0.0836, train/loss_vlb_step=0.00028, train/loss_step=0.0836, global_step=983.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4322/5971 [39:02<14:53,  1.85it/s, loss=0.0824, v_num=0, train/loss_simple_step=0.0836, train/loss_vlb_step=0.00028, train/loss_step=0.0836, global_step=983.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4322/5971 [39:02<14:53,  1.85it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000838, train/loss_step=0.225, global_step=983.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  72%|███████▏  | 4323/5971 [39:03<14:53,  1.85it/s, loss=0.0932, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000227, train/loss_step=0.0643, global_step=983.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4324/5971 [39:05<14:53,  1.84it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.73e-5, train/loss_step=0.0254, global_step=983.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  72%|███████▏  | 4325/5971 [39:06<14:52,  1.84it/s, loss=0.0919, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00153, train/loss_step=0.297, global_step=984.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  72%|███████▏  | 4326/5971 [39:07<14:52,  1.84it/s, loss=0.0919, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00153, train/loss_step=0.297, global_step=984.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4326/5971 [39:07<14:52,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.0083, train/loss_step=0.561, global_step=984.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  72%|███████▏  | 4327/5971 [39:08<14:52,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00891, train/loss_vlb_step=4.06e-5, train/loss_step=0.00891, global_step=984.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  72%|███████▏  | 4328/5971 [39:10<14:52,  1.84it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=4.23e-5, train/loss_step=0.00843, global_step=984.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4329/5971 [39:11<14:51,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00119, train/loss_step=0.262, global_step=985.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  73%|███████▎  | 4330/5971 [39:12<14:51,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00119, train/loss_step=0.262, global_step=985.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4330/5971 [39:12<14:51,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00709, train/loss_vlb_step=3.36e-5, train/loss_step=0.00709, global_step=985.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4331/5971 [39:13<14:50,  1.84it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0033, train/loss_vlb_step=1.84e-5, train/loss_step=0.0033, global_step=985.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  73%|███████▎  | 4332/5971 [39:15<14:51,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0223, train/loss_vlb_step=9.13e-5, train/loss_step=0.0223, global_step=985.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  73%|███████▎  | 4333/5971 [39:16<14:50,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.41e-5, train/loss_step=0.0122, global_step=986.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4334/5971 [39:17<14:50,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.41e-5, train/loss_step=0.0122, global_step=986.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4334/5971 [39:17<14:50,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.25e-5, train/loss_step=0.0148, global_step=986.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4335/5971 [39:18<14:49,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.38e-5, train/loss_step=0.018, global_step=986.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  73%|███████▎  | 4336/5971 [39:20<14:49,  1.84it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.07e-5, train/loss_step=0.0116, global_step=986.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4337/5971 [39:21<14:49,  1.84it/s, loss=0.095, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.38e-5, train/loss_step=0.00698, global_step=987.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4338/5971 [39:22<14:49,  1.84it/s, loss=0.095, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.38e-5, train/loss_step=0.00698, global_step=987.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4338/5971 [39:22<14:49,  1.84it/s, loss=0.0902, v_num=0, train/loss_simple_step=0.00333, train/loss_vlb_step=1.9e-5, train/loss_step=0.00333, global_step=987.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4339/5971 [39:23<14:48,  1.84it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000228, train/loss_step=0.0673, global_step=987.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4340/5971 [39:25<14:48,  1.84it/s, loss=0.131, v_num=0, train/loss_simple_step=0.909, train/loss_vlb_step=0.457, train/loss_step=0.909, global_step=987.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  73%|███████▎  | 4341/5971 [39:26<14:48,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000971, train/loss_step=0.266, global_step=988.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4342/5971 [39:27<14:47,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000971, train/loss_step=0.266, global_step=988.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4342/5971 [39:27<14:47,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00543, train/loss_step=0.530, global_step=988.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4343/5971 [39:28<14:47,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000737, train/loss_step=0.217, global_step=988.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4344/5971 [39:30<14:47,  1.83it/s, loss=0.17, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000581, train/loss_step=0.169, global_step=988.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  73%|███████▎  | 4345/5971 [39:31<14:47,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=989.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4346/5971 [39:32<14:46,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=989.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4346/5971 [39:32<14:46,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.41e-5, train/loss_step=0.00239, global_step=989.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4347/5971 [39:33<14:46,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.0014, train/loss_step=0.286, global_step=989.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  73%|███████▎  | 4348/5971 [39:36<14:46,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.49e-5, train/loss_step=0.00755, global_step=989.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4349/5971 [39:36<14:46,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.511, train/loss_vlb_step=0.00665, train/loss_step=0.511, global_step=990.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  73%|███████▎  | 4350/5971 [39:37<14:45,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.511, train/loss_vlb_step=0.00665, train/loss_step=0.511, global_step=990.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4350/5971 [39:37<14:45,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.735, train/loss_vlb_step=0.0165, train/loss_step=0.735, global_step=990.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  73%|███████▎  | 4351/5971 [39:38<14:45,  1.83it/s, loss=0.203, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000533, train/loss_step=0.161, global_step=990.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4352/5971 [39:41<14:45,  1.83it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.33e-5, train/loss_step=0.0147, global_step=990.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4353/5971 [39:42<14:45,  1.83it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.54e-5, train/loss_step=0.00487, global_step=991.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4354/5971 [39:43<14:44,  1.83it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.54e-5, train/loss_step=0.00487, global_step=991.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4354/5971 [39:43<14:44,  1.83it/s, loss=0.212, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000816, train/loss_step=0.211, global_step=991.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  73%|███████▎  | 4355/5971 [39:43<14:44,  1.83it/s, loss=0.225, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00114, train/loss_step=0.281, global_step=991.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  73%|███████▎  | 4356/5971 [39:46<14:44,  1.83it/s, loss=0.247, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.0029, train/loss_step=0.454, global_step=991.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  73%|███████▎  | 4357/5971 [39:47<14:44,  1.83it/s, loss=0.255, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000533, train/loss_step=0.162, global_step=992.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4358/5971 [39:48<14:43,  1.83it/s, loss=0.255, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000533, train/loss_step=0.162, global_step=992.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4358/5971 [39:48<14:43,  1.83it/s, loss=0.263, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000552, train/loss_step=0.163, global_step=992.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4359/5971 [39:49<14:43,  1.83it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0838, train/loss_vlb_step=0.000279, train/loss_step=0.0838, global_step=992.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4360/5971 [39:51<14:43,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00161, train/loss_step=0.315, global_step=992.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  73%|███████▎  | 4361/5971 [39:52<14:43,  1.82it/s, loss=0.25, v_num=0, train/loss_simple_step=0.580, train/loss_vlb_step=0.0113, train/loss_step=0.580, global_step=993.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  73%|███████▎  | 4362/5971 [39:53<14:42,  1.82it/s, loss=0.25, v_num=0, train/loss_simple_step=0.580, train/loss_vlb_step=0.0113, train/loss_step=0.580, global_step=993.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4362/5971 [39:53<14:42,  1.82it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00903, train/loss_vlb_step=4.36e-5, train/loss_step=0.00903, global_step=993.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4363/5971 [39:54<14:42,  1.82it/s, loss=0.248, v_num=0, train/loss_simple_step=0.709, train/loss_vlb_step=0.0189, train/loss_step=0.709, global_step=993.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  73%|███████▎  | 4364/5971 [39:57<14:42,  1.82it/s, loss=0.241, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.62e-5, train/loss_step=0.0213, global_step=993.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4365/5971 [39:57<14:42,  1.82it/s, loss=0.236, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.62e-5, train/loss_step=0.010, global_step=994.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  73%|███████▎  | 4366/5971 [39:58<14:41,  1.82it/s, loss=0.236, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.62e-5, train/loss_step=0.010, global_step=994.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4366/5971 [39:58<14:41,  1.82it/s, loss=0.236, v_num=0, train/loss_simple_step=0.00632, train/loss_vlb_step=3.23e-5, train/loss_step=0.00632, global_step=994.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4367/5971 [39:59<14:41,  1.82it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000101, train/loss_step=0.0262, global_step=994.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  73%|███████▎  | 4368/5971 [40:01<14:41,  1.82it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000125, train/loss_step=0.0309, global_step=994.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4369/5971 [40:02<14:40,  1.82it/s, loss=0.226, v_num=0, train/loss_simple_step=0.544, train/loss_vlb_step=0.00574, train/loss_step=0.544, global_step=995.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  73%|███████▎  | 4370/5971 [40:03<14:40,  1.82it/s, loss=0.226, v_num=0, train/loss_simple_step=0.544, train/loss_vlb_step=0.00574, train/loss_step=0.544, global_step=995.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4370/5971 [40:03<14:40,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.00094, train/loss_step=0.241, global_step=995.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4371/5971 [40:04<14:39,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0849, train/loss_vlb_step=0.000285, train/loss_step=0.0849, global_step=995.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4372/5971 [40:06<14:40,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000415, train/loss_step=0.125, global_step=995.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  73%|███████▎  | 4373/5971 [40:07<14:39,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.11e-5, train/loss_step=0.0115, global_step=996.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4374/5971 [40:08<14:39,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.11e-5, train/loss_step=0.0115, global_step=996.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4374/5971 [40:08<14:39,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0978, train/loss_vlb_step=0.000322, train/loss_step=0.0978, global_step=996.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4375/5971 [40:09<14:38,  1.82it/s, loss=0.199, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00121, train/loss_step=0.306, global_step=996.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  73%|███████▎  | 4376/5971 [40:11<14:38,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00128, train/loss_step=0.255, global_step=996.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4377/5971 [40:12<14:38,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0866, train/loss_vlb_step=0.000284, train/loss_step=0.0866, global_step=997.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4378/5971 [40:13<14:37,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0866, train/loss_vlb_step=0.000284, train/loss_step=0.0866, global_step=997.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4378/5971 [40:13<14:37,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0897, train/loss_vlb_step=0.000297, train/loss_step=0.0897, global_step=997.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4379/5971 [40:14<14:37,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000591, train/loss_step=0.172, global_step=997.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  73%|███████▎  | 4380/5971 [40:16<14:37,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000477, train/loss_step=0.142, global_step=997.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4381/5971 [40:17<14:37,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000867, train/loss_step=0.222, global_step=998.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4382/5971 [40:18<14:36,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000867, train/loss_step=0.222, global_step=998.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4382/5971 [40:18<14:36,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.875, train/loss_vlb_step=0.0641, train/loss_step=0.875, global_step=998.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  73%|███████▎  | 4383/5971 [40:19<14:36,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00185, train/loss_step=0.380, global_step=998.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4384/5971 [40:21<14:36,  1.81it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.000143, train/loss_step=0.0401, global_step=998.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4385/5971 [40:22<14:35,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00172, train/loss_step=0.322, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  73%|███████▎  | 4385/5971 [40:33<14:39,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00172, train/loss_step=0.322, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4386/5971 [41:11<14:52,  1.78it/s, loss=0.203, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00172, train/loss_step=0.322, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4386/5971 [41:11<14:52,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000146, train/loss_step=0.0425, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4387/5971 [41:12<14:52,  1.77it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000146, train/loss_step=0.0425, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4387/5971 [41:12<14:52,  1.77it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00818, train/loss_vlb_step=3.84e-5, train/loss_step=0.00818, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4388/5971 [41:14<14:52,  1.77it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00818, train/loss_vlb_step=3.84e-5, train/loss_step=0.00818, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  73%|███████▎  | 4388/5971 [41:14<14:52,  1.77it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.32it/s][A
Epoch 1:  74%|███████▎  | 4390/5971 [41:14<14:51,  1.77it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   1%|          | 2/167 [00:01<01:50,  1.49it/s][A
Epoch 1:  74%|███████▎  | 4392/5971 [41:15<14:49,  1.77it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:01<00:35,  4.62it/s][A
Epoch 1:  74%|███████▎  | 4395/5971 [41:15<14:47,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:01<00:20,  7.67it/s][A
Epoch 1:  74%|███████▎  | 4398/5971 [41:15<14:45,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:14, 10.96it/s][A
Epoch 1:  74%|███████▎  | 4401/5971 [41:15<14:43,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:11, 13.72it/s][A
Epoch 1:  74%|███████▍  | 4404/5971 [41:16<14:40,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  11%|█         | 18/167 [00:01<00:08, 17.67it/s][A
Epoch 1:  74%|███████▍  | 4407/5971 [41:16<14:38,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  13%|█▎        | 21/167 [00:02<00:07, 19.79it/s][A
Epoch 1:  74%|███████▍  | 4410/5971 [41:16<14:36,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 24/167 [00:02<00:06, 20.81it/s][A
Epoch 1:  74%|███████▍  | 4413/5971 [41:16<14:34,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 27/167 [00:02<00:06, 22.36it/s][A
Epoch 1:  74%|███████▍  | 4416/5971 [41:16<14:31,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  18%|█▊        | 30/167 [00:02<00:05, 23.37it/s][A
Epoch 1:  74%|███████▍  | 4419/5971 [41:16<14:29,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 24.78it/s][A
Epoch 1:  74%|███████▍  | 4422/5971 [41:16<14:27,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 25.13it/s][A
Epoch 1:  74%|███████▍  | 4425/5971 [41:16<14:25,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 26.55it/s][A
Epoch 1:  74%|███████▍  | 4429/5971 [41:17<14:22,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.20it/s][A
Epoch 1:  74%|███████▍  | 4433/5971 [41:17<14:19,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.41it/s][A
Epoch 1:  74%|███████▍  | 4437/5971 [41:17<14:16,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  29%|██▉       | 49/167 [00:03<00:04, 26.14it/s][A

Validating:  31%|███       | 52/167 [00:03<00:04, 25.00it/s][A
Epoch 1:  74%|███████▍  | 4441/5971 [41:17<14:13,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  33%|███▎      | 55/167 [00:03<00:04, 24.33it/s][A
Epoch 1:  74%|███████▍  | 4445/5971 [41:17<14:10,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▍      | 58/167 [00:03<00:04, 24.55it/s][A
Epoch 1:  75%|███████▍  | 4449/5971 [41:17<14:07,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 25.27it/s][A

Validating:  38%|███▊      | 64/167 [00:03<00:04, 25.32it/s][A
Epoch 1:  75%|███████▍  | 4453/5971 [41:17<14:04,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  40%|████      | 67/167 [00:03<00:03, 25.36it/s][A
Epoch 1:  75%|███████▍  | 4457/5971 [41:18<14:01,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.74it/s][A
Epoch 1:  75%|███████▍  | 4461/5971 [41:18<13:58,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  44%|████▎     | 73/167 [00:04<00:03, 25.84it/s][A

Validating:  46%|████▌     | 76/167 [00:04<00:03, 26.71it/s][A
Epoch 1:  75%|███████▍  | 4465/5971 [41:18<13:55,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  47%|████▋     | 79/167 [00:04<00:03, 27.44it/s][A
Epoch 1:  75%|███████▍  | 4469/5971 [41:18<13:52,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 27.24it/s][A
Epoch 1:  75%|███████▍  | 4473/5971 [41:18<13:49,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  51%|█████     | 85/167 [00:04<00:03, 27.11it/s][A

Validating:  53%|█████▎    | 88/167 [00:04<00:02, 26.89it/s][A
Epoch 1:  75%|███████▍  | 4477/5971 [41:18<13:47,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 27.25it/s][A
Epoch 1:  75%|███████▌  | 4481/5971 [41:19<13:44,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.59it/s][A
Epoch 1:  75%|███████▌  | 4485/5971 [41:19<13:41,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.30it/s][A

Validating:  60%|█████▉    | 100/167 [00:05<00:02, 25.41it/s][A
Epoch 1:  75%|███████▌  | 4489/5971 [41:19<13:38,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 103/167 [00:05<00:02, 25.11it/s][A
Epoch 1:  75%|███████▌  | 4493/5971 [41:19<13:35,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 24.85it/s][A
Epoch 1:  75%|███████▌  | 4497/5971 [41:19<13:32,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 25.29it/s][A

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 24.09it/s][A
Epoch 1:  75%|███████▌  | 4501/5971 [41:19<13:29,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 24.66it/s][A
Epoch 1:  75%|███████▌  | 4505/5971 [41:19<13:26,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.29it/s][A
Epoch 1:  76%|███████▌  | 4509/5971 [41:20<13:23,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 24.43it/s][A

Validating:  74%|███████▍  | 124/167 [00:06<00:01, 24.57it/s][A
Epoch 1:  76%|███████▌  | 4513/5971 [41:20<13:21,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:06<00:01, 25.95it/s][A
Epoch 1:  76%|███████▌  | 4517/5971 [41:20<13:18,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 25.79it/s][A
Epoch 1:  76%|███████▌  | 4521/5971 [41:20<13:15,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|███████▉  | 133/167 [00:06<00:01, 25.31it/s][A

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.43it/s][A
Epoch 1:  76%|███████▌  | 4525/5971 [41:20<13:12,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 25.82it/s][A
Epoch 1:  76%|███████▌  | 4529/5971 [41:20<13:09,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 25.55it/s][A
Epoch 1:  76%|███████▌  | 4533/5971 [41:21<13:06,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.87it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 25.43it/s][A
Epoch 1:  76%|███████▌  | 4537/5971 [41:21<13:04,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  90%|█████████ | 151/167 [00:07<00:00, 26.08it/s][A
Epoch 1:  76%|███████▌  | 4541/5971 [41:21<13:01,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 154/167 [00:07<00:00, 24.53it/s][A
Epoch 1:  76%|███████▌  | 4545/5971 [41:21<12:58,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  94%|█████████▍| 157/167 [00:07<00:00, 23.80it/s][A

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 24.83it/s][A
Epoch 1:  76%|███████▌  | 4549/5971 [41:21<12:55,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 25.98it/s][A
Epoch 1:  76%|███████▋  | 4553/5971 [41:21<12:52,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 25.76it/s][A
Epoch 1:  76%|███████▋  | 4556/5971 [41:22<12:50,  1.84it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:33,  1.47it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:18,  2.60it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.31it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.86it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.22it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.49it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.79it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.03it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.21it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.16it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.16it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.18it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.11it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.16it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.19it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.22it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.29it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.23it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.19it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.17it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.20it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.19it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.28it/s][A
Epoch 1:  76%|███████▋  | 4556/5971 [41:33<12:54,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.16it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.01it/s]

Epoch 1:  76%|███████▋  | 4557/5971 [41:34<12:53,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000117, train/loss_step=0.0305, global_step=999.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4557/5971 [41:34<12:53,  1.83it/s, loss=0.183, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000454, train/loss_step=0.132, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.46it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.95it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.84it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.96it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.01it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.09it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.22it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.24it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.22it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.16it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.36it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.26it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.19it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.21it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.19it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.13it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.12it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.13it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.16it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.18it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.13it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.14it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.10it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.18it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.07it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.02it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.97it/s]

Epoch 1:  76%|███████▋  | 4558/5971 [41:47<12:57,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000454, train/loss_step=0.132, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4558/5971 [41:47<12:57,  1.82it/s, loss=0.187, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00142, train/loss_step=0.311, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.34it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.11it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.70it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.17it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.58it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.86it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.10it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.16it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.40it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.26it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.22it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.26it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.38it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.53it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.26it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.26it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.44it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.34it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.48it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.04it/s]

Epoch 1:  76%|███████▋  | 4559/5971 [41:59<13:00,  1.81it/s, loss=0.187, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00142, train/loss_step=0.311, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4559/5971 [41:59<13:00,  1.81it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000281, train/loss_step=0.0854, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.94it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.43it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.79it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.03it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.23it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.45it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.41it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.29it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.41it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.56it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.46it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.50it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.50it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.48it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.45it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.48it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.17it/s]

Epoch 1:  76%|███████▋  | 4560/5971 [42:13<13:03,  1.80it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000281, train/loss_step=0.0854, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4560/5971 [42:13<13:03,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.57e-5, train/loss_step=0.0128, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  76%|███████▋  | 4561/5971 [42:14<13:03,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.57e-5, train/loss_step=0.0128, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4561/5971 [42:14<13:03,  1.80it/s, loss=0.198, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00148, train/loss_step=0.350, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  76%|███████▋  | 4562/5971 [42:14<13:02,  1.80it/s, loss=0.198, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00148, train/loss_step=0.350, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4562/5971 [42:14<13:02,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.075, train/loss_vlb_step=0.000254, train/loss_step=0.075, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4563/5971 [42:15<13:02,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.075, train/loss_vlb_step=0.000254, train/loss_step=0.075, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4563/5971 [42:15<13:02,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.0002, train/loss_step=0.0561, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4564/5971 [42:18<13:02,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.0002, train/loss_step=0.0561, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4564/5971 [42:18<13:02,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.48e-5, train/loss_step=0.0104, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4565/5971 [42:19<13:02,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.48e-5, train/loss_step=0.0104, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4565/5971 [42:19<13:02,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.00104, train/loss_step=0.224, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  76%|███████▋  | 4566/5971 [42:20<13:01,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.00104, train/loss_step=0.224, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4566/5971 [42:20<13:01,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0945, train/loss_vlb_step=0.000313, train/loss_step=0.0945, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4567/5971 [42:21<13:01,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0945, train/loss_vlb_step=0.000313, train/loss_step=0.0945, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  76%|███████▋  | 4567/5971 [42:21<13:01,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00103, train/loss_step=0.266, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  77%|███████▋  | 4568/5971 [42:23<13:01,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00103, train/loss_step=0.266, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4568/5971 [42:23<13:01,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.48e-5, train/loss_step=0.00517, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4569/5971 [42:24<13:00,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.48e-5, train/loss_step=0.00517, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4569/5971 [42:24<13:00,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.78e-5, train/loss_step=0.0133, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  77%|███████▋  | 4570/5971 [42:25<13:00,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.78e-5, train/loss_step=0.0133, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4570/5971 [42:25<13:00,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0092, train/loss_vlb_step=4.34e-5, train/loss_step=0.0092, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4571/5971 [42:26<12:59,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0092, train/loss_vlb_step=4.34e-5, train/loss_step=0.0092, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4571/5971 [42:26<12:59,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000436, train/loss_step=0.132, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  77%|███████▋  | 4572/5971 [42:28<12:59,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000436, train/loss_step=0.132, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4572/5971 [42:28<12:59,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00454, train/loss_step=0.481, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  77%|███████▋  | 4573/5971 [42:29<12:59,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00454, train/loss_step=0.481, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4573/5971 [42:29<12:59,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.29e-5, train/loss_step=0.0043, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4574/5971 [42:30<12:58,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.29e-5, train/loss_step=0.0043, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4574/5971 [42:30<12:58,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.0117, train/loss_step=0.621, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  77%|███████▋  | 4575/5971 [42:31<12:58,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.0117, train/loss_step=0.621, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4575/5971 [42:31<12:58,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.27e-5, train/loss_step=0.0147, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4576/5971 [42:33<12:58,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.27e-5, train/loss_step=0.0147, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4576/5971 [42:33<12:58,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0447, train/loss_vlb_step=0.000163, train/loss_step=0.0447, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4577/5971 [42:34<12:57,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0447, train/loss_vlb_step=0.000163, train/loss_step=0.0447, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4577/5971 [42:34<12:57,  1.79it/s, loss=0.17, v_num=0, train/loss_simple_step=0.587, train/loss_vlb_step=0.0236, train/loss_step=0.587, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  77%|███████▋  | 4578/5971 [42:35<12:57,  1.79it/s, loss=0.17, v_num=0, train/loss_simple_step=0.587, train/loss_vlb_step=0.0236, train/loss_step=0.587, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4578/5971 [42:35<12:57,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.000233, train/loss_step=0.0685, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4579/5971 [42:36<12:57,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.000233, train/loss_step=0.0685, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4579/5971 [42:36<12:57,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.24e-5, train/loss_step=0.00205, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4580/5971 [42:38<12:56,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.24e-5, train/loss_step=0.00205, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4580/5971 [42:38<12:56,  1.79it/s, loss=0.201, v_num=0, train/loss_simple_step=0.962, train/loss_vlb_step=0.484, train/loss_step=0.962, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  77%|███████▋  | 4581/5971 [42:39<12:56,  1.79it/s, loss=0.201, v_num=0, train/loss_simple_step=0.962, train/loss_vlb_step=0.484, train/loss_step=0.962, global_step=1e+3, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4581/5971 [42:39<12:56,  1.79it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=8.88e-5, train/loss_step=0.0227, global_step=1006.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4582/5971 [42:40<12:56,  1.79it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=8.88e-5, train/loss_step=0.0227, global_step=1006.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4582/5971 [42:40<12:56,  1.79it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000147, train/loss_step=0.0432, global_step=1006.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4583/5971 [42:41<12:55,  1.79it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000147, train/loss_step=0.0432, global_step=1006.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4583/5971 [42:41<12:55,  1.79it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.19e-5, train/loss_step=0.00207, global_step=1006.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4584/5971 [42:44<12:55,  1.79it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.19e-5, train/loss_step=0.00207, global_step=1006.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4584/5971 [42:44<12:55,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000683, train/loss_step=0.183, global_step=1006.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  77%|███████▋  | 4585/5971 [42:45<12:55,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000683, train/loss_step=0.183, global_step=1006.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4585/5971 [42:45<12:55,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00427, train/loss_step=0.526, global_step=1007.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  77%|███████▋  | 4586/5971 [42:45<12:54,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00427, train/loss_step=0.526, global_step=1007.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4586/5971 [42:45<12:54,  1.79it/s, loss=0.214, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.0014, train/loss_step=0.297, global_step=1007.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  77%|███████▋  | 4587/5971 [42:46<12:54,  1.79it/s, loss=0.214, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.0014, train/loss_step=0.297, global_step=1007.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4587/5971 [42:46<12:54,  1.79it/s, loss=0.208, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000484, train/loss_step=0.144, global_step=1007.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4588/5971 [42:48<12:54,  1.79it/s, loss=0.208, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000484, train/loss_step=0.144, global_step=1007.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4588/5971 [42:48<12:54,  1.79it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.53e-5, train/loss_step=0.0183, global_step=1007.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4589/5971 [42:49<12:53,  1.79it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.53e-5, train/loss_step=0.0183, global_step=1007.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4589/5971 [42:49<12:53,  1.79it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000119, train/loss_step=0.0319, global_step=1008.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4590/5971 [42:50<12:53,  1.79it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000119, train/loss_step=0.0319, global_step=1008.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4590/5971 [42:50<12:53,  1.79it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00907, train/loss_vlb_step=3.98e-5, train/loss_step=0.00907, global_step=1008.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4591/5971 [42:51<12:52,  1.79it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00907, train/loss_vlb_step=3.98e-5, train/loss_step=0.00907, global_step=1008.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4591/5971 [42:51<12:52,  1.79it/s, loss=0.221, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.0021, train/loss_step=0.365, global_step=1008.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  77%|███████▋  | 4592/5971 [42:54<12:52,  1.78it/s, loss=0.221, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.0021, train/loss_step=0.365, global_step=1008.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4592/5971 [42:54<12:52,  1.78it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.000225, train/loss_step=0.0654, global_step=1008.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4593/5971 [42:55<12:52,  1.78it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.000225, train/loss_step=0.0654, global_step=1008.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4593/5971 [42:55<12:52,  1.78it/s, loss=0.224, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00386, train/loss_step=0.481, global_step=1009.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  77%|███████▋  | 4594/5971 [42:56<12:51,  1.78it/s, loss=0.224, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00386, train/loss_step=0.481, global_step=1009.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4594/5971 [42:56<12:51,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.87e-5, train/loss_step=0.0178, global_step=1009.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4595/5971 [42:57<12:51,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.87e-5, train/loss_step=0.0178, global_step=1009.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4595/5971 [42:57<12:51,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.38e-5, train/loss_step=0.00237, global_step=1009.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4596/5971 [42:59<12:51,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.38e-5, train/loss_step=0.00237, global_step=1009.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4596/5971 [42:59<12:51,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.000279, train/loss_step=0.0847, global_step=1009.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  77%|███████▋  | 4597/5971 [43:00<12:51,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.000279, train/loss_step=0.0847, global_step=1009.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4597/5971 [43:00<12:51,  1.78it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00971, train/loss_vlb_step=4.49e-5, train/loss_step=0.00971, global_step=1010.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4598/5971 [43:01<12:50,  1.78it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00971, train/loss_vlb_step=4.49e-5, train/loss_step=0.00971, global_step=1010.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4598/5971 [43:01<12:50,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.12e-5, train/loss_step=0.004, global_step=1010.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  77%|███████▋  | 4599/5971 [43:02<12:50,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.12e-5, train/loss_step=0.004, global_step=1010.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4599/5971 [43:02<12:50,  1.78it/s, loss=0.175, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000804, train/loss_step=0.227, global_step=1010.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4600/5971 [43:04<12:50,  1.78it/s, loss=0.175, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000804, train/loss_step=0.227, global_step=1010.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4600/5971 [43:04<12:50,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00291, train/loss_step=0.467, global_step=1010.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  77%|███████▋  | 4601/5971 [43:05<12:49,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00291, train/loss_step=0.467, global_step=1010.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4601/5971 [43:05<12:49,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000168, train/loss_step=0.0449, global_step=1011.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4602/5971 [43:06<12:49,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000168, train/loss_step=0.0449, global_step=1011.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4602/5971 [43:06<12:49,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.919, train/loss_vlb_step=0.462, train/loss_step=0.919, global_step=1011.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  77%|███████▋  | 4603/5971 [43:07<12:48,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.919, train/loss_vlb_step=0.462, train/loss_step=0.919, global_step=1011.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4603/5971 [43:07<12:48,  1.78it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000126, train/loss_step=0.0376, global_step=1011.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4604/5971 [43:09<12:48,  1.78it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000126, train/loss_step=0.0376, global_step=1011.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4604/5971 [43:09<12:48,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000358, train/loss_step=0.109, global_step=1011.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  77%|███████▋  | 4605/5971 [43:10<12:48,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000358, train/loss_step=0.109, global_step=1011.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4605/5971 [43:10<12:48,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0494, train/loss_vlb_step=0.000177, train/loss_step=0.0494, global_step=1012.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4606/5971 [43:11<12:47,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0494, train/loss_vlb_step=0.000177, train/loss_step=0.0494, global_step=1012.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4606/5971 [43:11<12:47,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00239, train/loss_step=0.429, global_step=1012.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  77%|███████▋  | 4607/5971 [43:12<12:47,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00239, train/loss_step=0.429, global_step=1012.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4607/5971 [43:12<12:47,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.0024, train/loss_step=0.357, global_step=1012.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  77%|███████▋  | 4608/5971 [43:14<12:47,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.0024, train/loss_step=0.357, global_step=1012.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4608/5971 [43:14<12:47,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=1012.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4609/5971 [43:15<12:46,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=1012.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4609/5971 [43:15<12:46,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.000328, train/loss_step=0.0998, global_step=1013.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4610/5971 [43:16<12:46,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.000328, train/loss_step=0.0998, global_step=1013.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4610/5971 [43:16<12:46,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00523, train/loss_vlb_step=2.76e-5, train/loss_step=0.00523, global_step=1013.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4611/5971 [43:17<12:45,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00523, train/loss_vlb_step=2.76e-5, train/loss_step=0.00523, global_step=1013.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4611/5971 [43:17<12:45,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00144, train/loss_step=0.316, global_step=1013.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  77%|███████▋  | 4612/5971 [43:19<12:45,  1.77it/s, loss=0.193, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00144, train/loss_step=0.316, global_step=1013.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4612/5971 [43:19<12:45,  1.77it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00927, train/loss_vlb_step=4.53e-5, train/loss_step=0.00927, global_step=1013.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4613/5971 [43:20<12:45,  1.77it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00927, train/loss_vlb_step=4.53e-5, train/loss_step=0.00927, global_step=1013.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4613/5971 [43:20<12:45,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.11e-5, train/loss_step=0.0178, global_step=1014.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  77%|███████▋  | 4614/5971 [43:21<12:44,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.11e-5, train/loss_step=0.0178, global_step=1014.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4614/5971 [43:21<12:44,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00437, train/loss_vlb_step=2.23e-5, train/loss_step=0.00437, global_step=1014.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4615/5971 [43:22<12:44,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00437, train/loss_vlb_step=2.23e-5, train/loss_step=0.00437, global_step=1014.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4615/5971 [43:22<12:44,  1.77it/s, loss=0.181, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00144, train/loss_step=0.292, global_step=1014.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  77%|███████▋  | 4616/5971 [43:24<12:44,  1.77it/s, loss=0.181, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00144, train/loss_step=0.292, global_step=1014.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4616/5971 [43:24<12:44,  1.77it/s, loss=0.19, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00105, train/loss_step=0.262, global_step=1014.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  77%|███████▋  | 4617/5971 [43:25<12:43,  1.77it/s, loss=0.19, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00105, train/loss_step=0.262, global_step=1014.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4617/5971 [43:25<12:43,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.35e-5, train/loss_step=0.00226, global_step=1015.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4618/5971 [43:26<12:43,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.35e-5, train/loss_step=0.00226, global_step=1015.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4618/5971 [43:26<12:43,  1.77it/s, loss=0.215, v_num=0, train/loss_simple_step=0.516, train/loss_vlb_step=0.00443, train/loss_step=0.516, global_step=1015.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  77%|███████▋  | 4619/5971 [43:27<12:43,  1.77it/s, loss=0.215, v_num=0, train/loss_simple_step=0.516, train/loss_vlb_step=0.00443, train/loss_step=0.516, global_step=1015.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4619/5971 [43:27<12:43,  1.77it/s, loss=0.217, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00111, train/loss_step=0.273, global_step=1015.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4620/5971 [43:29<12:42,  1.77it/s, loss=0.217, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00111, train/loss_step=0.273, global_step=1015.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4620/5971 [43:29<12:42,  1.77it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=2.07e-5, train/loss_step=0.00379, global_step=1015.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4621/5971 [43:30<12:42,  1.77it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=2.07e-5, train/loss_step=0.00379, global_step=1015.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4621/5971 [43:30<12:42,  1.77it/s, loss=0.228, v_num=0, train/loss_simple_step=0.714, train/loss_vlb_step=0.0267, train/loss_step=0.714, global_step=1016.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  77%|███████▋  | 4622/5971 [43:31<12:41,  1.77it/s, loss=0.228, v_num=0, train/loss_simple_step=0.714, train/loss_vlb_step=0.0267, train/loss_step=0.714, global_step=1016.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4622/5971 [43:31<12:41,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000224, train/loss_step=0.066, global_step=1016.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4623/5971 [43:32<12:41,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000224, train/loss_step=0.066, global_step=1016.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4623/5971 [43:32<12:41,  1.77it/s, loss=0.193, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000677, train/loss_step=0.192, global_step=1016.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4624/5971 [43:34<12:41,  1.77it/s, loss=0.193, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000677, train/loss_step=0.192, global_step=1016.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4624/5971 [43:34<12:41,  1.77it/s, loss=0.2, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00109, train/loss_step=0.259, global_step=1016.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  77%|███████▋  | 4625/5971 [43:35<12:41,  1.77it/s, loss=0.2, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00109, train/loss_step=0.259, global_step=1016.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4625/5971 [43:35<12:41,  1.77it/s, loss=0.21, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000884, train/loss_step=0.247, global_step=1017.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4626/5971 [43:36<12:40,  1.77it/s, loss=0.21, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000884, train/loss_step=0.247, global_step=1017.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4626/5971 [43:36<12:40,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.5e-5, train/loss_step=0.0102, global_step=1017.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4627/5971 [43:37<12:40,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.5e-5, train/loss_step=0.0102, global_step=1017.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  77%|███████▋  | 4627/5971 [43:37<12:40,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.29e-5, train/loss_step=0.00225, global_step=1017.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4628/5971 [43:39<12:40,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.29e-5, train/loss_step=0.00225, global_step=1017.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4628/5971 [43:39<12:40,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=9.71e-5, train/loss_step=0.0267, global_step=1017.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  78%|███████▊  | 4629/5971 [43:40<12:39,  1.77it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=9.71e-5, train/loss_step=0.0267, global_step=1017.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4629/5971 [43:40<12:39,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000983, train/loss_step=0.250, global_step=1018.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  78%|███████▊  | 4630/5971 [43:41<12:39,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000983, train/loss_step=0.250, global_step=1018.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4630/5971 [43:41<12:39,  1.77it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.28e-5, train/loss_step=0.0185, global_step=1018.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4631/5971 [43:42<12:38,  1.77it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.28e-5, train/loss_step=0.0185, global_step=1018.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4631/5971 [43:42<12:38,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000973, train/loss_step=0.254, global_step=1018.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  78%|███████▊  | 4632/5971 [43:44<12:38,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000973, train/loss_step=0.254, global_step=1018.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4632/5971 [43:44<12:38,  1.77it/s, loss=0.184, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00103, train/loss_step=0.262, global_step=1018.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  78%|███████▊  | 4633/5971 [43:45<12:38,  1.76it/s, loss=0.184, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00103, train/loss_step=0.262, global_step=1018.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4633/5971 [43:45<12:38,  1.76it/s, loss=0.188, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=1019.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4634/5971 [43:46<12:37,  1.76it/s, loss=0.188, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=1019.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4634/5971 [43:46<12:37,  1.76it/s, loss=0.199, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000763, train/loss_step=0.217, global_step=1019.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4635/5971 [43:47<12:37,  1.76it/s, loss=0.199, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000763, train/loss_step=0.217, global_step=1019.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4635/5971 [43:47<12:37,  1.76it/s, loss=0.209, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00551, train/loss_step=0.498, global_step=1019.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  78%|███████▊  | 4636/5971 [43:49<12:37,  1.76it/s, loss=0.209, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00551, train/loss_step=0.498, global_step=1019.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4636/5971 [43:49<12:37,  1.76it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.21e-5, train/loss_step=0.00208, global_step=1019.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4637/5971 [43:50<12:36,  1.76it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.21e-5, train/loss_step=0.00208, global_step=1019.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4637/5971 [43:50<12:36,  1.76it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00462, train/loss_vlb_step=2.3e-5, train/loss_step=0.00462, global_step=1020.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  78%|███████▊  | 4638/5971 [43:51<12:36,  1.76it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00462, train/loss_vlb_step=2.3e-5, train/loss_step=0.00462, global_step=1020.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4638/5971 [43:51<12:36,  1.76it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000102, train/loss_step=0.0257, global_step=1020.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4639/5971 [43:52<12:35,  1.76it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000102, train/loss_step=0.0257, global_step=1020.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4639/5971 [43:52<12:35,  1.76it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.16e-5, train/loss_step=0.00209, global_step=1020.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4640/5971 [43:54<12:35,  1.76it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.16e-5, train/loss_step=0.00209, global_step=1020.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4640/5971 [43:54<12:35,  1.76it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0065, train/loss_vlb_step=3.24e-5, train/loss_step=0.0065, global_step=1020.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  78%|███████▊  | 4641/5971 [43:55<12:35,  1.76it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0065, train/loss_vlb_step=3.24e-5, train/loss_step=0.0065, global_step=1020.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4641/5971 [43:55<12:35,  1.76it/s, loss=0.146, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00342, train/loss_step=0.464, global_step=1021.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  78%|███████▊  | 4642/5971 [43:56<12:34,  1.76it/s, loss=0.146, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00342, train/loss_step=0.464, global_step=1021.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4642/5971 [43:56<12:34,  1.76it/s, loss=0.178, v_num=0, train/loss_simple_step=0.715, train/loss_vlb_step=0.0167, train/loss_step=0.715, global_step=1021.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  78%|███████▊  | 4643/5971 [43:57<12:34,  1.76it/s, loss=0.178, v_num=0, train/loss_simple_step=0.715, train/loss_vlb_step=0.0167, train/loss_step=0.715, global_step=1021.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4643/5971 [43:57<12:34,  1.76it/s, loss=0.173, v_num=0, train/loss_simple_step=0.074, train/loss_vlb_step=0.000249, train/loss_step=0.074, global_step=1021.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4644/5971 [43:59<12:34,  1.76it/s, loss=0.173, v_num=0, train/loss_simple_step=0.074, train/loss_vlb_step=0.000249, train/loss_step=0.074, global_step=1021.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4644/5971 [43:59<12:34,  1.76it/s, loss=0.172, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00108, train/loss_step=0.257, global_step=1021.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  78%|███████▊  | 4645/5971 [44:00<12:33,  1.76it/s, loss=0.172, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00108, train/loss_step=0.257, global_step=1021.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4645/5971 [44:00<12:33,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000213, train/loss_step=0.0624, global_step=1022.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4646/5971 [44:01<12:33,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000213, train/loss_step=0.0624, global_step=1022.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4646/5971 [44:01<12:33,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.52e-5, train/loss_step=0.0147, global_step=1022.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  78%|███████▊  | 4647/5971 [44:02<12:32,  1.76it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.52e-5, train/loss_step=0.0147, global_step=1022.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4647/5971 [44:02<12:32,  1.76it/s, loss=0.2, v_num=0, train/loss_simple_step=0.737, train/loss_vlb_step=0.0229, train/loss_step=0.737, global_step=1022.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  78%|███████▊  | 4648/5971 [44:04<12:32,  1.76it/s, loss=0.2, v_num=0, train/loss_simple_step=0.737, train/loss_vlb_step=0.0229, train/loss_step=0.737, global_step=1022.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4648/5971 [44:04<12:32,  1.76it/s, loss=0.208, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000623, train/loss_step=0.182, global_step=1022.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4649/5971 [44:05<12:32,  1.76it/s, loss=0.208, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000623, train/loss_step=0.182, global_step=1022.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4649/5971 [44:05<12:32,  1.76it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.39e-5, train/loss_step=0.0146, global_step=1023.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4650/5971 [44:06<12:31,  1.76it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.39e-5, train/loss_step=0.0146, global_step=1023.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4650/5971 [44:06<12:31,  1.76it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.66e-5, train/loss_step=0.0252, global_step=1023.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4651/5971 [44:07<12:31,  1.76it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.66e-5, train/loss_step=0.0252, global_step=1023.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4651/5971 [44:07<12:31,  1.76it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000133, train/loss_step=0.0349, global_step=1023.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4652/5971 [44:09<12:31,  1.76it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000133, train/loss_step=0.0349, global_step=1023.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4652/5971 [44:09<12:31,  1.76it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.000215, train/loss_step=0.0637, global_step=1023.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4653/5971 [44:10<12:30,  1.76it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.000215, train/loss_step=0.0637, global_step=1023.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4653/5971 [44:10<12:30,  1.76it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.34e-5, train/loss_step=0.00239, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4654/5971 [44:11<12:30,  1.76it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.34e-5, train/loss_step=0.00239, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4654/5971 [44:11<12:30,  1.76it/s, loss=0.185, v_num=0, train/loss_simple_step=0.510, train/loss_vlb_step=0.00453, train/loss_step=0.510, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  78%|███████▊  | 4655/5971 [44:12<12:29,  1.76it/s, loss=0.185, v_num=0, train/loss_simple_step=0.510, train/loss_vlb_step=0.00453, train/loss_step=0.510, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4655/5971 [44:12<12:29,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.000108, train/loss_step=0.0281, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4656/5971 [44:14<12:29,  1.75it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.000108, train/loss_step=0.0281, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  78%|███████▊  | 4656/5971 [44:14<12:29,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:01,  2.72it/s][A
Epoch 1:  78%|███████▊  | 4658/5971 [44:14<12:28,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   1%|          | 2/167 [00:00<00:57,  2.89it/s][A
Epoch 1:  78%|███████▊  | 4660/5971 [44:15<12:26,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:20,  7.93it/s][A
Epoch 1:  78%|███████▊  | 4663/5971 [44:15<12:24,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.43it/s][A
Epoch 1:  78%|███████▊  | 4666/5971 [44:15<12:22,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:01<00:09, 15.72it/s][A
Epoch 1:  78%|███████▊  | 4669/5971 [44:15<12:20,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.92it/s][A
Epoch 1:  78%|███████▊  | 4672/5971 [44:15<12:18,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.48it/s][A
Epoch 1:  78%|███████▊  | 4675/5971 [44:15<12:16,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.46it/s][A
Epoch 1:  78%|███████▊  | 4678/5971 [44:15<12:13,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 21.45it/s][A
Epoch 1:  78%|███████▊  | 4681/5971 [44:16<12:11,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 22.02it/s][A
Epoch 1:  78%|███████▊  | 4684/5971 [44:16<12:09,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 22.28it/s][A
Epoch 1:  78%|███████▊  | 4687/5971 [44:16<12:07,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.70it/s][A
Epoch 1:  79%|███████▊  | 4690/5971 [44:16<12:05,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:02<00:05, 24.86it/s][A
Epoch 1:  79%|███████▊  | 4693/5971 [44:16<12:03,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.94it/s][A
Epoch 1:  79%|███████▊  | 4696/5971 [44:16<12:01,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.39it/s][A
Epoch 1:  79%|███████▊  | 4699/5971 [44:16<11:59,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.08it/s][A
Epoch 1:  79%|███████▉  | 4703/5971 [44:16<11:56,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.62it/s][A
Epoch 1:  79%|███████▉  | 4707/5971 [44:16<11:53,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.68it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.25it/s][A
Epoch 1:  79%|███████▉  | 4711/5971 [44:17<11:50,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.29it/s][A
Epoch 1:  79%|███████▉  | 4715/5971 [44:17<11:47,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 28.00it/s][A
Epoch 1:  79%|███████▉  | 4719/5971 [44:17<11:44,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 27.19it/s][A
Epoch 1:  79%|███████▉  | 4723/5971 [44:17<11:42,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  40%|████      | 67/167 [00:03<00:04, 23.72it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:04, 21.28it/s][A
Epoch 1:  79%|███████▉  | 4727/5971 [44:17<11:39,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  44%|████▎     | 73/167 [00:03<00:04, 23.12it/s][A
Epoch 1:  79%|███████▉  | 4731/5971 [44:17<11:36,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 24.69it/s][A
Epoch 1:  79%|███████▉  | 4735/5971 [44:18<11:33,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.72it/s][A
Epoch 1:  79%|███████▉  | 4739/5971 [44:18<11:30,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.04it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.11it/s][A
Epoch 1:  79%|███████▉  | 4743/5971 [44:18<11:28,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 89/167 [00:04<00:02, 26.33it/s][A
Epoch 1:  80%|███████▉  | 4747/5971 [44:18<11:25,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.93it/s][A
Epoch 1:  80%|███████▉  | 4751/5971 [44:18<11:22,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.38it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.44it/s][A
Epoch 1:  80%|███████▉  | 4755/5971 [44:18<11:19,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.03it/s][A
Epoch 1:  80%|███████▉  | 4759/5971 [44:19<11:17,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.47it/s][A
Epoch 1:  80%|███████▉  | 4763/5971 [44:19<11:14,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.93it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.83it/s][A
Epoch 1:  80%|███████▉  | 4767/5971 [44:19<11:11,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 113/167 [00:05<00:01, 27.28it/s][A
Epoch 1:  80%|███████▉  | 4771/5971 [44:19<11:08,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.06it/s][A
Epoch 1:  80%|███████▉  | 4775/5971 [44:19<11:06,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.78it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.42it/s][A
Epoch 1:  80%|████████  | 4779/5971 [44:19<11:03,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 24.33it/s][A
Epoch 1:  80%|████████  | 4783/5971 [44:19<11:00,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.84it/s][A
Epoch 1:  80%|████████  | 4787/5971 [44:20<10:57,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.54it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 24.82it/s][A
Epoch 1:  80%|████████  | 4791/5971 [44:20<10:55,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.89it/s][A
Epoch 1:  80%|████████  | 4795/5971 [44:20<10:52,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.46it/s][A
Epoch 1:  80%|████████  | 4799/5971 [44:20<10:49,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.44it/s][A
Epoch 1:  80%|████████  | 4803/5971 [44:20<10:46,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.08it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.46it/s][A
Epoch 1:  81%|████████  | 4807/5971 [44:20<10:44,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 27.07it/s][A
Epoch 1:  81%|████████  | 4811/5971 [44:21<10:41,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.30it/s][A
Epoch 1:  81%|████████  | 4815/5971 [44:21<10:38,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.97it/s][A
Epoch 1:  81%|████████  | 4819/5971 [44:21<10:36,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.19it/s][A

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 26.30it/s][A
Epoch 1:  81%|████████  | 4823/5971 [44:21<10:33,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4824/5971 [44:21<10:32,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000212, train/loss_step=0.0625, global_step=1024.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  81%|████████  | 4825/5971 [44:22<10:32,  1.81it/s, loss=0.166, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000148, train/loss_step=0.040, global_step=1025.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  81%|████████  | 4826/5971 [44:23<10:31,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00972, train/loss_vlb_step=4.58e-5, train/loss_step=0.00972, global_step=1025.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4827/5971 [44:24<10:31,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00972, train/loss_vlb_step=4.58e-5, train/loss_step=0.00972, global_step=1025.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4827/5971 [44:24<10:31,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00211, train/loss_vlb_step=1.21e-5, train/loss_step=0.00211, global_step=1025.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4828/5971 [44:26<10:31,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000352, train/loss_step=0.107, global_step=1025.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  81%|████████  | 4829/5971 [44:27<10:30,  1.81it/s, loss=0.166, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00363, train/loss_step=0.380, global_step=1026.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4830/5971 [44:28<10:30,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.88e-5, train/loss_step=0.0236, global_step=1026.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4831/5971 [44:29<10:29,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.88e-5, train/loss_step=0.0236, global_step=1026.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4831/5971 [44:29<10:29,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.05e-5, train/loss_step=0.00404, global_step=1026.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4832/5971 [44:31<10:29,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00849, train/loss_vlb_step=3.91e-5, train/loss_step=0.00849, global_step=1026.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4833/5971 [44:32<10:29,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000751, train/loss_step=0.188, global_step=1027.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  81%|████████  | 4834/5971 [44:33<10:28,  1.81it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.1e-5, train/loss_step=0.0019, global_step=1027.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4835/5971 [44:34<10:28,  1.81it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.1e-5, train/loss_step=0.0019, global_step=1027.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4835/5971 [44:34<10:28,  1.81it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.21e-5, train/loss_step=0.0179, global_step=1027.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4836/5971 [44:36<10:28,  1.81it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00238, train/loss_step=0.381, global_step=1027.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  81%|████████  | 4837/5971 [44:37<10:27,  1.81it/s, loss=0.0947, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.19e-5, train/loss_step=0.00415, global_step=1028.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4838/5971 [44:38<10:27,  1.81it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.72e-5, train/loss_step=0.0131, global_step=1028.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  81%|████████  | 4839/5971 [44:39<10:26,  1.81it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.72e-5, train/loss_step=0.0131, global_step=1028.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4839/5971 [44:39<10:26,  1.81it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.00949, train/loss_vlb_step=4.52e-5, train/loss_step=0.00949, global_step=1028.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4840/5971 [44:41<10:26,  1.81it/s, loss=0.0897, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=1e-5, train/loss_step=0.00168, global_step=1028.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  81%|████████  | 4841/5971 [44:42<10:25,  1.81it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.00425, train/loss_vlb_step=2.26e-5, train/loss_step=0.00425, global_step=1029.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4842/5971 [44:43<10:25,  1.80it/s, loss=0.0652, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.16e-5, train/loss_step=0.0171, global_step=1029.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  81%|████████  | 4843/5971 [44:44<10:25,  1.80it/s, loss=0.0652, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.16e-5, train/loss_step=0.0171, global_step=1029.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4843/5971 [44:44<10:25,  1.80it/s, loss=0.0806, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00169, train/loss_step=0.336, global_step=1029.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  81%|████████  | 4844/5971 [44:46<10:24,  1.80it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000472, train/loss_step=0.143, global_step=1029.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4845/5971 [44:47<10:24,  1.80it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000552, train/loss_step=0.161, global_step=1030.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4846/5971 [44:47<10:23,  1.80it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000204, train/loss_step=0.0581, global_step=1030.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4847/5971 [44:48<10:23,  1.80it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000204, train/loss_step=0.0581, global_step=1030.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4847/5971 [44:48<10:23,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000611, train/loss_step=0.179, global_step=1030.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  81%|████████  | 4848/5971 [44:50<10:23,  1.80it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.00424, train/loss_vlb_step=2.24e-5, train/loss_step=0.00424, global_step=1030.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4849/5971 [44:51<10:22,  1.80it/s, loss=0.0796, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000136, train/loss_step=0.0369, global_step=1031.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  81%|████████  | 4850/5971 [44:52<10:22,  1.80it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.38e-5, train/loss_step=0.00248, global_step=1031.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4851/5971 [44:53<10:21,  1.80it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.38e-5, train/loss_step=0.00248, global_step=1031.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████  | 4851/5971 [44:53<10:21,  1.80it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000487, train/loss_step=0.145, global_step=1031.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  81%|████████▏ | 4852/5971 [44:55<10:21,  1.80it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000215, train/loss_step=0.0628, global_step=1031.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4853/5971 [44:56<10:21,  1.80it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.000106, train/loss_step=0.0282, global_step=1032.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4854/5971 [44:57<10:20,  1.80it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.36e-5, train/loss_step=0.00252, global_step=1032.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4855/5971 [44:58<10:20,  1.80it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.36e-5, train/loss_step=0.00252, global_step=1032.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4855/5971 [44:58<10:20,  1.80it/s, loss=0.0806, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.48e-5, train/loss_step=0.0227, global_step=1032.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  81%|████████▏ | 4856/5971 [45:00<10:19,  1.80it/s, loss=0.0679, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.00042, train/loss_step=0.127, global_step=1032.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  81%|████████▏ | 4857/5971 [45:01<10:19,  1.80it/s, loss=0.0679, v_num=0, train/loss_simple_step=0.00494, train/loss_vlb_step=2.64e-5, train/loss_step=0.00494, global_step=1033.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4858/5971 [45:02<10:19,  1.80it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00232, train/loss_step=0.417, global_step=1033.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  81%|████████▏ | 4859/5971 [45:03<10:18,  1.80it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00232, train/loss_step=0.417, global_step=1033.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4859/5971 [45:03<10:18,  1.80it/s, loss=0.107, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00167, train/loss_step=0.380, global_step=1033.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  81%|████████▏ | 4860/5971 [45:05<10:18,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000343, train/loss_step=0.103, global_step=1033.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4861/5971 [45:06<10:17,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000609, train/loss_step=0.174, global_step=1034.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  81%|████████▏ | 4862/5971 [45:07<10:17,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000138, train/loss_step=0.0379, global_step=1034.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4863/5971 [45:08<10:17,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000138, train/loss_step=0.0379, global_step=1034.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4863/5971 [45:08<10:17,  1.80it/s, loss=0.11, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000367, train/loss_step=0.112, global_step=1034.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  81%|████████▏ | 4864/5971 [45:10<10:16,  1.79it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.3e-5, train/loss_step=0.00638, global_step=1034.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4865/5971 [45:11<10:16,  1.79it/s, loss=0.0959, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.47e-5, train/loss_step=0.0156, global_step=1035.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  81%|████████▏ | 4866/5971 [45:12<10:15,  1.79it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000233, train/loss_step=0.0688, global_step=1035.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4867/5971 [45:13<10:15,  1.79it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000233, train/loss_step=0.0688, global_step=1035.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4867/5971 [45:13<10:15,  1.79it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.26e-5, train/loss_step=0.017, global_step=1035.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  82%|████████▏ | 4868/5971 [45:15<10:15,  1.79it/s, loss=0.0983, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000715, train/loss_step=0.204, global_step=1035.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4869/5971 [45:16<10:14,  1.79it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0786, train/loss_vlb_step=0.00026, train/loss_step=0.0786, global_step=1036.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  82%|████████▏ | 4870/5971 [45:17<10:14,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000527, train/loss_step=0.142, global_step=1036.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4871/5971 [45:18<10:13,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000527, train/loss_step=0.142, global_step=1036.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4871/5971 [45:18<10:13,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000289, train/loss_step=0.0868, global_step=1036.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4872/5971 [45:20<10:13,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0892, train/loss_vlb_step=0.000295, train/loss_step=0.0892, global_step=1036.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4873/5971 [45:21<10:13,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.00077, train/loss_step=0.208, global_step=1037.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  82%|████████▏ | 4874/5971 [45:22<10:12,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.33e-5, train/loss_step=0.0146, global_step=1037.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4875/5971 [45:23<10:12,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.33e-5, train/loss_step=0.0146, global_step=1037.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4875/5971 [45:23<10:12,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000141, train/loss_step=0.0379, global_step=1037.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4876/5971 [45:25<10:11,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.32e-5, train/loss_step=0.0144, global_step=1037.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  82%|████████▏ | 4877/5971 [45:26<10:11,  1.79it/s, loss=0.124, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00103, train/loss_step=0.282, global_step=1038.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  82%|████████▏ | 4878/5971 [45:27<10:10,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.085, train/loss_vlb_step=0.000283, train/loss_step=0.085, global_step=1038.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4879/5971 [45:28<10:10,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.085, train/loss_vlb_step=0.000283, train/loss_step=0.085, global_step=1038.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4879/5971 [45:28<10:10,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00321, train/loss_step=0.420, global_step=1038.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  82%|████████▏ | 4880/5971 [45:30<10:10,  1.79it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00365, train/loss_vlb_step=1.97e-5, train/loss_step=0.00365, global_step=1038.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4881/5971 [45:31<10:09,  1.79it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.44e-5, train/loss_step=0.0216, global_step=1039.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  82%|████████▏ | 4882/5971 [45:32<10:09,  1.79it/s, loss=0.114, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00197, train/loss_step=0.369, global_step=1039.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  82%|████████▏ | 4883/5971 [45:33<10:08,  1.79it/s, loss=0.114, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00197, train/loss_step=0.369, global_step=1039.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4883/5971 [45:33<10:08,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00445, train/loss_vlb_step=2.39e-5, train/loss_step=0.00445, global_step=1039.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4884/5971 [45:35<10:08,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000216, train/loss_step=0.0655, global_step=1039.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  82%|████████▏ | 4885/5971 [45:36<10:08,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.39e-5, train/loss_step=0.0151, global_step=1040.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  82%|████████▏ | 4886/5971 [45:37<10:07,  1.79it/s, loss=0.114, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000386, train/loss_step=0.118, global_step=1040.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  82%|████████▏ | 4887/5971 [45:38<10:07,  1.79it/s, loss=0.114, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000386, train/loss_step=0.118, global_step=1040.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4887/5971 [45:38<10:07,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.1e-5, train/loss_step=0.00186, global_step=1040.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4888/5971 [45:40<10:07,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000548, train/loss_step=0.164, global_step=1040.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  82%|████████▏ | 4889/5971 [45:41<10:06,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000282, train/loss_step=0.0859, global_step=1041.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4890/5971 [45:42<10:06,  1.78it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.73e-5, train/loss_step=0.00768, global_step=1041.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4891/5971 [45:43<10:05,  1.78it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.73e-5, train/loss_step=0.00768, global_step=1041.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4891/5971 [45:43<10:05,  1.78it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=7.03e-5, train/loss_step=0.0158, global_step=1041.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  82%|████████▏ | 4892/5971 [45:45<10:05,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.916, train/loss_vlb_step=0.231, train/loss_step=0.916, global_step=1041.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  82%|████████▏ | 4893/5971 [45:46<10:04,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.01e-5, train/loss_step=0.0117, global_step=1042.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4894/5971 [45:47<10:04,  1.78it/s, loss=0.149, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00218, train/loss_step=0.338, global_step=1042.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  82%|████████▏ | 4895/5971 [45:48<10:03,  1.78it/s, loss=0.149, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00218, train/loss_step=0.338, global_step=1042.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4895/5971 [45:48<10:03,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.64e-5, train/loss_step=0.0124, global_step=1042.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4896/5971 [45:50<10:03,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.74e-5, train/loss_step=0.00326, global_step=1042.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4897/5971 [45:51<10:03,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00429, train/loss_step=0.495, global_step=1043.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  82%|████████▏ | 4898/5971 [45:51<10:02,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000422, train/loss_step=0.124, global_step=1043.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4899/5971 [45:52<10:02,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000422, train/loss_step=0.124, global_step=1043.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4899/5971 [45:52<10:02,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.25e-5, train/loss_step=0.00222, global_step=1043.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4900/5971 [45:55<10:02,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.32e-5, train/loss_step=0.00237, global_step=1043.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4901/5971 [45:56<10:01,  1.78it/s, loss=0.143, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=1044.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  82%|████████▏ | 4902/5971 [45:56<10:01,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00276, train/loss_step=0.425, global_step=1044.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  82%|████████▏ | 4903/5971 [45:57<10:00,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00276, train/loss_step=0.425, global_step=1044.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4903/5971 [45:57<10:00,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000671, train/loss_step=0.192, global_step=1044.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4904/5971 [45:59<10:00,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.000229, train/loss_step=0.0649, global_step=1044.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4905/5971 [46:00<09:59,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00696, train/loss_vlb_step=3.53e-5, train/loss_step=0.00696, global_step=1045.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4906/5971 [46:01<09:59,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.63e-5, train/loss_step=0.022, global_step=1045.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  82%|████████▏ | 4907/5971 [46:02<09:58,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.63e-5, train/loss_step=0.022, global_step=1045.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4907/5971 [46:02<09:58,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00107, train/loss_step=0.251, global_step=1045.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4908/5971 [46:04<09:58,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.00045, train/loss_step=0.133, global_step=1045.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4909/5971 [46:05<09:58,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0046, train/loss_vlb_step=2.34e-5, train/loss_step=0.0046, global_step=1046.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4910/5971 [46:06<09:57,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.43e-5, train/loss_step=0.0256, global_step=1046.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4911/5971 [46:07<09:57,  1.77it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.43e-5, train/loss_step=0.0256, global_step=1046.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4911/5971 [46:07<09:57,  1.77it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.39e-5, train/loss_step=0.00238, global_step=1046.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4912/5971 [46:09<09:57,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00208, train/loss_step=0.334, global_step=1046.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  82%|████████▏ | 4913/5971 [46:10<09:56,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00466, train/loss_vlb_step=2.43e-5, train/loss_step=0.00466, global_step=1047.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4914/5971 [46:11<09:56,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00853, train/loss_vlb_step=4e-5, train/loss_step=0.00853, global_step=1047.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  82%|████████▏ | 4915/5971 [46:12<09:55,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00853, train/loss_vlb_step=4e-5, train/loss_step=0.00853, global_step=1047.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4915/5971 [46:12<09:55,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.46e-5, train/loss_step=0.00265, global_step=1047.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4916/5971 [46:14<09:55,  1.77it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0758, train/loss_vlb_step=0.00025, train/loss_step=0.0758, global_step=1047.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  82%|████████▏ | 4917/5971 [46:15<09:54,  1.77it/s, loss=0.107, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.0016, train/loss_step=0.348, global_step=1048.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  82%|████████▏ | 4918/5971 [46:16<09:54,  1.77it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00799, train/loss_vlb_step=3.82e-5, train/loss_step=0.00799, global_step=1048.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4919/5971 [46:17<09:53,  1.77it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00799, train/loss_vlb_step=3.82e-5, train/loss_step=0.00799, global_step=1048.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4919/5971 [46:17<09:53,  1.77it/s, loss=0.114, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=1048.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  82%|████████▏ | 4920/5971 [46:19<09:53,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00122, train/loss_step=0.275, global_step=1048.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4921/5971 [46:20<09:53,  1.77it/s, loss=0.134, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000834, train/loss_step=0.225, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4922/5971 [46:21<09:52,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000577, train/loss_step=0.170, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4923/5971 [46:22<09:52,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000577, train/loss_step=0.170, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4923/5971 [46:22<09:52,  1.77it/s, loss=0.124, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000969, train/loss_step=0.243, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  82%|████████▏ | 4924/5971 [46:24<09:51,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.42it/s][A

Validating:   1%|          | 2/167 [00:00<00:45,  3.66it/s][A
Epoch 1:  83%|████████▎ | 4927/5971 [46:25<09:50,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.32it/s][A
Epoch 1:  83%|████████▎ | 4931/5971 [46:25<09:47,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.87it/s][A
Epoch 1:  83%|████████▎ | 4935/5971 [46:25<09:44,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.59it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.13it/s][A
Epoch 1:  83%|████████▎ | 4939/5971 [46:25<09:41,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.47it/s][A
Epoch 1:  83%|████████▎ | 4943/5971 [46:25<09:39,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.30it/s][A
Epoch 1:  83%|████████▎ | 4947/5971 [46:26<09:36,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.16it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.24it/s][A
Epoch 1:  83%|████████▎ | 4951/5971 [46:26<09:33,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.93it/s][A
Epoch 1:  83%|████████▎ | 4955/5971 [46:26<09:31,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.81it/s][A
Epoch 1:  83%|████████▎ | 4959/5971 [46:26<09:28,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  21%|██        | 35/167 [00:01<00:04, 26.65it/s][A

Validating:  23%|██▎       | 38/167 [00:01<00:05, 24.54it/s][A
Epoch 1:  83%|████████▎ | 4963/5971 [46:26<09:25,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.02it/s][A
Epoch 1:  83%|████████▎ | 4967/5971 [46:26<09:23,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 24.67it/s][A
Epoch 1:  83%|████████▎ | 4971/5971 [46:27<09:20,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  28%|██▊       | 47/167 [00:02<00:05, 23.56it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 24.16it/s][A
Epoch 1:  83%|████████▎ | 4975/5971 [46:27<09:17,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 23.37it/s][A
Epoch 1:  83%|████████▎ | 4979/5971 [46:27<09:15,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 24.48it/s][A
Epoch 1:  83%|████████▎ | 4983/5971 [46:27<09:12,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.13it/s][A
Epoch 1:  84%|████████▎ | 4987/5971 [46:27<09:09,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.77it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:04, 25.01it/s][A
Epoch 1:  84%|████████▎ | 4991/5971 [46:27<09:07,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.61it/s][A
Epoch 1:  84%|████████▎ | 4995/5971 [46:28<09:04,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.23it/s][A
Epoch 1:  84%|████████▎ | 4999/5971 [46:28<09:02,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 27.28it/s][A
Epoch 1:  84%|████████▍ | 5003/5971 [46:28<08:59,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 27.27it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.61it/s][A
Epoch 1:  84%|████████▍ | 5007/5971 [46:28<08:56,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  51%|█████     | 85/167 [00:03<00:03, 25.43it/s][A
Epoch 1:  84%|████████▍ | 5011/5971 [46:28<08:54,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 25.41it/s][A
Epoch 1:  84%|████████▍ | 5015/5971 [46:28<08:51,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.91it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.38it/s][A
Epoch 1:  84%|████████▍ | 5019/5971 [46:28<08:48,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 24.67it/s][A
Epoch 1:  84%|████████▍ | 5023/5971 [46:29<08:46,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.87it/s][A
Epoch 1:  84%|████████▍ | 5027/5971 [46:29<08:43,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.82it/s][A
Epoch 1:  84%|████████▍ | 5031/5971 [46:29<08:41,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.27it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.57it/s][A
Epoch 1:  84%|████████▍ | 5035/5971 [46:29<08:38,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 28.18it/s][A
Epoch 1:  84%|████████▍ | 5039/5971 [46:29<08:35,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 28.41it/s][A
Epoch 1:  84%|████████▍ | 5043/5971 [46:29<08:33,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 27.12it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.80it/s][A
Epoch 1:  85%|████████▍ | 5047/5971 [46:29<08:30,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.06it/s][A
Epoch 1:  85%|████████▍ | 5051/5971 [46:30<08:28,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.69it/s][A
Epoch 1:  85%|████████▍ | 5055/5971 [46:30<08:25,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.51it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.63it/s][A
Epoch 1:  85%|████████▍ | 5059/5971 [46:30<08:22,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 27.23it/s][A
Epoch 1:  85%|████████▍ | 5063/5971 [46:30<08:20,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.09it/s][A
Epoch 1:  85%|████████▍ | 5067/5971 [46:30<08:17,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.24it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.04it/s][A
Epoch 1:  85%|████████▍ | 5071/5971 [46:30<08:15,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.10it/s][A
Epoch 1:  85%|████████▍ | 5075/5971 [46:31<08:12,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.07it/s][A
Epoch 1:  85%|████████▌ | 5079/5971 [46:31<08:10,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.99it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.27it/s][A
Epoch 1:  85%|████████▌ | 5083/5971 [46:31<08:07,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.42it/s][A
Epoch 1:  85%|████████▌ | 5087/5971 [46:31<08:04,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 28.14it/s][A
Epoch 1:  85%|████████▌ | 5091/5971 [46:31<08:02,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5092/5971 [46:31<08:01,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00384, train/loss_step=0.438, global_step=1049.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  85%|████████▌ | 5093/5971 [46:32<08:01,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00887, train/loss_vlb_step=4.03e-5, train/loss_step=0.00887, global_step=1050.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5094/5971 [46:33<08:00,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000633, train/loss_step=0.189, global_step=1050.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  85%|████████▌ | 5095/5971 [46:34<08:00,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000633, train/loss_step=0.189, global_step=1050.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5095/5971 [46:34<08:00,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00272, train/loss_vlb_step=1.47e-5, train/loss_step=0.00272, global_step=1050.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5096/5971 [46:37<08:00,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.2e-5, train/loss_step=0.0145, global_step=1050.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  85%|████████▌ | 5097/5971 [46:38<07:59,  1.82it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0478, train/loss_vlb_step=0.000167, train/loss_step=0.0478, global_step=1051.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5098/5971 [46:39<07:59,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.58e-5, train/loss_step=0.0121, global_step=1051.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  85%|████████▌ | 5099/5971 [46:39<07:58,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.58e-5, train/loss_step=0.0121, global_step=1051.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5099/5971 [46:39<07:58,  1.82it/s, loss=0.149, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00128, train/loss_step=0.299, global_step=1051.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  85%|████████▌ | 5100/5971 [46:42<07:58,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00269, train/loss_step=0.513, global_step=1051.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5101/5971 [46:43<07:58,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000603, train/loss_step=0.172, global_step=1052.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5102/5971 [46:44<07:57,  1.82it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0447, train/loss_vlb_step=0.000159, train/loss_step=0.0447, global_step=1052.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5103/5971 [46:45<07:57,  1.82it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0447, train/loss_vlb_step=0.000159, train/loss_step=0.0447, global_step=1052.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5103/5971 [46:45<07:57,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00548, train/loss_step=0.550, global_step=1052.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  85%|████████▌ | 5104/5971 [46:47<07:56,  1.82it/s, loss=0.199, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.00055, train/loss_step=0.160, global_step=1052.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  85%|████████▌ | 5105/5971 [46:48<07:56,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000124, train/loss_step=0.0338, global_step=1053.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5106/5971 [46:49<07:55,  1.82it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.9e-5, train/loss_step=0.0242, global_step=1053.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  86%|████████▌ | 5107/5971 [46:50<07:55,  1.82it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.9e-5, train/loss_step=0.0242, global_step=1053.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5107/5971 [46:50<07:55,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000747, train/loss_step=0.224, global_step=1053.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5108/5971 [46:52<07:55,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.94e-5, train/loss_step=0.0105, global_step=1053.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5109/5971 [46:53<07:54,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.93e-5, train/loss_step=0.0128, global_step=1054.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5110/5971 [46:54<07:54,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000673, train/loss_step=0.187, global_step=1054.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  86%|████████▌ | 5111/5971 [46:55<07:53,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000673, train/loss_step=0.187, global_step=1054.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5111/5971 [46:55<07:53,  1.82it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.00012, train/loss_step=0.0305, global_step=1054.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5112/5971 [46:57<07:53,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000642, train/loss_step=0.178, global_step=1054.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  86%|████████▌ | 5113/5971 [46:58<07:52,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.51e-5, train/loss_step=0.0104, global_step=1055.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5114/5971 [46:59<07:52,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00159, train/loss_step=0.326, global_step=1055.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  86%|████████▌ | 5115/5971 [47:00<07:51,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00159, train/loss_step=0.326, global_step=1055.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5115/5971 [47:00<07:51,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.000174, train/loss_step=0.0511, global_step=1055.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5116/5971 [47:02<07:51,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.766, train/loss_vlb_step=0.0268, train/loss_step=0.766, global_step=1055.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  86%|████████▌ | 5117/5971 [47:03<07:51,  1.81it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.06e-5, train/loss_step=0.0173, global_step=1056.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5118/5971 [47:04<07:50,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000112, train/loss_step=0.0278, global_step=1056.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5119/5971 [47:05<07:50,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000112, train/loss_step=0.0278, global_step=1056.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5119/5971 [47:05<07:50,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.1e-5, train/loss_step=0.0149, global_step=1056.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  86%|████████▌ | 5120/5971 [47:07<07:49,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.25e-5, train/loss_step=0.00206, global_step=1056.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5121/5971 [47:08<07:49,  1.81it/s, loss=0.166, v_num=0, train/loss_simple_step=0.646, train/loss_vlb_step=0.010, train/loss_step=0.646, global_step=1057.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  86%|████████▌ | 5122/5971 [47:09<07:48,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.87e-5, train/loss_step=0.0221, global_step=1057.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5123/5971 [47:10<07:48,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.87e-5, train/loss_step=0.0221, global_step=1057.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5123/5971 [47:10<07:48,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.03e-5, train/loss_step=0.00182, global_step=1057.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5124/5971 [47:12<07:48,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.723, train/loss_vlb_step=0.0151, train/loss_step=0.723, global_step=1057.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  86%|████████▌ | 5125/5971 [47:13<07:47,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0801, train/loss_vlb_step=0.000263, train/loss_step=0.0801, global_step=1058.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5126/5971 [47:14<07:47,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00261, train/loss_step=0.338, global_step=1058.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  86%|████████▌ | 5127/5971 [47:15<07:46,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00261, train/loss_step=0.338, global_step=1058.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5127/5971 [47:15<07:46,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0578, train/loss_vlb_step=0.000201, train/loss_step=0.0578, global_step=1058.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5128/5971 [47:17<07:46,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.25e-5, train/loss_step=0.00656, global_step=1058.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5129/5971 [47:18<07:45,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.85e-5, train/loss_step=0.00342, global_step=1059.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5130/5971 [47:19<07:45,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.91e-5, train/loss_step=0.00336, global_step=1059.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5131/5971 [47:19<07:44,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.91e-5, train/loss_step=0.00336, global_step=1059.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5131/5971 [47:19<07:44,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000668, train/loss_step=0.193, global_step=1059.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  86%|████████▌ | 5132/5971 [47:22<07:44,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.95e-5, train/loss_step=0.00579, global_step=1059.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5133/5971 [47:22<07:44,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000503, train/loss_step=0.151, global_step=1060.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  86%|████████▌ | 5134/5971 [47:23<07:43,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00268, train/loss_step=0.395, global_step=1060.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  86%|████████▌ | 5135/5971 [47:24<07:43,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00268, train/loss_step=0.395, global_step=1060.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5135/5971 [47:24<07:43,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00219, train/loss_step=0.402, global_step=1060.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5136/5971 [47:27<07:42,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00148, train/loss_step=0.334, global_step=1060.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5137/5971 [47:27<07:42,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.41e-5, train/loss_step=0.00244, global_step=1061.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5138/5971 [47:28<07:41,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.45e-5, train/loss_step=0.0148, global_step=1061.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  86%|████████▌ | 5139/5971 [47:29<07:41,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.45e-5, train/loss_step=0.0148, global_step=1061.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5139/5971 [47:29<07:41,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00151, train/loss_step=0.328, global_step=1061.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  86%|████████▌ | 5140/5971 [47:31<07:40,  1.80it/s, loss=0.221, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.021, train/loss_step=0.712, global_step=1061.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  86%|████████▌ | 5141/5971 [47:32<07:40,  1.80it/s, loss=0.217, v_num=0, train/loss_simple_step=0.573, train/loss_vlb_step=0.0112, train/loss_step=0.573, global_step=1062.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5142/5971 [47:33<07:39,  1.80it/s, loss=0.224, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000536, train/loss_step=0.160, global_step=1062.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5143/5971 [47:34<07:39,  1.80it/s, loss=0.224, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000536, train/loss_step=0.160, global_step=1062.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5143/5971 [47:34<07:39,  1.80it/s, loss=0.239, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00121, train/loss_step=0.290, global_step=1062.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  86%|████████▌ | 5144/5971 [47:36<07:39,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.38e-5, train/loss_step=0.00671, global_step=1062.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5145/5971 [47:37<07:38,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00218, train/loss_step=0.381, global_step=1063.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  86%|████████▌ | 5146/5971 [47:38<07:38,  1.80it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0949, train/loss_vlb_step=0.000313, train/loss_step=0.0949, global_step=1063.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5147/5971 [47:39<07:37,  1.80it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0949, train/loss_vlb_step=0.000313, train/loss_step=0.0949, global_step=1063.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5147/5971 [47:39<07:37,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00667, train/loss_vlb_step=3.41e-5, train/loss_step=0.00667, global_step=1063.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▌ | 5148/5971 [47:41<07:37,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00139, train/loss_step=0.311, global_step=1063.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  86%|████████▌ | 5149/5971 [47:42<07:36,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00345, train/loss_vlb_step=1.86e-5, train/loss_step=0.00345, global_step=1064.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5150/5971 [47:43<07:36,  1.80it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0543, train/loss_vlb_step=0.000185, train/loss_step=0.0543, global_step=1064.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  86%|████████▋ | 5151/5971 [47:44<07:35,  1.80it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0543, train/loss_vlb_step=0.000185, train/loss_step=0.0543, global_step=1064.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5151/5971 [47:44<07:35,  1.80it/s, loss=0.224, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000944, train/loss_step=0.244, global_step=1064.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  86%|████████▋ | 5152/5971 [47:46<07:35,  1.80it/s, loss=0.231, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000494, train/loss_step=0.150, global_step=1064.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5153/5971 [47:47<07:35,  1.80it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.44e-5, train/loss_step=0.00704, global_step=1065.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5154/5971 [47:48<07:34,  1.80it/s, loss=0.215, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000909, train/loss_step=0.233, global_step=1065.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  86%|████████▋ | 5155/5971 [47:49<07:34,  1.80it/s, loss=0.215, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000909, train/loss_step=0.233, global_step=1065.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5155/5971 [47:49<07:34,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.32e-5, train/loss_step=0.00218, global_step=1065.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5156/5971 [47:51<07:33,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000492, train/loss_step=0.149, global_step=1065.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  86%|████████▋ | 5157/5971 [47:52<07:33,  1.80it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0949, train/loss_vlb_step=0.000313, train/loss_step=0.0949, global_step=1066.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5158/5971 [47:53<07:32,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.059, train/loss_vlb_step=0.0002, train/loss_step=0.059, global_step=1066.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  86%|████████▋ | 5159/5971 [47:54<07:32,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.059, train/loss_vlb_step=0.0002, train/loss_step=0.059, global_step=1066.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5159/5971 [47:54<07:32,  1.80it/s, loss=0.192, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00148, train/loss_step=0.301, global_step=1066.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5160/5971 [47:56<07:31,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00999, train/loss_vlb_step=4.62e-5, train/loss_step=0.00999, global_step=1066.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5161/5971 [47:57<07:31,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000913, train/loss_step=0.236, global_step=1067.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  86%|████████▋ | 5162/5971 [47:57<07:30,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000285, train/loss_step=0.0859, global_step=1067.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5163/5971 [47:58<07:30,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000285, train/loss_step=0.0859, global_step=1067.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  86%|████████▋ | 5163/5971 [47:58<07:30,  1.79it/s, loss=0.13, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000571, train/loss_step=0.164, global_step=1067.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  86%|████████▋ | 5164/5971 [48:01<07:30,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.000112, train/loss_step=0.0276, global_step=1067.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5165/5971 [48:02<07:29,  1.79it/s, loss=0.129, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00194, train/loss_step=0.353, global_step=1068.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  87%|████████▋ | 5166/5971 [48:02<07:29,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000147, train/loss_step=0.0407, global_step=1068.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5167/5971 [48:03<07:28,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000147, train/loss_step=0.0407, global_step=1068.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5167/5971 [48:03<07:28,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.13e-5, train/loss_step=0.00403, global_step=1068.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5168/5971 [48:05<07:28,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=1068.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  87%|████████▋ | 5169/5971 [48:06<07:27,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00305, train/loss_vlb_step=1.65e-5, train/loss_step=0.00305, global_step=1069.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5170/5971 [48:07<07:27,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000335, train/loss_step=0.101, global_step=1069.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  87%|████████▋ | 5171/5971 [48:08<07:26,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000335, train/loss_step=0.101, global_step=1069.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5171/5971 [48:08<07:26,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000106, train/loss_step=0.0278, global_step=1069.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5172/5971 [48:10<07:26,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00911, train/loss_vlb_step=4.32e-5, train/loss_step=0.00911, global_step=1069.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5173/5971 [48:11<07:25,  1.79it/s, loss=0.114, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00108, train/loss_step=0.244, global_step=1070.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  87%|████████▋ | 5174/5971 [48:12<07:25,  1.79it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.18e-5, train/loss_step=0.00408, global_step=1070.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5175/5971 [48:13<07:24,  1.79it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.18e-5, train/loss_step=0.00408, global_step=1070.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5175/5971 [48:13<07:24,  1.79it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.65e-5, train/loss_step=0.00298, global_step=1070.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5176/5971 [48:16<07:24,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000892, train/loss_step=0.255, global_step=1070.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  87%|████████▋ | 5177/5971 [48:17<07:24,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000273, train/loss_step=0.0828, global_step=1071.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5178/5971 [48:17<07:23,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0833, train/loss_vlb_step=0.000277, train/loss_step=0.0833, global_step=1071.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5179/5971 [48:18<07:23,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0833, train/loss_vlb_step=0.000277, train/loss_step=0.0833, global_step=1071.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5179/5971 [48:18<07:23,  1.79it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.00013, train/loss_step=0.035, global_step=1071.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  87%|████████▋ | 5180/5971 [48:21<07:22,  1.79it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.00163, train/loss_vlb_step=9.59e-6, train/loss_step=0.00163, global_step=1071.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5181/5971 [48:22<07:22,  1.79it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.00466, train/loss_vlb_step=2.32e-5, train/loss_step=0.00466, global_step=1072.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5182/5971 [48:22<07:21,  1.79it/s, loss=0.0795, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.6e-5, train/loss_step=0.0102, global_step=1072.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  87%|████████▋ | 5183/5971 [48:23<07:21,  1.79it/s, loss=0.0795, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.6e-5, train/loss_step=0.0102, global_step=1072.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5183/5971 [48:23<07:21,  1.79it/s, loss=0.0754, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000273, train/loss_step=0.0815, global_step=1072.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5184/5971 [48:26<07:21,  1.78it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00115, train/loss_step=0.266, global_step=1072.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  87%|████████▋ | 5185/5971 [48:27<07:20,  1.78it/s, loss=0.0842, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00127, train/loss_step=0.290, global_step=1073.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5186/5971 [48:28<07:20,  1.78it/s, loss=0.0825, v_num=0, train/loss_simple_step=0.00708, train/loss_vlb_step=3.49e-5, train/loss_step=0.00708, global_step=1073.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5187/5971 [48:28<07:19,  1.78it/s, loss=0.0825, v_num=0, train/loss_simple_step=0.00708, train/loss_vlb_step=3.49e-5, train/loss_step=0.00708, global_step=1073.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5187/5971 [48:28<07:19,  1.78it/s, loss=0.093, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000838, train/loss_step=0.214, global_step=1073.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  87%|████████▋ | 5188/5971 [48:31<07:19,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00492, train/loss_step=0.528, global_step=1073.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  87%|████████▋ | 5189/5971 [48:32<07:18,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00311, train/loss_step=0.433, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5190/5971 [48:33<07:18,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000693, train/loss_step=0.203, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5191/5971 [48:33<07:17,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000693, train/loss_step=0.203, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5191/5971 [48:33<07:17,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0825, train/loss_vlb_step=0.000286, train/loss_step=0.0825, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  87%|████████▋ | 5192/5971 [48:36<07:17,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.41it/s][A

Validating:   1%|          | 2/167 [00:00<00:43,  3.81it/s][A
Epoch 1:  87%|████████▋ | 5195/5971 [48:36<07:15,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   2%|▏         | 4/167 [00:00<00:21,  7.53it/s][A
Epoch 1:  87%|████████▋ | 5199/5971 [48:36<07:13,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   4%|▍         | 7/167 [00:00<00:12, 12.95it/s][A

Validating:   6%|▌         | 10/167 [00:00<00:09, 16.02it/s][A
Epoch 1:  87%|████████▋ | 5203/5971 [48:36<07:10,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   8%|▊         | 13/167 [00:01<00:08, 18.87it/s][A
Epoch 1:  87%|████████▋ | 5207/5971 [48:37<07:07,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|▉         | 16/167 [00:01<00:07, 21.31it/s][A
Epoch 1:  87%|████████▋ | 5211/5971 [48:37<07:05,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 22.48it/s][A

Validating:  13%|█▎        | 22/167 [00:01<00:06, 24.07it/s][A
Epoch 1:  87%|████████▋ | 5215/5971 [48:37<07:02,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.31it/s][A
Epoch 1:  87%|████████▋ | 5219/5971 [48:37<07:00,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 25.95it/s][A
Epoch 1:  87%|████████▋ | 5223/5971 [48:37<06:57,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 26.27it/s][A

Validating:  20%|██        | 34/167 [00:01<00:05, 26.52it/s][A
Epoch 1:  88%|████████▊ | 5227/5971 [48:37<06:55,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 37/167 [00:01<00:04, 26.94it/s][A
Epoch 1:  88%|████████▊ | 5231/5971 [48:38<06:52,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 26.30it/s][A
Epoch 1:  88%|████████▊ | 5235/5971 [48:38<06:50,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.55it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:04, 24.80it/s][A
Epoch 1:  88%|████████▊ | 5239/5971 [48:38<06:47,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.85it/s][A
Epoch 1:  88%|████████▊ | 5243/5971 [48:38<06:45,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.15it/s][A
Epoch 1:  88%|████████▊ | 5247/5971 [48:38<06:42,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.20it/s][A
Epoch 1:  88%|████████▊ | 5251/5971 [48:38<06:40,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.51it/s][A

Validating:  37%|███▋      | 62/167 [00:02<00:04, 26.02it/s][A
Epoch 1:  88%|████████▊ | 5255/5971 [48:38<06:37,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.48it/s][A
Epoch 1:  88%|████████▊ | 5259/5971 [48:39<06:35,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.03it/s][A
Epoch 1:  88%|████████▊ | 5263/5971 [48:39<06:32,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.66it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.82it/s][A
Epoch 1:  88%|████████▊ | 5267/5971 [48:39<06:30,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 24.86it/s][A
Epoch 1:  88%|████████▊ | 5271/5971 [48:39<06:27,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.63it/s][A
Epoch 1:  88%|████████▊ | 5275/5971 [48:39<06:25,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.18it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.21it/s][A
Epoch 1:  88%|████████▊ | 5279/5971 [48:39<06:22,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.19it/s][A
Epoch 1:  88%|████████▊ | 5283/5971 [48:40<06:20,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.92it/s][A
Epoch 1:  89%|████████▊ | 5287/5971 [48:40<06:17,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.33it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.00it/s][A
Epoch 1:  89%|████████▊ | 5291/5971 [48:40<06:15,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.62it/s][A
Epoch 1:  89%|████████▊ | 5295/5971 [48:40<06:12,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.40it/s][A
Epoch 1:  89%|████████▊ | 5299/5971 [48:40<06:10,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.50it/s][A
Epoch 1:  89%|████████▉ | 5303/5971 [48:40<06:07,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.10it/s][A
Epoch 1:  89%|████████▉ | 5307/5971 [48:40<06:05,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 27.86it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 28.12it/s][A
Epoch 1:  89%|████████▉ | 5311/5971 [48:41<06:02,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.39it/s][A
Epoch 1:  89%|████████▉ | 5315/5971 [48:41<06:00,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.77it/s][A
Epoch 1:  89%|████████▉ | 5319/5971 [48:41<05:58,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.53it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.28it/s][A
Epoch 1:  89%|████████▉ | 5323/5971 [48:41<05:55,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.17it/s][A
Epoch 1:  89%|████████▉ | 5327/5971 [48:41<05:53,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 28.18it/s][A
Epoch 1:  89%|████████▉ | 5331/5971 [48:41<05:50,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.55it/s][A
Epoch 1:  89%|████████▉ | 5335/5971 [48:41<05:48,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.36it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.12it/s][A
Epoch 1:  89%|████████▉ | 5339/5971 [48:42<05:45,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.70it/s][A
Epoch 1:  89%|████████▉ | 5343/5971 [48:42<05:43,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.16it/s][A
Epoch 1:  90%|████████▉ | 5347/5971 [48:42<05:40,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.39it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.75it/s][A
Epoch 1:  90%|████████▉ | 5351/5971 [48:42<05:38,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.94it/s][A
Epoch 1:  90%|████████▉ | 5355/5971 [48:42<05:36,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 24.35it/s][A
Epoch 1:  90%|████████▉ | 5359/5971 [48:42<05:33,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 100%|██████████| 167/167 [00:06<00:00, 25.10it/s][A
Epoch 1:  90%|████████▉ | 5360/5971 [48:43<05:33,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00106, train/loss_step=0.292, global_step=1074.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  90%|████████▉ | 5361/5971 [48:44<05:32,  1.83it/s, loss=0.153, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.00061, train/loss_step=0.176, global_step=1075.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5362/5971 [48:45<05:32,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.851, train/loss_vlb_step=0.044, train/loss_step=0.851, global_step=1075.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  90%|████████▉ | 5363/5971 [48:45<05:31,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.851, train/loss_vlb_step=0.044, train/loss_step=0.851, global_step=1075.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5363/5971 [48:45<05:31,  1.83it/s, loss=0.214, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00251, train/loss_step=0.376, global_step=1075.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5364/5971 [48:48<05:31,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00143, train/loss_step=0.305, global_step=1075.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5365/5971 [48:49<05:30,  1.83it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.39e-5, train/loss_step=0.0126, global_step=1076.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5366/5971 [48:50<05:30,  1.83it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.8e-5, train/loss_step=0.0103, global_step=1076.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  90%|████████▉ | 5367/5971 [48:51<05:29,  1.83it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.8e-5, train/loss_step=0.0103, global_step=1076.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5367/5971 [48:51<05:29,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.28e-5, train/loss_step=0.00868, global_step=1076.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5368/5971 [48:54<05:29,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.15e-5, train/loss_step=0.00195, global_step=1076.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5369/5971 [48:55<05:29,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.97e-5, train/loss_step=0.0168, global_step=1077.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  90%|████████▉ | 5370/5971 [48:56<05:28,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.16e-5, train/loss_step=0.00405, global_step=1077.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5371/5971 [48:56<05:28,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.16e-5, train/loss_step=0.00405, global_step=1077.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5371/5971 [48:56<05:28,  1.83it/s, loss=0.21, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000427, train/loss_step=0.129, global_step=1077.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  90%|████████▉ | 5372/5971 [48:59<05:27,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000459, train/loss_step=0.138, global_step=1077.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|████████▉ | 5373/5971 [49:00<05:27,  1.83it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00956, train/loss_vlb_step=4.44e-5, train/loss_step=0.00956, global_step=1078.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5374/5971 [49:01<05:26,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000357, train/loss_step=0.108, global_step=1078.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  90%|█████████ | 5375/5971 [49:02<05:26,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000357, train/loss_step=0.108, global_step=1078.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5375/5971 [49:02<05:26,  1.83it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00399, train/loss_vlb_step=1.93e-5, train/loss_step=0.00399, global_step=1078.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5376/5971 [49:04<05:25,  1.83it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.24e-5, train/loss_step=0.0195, global_step=1078.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  90%|█████████ | 5377/5971 [49:05<05:25,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0078, train/loss_vlb_step=3.74e-5, train/loss_step=0.0078, global_step=1079.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5378/5971 [49:06<05:24,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00604, train/loss_vlb_step=3e-5, train/loss_step=0.00604, global_step=1079.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  90%|█████████ | 5379/5971 [49:07<05:24,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00604, train/loss_vlb_step=3e-5, train/loss_step=0.00604, global_step=1079.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5379/5971 [49:07<05:24,  1.83it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.95e-5, train/loss_step=0.00366, global_step=1079.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5380/5971 [49:09<05:23,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=2.02e-5, train/loss_step=0.00376, global_step=1079.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  90%|█████████ | 5381/5971 [49:10<05:23,  1.82it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0517, train/loss_vlb_step=0.000181, train/loss_step=0.0517, global_step=1080.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5382/5971 [49:11<05:22,  1.82it/s, loss=0.0625, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000129, train/loss_step=0.0341, global_step=1080.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5383/5971 [49:12<05:22,  1.82it/s, loss=0.0625, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000129, train/loss_step=0.0341, global_step=1080.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5383/5971 [49:12<05:22,  1.82it/s, loss=0.0626, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00164, train/loss_step=0.376, global_step=1080.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  90%|█████████ | 5384/5971 [49:15<05:22,  1.82it/s, loss=0.049, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000129, train/loss_step=0.0328, global_step=1080.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5385/5971 [49:15<05:21,  1.82it/s, loss=0.0546, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000414, train/loss_step=0.124, global_step=1081.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  90%|█████████ | 5386/5971 [49:16<05:21,  1.82it/s, loss=0.0599, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=1081.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5387/5971 [49:17<05:20,  1.82it/s, loss=0.0599, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=1081.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5387/5971 [49:17<05:20,  1.82it/s, loss=0.0665, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000462, train/loss_step=0.139, global_step=1081.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5388/5971 [49:19<05:20,  1.82it/s, loss=0.0666, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.23e-5, train/loss_step=0.00398, global_step=1081.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5389/5971 [49:20<05:19,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.842, train/loss_vlb_step=0.0718, train/loss_step=0.842, global_step=1082.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  90%|█████████ | 5390/5971 [49:21<05:19,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00114, train/loss_step=0.283, global_step=1082.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5391/5971 [49:22<05:18,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00114, train/loss_step=0.283, global_step=1082.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5391/5971 [49:22<05:18,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000605, train/loss_step=0.167, global_step=1082.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5392/5971 [49:24<05:18,  1.82it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.24e-5, train/loss_step=0.00429, global_step=1082.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5393/5971 [49:25<05:17,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00194, train/loss_step=0.338, global_step=1083.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  90%|█████████ | 5394/5971 [49:26<05:17,  1.82it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.000141, train/loss_step=0.0398, global_step=1083.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5395/5971 [49:27<05:16,  1.82it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.000141, train/loss_step=0.0398, global_step=1083.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5395/5971 [49:27<05:16,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000138, train/loss_step=0.0354, global_step=1083.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5396/5971 [49:29<05:16,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000352, train/loss_step=0.107, global_step=1083.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  90%|█████████ | 5397/5971 [49:30<05:15,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.00125, train/loss_step=0.246, global_step=1084.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  90%|█████████ | 5398/5971 [49:31<05:15,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000751, train/loss_step=0.205, global_step=1084.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5399/5971 [49:32<05:14,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000751, train/loss_step=0.205, global_step=1084.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5399/5971 [49:32<05:14,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000126, train/loss_step=0.034, global_step=1084.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5400/5971 [49:34<05:14,  1.82it/s, loss=0.181, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00374, train/loss_step=0.441, global_step=1084.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  90%|█████████ | 5401/5971 [49:35<05:13,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000896, train/loss_step=0.244, global_step=1085.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5402/5971 [49:36<05:13,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000482, train/loss_step=0.144, global_step=1085.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5403/5971 [49:37<05:12,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000482, train/loss_step=0.144, global_step=1085.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  90%|█████████ | 5403/5971 [49:37<05:12,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.62e-5, train/loss_step=0.00278, global_step=1085.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5404/5971 [49:39<05:12,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00362, train/loss_vlb_step=1.85e-5, train/loss_step=0.00362, global_step=1085.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5405/5971 [49:40<05:12,  1.81it/s, loss=0.171, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.44e-5, train/loss_step=0.015, global_step=1086.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  91%|█████████ | 5406/5971 [49:41<05:11,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.12e-5, train/loss_step=0.00394, global_step=1086.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5407/5971 [49:42<05:11,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.12e-5, train/loss_step=0.00394, global_step=1086.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5407/5971 [49:42<05:11,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=2.09e-5, train/loss_step=0.00379, global_step=1086.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5408/5971 [49:44<05:10,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00836, train/loss_vlb_step=4e-5, train/loss_step=0.00836, global_step=1086.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  91%|█████████ | 5409/5971 [49:45<05:10,  1.81it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.96e-5, train/loss_step=0.0152, global_step=1087.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5410/5971 [49:46<05:09,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.59e-5, train/loss_step=0.021, global_step=1087.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  91%|█████████ | 5411/5971 [49:47<05:09,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.59e-5, train/loss_step=0.021, global_step=1087.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5411/5971 [49:47<05:09,  1.81it/s, loss=0.101, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000367, train/loss_step=0.111, global_step=1087.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5412/5971 [49:49<05:08,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000843, train/loss_step=0.229, global_step=1087.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5413/5971 [49:50<05:08,  1.81it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.0701, train/loss_vlb_step=0.00023, train/loss_step=0.0701, global_step=1088.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5414/5971 [49:51<05:07,  1.81it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0577, train/loss_vlb_step=0.000196, train/loss_step=0.0577, global_step=1088.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  91%|█████████ | 5415/5971 [49:52<05:07,  1.81it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0577, train/loss_vlb_step=0.000196, train/loss_step=0.0577, global_step=1088.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5415/5971 [49:52<05:07,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000559, train/loss_step=0.166, global_step=1088.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5416/5971 [49:54<05:06,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000516, train/loss_step=0.154, global_step=1088.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5417/5971 [49:55<05:06,  1.81it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=8.74e-5, train/loss_step=0.0231, global_step=1089.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5418/5971 [49:56<05:05,  1.81it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000784, train/loss_step=0.226, global_step=1089.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5419/5971 [49:56<05:05,  1.81it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000784, train/loss_step=0.226, global_step=1089.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5419/5971 [49:56<05:05,  1.81it/s, loss=0.105, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000579, train/loss_step=0.168, global_step=1089.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5420/5971 [49:59<05:04,  1.81it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000767, train/loss_step=0.199, global_step=1089.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5421/5971 [50:00<05:04,  1.81it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0939, train/loss_vlb_step=0.000309, train/loss_step=0.0939, global_step=1090.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5422/5971 [50:01<05:03,  1.81it/s, loss=0.079, v_num=0, train/loss_simple_step=0.00806, train/loss_vlb_step=3.77e-5, train/loss_step=0.00806, global_step=1090.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5423/5971 [50:01<05:03,  1.81it/s, loss=0.079, v_num=0, train/loss_simple_step=0.00806, train/loss_vlb_step=3.77e-5, train/loss_step=0.00806, global_step=1090.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5423/5971 [50:01<05:03,  1.81it/s, loss=0.079, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.68e-5, train/loss_step=0.00295, global_step=1090.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5424/5971 [50:04<05:02,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00481, train/loss_step=0.509, global_step=1090.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  91%|█████████ | 5425/5971 [50:05<05:02,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000341, train/loss_step=0.103, global_step=1091.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5426/5971 [50:06<05:01,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00194, train/loss_step=0.365, global_step=1091.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5427/5971 [50:07<05:01,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00194, train/loss_step=0.365, global_step=1091.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5427/5971 [50:07<05:01,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.71e-5, train/loss_step=0.0254, global_step=1091.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5428/5971 [50:09<05:00,  1.80it/s, loss=0.137, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000694, train/loss_step=0.198, global_step=1091.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5429/5971 [50:10<05:00,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.66e-5, train/loss_step=0.025, global_step=1092.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5430/5971 [50:11<04:59,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.838, train/loss_vlb_step=0.0337, train/loss_step=0.838, global_step=1092.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5431/5971 [50:12<04:59,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.838, train/loss_vlb_step=0.0337, train/loss_step=0.838, global_step=1092.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5431/5971 [50:12<04:59,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0959, train/loss_vlb_step=0.000315, train/loss_step=0.0959, global_step=1092.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5432/5971 [50:14<04:59,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.79e-5, train/loss_step=0.0244, global_step=1092.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5433/5971 [50:15<04:58,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.27e-5, train/loss_step=0.00219, global_step=1093.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5434/5971 [50:16<04:58,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=7.77e-5, train/loss_step=0.0201, global_step=1093.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  91%|█████████ | 5435/5971 [50:17<04:57,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=7.77e-5, train/loss_step=0.0201, global_step=1093.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5435/5971 [50:17<04:57,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=1.92e-5, train/loss_step=0.00382, global_step=1093.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5436/5971 [50:19<04:57,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.000176, train/loss_step=0.0472, global_step=1093.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5437/5971 [50:20<04:56,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000563, train/loss_step=0.163, global_step=1094.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  91%|█████████ | 5438/5971 [50:21<04:56,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00266, train/loss_step=0.373, global_step=1094.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5439/5971 [50:22<04:55,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00266, train/loss_step=0.373, global_step=1094.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5439/5971 [50:22<04:55,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.15e-5, train/loss_step=0.0145, global_step=1094.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5440/5971 [50:24<04:55,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000125, train/loss_step=0.0337, global_step=1094.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5441/5971 [50:25<04:54,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.23e-5, train/loss_step=0.00205, global_step=1095.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5442/5971 [50:26<04:54,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000187, train/loss_step=0.055, global_step=1095.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  91%|█████████ | 5443/5971 [50:27<04:53,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000187, train/loss_step=0.055, global_step=1095.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5443/5971 [50:27<04:53,  1.80it/s, loss=0.161, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00143, train/loss_step=0.329, global_step=1095.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████ | 5444/5971 [50:29<04:53,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00145, train/loss_step=0.359, global_step=1095.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5445/5971 [50:30<04:52,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00132, train/loss_step=0.268, global_step=1096.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5446/5971 [50:31<04:52,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000126, train/loss_step=0.0343, global_step=1096.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5447/5971 [50:32<04:51,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000126, train/loss_step=0.0343, global_step=1096.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5447/5971 [50:32<04:51,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000304, train/loss_step=0.0922, global_step=1096.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████ | 5448/5971 [50:35<04:51,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0994, train/loss_vlb_step=0.000327, train/loss_step=0.0994, global_step=1096.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5449/5971 [50:36<04:50,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=0.000101, train/loss_step=0.0254, global_step=1097.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5450/5971 [50:37<04:50,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.57e-5, train/loss_step=0.00739, global_step=1097.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5451/5971 [50:38<04:49,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.57e-5, train/loss_step=0.00739, global_step=1097.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5451/5971 [50:38<04:49,  1.79it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.000145, train/loss_step=0.0401, global_step=1097.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5452/5971 [50:40<04:49,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000403, train/loss_step=0.120, global_step=1097.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  91%|█████████▏| 5453/5971 [50:41<04:48,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.549, train/loss_vlb_step=0.00579, train/loss_step=0.549, global_step=1098.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  91%|█████████▏| 5454/5971 [50:42<04:48,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.00055, train/loss_step=0.166, global_step=1098.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5455/5971 [50:43<04:47,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.00055, train/loss_step=0.166, global_step=1098.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5455/5971 [50:43<04:47,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.47e-5, train/loss_step=0.0206, global_step=1098.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5456/5971 [50:45<04:47,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00113, train/loss_step=0.249, global_step=1098.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  91%|█████████▏| 5457/5971 [50:46<04:46,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000224, train/loss_step=0.0661, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5458/5971 [50:46<04:46,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.00041, train/loss_step=0.124, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  91%|█████████▏| 5459/5971 [50:47<04:45,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.00041, train/loss_step=0.124, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5459/5971 [50:47<04:45,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00959, train/loss_step=0.508, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  91%|█████████▏| 5460/5971 [50:49<04:45,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:24,  1.95it/s][A
Epoch 1:  91%|█████████▏| 5463/5971 [50:50<04:43,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   2%|▏         | 3/167 [00:00<00:28,  5.76it/s][A

Validating:   4%|▎         | 6/167 [00:00<00:14, 10.83it/s][A
Epoch 1:  92%|█████████▏| 5467/5971 [50:50<04:41,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.98it/s][A
Epoch 1:  92%|█████████▏| 5471/5971 [50:50<04:38,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.56it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.90it/s][A
Epoch 1:  92%|█████████▏| 5475/5971 [50:51<04:36,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.06it/s][A
Epoch 1:  92%|█████████▏| 5479/5971 [50:51<04:33,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 24.09it/s][A
Epoch 1:  92%|█████████▏| 5483/5971 [50:51<04:31,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.57it/s][A
Epoch 1:  92%|█████████▏| 5487/5971 [50:51<04:29,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.28it/s][A

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.36it/s][A
Epoch 1:  92%|█████████▏| 5491/5971 [50:51<04:26,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.16it/s][A
Epoch 1:  92%|█████████▏| 5495/5971 [50:51<04:24,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 24.76it/s][A
Epoch 1:  92%|█████████▏| 5499/5971 [50:51<04:21,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.00it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.92it/s][A
Epoch 1:  92%|█████████▏| 5503/5971 [50:52<04:19,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.22it/s][A
Epoch 1:  92%|█████████▏| 5507/5971 [50:52<04:17,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.48it/s][A
Epoch 1:  92%|█████████▏| 5511/5971 [50:52<04:14,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.11it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.36it/s][A
Epoch 1:  92%|█████████▏| 5515/5971 [50:52<04:12,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.08it/s][A
Epoch 1:  92%|█████████▏| 5519/5971 [50:52<04:09,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.81it/s][A
Epoch 1:  92%|█████████▏| 5523/5971 [50:52<04:07,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.79it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.68it/s][A
Epoch 1:  93%|█████████▎| 5527/5971 [50:53<04:05,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.48it/s][A
Epoch 1:  93%|█████████▎| 5531/5971 [50:53<04:02,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.02it/s][A
Epoch 1:  93%|█████████▎| 5535/5971 [50:53<04:00,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 27.15it/s][A
Epoch 1:  93%|█████████▎| 5539/5971 [50:53<03:58,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.73it/s][A
Epoch 1:  93%|█████████▎| 5543/5971 [50:53<03:55,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.38it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.73it/s][A
Epoch 1:  93%|█████████▎| 5547/5971 [50:53<03:53,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 25.69it/s][A
Epoch 1:  93%|█████████▎| 5551/5971 [50:53<03:51,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.95it/s][A
Epoch 1:  93%|█████████▎| 5555/5971 [50:54<03:48,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.26it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 27.18it/s][A
Epoch 1:  93%|█████████▎| 5559/5971 [50:54<03:46,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.67it/s][A
Epoch 1:  93%|█████████▎| 5563/5971 [50:54<03:43,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.43it/s][A
Epoch 1:  93%|█████████▎| 5567/5971 [50:54<03:41,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.83it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.30it/s][A
Epoch 1:  93%|█████████▎| 5571/5971 [50:54<03:39,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.80it/s][A
Epoch 1:  93%|█████████▎| 5575/5971 [50:54<03:36,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.54it/s][A
Epoch 1:  93%|█████████▎| 5579/5971 [50:54<03:34,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.81it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 24.69it/s][A
Epoch 1:  94%|█████████▎| 5583/5971 [50:55<03:32,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.85it/s][A
Epoch 1:  94%|█████████▎| 5587/5971 [50:55<03:29,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.07it/s][A
Epoch 1:  94%|█████████▎| 5591/5971 [50:55<03:27,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.97it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 24.40it/s][A
Epoch 1:  94%|█████████▎| 5595/5971 [50:55<03:25,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 23.00it/s][A
Epoch 1:  94%|█████████▍| 5599/5971 [50:55<03:22,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 22.95it/s][A
Epoch 1:  94%|█████████▍| 5603/5971 [50:56<03:20,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 143/167 [00:06<00:01, 23.14it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 22.87it/s][A
Epoch 1:  94%|█████████▍| 5607/5971 [50:56<03:18,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 23.60it/s][A
Epoch 1:  94%|█████████▍| 5611/5971 [50:56<03:16,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 24.15it/s][A
Epoch 1:  94%|█████████▍| 5615/5971 [50:56<03:13,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 24.78it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 24.95it/s][A
Epoch 1:  94%|█████████▍| 5619/5971 [50:56<03:11,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.86it/s][A
Epoch 1:  94%|█████████▍| 5623/5971 [50:56<03:09,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 28.18it/s][A
Epoch 1:  94%|█████████▍| 5627/5971 [50:56<03:06,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5628/5971 [50:57<03:06,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000115, train/loss_step=0.0285, global_step=1099.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.95it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.04it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.24it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.50it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.46it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.34it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.30it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.32it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.42it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.24it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.28it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.41it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.41it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.39it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.35it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.34it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.30it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.46it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.13it/s]

Epoch 1:  94%|█████████▍| 5629/5971 [51:09<03:06,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000539, train/loss_step=0.163, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.34it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.17it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.81it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.55it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.71it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.77it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.93it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.10it/s][A
Epoch 1:  94%|█████████▍| 5629/5971 [51:13<03:06,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000539, train/loss_step=0.163, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:07,  4.94it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.05it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.08it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.08it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.22it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.32it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.41it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.50it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.38it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.01it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:03,  4.93it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  4.91it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  4.88it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  4.82it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  4.82it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:02,  4.86it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  4.81it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  4.78it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  4.84it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  4.87it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:01,  4.89it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  4.92it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  4.92it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  4.90it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  4.82it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.81it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.85it/s]

Epoch 1:  94%|█████████▍| 5630/5971 [51:22<03:06,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000539, train/loss_step=0.163, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5630/5971 [51:22<03:06,  1.83it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.67e-5, train/loss_step=0.0242, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.31it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.04it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.62it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.13it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.51it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.75it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.80it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.82it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.89it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.06it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.12it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.09it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.14it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.24it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.19it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.27it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.29it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.26it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.19it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.19it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.19it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.17it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.22it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.18it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.16it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.08it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.23it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.18it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.19it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.08it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  4.98it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.00it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.08it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.28it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.18it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.15it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.08it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.06it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.06it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.15it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.89it/s]

Epoch 1:  94%|█████████▍| 5631/5971 [51:34<03:06,  1.82it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.67e-5, train/loss_step=0.0242, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5631/5971 [51:34<03:06,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00859, train/loss_vlb_step=3.99e-5, train/loss_step=0.00859, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:24,  2.04it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:14,  3.23it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:11,  4.00it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.52it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.84it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:08,  5.07it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.24it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.36it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:01<00:07,  5.44it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.49it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.54it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.51it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.52it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.48it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.47it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.27it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.13it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.11it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.07it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.10it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.10it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.01it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.09it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.15it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.19it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.14it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.13it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.18it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.21it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.26it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.37it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.58it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.61it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.48it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.33it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.21it/s]

Epoch 1:  94%|█████████▍| 5632/5971 [51:48<03:07,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00859, train/loss_vlb_step=3.99e-5, train/loss_step=0.00859, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5632/5971 [51:48<03:07,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=8.75e-5, train/loss_step=0.0232, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  94%|█████████▍| 5633/5971 [51:49<03:06,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=8.75e-5, train/loss_step=0.0232, global_step=1100.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5633/5971 [51:49<03:06,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000144, train/loss_step=0.0399, global_step=1101.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5634/5971 [51:50<03:06,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000144, train/loss_step=0.0399, global_step=1101.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5634/5971 [51:50<03:06,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000677, train/loss_step=0.200, global_step=1101.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  94%|█████████▍| 5635/5971 [51:51<03:05,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000677, train/loss_step=0.200, global_step=1101.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5635/5971 [51:51<03:05,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00489, train/loss_vlb_step=2.66e-5, train/loss_step=0.00489, global_step=1101.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5636/5971 [51:53<03:05,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00489, train/loss_vlb_step=2.66e-5, train/loss_step=0.00489, global_step=1101.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5636/5971 [51:53<03:05,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000872, train/loss_step=0.246, global_step=1101.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  94%|█████████▍| 5637/5971 [51:54<03:04,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000872, train/loss_step=0.246, global_step=1101.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5637/5971 [51:54<03:04,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000772, train/loss_step=0.193, global_step=1102.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5638/5971 [51:55<03:03,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000772, train/loss_step=0.193, global_step=1102.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5638/5971 [51:55<03:03,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00125, train/loss_step=0.300, global_step=1102.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  94%|█████████▍| 5639/5971 [51:56<03:03,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00125, train/loss_step=0.300, global_step=1102.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5639/5971 [51:56<03:03,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00179, train/loss_step=0.350, global_step=1102.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5640/5971 [51:58<03:02,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00179, train/loss_step=0.350, global_step=1102.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5640/5971 [51:58<03:02,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00101, train/loss_step=0.250, global_step=1102.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5641/5971 [51:59<03:02,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00101, train/loss_step=0.250, global_step=1102.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5641/5971 [51:59<03:02,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00124, train/loss_step=0.318, global_step=1103.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5642/5971 [52:00<03:01,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00124, train/loss_step=0.318, global_step=1103.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  94%|█████████▍| 5642/5971 [52:00<03:01,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.81e-5, train/loss_step=0.0105, global_step=1103.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5643/5971 [52:01<03:01,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.81e-5, train/loss_step=0.0105, global_step=1103.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5643/5971 [52:01<03:01,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000143, train/loss_step=0.0375, global_step=1103.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5644/5971 [52:03<03:00,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000143, train/loss_step=0.0375, global_step=1103.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5644/5971 [52:03<03:00,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00142, train/loss_vlb_step=8.55e-6, train/loss_step=0.00142, global_step=1103.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5645/5971 [52:04<03:00,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00142, train/loss_vlb_step=8.55e-6, train/loss_step=0.00142, global_step=1103.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5645/5971 [52:04<03:00,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0591, train/loss_vlb_step=0.000209, train/loss_step=0.0591, global_step=1104.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▍| 5646/5971 [52:05<02:59,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0591, train/loss_vlb_step=0.000209, train/loss_step=0.0591, global_step=1104.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5646/5971 [52:05<02:59,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0057, train/loss_vlb_step=3.01e-5, train/loss_step=0.0057, global_step=1104.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▍| 5647/5971 [52:06<02:59,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0057, train/loss_vlb_step=3.01e-5, train/loss_step=0.0057, global_step=1104.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5647/5971 [52:06<02:59,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000196, train/loss_step=0.0552, global_step=1104.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5648/5971 [52:08<02:58,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000196, train/loss_step=0.0552, global_step=1104.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5648/5971 [52:08<02:58,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000109, train/loss_step=0.0304, global_step=1104.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5649/5971 [52:09<02:58,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000109, train/loss_step=0.0304, global_step=1104.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5649/5971 [52:09<02:58,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.38e-5, train/loss_step=0.0159, global_step=1105.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▍| 5650/5971 [52:10<02:57,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.38e-5, train/loss_step=0.0159, global_step=1105.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5650/5971 [52:10<02:57,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.36e-5, train/loss_step=0.00237, global_step=1105.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5651/5971 [52:11<02:57,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.36e-5, train/loss_step=0.00237, global_step=1105.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5651/5971 [52:11<02:57,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00368, train/loss_step=0.467, global_step=1105.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  95%|█████████▍| 5652/5971 [52:13<02:56,  1.80it/s, loss=0.131, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00368, train/loss_step=0.467, global_step=1105.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5652/5971 [52:13<02:56,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000168, train/loss_step=0.0466, global_step=1105.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5653/5971 [52:14<02:56,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000168, train/loss_step=0.0466, global_step=1105.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5653/5971 [52:14<02:56,  1.80it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.31e-5, train/loss_step=0.0181, global_step=1106.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▍| 5654/5971 [52:15<02:55,  1.80it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.31e-5, train/loss_step=0.0181, global_step=1106.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5654/5971 [52:15<02:55,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000218, train/loss_step=0.063, global_step=1106.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▍| 5655/5971 [52:16<02:55,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000218, train/loss_step=0.063, global_step=1106.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5655/5971 [52:16<02:55,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000462, train/loss_step=0.138, global_step=1106.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▍| 5656/5971 [52:18<02:54,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000462, train/loss_step=0.138, global_step=1106.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5656/5971 [52:18<02:54,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000742, train/loss_step=0.209, global_step=1106.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5657/5971 [52:19<02:54,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000742, train/loss_step=0.209, global_step=1106.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5657/5971 [52:19<02:54,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00916, train/loss_vlb_step=4.13e-5, train/loss_step=0.00916, global_step=1107.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5658/5971 [52:20<02:53,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00916, train/loss_vlb_step=4.13e-5, train/loss_step=0.00916, global_step=1107.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5658/5971 [52:20<02:53,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000486, train/loss_step=0.145, global_step=1107.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  95%|█████████▍| 5659/5971 [52:21<02:53,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000486, train/loss_step=0.145, global_step=1107.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5659/5971 [52:21<02:53,  1.80it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000307, train/loss_step=0.0935, global_step=1107.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5660/5971 [52:23<02:52,  1.80it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000307, train/loss_step=0.0935, global_step=1107.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5660/5971 [52:23<02:52,  1.80it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000144, train/loss_step=0.0392, global_step=1107.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5661/5971 [52:24<02:52,  1.80it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000144, train/loss_step=0.0392, global_step=1107.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5661/5971 [52:24<02:52,  1.80it/s, loss=0.0753, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000206, train/loss_step=0.0596, global_step=1108.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5662/5971 [52:25<02:51,  1.80it/s, loss=0.0753, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000206, train/loss_step=0.0596, global_step=1108.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5662/5971 [52:25<02:51,  1.80it/s, loss=0.075, v_num=0, train/loss_simple_step=0.00386, train/loss_vlb_step=1.95e-5, train/loss_step=0.00386, global_step=1108.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5663/5971 [52:26<02:51,  1.80it/s, loss=0.075, v_num=0, train/loss_simple_step=0.00386, train/loss_vlb_step=1.95e-5, train/loss_step=0.00386, global_step=1108.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5663/5971 [52:26<02:51,  1.80it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00404, train/loss_step=0.472, global_step=1108.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  95%|█████████▍| 5664/5971 [52:28<02:50,  1.80it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00404, train/loss_step=0.472, global_step=1108.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5664/5971 [52:28<02:50,  1.80it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000289, train/loss_step=0.0868, global_step=1108.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5665/5971 [52:29<02:50,  1.80it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000289, train/loss_step=0.0868, global_step=1108.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5665/5971 [52:29<02:50,  1.80it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000123, train/loss_step=0.0303, global_step=1109.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5666/5971 [52:30<02:49,  1.80it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000123, train/loss_step=0.0303, global_step=1109.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5666/5971 [52:30<02:49,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=1109.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  95%|█████████▍| 5667/5971 [52:30<02:49,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=1109.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5667/5971 [52:30<02:49,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00241, train/loss_vlb_step=1.36e-5, train/loss_step=0.00241, global_step=1109.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5668/5971 [52:33<02:48,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00241, train/loss_vlb_step=1.36e-5, train/loss_step=0.00241, global_step=1109.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5668/5971 [52:33<02:48,  1.80it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.58e-5, train/loss_step=0.0027, global_step=1109.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  95%|█████████▍| 5669/5971 [52:34<02:47,  1.80it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.58e-5, train/loss_step=0.0027, global_step=1109.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5669/5971 [52:34<02:47,  1.80it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.94e-5, train/loss_step=0.0192, global_step=1110.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5670/5971 [52:34<02:47,  1.80it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.94e-5, train/loss_step=0.0192, global_step=1110.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5670/5971 [52:34<02:47,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000229, train/loss_step=0.0688, global_step=1110.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5671/5971 [52:35<02:46,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000229, train/loss_step=0.0688, global_step=1110.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5671/5971 [52:35<02:46,  1.80it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=1.02e-5, train/loss_step=0.00171, global_step=1110.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5672/5971 [52:37<02:46,  1.80it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=1.02e-5, train/loss_step=0.00171, global_step=1110.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▍| 5672/5971 [52:37<02:46,  1.80it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.33e-5, train/loss_step=0.00961, global_step=1110.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5673/5971 [52:38<02:45,  1.80it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.33e-5, train/loss_step=0.00961, global_step=1110.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5673/5971 [52:38<02:45,  1.80it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000109, train/loss_step=0.0298, global_step=1111.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▌| 5674/5971 [52:39<02:45,  1.80it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000109, train/loss_step=0.0298, global_step=1111.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5674/5971 [52:39<02:45,  1.80it/s, loss=0.0775, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000118, train/loss_step=0.0303, global_step=1111.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5675/5971 [52:40<02:44,  1.80it/s, loss=0.0775, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000118, train/loss_step=0.0303, global_step=1111.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5675/5971 [52:40<02:44,  1.80it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=1111.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  95%|█████████▌| 5676/5971 [52:43<02:44,  1.79it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=1111.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5676/5971 [52:43<02:44,  1.79it/s, loss=0.0658, v_num=0, train/loss_simple_step=0.00947, train/loss_vlb_step=4.06e-5, train/loss_step=0.00947, global_step=1111.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5677/5971 [52:43<02:43,  1.79it/s, loss=0.0658, v_num=0, train/loss_simple_step=0.00947, train/loss_vlb_step=4.06e-5, train/loss_step=0.00947, global_step=1111.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5677/5971 [52:43<02:43,  1.79it/s, loss=0.0655, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.12e-5, train/loss_step=0.00419, global_step=1112.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5678/5971 [52:44<02:43,  1.79it/s, loss=0.0655, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.12e-5, train/loss_step=0.00419, global_step=1112.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5678/5971 [52:44<02:43,  1.79it/s, loss=0.0716, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00106, train/loss_step=0.266, global_step=1112.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  95%|█████████▌| 5679/5971 [52:45<02:42,  1.79it/s, loss=0.0716, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00106, train/loss_step=0.266, global_step=1112.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5679/5971 [52:45<02:42,  1.79it/s, loss=0.0675, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.51e-5, train/loss_step=0.0108, global_step=1112.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5680/5971 [52:47<02:42,  1.79it/s, loss=0.0675, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.51e-5, train/loss_step=0.0108, global_step=1112.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5680/5971 [52:47<02:42,  1.79it/s, loss=0.0657, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.74e-5, train/loss_step=0.00316, global_step=1112.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5681/5971 [52:48<02:41,  1.79it/s, loss=0.0657, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.74e-5, train/loss_step=0.00316, global_step=1112.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5681/5971 [52:48<02:41,  1.79it/s, loss=0.0844, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00311, train/loss_step=0.434, global_step=1113.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  95%|█████████▌| 5682/5971 [52:49<02:41,  1.79it/s, loss=0.0844, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00311, train/loss_step=0.434, global_step=1113.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5682/5971 [52:49<02:41,  1.79it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000456, train/loss_step=0.128, global_step=1113.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5683/5971 [52:50<02:40,  1.79it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000456, train/loss_step=0.128, global_step=1113.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5683/5971 [52:50<02:40,  1.79it/s, loss=0.0678, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.88e-5, train/loss_step=0.0166, global_step=1113.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5684/5971 [52:52<02:40,  1.79it/s, loss=0.0678, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.88e-5, train/loss_step=0.0166, global_step=1113.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5684/5971 [52:52<02:40,  1.79it/s, loss=0.0704, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000462, train/loss_step=0.138, global_step=1113.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▌| 5685/5971 [52:53<02:39,  1.79it/s, loss=0.0704, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000462, train/loss_step=0.138, global_step=1113.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5685/5971 [52:53<02:39,  1.79it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00624, train/loss_step=0.564, global_step=1114.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▌| 5686/5971 [52:54<02:39,  1.79it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00624, train/loss_step=0.564, global_step=1114.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5686/5971 [52:54<02:39,  1.79it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.44e-5, train/loss_step=0.00269, global_step=1114.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5687/5971 [52:55<02:38,  1.79it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.44e-5, train/loss_step=0.00269, global_step=1114.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5687/5971 [52:55<02:38,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.460, train/loss_vlb_step=0.00408, train/loss_step=0.460, global_step=1114.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  95%|█████████▌| 5688/5971 [52:57<02:38,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.460, train/loss_vlb_step=0.00408, train/loss_step=0.460, global_step=1114.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5688/5971 [52:57<02:38,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0678, train/loss_vlb_step=0.000228, train/loss_step=0.0678, global_step=1114.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5689/5971 [52:58<02:37,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0678, train/loss_vlb_step=0.000228, train/loss_step=0.0678, global_step=1114.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5689/5971 [52:58<02:37,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00938, train/loss_vlb_step=4.01e-5, train/loss_step=0.00938, global_step=1115.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5690/5971 [52:59<02:36,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00938, train/loss_vlb_step=4.01e-5, train/loss_step=0.00938, global_step=1115.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5690/5971 [52:59<02:36,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.11e-5, train/loss_step=0.0019, global_step=1115.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  95%|█████████▌| 5691/5971 [53:00<02:36,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.11e-5, train/loss_step=0.0019, global_step=1115.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5691/5971 [53:00<02:36,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00399, train/loss_vlb_step=2.15e-5, train/loss_step=0.00399, global_step=1115.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5692/5971 [53:02<02:35,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00399, train/loss_vlb_step=2.15e-5, train/loss_step=0.00399, global_step=1115.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5692/5971 [53:02<02:35,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.59e-5, train/loss_step=0.0258, global_step=1115.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  95%|█████████▌| 5693/5971 [53:03<02:35,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.59e-5, train/loss_step=0.0258, global_step=1115.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5693/5971 [53:03<02:35,  1.79it/s, loss=0.121, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=1116.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▌| 5694/5971 [53:04<02:34,  1.79it/s, loss=0.121, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=1116.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5694/5971 [53:04<02:34,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0784, train/loss_vlb_step=0.000259, train/loss_step=0.0784, global_step=1116.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5695/5971 [53:05<02:34,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0784, train/loss_vlb_step=0.000259, train/loss_step=0.0784, global_step=1116.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5695/5971 [53:05<02:34,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000676, train/loss_step=0.189, global_step=1116.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  95%|█████████▌| 5696/5971 [53:07<02:33,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000676, train/loss_step=0.189, global_step=1116.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5696/5971 [53:07<02:33,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00177, train/loss_step=0.399, global_step=1116.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▌| 5697/5971 [53:08<02:33,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00177, train/loss_step=0.399, global_step=1116.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5697/5971 [53:08<02:33,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.15e-5, train/loss_step=0.0205, global_step=1117.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5698/5971 [53:09<02:32,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.15e-5, train/loss_step=0.0205, global_step=1117.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5698/5971 [53:09<02:32,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000612, train/loss_step=0.174, global_step=1117.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  95%|█████████▌| 5699/5971 [53:10<02:32,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000612, train/loss_step=0.174, global_step=1117.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5699/5971 [53:10<02:32,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000896, train/loss_step=0.252, global_step=1117.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5700/5971 [53:12<02:31,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000896, train/loss_step=0.252, global_step=1117.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5700/5971 [53:12<02:31,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.59e-5, train/loss_step=0.00294, global_step=1117.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5701/5971 [53:13<02:31,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.59e-5, train/loss_step=0.00294, global_step=1117.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5701/5971 [53:13<02:31,  1.79it/s, loss=0.159, v_num=0, train/loss_simple_step=0.510, train/loss_vlb_step=0.00366, train/loss_step=0.510, global_step=1118.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  95%|█████████▌| 5702/5971 [53:14<02:30,  1.79it/s, loss=0.159, v_num=0, train/loss_simple_step=0.510, train/loss_vlb_step=0.00366, train/loss_step=0.510, global_step=1118.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  95%|█████████▌| 5702/5971 [53:14<02:30,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000762, train/loss_step=0.208, global_step=1118.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5703/5971 [53:15<02:30,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000762, train/loss_step=0.208, global_step=1118.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5703/5971 [53:15<02:30,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.13e-5, train/loss_step=0.00668, global_step=1118.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5704/5971 [53:17<02:29,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.13e-5, train/loss_step=0.00668, global_step=1118.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5704/5971 [53:17<02:29,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000166, train/loss_step=0.0465, global_step=1118.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  96%|█████████▌| 5705/5971 [53:18<02:29,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000166, train/loss_step=0.0465, global_step=1118.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5705/5971 [53:18<02:29,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00904, train/loss_vlb_step=4.28e-5, train/loss_step=0.00904, global_step=1119.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5706/5971 [53:19<02:28,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00904, train/loss_vlb_step=4.28e-5, train/loss_step=0.00904, global_step=1119.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5706/5971 [53:19<02:28,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.79e-5, train/loss_step=0.00332, global_step=1119.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5707/5971 [53:20<02:28,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.79e-5, train/loss_step=0.00332, global_step=1119.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5707/5971 [53:20<02:28,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00113, train/loss_step=0.258, global_step=1119.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  96%|█████████▌| 5708/5971 [53:22<02:27,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00113, train/loss_step=0.258, global_step=1119.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5708/5971 [53:22<02:27,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.55e-5, train/loss_step=0.0153, global_step=1119.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5709/5971 [53:23<02:26,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.55e-5, train/loss_step=0.0153, global_step=1119.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5709/5971 [53:23<02:26,  1.78it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000133, train/loss_step=0.0362, global_step=1120.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5710/5971 [53:24<02:26,  1.78it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000133, train/loss_step=0.0362, global_step=1120.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5710/5971 [53:24<02:26,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00625, train/loss_step=0.564, global_step=1120.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  96%|█████████▌| 5711/5971 [53:24<02:25,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00625, train/loss_step=0.564, global_step=1120.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5711/5971 [53:24<02:25,  1.78it/s, loss=0.163, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00132, train/loss_step=0.320, global_step=1120.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5712/5971 [53:27<02:25,  1.78it/s, loss=0.163, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00132, train/loss_step=0.320, global_step=1120.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5712/5971 [53:27<02:25,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0088, train/loss_vlb_step=3.78e-5, train/loss_step=0.0088, global_step=1120.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5713/5971 [53:27<02:24,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0088, train/loss_vlb_step=3.78e-5, train/loss_step=0.0088, global_step=1120.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5713/5971 [53:27<02:24,  1.78it/s, loss=0.173, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0019, train/loss_step=0.358, global_step=1121.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  96%|█████████▌| 5714/5971 [53:28<02:24,  1.78it/s, loss=0.173, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0019, train/loss_step=0.358, global_step=1121.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5714/5971 [53:28<02:24,  1.78it/s, loss=0.19, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.0042, train/loss_step=0.428, global_step=1121.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  96%|█████████▌| 5715/5971 [53:29<02:23,  1.78it/s, loss=0.19, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.0042, train/loss_step=0.428, global_step=1121.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5715/5971 [53:29<02:23,  1.78it/s, loss=0.19, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000651, train/loss_step=0.181, global_step=1121.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5716/5971 [53:31<02:23,  1.78it/s, loss=0.19, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000651, train/loss_step=0.181, global_step=1121.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5716/5971 [53:31<02:23,  1.78it/s, loss=0.175, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000351, train/loss_step=0.106, global_step=1121.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5717/5971 [53:32<02:22,  1.78it/s, loss=0.175, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000351, train/loss_step=0.106, global_step=1121.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5717/5971 [53:32<02:22,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.000214, train/loss_step=0.0609, global_step=1122.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5718/5971 [53:33<02:22,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.000214, train/loss_step=0.0609, global_step=1122.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5718/5971 [53:33<02:22,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00865, train/loss_vlb_step=3.93e-5, train/loss_step=0.00865, global_step=1122.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5719/5971 [53:34<02:21,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00865, train/loss_vlb_step=3.93e-5, train/loss_step=0.00865, global_step=1122.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5719/5971 [53:34<02:21,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000271, train/loss_step=0.0797, global_step=1122.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  96%|█████████▌| 5720/5971 [53:36<02:21,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000271, train/loss_step=0.0797, global_step=1122.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5720/5971 [53:36<02:21,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00901, train/loss_vlb_step=4.21e-5, train/loss_step=0.00901, global_step=1122.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5721/5971 [53:37<02:20,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00901, train/loss_vlb_step=4.21e-5, train/loss_step=0.00901, global_step=1122.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5721/5971 [53:37<02:20,  1.78it/s, loss=0.145, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000641, train/loss_step=0.188, global_step=1123.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  96%|█████████▌| 5722/5971 [53:38<02:20,  1.78it/s, loss=0.145, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000641, train/loss_step=0.188, global_step=1123.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5722/5971 [53:38<02:20,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00087, train/loss_step=0.244, global_step=1123.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  96%|█████████▌| 5723/5971 [53:39<02:19,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00087, train/loss_step=0.244, global_step=1123.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5723/5971 [53:39<02:19,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000788, train/loss_step=0.211, global_step=1123.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5724/5971 [53:41<02:19,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000788, train/loss_step=0.211, global_step=1123.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5724/5971 [53:41<02:19,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00776, train/loss_vlb_step=3.54e-5, train/loss_step=0.00776, global_step=1123.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5725/5971 [53:42<02:18,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00776, train/loss_vlb_step=3.54e-5, train/loss_step=0.00776, global_step=1123.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5725/5971 [53:42<02:18,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.64e-5, train/loss_step=0.00297, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5726/5971 [53:43<02:17,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.64e-5, train/loss_step=0.00297, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5726/5971 [53:43<02:17,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.583, train/loss_vlb_step=0.0055, train/loss_step=0.583, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  96%|█████████▌| 5727/5971 [53:44<02:17,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.583, train/loss_vlb_step=0.0055, train/loss_step=0.583, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5727/5971 [53:44<02:17,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00123, train/loss_step=0.293, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5728/5971 [53:46<02:16,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00123, train/loss_step=0.293, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  96%|█████████▌| 5728/5971 [53:46<02:16,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:22,  2.01it/s][A
Epoch 1:  96%|█████████▌| 5730/5971 [53:47<02:15,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   2%|▏         | 3/167 [00:00<00:28,  5.78it/s][A
Epoch 1:  96%|█████████▌| 5732/5971 [53:47<02:14,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   4%|▎         | 6/167 [00:00<00:14, 10.84it/s][A
Epoch 1:  96%|█████████▌| 5735/5971 [53:47<02:12,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.95it/s][A
Epoch 1:  96%|█████████▌| 5738/5971 [53:47<02:11,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   7%|▋         | 12/167 [00:00<00:08, 18.13it/s][A
Epoch 1:  96%|█████████▌| 5741/5971 [53:47<02:09,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.51it/s][A
Epoch 1:  96%|█████████▌| 5744/5971 [53:47<02:07,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  11%|█         | 18/167 [00:01<00:06, 22.14it/s][A
Epoch 1:  96%|█████████▌| 5747/5971 [53:47<02:05,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.56it/s][A
Epoch 1:  96%|█████████▋| 5750/5971 [53:47<02:04,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.08it/s][A
Epoch 1:  96%|█████████▋| 5753/5971 [53:48<02:02,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.97it/s][A
Epoch 1:  96%|█████████▋| 5756/5971 [53:48<02:00,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.75it/s][A
Epoch 1:  96%|█████████▋| 5759/5971 [53:48<01:58,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.32it/s][A
Epoch 1:  96%|█████████▋| 5762/5971 [53:48<01:57,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.60it/s][A
Epoch 1:  97%|█████████▋| 5765/5971 [53:48<01:55,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 25.93it/s][A
Epoch 1:  97%|█████████▋| 5768/5971 [53:48<01:53,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.90it/s][A
Epoch 1:  97%|█████████▋| 5771/5971 [53:48<01:51,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.52it/s][A
Epoch 1:  97%|█████████▋| 5774/5971 [53:48<01:50,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.55it/s][A
Epoch 1:  97%|█████████▋| 5777/5971 [53:49<01:48,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.43it/s][A
Epoch 1:  97%|█████████▋| 5780/5971 [53:49<01:46,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.38it/s][A
Epoch 1:  97%|█████████▋| 5783/5971 [53:49<01:44,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.21it/s][A
Epoch 1:  97%|█████████▋| 5786/5971 [53:49<01:43,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 24.66it/s][A
Epoch 1:  97%|█████████▋| 5789/5971 [53:49<01:41,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.52it/s][A
Epoch 1:  97%|█████████▋| 5792/5971 [53:49<01:39,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.77it/s][A
Epoch 1:  97%|█████████▋| 5795/5971 [53:49<01:38,  1.79it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.45it/s][A
Epoch 1:  97%|█████████▋| 5798/5971 [53:49<01:36,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.37it/s][A
Epoch 1:  97%|█████████▋| 5801/5971 [53:49<01:34,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.97it/s][A
Epoch 1:  97%|█████████▋| 5804/5971 [53:50<01:32,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 24.87it/s][A
Epoch 1:  97%|█████████▋| 5807/5971 [53:50<01:31,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.38it/s][A
Epoch 1:  97%|█████████▋| 5810/5971 [53:50<01:29,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.89it/s][A
Epoch 1:  97%|█████████▋| 5813/5971 [53:50<01:27,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.18it/s][A
Epoch 1:  97%|█████████▋| 5816/5971 [53:50<01:26,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  54%|█████▍    | 90/167 [00:03<00:03, 25.27it/s][A
Epoch 1:  97%|█████████▋| 5819/5971 [53:50<01:24,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.21it/s][A
Epoch 1:  98%|█████████▊| 5822/5971 [53:50<01:22,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 24.21it/s][A
Epoch 1:  98%|█████████▊| 5825/5971 [53:50<01:20,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 24.82it/s][A
Epoch 1:  98%|█████████▊| 5828/5971 [53:51<01:19,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 26.58it/s][A
Epoch 1:  98%|█████████▊| 5832/5971 [53:51<01:16,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.93it/s][A
Epoch 1:  98%|█████████▊| 5836/5971 [53:51<01:14,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.79it/s][A
Epoch 1:  98%|█████████▊| 5840/5971 [53:51<01:12,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.43it/s][A
Epoch 1:  98%|█████████▊| 5844/5971 [53:51<01:10,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.76it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 27.16it/s][A
Epoch 1:  98%|█████████▊| 5848/5971 [53:51<01:07,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.81it/s][A
Epoch 1:  98%|█████████▊| 5852/5971 [53:51<01:05,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.78it/s][A
Epoch 1:  98%|█████████▊| 5856/5971 [53:52<01:03,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.11it/s][A
Epoch 1:  98%|█████████▊| 5860/5971 [53:52<01:01,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.05it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.14it/s][A
Epoch 1:  98%|█████████▊| 5864/5971 [53:52<00:58,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.25it/s][A
Epoch 1:  98%|█████████▊| 5868/5971 [53:52<00:56,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.35it/s][A
Epoch 1:  98%|█████████▊| 5872/5971 [53:52<00:54,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.00it/s][A
Epoch 1:  98%|█████████▊| 5876/5971 [53:52<00:52,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.67it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.93it/s][A
Epoch 1:  98%|█████████▊| 5880/5971 [53:52<00:50,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 28.18it/s][A
Epoch 1:  99%|█████████▊| 5884/5971 [53:53<00:47,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.52it/s][A
Epoch 1:  99%|█████████▊| 5888/5971 [53:53<00:45,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.60it/s][A
Epoch 1:  99%|█████████▊| 5892/5971 [53:53<00:43,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.45it/s][A
Epoch 1:  99%|█████████▊| 5896/5971 [53:53<00:41,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▊| 5896/5971 [53:54<00:41,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=1124.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]

                                                             [A
Epoch 1:  99%|█████████▉| 5897/5971 [53:55<00:40,  1.82it/s, loss=0.206, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00111, train/loss_step=0.289, global_step=1125.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  99%|█████████▉| 5898/5971 [53:55<00:40,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.0007, train/loss_step=0.200, global_step=1125.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  99%|█████████▉| 5899/5971 [53:56<00:39,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000929, train/loss_step=0.212, global_step=1125.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5900/5971 [53:59<00:38,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000929, train/loss_step=0.212, global_step=1125.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5900/5971 [53:59<00:38,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.49e-5, train/loss_step=0.0026, global_step=1125.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5901/5971 [54:00<00:38,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.725, train/loss_vlb_step=0.0193, train/loss_step=0.725, global_step=1126.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  99%|█████████▉| 5902/5971 [54:01<00:37,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00509, train/loss_step=0.453, global_step=1126.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5903/5971 [54:02<00:37,  1.82it/s, loss=0.21, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00149, train/loss_step=0.359, global_step=1126.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  99%|█████████▉| 5904/5971 [54:04<00:36,  1.82it/s, loss=0.21, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00149, train/loss_step=0.359, global_step=1126.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5904/5971 [54:04<00:36,  1.82it/s, loss=0.211, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=1126.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5905/5971 [54:05<00:36,  1.82it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=3.02e-5, train/loss_step=0.00598, global_step=1127.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5906/5971 [54:05<00:35,  1.82it/s, loss=0.214, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000435, train/loss_step=0.129, global_step=1127.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  99%|█████████▉| 5907/5971 [54:06<00:35,  1.82it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.000286, train/loss_step=0.0852, global_step=1127.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5908/5971 [54:08<00:34,  1.82it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.000286, train/loss_step=0.0852, global_step=1127.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5908/5971 [54:08<00:34,  1.82it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0509, train/loss_vlb_step=0.000183, train/loss_step=0.0509, global_step=1127.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5909/5971 [54:09<00:34,  1.82it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.11e-5, train/loss_step=0.00382, global_step=1128.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5910/5971 [54:10<00:33,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.38e-5, train/loss_step=0.00251, global_step=1128.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5911/5971 [54:11<00:33,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000945, train/loss_step=0.227, global_step=1128.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  99%|█████████▉| 5912/5971 [54:14<00:32,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000945, train/loss_step=0.227, global_step=1128.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5912/5971 [54:14<00:32,  1.82it/s, loss=0.228, v_num=0, train/loss_simple_step=0.651, train/loss_vlb_step=0.0136, train/loss_step=0.651, global_step=1128.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  99%|█████████▉| 5913/5971 [54:14<00:31,  1.82it/s, loss=0.237, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000607, train/loss_step=0.182, global_step=1129.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5914/5971 [54:15<00:31,  1.82it/s, loss=0.213, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000364, train/loss_step=0.106, global_step=1129.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5915/5971 [54:16<00:30,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.73e-5, train/loss_step=0.0254, global_step=1129.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  99%|█████████▉| 5916/5971 [54:19<00:30,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.73e-5, train/loss_step=0.0254, global_step=1129.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5916/5971 [54:19<00:30,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.45e-5, train/loss_step=0.00254, global_step=1129.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5917/5971 [54:20<00:29,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.003, train/loss_step=0.429, global_step=1130.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1:  99%|█████████▉| 5918/5971 [54:21<00:29,  1.82it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.42e-5, train/loss_step=0.0142, global_step=1130.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5919/5971 [54:21<00:28,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00195, train/loss_step=0.426, global_step=1130.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1:  99%|█████████▉| 5920/5971 [54:24<00:28,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00195, train/loss_step=0.426, global_step=1130.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5920/5971 [54:24<00:28,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00741, train/loss_vlb_step=3.6e-5, train/loss_step=0.00741, global_step=1130.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5921/5971 [54:25<00:27,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000234, train/loss_step=0.0675, global_step=1131.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5922/5971 [54:26<00:27,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.945, train/loss_vlb_step=0.239, train/loss_step=0.945, global_step=1131.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1:  99%|█████████▉| 5923/5971 [54:26<00:26,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.00047, train/loss_step=0.137, global_step=1131.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5924/5971 [54:29<00:25,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.00047, train/loss_step=0.137, global_step=1131.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5924/5971 [54:29<00:25,  1.81it/s, loss=0.198, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00394, train/loss_step=0.465, global_step=1131.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5925/5971 [54:30<00:25,  1.81it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.1e-5, train/loss_step=0.00188, global_step=1132.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5926/5971 [54:30<00:24,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.71e-5, train/loss_step=0.0054, global_step=1132.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  99%|█████████▉| 5927/5971 [54:31<00:24,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00167, train/loss_step=0.316, global_step=1132.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:  99%|█████████▉| 5928/5971 [54:34<00:23,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00167, train/loss_step=0.316, global_step=1132.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5928/5971 [54:34<00:23,  1.81it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000113, train/loss_step=0.0279, global_step=1132.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5929/5971 [54:35<00:23,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000156, train/loss_step=0.0424, global_step=1133.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5930/5971 [54:35<00:22,  1.81it/s, loss=0.223, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00189, train/loss_step=0.381, global_step=1133.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1:  99%|█████████▉| 5931/5971 [54:36<00:22,  1.81it/s, loss=0.224, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000885, train/loss_step=0.250, global_step=1133.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5932/5971 [54:39<00:21,  1.81it/s, loss=0.224, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000885, train/loss_step=0.250, global_step=1133.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5932/5971 [54:39<00:21,  1.81it/s, loss=0.198, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=1133.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5933/5971 [54:40<00:21,  1.81it/s, loss=0.212, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00346, train/loss_step=0.469, global_step=1134.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  99%|█████████▉| 5934/5971 [54:41<00:20,  1.81it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00606, train/loss_vlb_step=2.91e-5, train/loss_step=0.00606, global_step=1134.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5935/5971 [54:42<00:19,  1.81it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000116, train/loss_step=0.0311, global_step=1134.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  99%|█████████▉| 5936/5971 [54:44<00:19,  1.81it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000116, train/loss_step=0.0311, global_step=1134.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5936/5971 [54:44<00:19,  1.81it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0384, train/loss_vlb_step=0.000138, train/loss_step=0.0384, global_step=1134.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5937/5971 [54:45<00:18,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0051, train/loss_vlb_step=2.62e-5, train/loss_step=0.0051, global_step=1135.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1:  99%|█████████▉| 5938/5971 [54:46<00:18,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000202, train/loss_step=0.0586, global_step=1135.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5939/5971 [54:46<00:17,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000235, train/loss_step=0.0671, global_step=1135.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5940/5971 [54:49<00:17,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000235, train/loss_step=0.0671, global_step=1135.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5940/5971 [54:49<00:17,  1.81it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000167, train/loss_step=0.0481, global_step=1135.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1:  99%|█████████▉| 5941/5971 [54:50<00:16,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00211, train/loss_step=0.370, global_step=1136.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1: 100%|█████████▉| 5942/5971 [54:51<00:16,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.23e-5, train/loss_step=0.0021, global_step=1136.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5943/5971 [54:52<00:15,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.00896, train/loss_step=0.689, global_step=1136.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1: 100%|█████████▉| 5944/5971 [54:54<00:14,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.00896, train/loss_step=0.689, global_step=1136.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5944/5971 [54:54<00:14,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000163, train/loss_step=0.0464, global_step=1136.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5945/5971 [54:55<00:14,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000708, train/loss_step=0.204, global_step=1137.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1: 100%|█████████▉| 5946/5971 [54:56<00:13,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.95e-5, train/loss_step=0.00366, global_step=1137.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5947/5971 [54:57<00:13,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00737, train/loss_step=0.567, global_step=1137.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1: 100%|█████████▉| 5948/5971 [54:59<00:12,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00737, train/loss_step=0.567, global_step=1137.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5948/5971 [54:59<00:12,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00622, train/loss_vlb_step=3.15e-5, train/loss_step=0.00622, global_step=1137.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5949/5971 [55:00<00:12,  1.80it/s, loss=0.169, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=8.34e-5, train/loss_step=0.020, global_step=1138.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1: 100%|█████████▉| 5950/5971 [55:01<00:11,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000378, train/loss_step=0.113, global_step=1138.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5951/5971 [55:02<00:11,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.91e-5, train/loss_step=0.0197, global_step=1138.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5952/5971 [55:04<00:10,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.91e-5, train/loss_step=0.0197, global_step=1138.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5952/5971 [55:04<00:10,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.77e-5, train/loss_step=0.0103, global_step=1138.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5953/5971 [55:05<00:09,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000193, train/loss_step=0.055, global_step=1139.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1: 100%|█████████▉| 5954/5971 [55:06<00:09,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000865, train/loss_step=0.239, global_step=1139.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1: 100%|█████████▉| 5955/5971 [55:07<00:08,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.35e-5, train/loss_step=0.00229, global_step=1139.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5956/5971 [55:09<00:08,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.35e-5, train/loss_step=0.00229, global_step=1139.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5956/5971 [55:09<00:08,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0329, train/loss_vlb_step=0.000118, train/loss_step=0.0329, global_step=1139.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1: 100%|█████████▉| 5957/5971 [55:10<00:07,  1.80it/s, loss=0.14, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.0009, train/loss_step=0.243, global_step=1140.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]     
Epoch 1: 100%|█████████▉| 5958/5971 [55:11<00:07,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000115, train/loss_step=0.0325, global_step=1140.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5959/5971 [55:12<00:06,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000525, train/loss_step=0.157, global_step=1140.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1: 100%|█████████▉| 5960/5971 [55:14<00:06,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000525, train/loss_step=0.157, global_step=1140.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5960/5971 [55:14<00:06,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.908, train/loss_vlb_step=0.153, train/loss_step=0.908, global_step=1140.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1: 100%|█████████▉| 5961/5971 [55:15<00:05,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.00022, train/loss_step=0.0642, global_step=1141.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5962/5971 [55:16<00:05,  1.80it/s, loss=0.185, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00103, train/loss_step=0.284, global_step=1141.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1: 100%|█████████▉| 5963/5971 [55:17<00:04,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=1141.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5964/5971 [55:19<00:03,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=1141.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5964/5971 [55:19<00:03,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00287, train/loss_step=0.453, global_step=1141.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1: 100%|█████████▉| 5965/5971 [55:20<00:03,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00128, train/loss_step=0.292, global_step=1142.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5966/5971 [55:21<00:02,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000377, train/loss_step=0.113, global_step=1142.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5967/5971 [55:21<00:02,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0891, train/loss_vlb_step=0.000294, train/loss_step=0.0891, global_step=1142.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5968/5971 [55:24<00:01,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0891, train/loss_vlb_step=0.000294, train/loss_step=0.0891, global_step=1142.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5968/5971 [55:24<00:01,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00614, train/loss_vlb_step=3.16e-5, train/loss_step=0.00614, global_step=1142.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|█████████▉| 5969/5971 [55:25<00:01,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000116, train/loss_step=0.0288, global_step=1143.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 1: 100%|█████████▉| 5970/5971 [55:26<00:00,  1.80it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000178, train/loss_step=0.0529, global_step=1143.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:26<00:00,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00657, train/loss_vlb_step=3.28e-5, train/loss_step=0.00657, global_step=1143.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:29<00:00,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.593, train/loss_vlb_step=0.00678, train/loss_step=0.593, global_step=1143.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1: 100%|██████████| 5971/5971 [55:30<00:00,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00367, train/loss_vlb_step=1.98e-5, train/loss_step=0.00367, global_step=1144.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:31<00:00,  1.79it/s, loss=0.221, v_num=0, train/loss_simple_step=0.911, train/loss_vlb_step=0.230, train/loss_step=0.911, global_step=1144.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]      
Epoch 1: 100%|██████████| 5971/5971 [55:32<00:00,  1.79it/s, loss=0.225, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=1144.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:34<00:00,  1.79it/s, loss=0.229, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=1144.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:35<00:00,  1.79it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.65e-5, train/loss_step=0.0077, global_step=1145.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:35<00:00,  1.79it/s, loss=0.238, v_num=0, train/loss_simple_step=0.451, train/loss_vlb_step=0.00253, train/loss_step=0.451, global_step=1145.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1: 100%|██████████| 5971/5971 [55:36<00:00,  1.79it/s, loss=0.248, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00259, train/loss_step=0.367, global_step=1145.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:39<00:00,  1.79it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0084, train/loss_vlb_step=3.8e-5, train/loss_step=0.0084, global_step=1145.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:40<00:00,  1.79it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00908, train/loss_vlb_step=4.21e-5, train/loss_step=0.00908, global_step=1146.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:41<00:00,  1.79it/s, loss=0.205, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00206, train/loss_step=0.377, global_step=1146.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]    
Epoch 1: 100%|██████████| 5971/5971 [55:41<00:00,  1.79it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00646, train/loss_vlb_step=3.03e-5, train/loss_step=0.00646, global_step=1146.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:43<00:00,  1.79it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00646, train/loss_vlb_step=3.03e-5, train/loss_step=0.00646, global_step=1146.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:44<00:00,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.00013, train/loss_step=0.0349, global_step=1146.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1: 100%|██████████| 5971/5971 [55:44<00:00,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00682, train/loss_step=0.472, global_step=1147.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1: 100%|██████████| 5971/5971 [55:45<00:00,  1.78it/s, loss=0.2, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00189, train/loss_step=0.366, global_step=1147.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1: 100%|██████████| 5971/5971 [55:46<00:00,  1.78it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.42e-5, train/loss_step=0.00256, global_step=1147.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:49<00:00,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000678, train/loss_step=0.183, global_step=1147.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]   
Epoch 1: 100%|██████████| 5971/5971 [55:50<00:00,  1.78it/s, loss=0.208, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000343, train/loss_step=0.104, global_step=1148.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:51<00:00,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.74e-5, train/loss_step=0.00314, global_step=1148.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:52<00:00,  1.78it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.47e-5, train/loss_step=0.0184, global_step=1148.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1: 100%|██████████| 5971/5971 [55:54<00:00,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000129, train/loss_step=0.0336, global_step=1148.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 1: 100%|██████████| 5971/5971 [55:57<00:00,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=1149.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]  
Epoch 1:   0%|          | 0/5971 [00:00<00:00, 11335.96it/s, loss=0.183, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=1149.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 2:   0%|          | 0/5971 [00:00<00:02, 2141.04it/s, loss=0.183, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=1149.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158] 
Epoch 2:   0%|          | 1/5971 [00:02<1:58:23,  1.19s/it, loss=0.183, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=1149.0, train/loss_simple_epoch=0.158, train/loss_vlb_epoch=0.00255, train/loss_epoch=0.158]
Epoch 2:   0%|          | 1/5971 [00:02<1:58:27,  1.19s/it, loss=0.138, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.31e-5, train/loss_step=0.00429, global_step=1150.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 2/5971 [00:03<1:48:54,  1.09s/it, loss=0.138, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.31e-5, train/loss_step=0.00429, global_step=1150.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 2/5971 [00:03<1:48:56,  1.10s/it, loss=0.154, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00322, train/loss_step=0.421, global_step=1150.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   0%|          | 3/5971 [00:04<1:44:03,  1.05s/it, loss=0.154, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00322, train/loss_step=0.421, global_step=1150.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 3/5971 [00:04<1:44:04,  1.05s/it, loss=0.171, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00289, train/loss_step=0.452, global_step=1150.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 4/5971 [00:06<2:10:30,  1.31s/it, loss=0.171, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00289, train/loss_step=0.452, global_step=1150.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 4/5971 [00:06<2:10:32,  1.31s/it, loss=0.172, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.52e-5, train/loss_step=0.016, global_step=1150.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 5/5971 [00:07<2:04:05,  1.25s/it, loss=0.172, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.52e-5, train/loss_step=0.016, global_step=1150.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 5/5971 [00:07<2:04:06,  1.25s/it, loss=0.149, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.11e-5, train/loss_step=0.00184, global_step=1151.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 6/5971 [00:08<1:58:59,  1.20s/it, loss=0.149, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.11e-5, train/loss_step=0.00184, global_step=1151.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 6/5971 [00:08<1:59:00,  1.20s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000167, train/loss_step=0.0475, global_step=1151.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   0%|          | 7/5971 [00:09<1:55:18,  1.16s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000167, train/loss_step=0.0475, global_step=1151.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 7/5971 [00:09<1:55:18,  1.16s/it, loss=0.133, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.28e-5, train/loss_step=0.00698, global_step=1151.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 8/5971 [00:11<2:05:59,  1.27s/it, loss=0.133, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.28e-5, train/loss_step=0.00698, global_step=1151.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 8/5971 [00:11<2:05:59,  1.27s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000117, train/loss_step=0.0317, global_step=1151.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   0%|          | 9/5971 [00:12<2:02:22,  1.23s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000117, train/loss_step=0.0317, global_step=1151.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 9/5971 [00:12<2:02:22,  1.23s/it, loss=0.116, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.85e-5, train/loss_step=0.00335, global_step=1152.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 10/5971 [00:13<1:59:04,  1.20s/it, loss=0.116, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.85e-5, train/loss_step=0.00335, global_step=1152.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 10/5971 [00:13<1:59:05,  1.20s/it, loss=0.116, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.89e-5, train/loss_step=0.0112, global_step=1152.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   0%|          | 11/5971 [00:14<1:56:40,  1.17s/it, loss=0.116, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.89e-5, train/loss_step=0.0112, global_step=1152.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 11/5971 [00:14<1:56:41,  1.17s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.12e-5, train/loss_step=0.0019, global_step=1152.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 12/5971 [00:16<2:05:38,  1.27s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.12e-5, train/loss_step=0.0019, global_step=1152.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 12/5971 [00:16<2:05:38,  1.27s/it, loss=0.091, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.45e-5, train/loss_step=0.00961, global_step=1152.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 13/5971 [00:17<2:03:08,  1.24s/it, loss=0.091, v_num=0, train/loss_simple_step=0.00961, train/loss_vlb_step=4.45e-5, train/loss_step=0.00961, global_step=1152.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 13/5971 [00:17<2:03:08,  1.24s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.2e-5, train/loss_step=0.023, global_step=1153.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   0%|          | 14/5971 [00:18<2:00:52,  1.22s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.2e-5, train/loss_step=0.023, global_step=1153.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 14/5971 [00:18<2:00:53,  1.22s/it, loss=0.0744, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.55e-5, train/loss_step=0.0152, global_step=1153.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 15/5971 [00:19<1:58:49,  1.20s/it, loss=0.0744, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.55e-5, train/loss_step=0.0152, global_step=1153.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 15/5971 [00:19<1:58:49,  1.20s/it, loss=0.0688, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000231, train/loss_step=0.0698, global_step=1153.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 16/5971 [00:21<2:04:41,  1.26s/it, loss=0.0688, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000231, train/loss_step=0.0698, global_step=1153.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 16/5971 [00:21<2:04:42,  1.26s/it, loss=0.0638, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.22e-5, train/loss_step=0.00415, global_step=1153.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 17/5971 [00:22<2:02:50,  1.24s/it, loss=0.0638, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.22e-5, train/loss_step=0.00415, global_step=1153.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 17/5971 [00:22<2:02:51,  1.24s/it, loss=0.0637, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=1154.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 18/5971 [00:23<2:00:57,  1.22s/it, loss=0.0637, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=1154.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 18/5971 [00:23<2:00:58,  1.22s/it, loss=0.0968, v_num=0, train/loss_simple_step=0.681, train/loss_vlb_step=0.0201, train/loss_step=0.681, global_step=1154.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:   0%|          | 19/5971 [00:24<2:00:59,  1.22s/it, loss=0.0968, v_num=0, train/loss_simple_step=0.681, train/loss_vlb_step=0.0201, train/loss_step=0.681, global_step=1154.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 19/5971 [00:24<2:01:00,  1.22s/it, loss=0.0993, v_num=0, train/loss_simple_step=0.0835, train/loss_vlb_step=0.000284, train/loss_step=0.0835, global_step=1154.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 20/5971 [00:26<2:05:08,  1.26s/it, loss=0.0993, v_num=0, train/loss_simple_step=0.0835, train/loss_vlb_step=0.000284, train/loss_step=0.0835, global_step=1154.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 20/5971 [00:26<2:05:08,  1.26s/it, loss=0.111, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00186, train/loss_step=0.344, global_step=1154.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   0%|          | 21/5971 [00:27<2:03:31,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00186, train/loss_step=0.344, global_step=1154.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 21/5971 [00:27<2:03:31,  1.25s/it, loss=0.123, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.00103, train/loss_step=0.229, global_step=1155.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 22/5971 [00:28<2:01:57,  1.23s/it, loss=0.123, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.00103, train/loss_step=0.229, global_step=1155.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 22/5971 [00:28<2:01:57,  1.23s/it, loss=0.132, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.00736, train/loss_step=0.611, global_step=1155.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 23/5971 [00:29<2:00:35,  1.22s/it, loss=0.132, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.00736, train/loss_step=0.611, global_step=1155.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 23/5971 [00:29<2:00:35,  1.22s/it, loss=0.124, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00127, train/loss_step=0.297, global_step=1155.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 24/5971 [00:31<2:05:07,  1.26s/it, loss=0.124, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00127, train/loss_step=0.297, global_step=1155.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 24/5971 [00:31<2:05:07,  1.26s/it, loss=0.125, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000108, train/loss_step=0.0274, global_step=1155.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 25/5971 [00:32<2:03:45,  1.25s/it, loss=0.125, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000108, train/loss_step=0.0274, global_step=1155.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 25/5971 [00:32<2:03:46,  1.25s/it, loss=0.126, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.82e-5, train/loss_step=0.0216, global_step=1156.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   0%|          | 26/5971 [00:33<2:02:23,  1.24s/it, loss=0.126, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.82e-5, train/loss_step=0.0216, global_step=1156.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 26/5971 [00:33<2:02:23,  1.24s/it, loss=0.126, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.00017, train/loss_step=0.0496, global_step=1156.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 27/5971 [00:34<2:01:10,  1.22s/it, loss=0.126, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.00017, train/loss_step=0.0496, global_step=1156.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 27/5971 [00:34<2:01:10,  1.22s/it, loss=0.127, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000105, train/loss_step=0.028, global_step=1156.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   0%|          | 28/5971 [00:36<2:05:46,  1.27s/it, loss=0.127, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000105, train/loss_step=0.028, global_step=1156.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 28/5971 [00:36<2:05:46,  1.27s/it, loss=0.126, v_num=0, train/loss_simple_step=0.00995, train/loss_vlb_step=4.49e-5, train/loss_step=0.00995, global_step=1156.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 29/5971 [00:37<2:04:37,  1.26s/it, loss=0.126, v_num=0, train/loss_simple_step=0.00995, train/loss_vlb_step=4.49e-5, train/loss_step=0.00995, global_step=1156.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   0%|          | 29/5971 [00:37<2:04:37,  1.26s/it, loss=0.129, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000222, train/loss_step=0.0643, global_step=1157.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|          | 30/5971 [00:38<2:03:25,  1.25s/it, loss=0.129, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000222, train/loss_step=0.0643, global_step=1157.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 30/5971 [00:38<2:03:26,  1.25s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0973, train/loss_vlb_step=0.000324, train/loss_step=0.0973, global_step=1157.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 31/5971 [00:39<2:02:21,  1.24s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0973, train/loss_vlb_step=0.000324, train/loss_step=0.0973, global_step=1157.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 31/5971 [00:39<2:02:22,  1.24s/it, loss=0.143, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000653, train/loss_step=0.190, global_step=1157.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   1%|          | 32/5971 [00:41<2:05:27,  1.27s/it, loss=0.143, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000653, train/loss_step=0.190, global_step=1157.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 32/5971 [00:41<2:05:27,  1.27s/it, loss=0.168, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00416, train/loss_step=0.508, global_step=1157.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|          | 33/5971 [00:42<2:04:24,  1.26s/it, loss=0.168, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00416, train/loss_step=0.508, global_step=1157.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 33/5971 [00:42<2:04:24,  1.26s/it, loss=0.167, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.07e-5, train/loss_step=0.00175, global_step=1158.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 34/5971 [00:43<2:03:20,  1.25s/it, loss=0.167, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.07e-5, train/loss_step=0.00175, global_step=1158.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 34/5971 [00:43<2:03:20,  1.25s/it, loss=0.166, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.76e-6, train/loss_step=0.0016, global_step=1158.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   1%|          | 35/5971 [00:44<2:02:17,  1.24s/it, loss=0.166, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.76e-6, train/loss_step=0.0016, global_step=1158.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 35/5971 [00:44<2:02:17,  1.24s/it, loss=0.172, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000673, train/loss_step=0.192, global_step=1158.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|          | 36/5971 [00:46<2:05:00,  1.26s/it, loss=0.172, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000673, train/loss_step=0.192, global_step=1158.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 36/5971 [00:46<2:05:00,  1.26s/it, loss=0.178, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=1158.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 37/5971 [00:47<2:04:00,  1.25s/it, loss=0.178, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=1158.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 37/5971 [00:47<2:04:00,  1.25s/it, loss=0.179, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000114, train/loss_step=0.030, global_step=1159.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 38/5971 [00:48<2:03:00,  1.24s/it, loss=0.179, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000114, train/loss_step=0.030, global_step=1159.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 38/5971 [00:48<2:03:00,  1.24s/it, loss=0.146, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=4.15e-5, train/loss_step=0.00845, global_step=1159.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 39/5971 [00:49<2:02:05,  1.23s/it, loss=0.146, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=4.15e-5, train/loss_step=0.00845, global_step=1159.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 39/5971 [00:49<2:02:05,  1.23s/it, loss=0.162, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00209, train/loss_step=0.411, global_step=1159.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   1%|          | 40/5971 [00:51<2:04:19,  1.26s/it, loss=0.162, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00209, train/loss_step=0.411, global_step=1159.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 40/5971 [00:51<2:04:19,  1.26s/it, loss=0.149, v_num=0, train/loss_simple_step=0.0751, train/loss_vlb_step=0.000251, train/loss_step=0.0751, global_step=1159.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 41/5971 [00:52<2:03:26,  1.25s/it, loss=0.149, v_num=0, train/loss_simple_step=0.0751, train/loss_vlb_step=0.000251, train/loss_step=0.0751, global_step=1159.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 41/5971 [00:52<2:03:26,  1.25s/it, loss=0.154, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00232, train/loss_step=0.339, global_step=1160.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   1%|          | 42/5971 [00:53<2:02:31,  1.24s/it, loss=0.154, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00232, train/loss_step=0.339, global_step=1160.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 42/5971 [00:53<2:02:31,  1.24s/it, loss=0.124, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.38e-5, train/loss_step=0.00442, global_step=1160.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 43/5971 [00:54<2:01:39,  1.23s/it, loss=0.124, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.38e-5, train/loss_step=0.00442, global_step=1160.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 43/5971 [00:54<2:01:39,  1.23s/it, loss=0.11, v_num=0, train/loss_simple_step=0.00917, train/loss_vlb_step=4.13e-5, train/loss_step=0.00917, global_step=1160.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|          | 44/5971 [00:56<2:04:09,  1.26s/it, loss=0.11, v_num=0, train/loss_simple_step=0.00917, train/loss_vlb_step=4.13e-5, train/loss_step=0.00917, global_step=1160.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 44/5971 [00:56<2:04:10,  1.26s/it, loss=0.109, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.36e-5, train/loss_step=0.00912, global_step=1160.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 45/5971 [00:57<2:03:22,  1.25s/it, loss=0.109, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.36e-5, train/loss_step=0.00912, global_step=1160.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 45/5971 [00:57<2:03:22,  1.25s/it, loss=0.108, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=9.56e-6, train/loss_step=0.00167, global_step=1161.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 46/5971 [00:58<2:02:34,  1.24s/it, loss=0.108, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=9.56e-6, train/loss_step=0.00167, global_step=1161.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 46/5971 [00:58<2:02:34,  1.24s/it, loss=0.113, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000518, train/loss_step=0.156, global_step=1161.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   1%|          | 47/5971 [00:59<2:01:46,  1.23s/it, loss=0.113, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000518, train/loss_step=0.156, global_step=1161.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 47/5971 [00:59<2:01:46,  1.23s/it, loss=0.124, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00121, train/loss_step=0.247, global_step=1161.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|          | 48/5971 [01:01<2:03:34,  1.25s/it, loss=0.124, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00121, train/loss_step=0.247, global_step=1161.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 48/5971 [01:01<2:03:34,  1.25s/it, loss=0.139, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00124, train/loss_step=0.303, global_step=1161.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 49/5971 [01:02<2:02:49,  1.24s/it, loss=0.139, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00124, train/loss_step=0.303, global_step=1161.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 49/5971 [01:02<2:02:49,  1.24s/it, loss=0.145, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000669, train/loss_step=0.193, global_step=1162.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 50/5971 [01:03<2:02:04,  1.24s/it, loss=0.145, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000669, train/loss_step=0.193, global_step=1162.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 50/5971 [01:03<2:02:04,  1.24s/it, loss=0.143, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000184, train/loss_step=0.0552, global_step=1162.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 51/5971 [01:03<2:01:22,  1.23s/it, loss=0.143, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000184, train/loss_step=0.0552, global_step=1162.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 51/5971 [01:03<2:01:22,  1.23s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.74e-5, train/loss_step=0.0207, global_step=1162.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|          | 52/5971 [01:06<2:03:18,  1.25s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.74e-5, train/loss_step=0.0207, global_step=1162.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 52/5971 [01:06<2:03:18,  1.25s/it, loss=0.116, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000435, train/loss_step=0.132, global_step=1162.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|          | 53/5971 [01:07<2:02:38,  1.24s/it, loss=0.116, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000435, train/loss_step=0.132, global_step=1162.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 53/5971 [01:07<2:02:39,  1.24s/it, loss=0.116, v_num=0, train/loss_simple_step=0.00581, train/loss_vlb_step=2.89e-5, train/loss_step=0.00581, global_step=1163.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 54/5971 [01:08<2:01:57,  1.24s/it, loss=0.116, v_num=0, train/loss_simple_step=0.00581, train/loss_vlb_step=2.89e-5, train/loss_step=0.00581, global_step=1163.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 54/5971 [01:08<2:01:57,  1.24s/it, loss=0.131, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00126, train/loss_step=0.299, global_step=1163.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   1%|          | 55/5971 [01:08<2:01:16,  1.23s/it, loss=0.131, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00126, train/loss_step=0.299, global_step=1163.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 55/5971 [01:08<2:01:16,  1.23s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.44e-5, train/loss_step=0.0154, global_step=1163.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 56/5971 [01:11<2:02:49,  1.25s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.44e-5, train/loss_step=0.0154, global_step=1163.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 56/5971 [01:11<2:02:49,  1.25s/it, loss=0.116, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.3e-5, train/loss_step=0.00225, global_step=1163.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 57/5971 [01:11<2:02:10,  1.24s/it, loss=0.116, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.3e-5, train/loss_step=0.00225, global_step=1163.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 57/5971 [01:11<2:02:11,  1.24s/it, loss=0.115, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.55e-5, train/loss_step=0.0175, global_step=1164.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|          | 58/5971 [01:12<2:01:35,  1.23s/it, loss=0.115, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.55e-5, train/loss_step=0.0175, global_step=1164.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 58/5971 [01:12<2:01:35,  1.23s/it, loss=0.127, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.00116, train/loss_step=0.242, global_step=1164.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   1%|          | 59/5971 [01:13<2:01:13,  1.23s/it, loss=0.127, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.00116, train/loss_step=0.242, global_step=1164.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 59/5971 [01:13<2:01:13,  1.23s/it, loss=0.107, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.78e-5, train/loss_step=0.00332, global_step=1164.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 60/5971 [01:16<2:02:48,  1.25s/it, loss=0.107, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.78e-5, train/loss_step=0.00332, global_step=1164.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 60/5971 [01:16<2:02:48,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000577, train/loss_step=0.170, global_step=1164.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   1%|          | 61/5971 [01:16<2:02:14,  1.24s/it, loss=0.111, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000577, train/loss_step=0.170, global_step=1164.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 61/5971 [01:16<2:02:14,  1.24s/it, loss=0.0946, v_num=0, train/loss_simple_step=0.00416, train/loss_vlb_step=2.25e-5, train/loss_step=0.00416, global_step=1165.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 62/5971 [01:17<2:01:53,  1.24s/it, loss=0.0946, v_num=0, train/loss_simple_step=0.00416, train/loss_vlb_step=2.25e-5, train/loss_step=0.00416, global_step=1165.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 62/5971 [01:17<2:01:53,  1.24s/it, loss=0.0948, v_num=0, train/loss_simple_step=0.00934, train/loss_vlb_step=4.37e-5, train/loss_step=0.00934, global_step=1165.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 63/5971 [01:18<2:01:20,  1.23s/it, loss=0.0948, v_num=0, train/loss_simple_step=0.00934, train/loss_vlb_step=4.37e-5, train/loss_step=0.00934, global_step=1165.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 63/5971 [01:18<2:01:20,  1.23s/it, loss=0.106, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000834, train/loss_step=0.241, global_step=1165.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   1%|          | 64/5971 [01:21<2:03:17,  1.25s/it, loss=0.106, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000834, train/loss_step=0.241, global_step=1165.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 64/5971 [01:21<2:03:17,  1.25s/it, loss=0.106, v_num=0, train/loss_simple_step=0.00804, train/loss_vlb_step=3.81e-5, train/loss_step=0.00804, global_step=1165.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 65/5971 [01:22<2:02:44,  1.25s/it, loss=0.106, v_num=0, train/loss_simple_step=0.00804, train/loss_vlb_step=3.81e-5, train/loss_step=0.00804, global_step=1165.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 65/5971 [01:22<2:02:44,  1.25s/it, loss=0.118, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.00109, train/loss_step=0.231, global_step=1166.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   1%|          | 66/5971 [01:23<2:02:12,  1.24s/it, loss=0.118, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.00109, train/loss_step=0.231, global_step=1166.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 66/5971 [01:23<2:02:12,  1.24s/it, loss=0.11, v_num=0, train/loss_simple_step=0.00452, train/loss_vlb_step=2.28e-5, train/loss_step=0.00452, global_step=1166.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 67/5971 [01:24<2:01:40,  1.24s/it, loss=0.11, v_num=0, train/loss_simple_step=0.00452, train/loss_vlb_step=2.28e-5, train/loss_step=0.00452, global_step=1166.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 67/5971 [01:24<2:01:41,  1.24s/it, loss=0.133, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.0139, train/loss_step=0.712, global_step=1166.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   1%|          | 68/5971 [01:26<2:02:55,  1.25s/it, loss=0.133, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.0139, train/loss_step=0.712, global_step=1166.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 68/5971 [01:26<2:02:56,  1.25s/it, loss=0.142, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.00391, train/loss_step=0.466, global_step=1166.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 69/5971 [01:27<2:02:25,  1.24s/it, loss=0.142, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.00391, train/loss_step=0.466, global_step=1166.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 69/5971 [01:27<2:02:25,  1.24s/it, loss=0.162, v_num=0, train/loss_simple_step=0.592, train/loss_vlb_step=0.00689, train/loss_step=0.592, global_step=1167.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 70/5971 [01:27<2:01:52,  1.24s/it, loss=0.162, v_num=0, train/loss_simple_step=0.592, train/loss_vlb_step=0.00689, train/loss_step=0.592, global_step=1167.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 70/5971 [01:27<2:01:52,  1.24s/it, loss=0.159, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.58e-5, train/loss_step=0.00287, global_step=1167.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 71/5971 [01:28<2:01:30,  1.24s/it, loss=0.159, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.58e-5, train/loss_step=0.00287, global_step=1167.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 71/5971 [01:28<2:01:31,  1.24s/it, loss=0.158, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.8e-5, train/loss_step=0.011, global_step=1167.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:   1%|          | 72/5971 [01:31<2:02:48,  1.25s/it, loss=0.158, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.8e-5, train/loss_step=0.011, global_step=1167.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 72/5971 [01:31<2:02:48,  1.25s/it, loss=0.153, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000108, train/loss_step=0.0306, global_step=1167.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 73/5971 [01:32<2:02:17,  1.24s/it, loss=0.153, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000108, train/loss_step=0.0306, global_step=1167.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 73/5971 [01:32<2:02:18,  1.24s/it, loss=0.155, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000125, train/loss_step=0.0317, global_step=1168.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 74/5971 [01:32<2:01:47,  1.24s/it, loss=0.155, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000125, train/loss_step=0.0317, global_step=1168.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|          | 74/5971 [01:32<2:01:47,  1.24s/it, loss=0.14, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.38e-5, train/loss_step=0.00242, global_step=1168.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 75/5971 [01:33<2:01:18,  1.23s/it, loss=0.14, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.38e-5, train/loss_step=0.00242, global_step=1168.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 75/5971 [01:33<2:01:18,  1.23s/it, loss=0.143, v_num=0, train/loss_simple_step=0.071, train/loss_vlb_step=0.000247, train/loss_step=0.071, global_step=1168.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   1%|▏         | 76/5971 [01:35<2:02:25,  1.25s/it, loss=0.143, v_num=0, train/loss_simple_step=0.071, train/loss_vlb_step=0.000247, train/loss_step=0.071, global_step=1168.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 76/5971 [01:35<2:02:25,  1.25s/it, loss=0.166, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00245, train/loss_step=0.462, global_step=1168.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|▏         | 77/5971 [01:36<2:01:58,  1.24s/it, loss=0.166, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00245, train/loss_step=0.462, global_step=1168.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 77/5971 [01:36<2:01:59,  1.24s/it, loss=0.172, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000469, train/loss_step=0.139, global_step=1169.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 78/5971 [01:37<2:01:31,  1.24s/it, loss=0.172, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000469, train/loss_step=0.139, global_step=1169.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 78/5971 [01:37<2:01:31,  1.24s/it, loss=0.177, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00185, train/loss_step=0.344, global_step=1169.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|▏         | 79/5971 [01:38<2:01:04,  1.23s/it, loss=0.177, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00185, train/loss_step=0.344, global_step=1169.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 79/5971 [01:38<2:01:04,  1.23s/it, loss=0.177, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.37e-5, train/loss_step=0.00234, global_step=1169.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 80/5971 [01:40<2:02:11,  1.24s/it, loss=0.177, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.37e-5, train/loss_step=0.00234, global_step=1169.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 80/5971 [01:40<2:02:11,  1.24s/it, loss=0.175, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000412, train/loss_step=0.125, global_step=1169.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   1%|▏         | 81/5971 [01:41<2:01:46,  1.24s/it, loss=0.175, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000412, train/loss_step=0.125, global_step=1169.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 81/5971 [01:41<2:01:46,  1.24s/it, loss=0.175, v_num=0, train/loss_simple_step=0.00381, train/loss_vlb_step=2.1e-5, train/loss_step=0.00381, global_step=1170.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 82/5971 [01:42<2:01:18,  1.24s/it, loss=0.175, v_num=0, train/loss_simple_step=0.00381, train/loss_vlb_step=2.1e-5, train/loss_step=0.00381, global_step=1170.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 82/5971 [01:42<2:01:18,  1.24s/it, loss=0.176, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000119, train/loss_step=0.0309, global_step=1170.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 83/5971 [01:43<2:00:52,  1.23s/it, loss=0.176, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000119, train/loss_step=0.0309, global_step=1170.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 83/5971 [01:43<2:00:52,  1.23s/it, loss=0.169, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=1170.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   1%|▏         | 84/5971 [01:45<2:02:09,  1.25s/it, loss=0.169, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=1170.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 84/5971 [01:45<2:02:09,  1.25s/it, loss=0.169, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.29e-5, train/loss_step=0.00422, global_step=1170.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 85/5971 [01:46<2:01:44,  1.24s/it, loss=0.169, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.29e-5, train/loss_step=0.00422, global_step=1170.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 85/5971 [01:46<2:01:44,  1.24s/it, loss=0.158, v_num=0, train/loss_simple_step=0.00895, train/loss_vlb_step=4.2e-5, train/loss_step=0.00895, global_step=1171.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|▏         | 86/5971 [01:47<2:01:19,  1.24s/it, loss=0.158, v_num=0, train/loss_simple_step=0.00895, train/loss_vlb_step=4.2e-5, train/loss_step=0.00895, global_step=1171.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 86/5971 [01:47<2:01:19,  1.24s/it, loss=0.178, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00207, train/loss_step=0.404, global_step=1171.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   1%|▏         | 87/5971 [01:48<2:00:53,  1.23s/it, loss=0.178, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00207, train/loss_step=0.404, global_step=1171.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 87/5971 [01:48<2:00:53,  1.23s/it, loss=0.15, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.00051, train/loss_step=0.155, global_step=1171.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   1%|▏         | 88/5971 [01:50<2:01:51,  1.24s/it, loss=0.15, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.00051, train/loss_step=0.155, global_step=1171.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 88/5971 [01:50<2:01:51,  1.24s/it, loss=0.127, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.12e-5, train/loss_step=0.00194, global_step=1171.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 89/5971 [01:51<2:01:27,  1.24s/it, loss=0.127, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.12e-5, train/loss_step=0.00194, global_step=1171.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   1%|▏         | 89/5971 [01:51<2:01:27,  1.24s/it, loss=0.0974, v_num=0, train/loss_simple_step=0.00945, train/loss_vlb_step=4.45e-5, train/loss_step=0.00945, global_step=1172.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 90/5971 [01:52<2:01:02,  1.23s/it, loss=0.0974, v_num=0, train/loss_simple_step=0.00945, train/loss_vlb_step=4.45e-5, train/loss_step=0.00945, global_step=1172.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 90/5971 [01:52<2:01:02,  1.23s/it, loss=0.123, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.00829, train/loss_step=0.520, global_step=1172.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:   2%|▏         | 91/5971 [01:53<2:00:39,  1.23s/it, loss=0.123, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.00829, train/loss_step=0.520, global_step=1172.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 91/5971 [01:53<2:00:39,  1.23s/it, loss=0.16, v_num=0, train/loss_simple_step=0.737, train/loss_vlb_step=0.0243, train/loss_step=0.737, global_step=1172.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   2%|▏         | 92/5971 [01:55<2:01:45,  1.24s/it, loss=0.16, v_num=0, train/loss_simple_step=0.737, train/loss_vlb_step=0.0243, train/loss_step=0.737, global_step=1172.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 92/5971 [01:55<2:01:45,  1.24s/it, loss=0.161, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000222, train/loss_step=0.0656, global_step=1172.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 93/5971 [01:56<2:01:23,  1.24s/it, loss=0.161, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000222, train/loss_step=0.0656, global_step=1172.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 93/5971 [01:56<2:01:23,  1.24s/it, loss=0.16, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=2.07e-5, train/loss_step=0.00379, global_step=1173.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 94/5971 [01:57<2:00:59,  1.24s/it, loss=0.16, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=2.07e-5, train/loss_step=0.00379, global_step=1173.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 94/5971 [01:57<2:00:59,  1.24s/it, loss=0.16, v_num=0, train/loss_simple_step=0.00526, train/loss_vlb_step=2.7e-5, train/loss_step=0.00526, global_step=1173.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   2%|▏         | 95/5971 [01:58<2:00:37,  1.23s/it, loss=0.16, v_num=0, train/loss_simple_step=0.00526, train/loss_vlb_step=2.7e-5, train/loss_step=0.00526, global_step=1173.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 95/5971 [01:58<2:00:37,  1.23s/it, loss=0.16, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000227, train/loss_step=0.0635, global_step=1173.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 96/5971 [02:00<2:01:27,  1.24s/it, loss=0.16, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000227, train/loss_step=0.0635, global_step=1173.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 96/5971 [02:00<2:01:28,  1.24s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.71e-5, train/loss_step=0.00302, global_step=1173.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 97/5971 [02:01<2:01:05,  1.24s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.71e-5, train/loss_step=0.00302, global_step=1173.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 97/5971 [02:01<2:01:05,  1.24s/it, loss=0.136, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000384, train/loss_step=0.116, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   2%|▏         | 98/5971 [02:02<2:00:44,  1.23s/it, loss=0.136, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000384, train/loss_step=0.116, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 98/5971 [02:02<2:00:44,  1.23s/it, loss=0.12, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.00014, train/loss_step=0.0403, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 99/5971 [02:03<2:00:23,  1.23s/it, loss=0.12, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.00014, train/loss_step=0.0403, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 99/5971 [02:03<2:00:23,  1.23s/it, loss=0.135, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.0012, train/loss_step=0.295, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   2%|▏         | 100/5971 [02:05<2:01:16,  1.24s/it, loss=0.135, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.0012, train/loss_step=0.295, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   2%|▏         | 100/5971 [02:05<2:01:16,  1.24s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.46it/s][A
Epoch 2:   2%|▏         | 102/5971 [02:05<1:59:18,  1.22s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:56,  2.92it/s][A
Epoch 2:   2%|▏         | 104/5971 [02:05<1:57:17,  1.20s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.12it/s][A
Epoch 2:   2%|▏         | 107/5971 [02:06<1:54:05,  1.17s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.23it/s][A
Epoch 2:   2%|▏         | 110/5971 [02:06<1:51:02,  1.14s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:09, 15.74it/s][A
Epoch 2:   2%|▏         | 113/5971 [02:06<1:48:10,  1.11s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.89it/s][A
Epoch 2:   2%|▏         | 116/5971 [02:06<1:45:27,  1.08s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.98it/s][A
Epoch 2:   2%|▏         | 119/5971 [02:06<1:42:51,  1.05s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.14it/s][A
Epoch 2:   2%|▏         | 122/5971 [02:06<1:40:22,  1.03s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.51it/s][A
Epoch 2:   2%|▏         | 125/5971 [02:06<1:38:01,  1.01s/it, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.01it/s][A
Epoch 2:   2%|▏         | 129/5971 [02:06<1:35:02,  1.02it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.68it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.71it/s][A
Epoch 2:   2%|▏         | 133/5971 [02:07<1:32:15,  1.05it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.76it/s][A
Epoch 2:   2%|▏         | 137/5971 [02:07<1:29:37,  1.08it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 27.60it/s][A
Epoch 2:   2%|▏         | 141/5971 [02:07<1:27:07,  1.12it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.44it/s][A
Epoch 2:   2%|▏         | 145/5971 [02:07<1:24:47,  1.15it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.23it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.54it/s][A
Epoch 2:   2%|▏         | 149/5971 [02:07<1:22:33,  1.18it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 27.33it/s][A
Epoch 2:   3%|▎         | 153/5971 [02:07<1:20:27,  1.21it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.79it/s][A
Epoch 2:   3%|▎         | 157/5971 [02:07<1:18:28,  1.23it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.32it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.95it/s][A
Epoch 2:   3%|▎         | 161/5971 [02:08<1:16:34,  1.26it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.39it/s][A
Epoch 2:   3%|▎         | 165/5971 [02:08<1:14:45,  1.29it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.99it/s][A
Epoch 2:   3%|▎         | 169/5971 [02:08<1:13:02,  1.32it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.91it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.13it/s][A
Epoch 2:   3%|▎         | 173/5971 [02:08<1:11:24,  1.35it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.80it/s][A
Epoch 2:   3%|▎         | 177/5971 [02:08<1:09:50,  1.38it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.86it/s][A
Epoch 2:   3%|▎         | 181/5971 [02:08<1:08:19,  1.41it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.58it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 27.25it/s][A
Epoch 2:   3%|▎         | 185/5971 [02:09<1:06:53,  1.44it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.87it/s][A
Epoch 2:   3%|▎         | 189/5971 [02:09<1:05:31,  1.47it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 25.93it/s][A
Epoch 2:   3%|▎         | 193/5971 [02:09<1:04:12,  1.50it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.70it/s][A
Epoch 2:   3%|▎         | 197/5971 [02:09<1:02:55,  1.53it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.59it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 27.98it/s][A
Epoch 2:   3%|▎         | 201/5971 [02:09<1:01:42,  1.56it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 28.42it/s][A
Epoch 2:   3%|▎         | 205/5971 [02:09<1:00:31,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.77it/s][A
Epoch 2:   4%|▎         | 209/5971 [02:09<59:23,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 28.32it/s][A

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 27.91it/s][A
Epoch 2:   4%|▎         | 213/5971 [02:10<58:18,  1.65it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 27.34it/s][A
Epoch 2:   4%|▎         | 217/5971 [02:10<57:16,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.99it/s][A
Epoch 2:   4%|▎         | 221/5971 [02:10<56:15,  1.70it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.08it/s][A

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.03it/s][A
Epoch 2:   4%|▍         | 225/5971 [02:10<55:18,  1.73it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.13it/s][A
Epoch 2:   4%|▍         | 229/5971 [02:10<54:21,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 24.78it/s][A
Epoch 2:   4%|▍         | 233/5971 [02:10<53:28,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 24.99it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 25.05it/s][A
Epoch 2:   4%|▍         | 237/5971 [02:10<52:35,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.52it/s][A
Epoch 2:   4%|▍         | 241/5971 [02:11<51:45,  1.85it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.86it/s][A
Epoch 2:   4%|▍         | 245/5971 [02:11<50:55,  1.87it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.41it/s][A
Epoch 2:   4%|▍         | 249/5971 [02:11<50:08,  1.90it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.99it/s][A

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.69it/s][A
Epoch 2:   4%|▍         | 253/5971 [02:11<49:22,  1.93it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 24.78it/s][A
Epoch 2:   4%|▍         | 257/5971 [02:11<48:38,  1.96it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 24.51it/s][A
Epoch 2:   4%|▍         | 261/5971 [02:11<47:55,  1.99it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.72it/s][A

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.77it/s][A
Epoch 2:   4%|▍         | 265/5971 [02:12<47:12,  2.01it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   4%|▍         | 268/5971 [02:12<46:46,  2.03it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:   5%|▍         | 269/5971 [02:13<46:57,  2.02it/s, loss=0.136, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000472, train/loss_step=0.141, global_step=1174.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 269/5971 [02:13<46:57,  2.02it/s, loss=0.146, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000757, train/loss_step=0.213, global_step=1175.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 270/5971 [02:14<47:04,  2.02it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.57e-5, train/loss_step=0.00273, global_step=1175.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 271/5971 [02:15<47:12,  2.01it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00412, train/loss_vlb_step=2.15e-5, train/loss_step=0.00412, global_step=1175.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   5%|▍         | 272/5971 [02:18<48:17,  1.97it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.76e-5, train/loss_step=0.00316, global_step=1175.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 273/5971 [02:19<48:25,  1.96it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.76e-5, train/loss_step=0.00316, global_step=1175.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 273/5971 [02:19<48:25,  1.96it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.93e-5, train/loss_step=0.0138, global_step=1176.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   5%|▍         | 274/5971 [02:20<48:32,  1.96it/s, loss=0.128, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000568, train/loss_step=0.164, global_step=1176.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 275/5971 [02:21<48:39,  1.95it/s, loss=0.124, v_num=0, train/loss_simple_step=0.075, train/loss_vlb_step=0.000253, train/loss_step=0.075, global_step=1176.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 276/5971 [02:23<49:13,  1.93it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000137, train/loss_step=0.0366, global_step=1176.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 277/5971 [02:24<49:20,  1.92it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000137, train/loss_step=0.0366, global_step=1176.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 277/5971 [02:24<49:20,  1.92it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.84e-5, train/loss_step=0.00336, global_step=1177.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 278/5971 [02:25<49:27,  1.92it/s, loss=0.107, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000489, train/loss_step=0.147, global_step=1177.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   5%|▍         | 279/5971 [02:26<49:33,  1.91it/s, loss=0.0717, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000144, train/loss_step=0.0375, global_step=1177.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 280/5971 [02:28<50:12,  1.89it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00176, train/loss_step=0.399, global_step=1177.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   5%|▍         | 281/5971 [02:29<50:19,  1.88it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00176, train/loss_step=0.399, global_step=1177.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 281/5971 [02:29<50:19,  1.88it/s, loss=0.0904, v_num=0, train/loss_simple_step=0.0447, train/loss_vlb_step=0.000159, train/loss_step=0.0447, global_step=1178.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 282/5971 [02:30<50:25,  1.88it/s, loss=0.114, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00596, train/loss_step=0.468, global_step=1178.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   5%|▍         | 283/5971 [02:31<50:31,  1.88it/s, loss=0.13, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00239, train/loss_step=0.402, global_step=1178.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   5%|▍         | 284/5971 [02:33<51:03,  1.86it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000209, train/loss_step=0.0636, global_step=1178.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 285/5971 [02:34<51:10,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000209, train/loss_step=0.0636, global_step=1178.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 285/5971 [02:34<51:11,  1.85it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00883, train/loss_vlb_step=3.92e-5, train/loss_step=0.00883, global_step=1179.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 286/5971 [02:35<51:17,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.510, train/loss_vlb_step=0.00574, train/loss_step=0.510, global_step=1179.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   5%|▍         | 287/5971 [02:36<51:23,  1.84it/s, loss=0.158, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00263, train/loss_step=0.417, global_step=1179.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 288/5971 [02:38<51:58,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000117, train/loss_step=0.0313, global_step=1179.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 289/5971 [02:39<52:05,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000117, train/loss_step=0.0313, global_step=1179.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 289/5971 [02:39<52:05,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.089, train/loss_vlb_step=0.0003, train/loss_step=0.089, global_step=1180.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   5%|▍         | 290/5971 [02:40<52:11,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=2.9e-5, train/loss_step=0.00593, global_step=1180.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 291/5971 [02:41<52:17,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00707, train/loss_vlb_step=3.46e-5, train/loss_step=0.00707, global_step=1180.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 292/5971 [02:43<52:47,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.04e-5, train/loss_step=0.00387, global_step=1180.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 293/5971 [02:44<52:53,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.04e-5, train/loss_step=0.00387, global_step=1180.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 293/5971 [02:44<52:53,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=4.95e-5, train/loss_step=0.0115, global_step=1181.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   5%|▍         | 294/5971 [02:45<52:59,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00536, train/loss_step=0.477, global_step=1181.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   5%|▍         | 295/5971 [02:46<53:05,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000946, train/loss_step=0.246, global_step=1181.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 296/5971 [02:48<53:35,  1.77it/s, loss=0.175, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000416, train/loss_step=0.127, global_step=1181.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 297/5971 [02:49<53:41,  1.76it/s, loss=0.175, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000416, train/loss_step=0.127, global_step=1181.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▍         | 297/5971 [02:49<53:41,  1.76it/s, loss=0.205, v_num=0, train/loss_simple_step=0.602, train/loss_vlb_step=0.00646, train/loss_step=0.602, global_step=1182.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   5%|▍         | 298/5971 [02:50<53:46,  1.76it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0557, train/loss_vlb_step=0.000196, train/loss_step=0.0557, global_step=1182.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 299/5971 [02:50<53:52,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.45e-5, train/loss_step=0.0256, global_step=1182.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   5%|▌         | 300/5971 [02:53<54:21,  1.74it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.34e-5, train/loss_step=0.0124, global_step=1182.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 301/5971 [02:54<54:28,  1.73it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.34e-5, train/loss_step=0.0124, global_step=1182.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 301/5971 [02:54<54:28,  1.73it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00942, train/loss_vlb_step=4.42e-5, train/loss_step=0.00942, global_step=1183.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 302/5971 [02:54<54:33,  1.73it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00913, train/loss_vlb_step=4.35e-5, train/loss_step=0.00913, global_step=1183.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 303/5971 [02:55<54:37,  1.73it/s, loss=0.148, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000919, train/loss_step=0.250, global_step=1183.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   5%|▌         | 304/5971 [02:57<55:06,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000973, train/loss_step=0.261, global_step=1183.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 305/5971 [02:58<55:11,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000973, train/loss_step=0.261, global_step=1183.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 305/5971 [02:58<55:11,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00654, train/loss_vlb_step=3.17e-5, train/loss_step=0.00654, global_step=1184.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 306/5971 [02:59<55:16,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.592, train/loss_vlb_step=0.00635, train/loss_step=0.592, global_step=1184.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   5%|▌         | 307/5971 [03:00<55:20,  1.71it/s, loss=0.151, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000799, train/loss_step=0.206, global_step=1184.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 308/5971 [03:03<55:56,  1.69it/s, loss=0.159, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.0007, train/loss_step=0.186, global_step=1184.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   5%|▌         | 309/5971 [03:04<56:01,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.0007, train/loss_step=0.186, global_step=1184.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 309/5971 [03:04<56:01,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.00075, train/loss_step=0.221, global_step=1185.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 310/5971 [03:04<56:05,  1.68it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000113, train/loss_step=0.0288, global_step=1185.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 311/5971 [03:05<56:10,  1.68it/s, loss=0.179, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.00088, train/loss_step=0.245, global_step=1185.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   5%|▌         | 312/5971 [03:08<56:39,  1.66it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.55e-5, train/loss_step=0.00269, global_step=1185.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 313/5971 [03:08<56:44,  1.66it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.55e-5, train/loss_step=0.00269, global_step=1185.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 313/5971 [03:08<56:44,  1.66it/s, loss=0.19, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.00122, train/loss_step=0.245, global_step=1186.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:   5%|▌         | 314/5971 [03:09<56:48,  1.66it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.04e-5, train/loss_step=0.00396, global_step=1186.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 315/5971 [03:10<56:53,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.55e-5, train/loss_step=0.0028, global_step=1186.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   5%|▌         | 316/5971 [03:12<57:20,  1.64it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00498, train/loss_vlb_step=2.53e-5, train/loss_step=0.00498, global_step=1186.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 317/5971 [03:13<57:25,  1.64it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00498, train/loss_vlb_step=2.53e-5, train/loss_step=0.00498, global_step=1186.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 317/5971 [03:13<57:25,  1.64it/s, loss=0.133, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00118, train/loss_step=0.283, global_step=1187.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   5%|▌         | 318/5971 [03:14<57:30,  1.64it/s, loss=0.149, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00206, train/loss_step=0.391, global_step=1187.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 319/5971 [03:15<57:34,  1.64it/s, loss=0.16, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000913, train/loss_step=0.229, global_step=1187.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 320/5971 [03:17<57:59,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000266, train/loss_step=0.0803, global_step=1187.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 321/5971 [03:18<58:04,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000266, train/loss_step=0.0803, global_step=1187.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 321/5971 [03:18<58:04,  1.62it/s, loss=0.181, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.0021, train/loss_step=0.379, global_step=1188.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   5%|▌         | 322/5971 [03:19<58:08,  1.62it/s, loss=0.2, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00257, train/loss_step=0.375, global_step=1188.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   5%|▌         | 323/5971 [03:20<58:12,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.000235, train/loss_step=0.0682, global_step=1188.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 324/5971 [03:22<58:38,  1.60it/s, loss=0.183, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=1188.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   5%|▌         | 325/5971 [03:23<58:42,  1.60it/s, loss=0.183, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=1188.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 325/5971 [03:23<58:42,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00121, train/loss_step=0.304, global_step=1189.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   5%|▌         | 326/5971 [03:24<58:46,  1.60it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000204, train/loss_step=0.0568, global_step=1189.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 327/5971 [03:25<58:50,  1.60it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00879, train/loss_vlb_step=4.17e-5, train/loss_step=0.00879, global_step=1189.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   5%|▌         | 328/5971 [03:27<59:19,  1.59it/s, loss=0.154, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000121, train/loss_step=0.034, global_step=1189.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   6%|▌         | 329/5971 [03:28<59:23,  1.58it/s, loss=0.154, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000121, train/loss_step=0.034, global_step=1189.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 329/5971 [03:28<59:23,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=1190.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   6%|▌         | 330/5971 [03:29<59:26,  1.58it/s, loss=0.16, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000869, train/loss_step=0.228, global_step=1190.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 331/5971 [03:30<59:29,  1.58it/s, loss=0.154, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=1190.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 332/5971 [03:32<59:57,  1.57it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.19e-5, train/loss_step=0.0111, global_step=1190.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 333/5971 [03:33<1:00:00,  1.57it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.19e-5, train/loss_step=0.0111, global_step=1190.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 333/5971 [03:33<1:00:00,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00577, train/loss_vlb_step=2.86e-5, train/loss_step=0.00577, global_step=1191.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 334/5971 [03:34<1:00:04,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.000107, train/loss_step=0.0267, global_step=1191.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   6%|▌         | 335/5971 [03:35<1:00:07,  1.56it/s, loss=0.178, v_num=0, train/loss_simple_step=0.686, train/loss_vlb_step=0.0324, train/loss_step=0.686, global_step=1191.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   6%|▌         | 336/5971 [03:37<1:00:35,  1.55it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.4e-5, train/loss_step=0.0126, global_step=1191.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 337/5971 [03:38<1:00:39,  1.55it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.4e-5, train/loss_step=0.0126, global_step=1191.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 337/5971 [03:38<1:00:39,  1.55it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.62e-5, train/loss_step=0.00497, global_step=1192.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 338/5971 [03:39<1:00:42,  1.55it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00944, train/loss_vlb_step=4.27e-5, train/loss_step=0.00944, global_step=1192.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 339/5971 [03:40<1:00:45,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00141, train/loss_step=0.309, global_step=1192.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:   6%|▌         | 340/5971 [03:42<1:01:11,  1.53it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.55e-5, train/loss_step=0.0245, global_step=1192.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 341/5971 [03:43<1:01:18,  1.53it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.55e-5, train/loss_step=0.0245, global_step=1192.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 341/5971 [03:43<1:01:18,  1.53it/s, loss=0.134, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.0005, train/loss_step=0.129, global_step=1193.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   6%|▌         | 342/5971 [03:44<1:01:21,  1.53it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.4e-5, train/loss_step=0.0044, global_step=1193.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 343/5971 [03:45<1:01:25,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.754, train/loss_vlb_step=0.0552, train/loss_step=0.754, global_step=1193.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   6%|▌         | 344/5971 [03:47<1:01:52,  1.52it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000168, train/loss_step=0.0479, global_step=1193.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 345/5971 [03:48<1:01:55,  1.51it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000168, train/loss_step=0.0479, global_step=1193.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 345/5971 [03:48<1:01:55,  1.51it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.84e-5, train/loss_step=0.0114, global_step=1194.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   6%|▌         | 346/5971 [03:49<1:01:58,  1.51it/s, loss=0.14, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000833, train/loss_step=0.218, global_step=1194.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   6%|▌         | 347/5971 [03:50<1:02:00,  1.51it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.63e-5, train/loss_step=0.0187, global_step=1194.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 348/5971 [03:53<1:02:40,  1.50it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0999, train/loss_vlb_step=0.000329, train/loss_step=0.0999, global_step=1194.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 349/5971 [03:54<1:02:43,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0999, train/loss_vlb_step=0.000329, train/loss_step=0.0999, global_step=1194.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 349/5971 [03:54<1:02:43,  1.49it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0627, train/loss_vlb_step=0.000214, train/loss_step=0.0627, global_step=1195.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   6%|▌         | 350/5971 [03:55<1:02:46,  1.49it/s, loss=0.146, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00175, train/loss_step=0.348, global_step=1195.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   6%|▌         | 351/5971 [03:56<1:02:48,  1.49it/s, loss=0.14, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.47e-5, train/loss_step=0.013, global_step=1195.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   6%|▌         | 352/5971 [03:58<1:03:18,  1.48it/s, loss=0.162, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00423, train/loss_step=0.454, global_step=1195.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 353/5971 [03:59<1:03:21,  1.48it/s, loss=0.162, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00423, train/loss_step=0.454, global_step=1195.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 353/5971 [03:59<1:03:21,  1.48it/s, loss=0.172, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000673, train/loss_step=0.201, global_step=1196.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 354/5971 [04:00<1:03:24,  1.48it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000132, train/loss_step=0.0381, global_step=1196.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 355/5971 [04:01<1:03:26,  1.48it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.0002, train/loss_step=0.0604, global_step=1196.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   6%|▌         | 356/5971 [04:03<1:03:48,  1.47it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00465, train/loss_vlb_step=2.49e-5, train/loss_step=0.00465, global_step=1196.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 357/5971 [04:04<1:03:51,  1.47it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00465, train/loss_vlb_step=2.49e-5, train/loss_step=0.00465, global_step=1196.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 357/5971 [04:04<1:03:51,  1.47it/s, loss=0.157, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00159, train/loss_step=0.335, global_step=1197.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   6%|▌         | 358/5971 [04:05<1:03:54,  1.46it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.52e-5, train/loss_step=0.00283, global_step=1197.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 359/5971 [04:06<1:03:56,  1.46it/s, loss=0.17, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00965, train/loss_step=0.574, global_step=1197.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:   6%|▌         | 360/5971 [04:08<1:04:21,  1.45it/s, loss=0.175, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000415, train/loss_step=0.125, global_step=1197.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 361/5971 [04:09<1:04:24,  1.45it/s, loss=0.175, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000415, train/loss_step=0.125, global_step=1197.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 361/5971 [04:09<1:04:24,  1.45it/s, loss=0.2, v_num=0, train/loss_simple_step=0.633, train/loss_vlb_step=0.0063, train/loss_step=0.633, global_step=1198.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:   6%|▌         | 362/5971 [04:10<1:04:27,  1.45it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00938, train/loss_vlb_step=4.22e-5, train/loss_step=0.00938, global_step=1198.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 363/5971 [04:11<1:04:29,  1.45it/s, loss=0.171, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000573, train/loss_step=0.160, global_step=1198.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   6%|▌         | 364/5971 [04:13<1:04:54,  1.44it/s, loss=0.175, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000448, train/loss_step=0.136, global_step=1198.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 365/5971 [04:14<1:04:58,  1.44it/s, loss=0.175, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000448, train/loss_step=0.136, global_step=1198.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 365/5971 [04:14<1:04:58,  1.44it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.79e-5, train/loss_step=0.0182, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 366/5971 [04:15<1:05:00,  1.44it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.38e-6, train/loss_step=0.0014, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 367/5971 [04:16<1:05:03,  1.44it/s, loss=0.177, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00109, train/loss_step=0.259, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   6%|▌         | 368/5971 [04:18<1:05:28,  1.43it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   6%|▌         | 369/5971 [04:18<1:05:17,  1.43it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:04,  2.56it/s][A

Validating:   1%|          | 2/167 [00:00<01:04,  2.57it/s][A
Epoch 2:   6%|▌         | 373/5971 [04:19<1:04:45,  1.44it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:22,  7.18it/s][A

Validating:   5%|▍         | 8/167 [00:01<00:14, 11.02it/s][A
Epoch 2:   6%|▋         | 377/5971 [04:19<1:04:04,  1.46it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:10, 14.86it/s][A
Epoch 2:   6%|▋         | 381/5971 [04:19<1:03:23,  1.47it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:09, 16.52it/s][A
Epoch 2:   6%|▋         | 385/5971 [04:20<1:02:43,  1.48it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.45it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:07, 20.86it/s][A
Epoch 2:   7%|▋         | 389/5971 [04:20<1:02:04,  1.50it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 21.70it/s][A
Epoch 2:   7%|▋         | 393/5971 [04:20<1:01:26,  1.51it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 21.83it/s][A
Epoch 2:   7%|▋         | 397/5971 [04:20<1:00:49,  1.53it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.72it/s][A
Epoch 2:   7%|▋         | 401/5971 [04:20<1:00:12,  1.54it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  20%|██        | 34/167 [00:02<00:05, 25.72it/s][A
Epoch 2:   7%|▋         | 405/5971 [04:20<59:36,  1.56it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  

Validating:  22%|██▏       | 37/167 [00:02<00:04, 26.21it/s][A

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.65it/s][A
Epoch 2:   7%|▋         | 409/5971 [04:21<59:00,  1.57it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.14it/s][A
Epoch 2:   7%|▋         | 413/5971 [04:21<58:26,  1.59it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.70it/s][A
Epoch 2:   7%|▋         | 417/5971 [04:21<57:52,  1.60it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 25.70it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 25.76it/s][A
Epoch 2:   7%|▋         | 421/5971 [04:21<57:18,  1.61it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.11it/s][A
Epoch 2:   7%|▋         | 425/5971 [04:21<56:46,  1.63it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.40it/s][A
Epoch 2:   7%|▋         | 429/5971 [04:21<56:13,  1.64it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  37%|███▋      | 61/167 [00:03<00:03, 26.50it/s][A

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.38it/s][A
Epoch 2:   7%|▋         | 433/5971 [04:21<55:42,  1.66it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:03<00:03, 25.66it/s][A
Epoch 2:   7%|▋         | 437/5971 [04:22<55:11,  1.67it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.69it/s][A
Epoch 2:   7%|▋         | 441/5971 [04:22<54:41,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 24.99it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 24.87it/s][A
Epoch 2:   7%|▋         | 445/5971 [04:22<54:11,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.03it/s][A
Epoch 2:   8%|▊         | 449/5971 [04:22<53:42,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 25.55it/s][A
Epoch 2:   8%|▊         | 453/5971 [04:22<53:13,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:04<00:02, 27.33it/s][A
Epoch 2:   8%|▊         | 457/5971 [04:22<52:44,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:04<00:02, 26.85it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 27.02it/s][A
Epoch 2:   8%|▊         | 461/5971 [04:22<52:16,  1.76it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.05it/s][A
Epoch 2:   8%|▊         | 465/5971 [04:23<51:49,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.00it/s][A
Epoch 2:   8%|▊         | 469/5971 [04:23<51:22,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.97it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.53it/s][A
Epoch 2:   8%|▊         | 473/5971 [04:23<50:55,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.96it/s][A
Epoch 2:   8%|▊         | 477/5971 [04:23<50:30,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 24.07it/s][A
Epoch 2:   8%|▊         | 481/5971 [04:23<50:04,  1.83it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 24.62it/s][A

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.16it/s][A
Epoch 2:   8%|▊         | 485/5971 [04:23<49:39,  1.84it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 24.62it/s][A
Epoch 2:   8%|▊         | 489/5971 [04:24<49:14,  1.86it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 24.65it/s][A
Epoch 2:   8%|▊         | 493/5971 [04:24<48:50,  1.87it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.07it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.36it/s][A
Epoch 2:   8%|▊         | 497/5971 [04:24<48:26,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.93it/s][A
Epoch 2:   8%|▊         | 501/5971 [04:24<48:03,  1.90it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:05<00:01, 25.85it/s][A
Epoch 2:   8%|▊         | 505/5971 [04:24<47:39,  1.91it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 25.98it/s][A
Epoch 2:   9%|▊         | 509/5971 [04:24<47:16,  1.93it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 26.38it/s][A

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.15it/s][A
Epoch 2:   9%|▊         | 513/5971 [04:25<46:54,  1.94it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.69it/s][A
Epoch 2:   9%|▊         | 517/5971 [04:25<46:31,  1.95it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.38it/s][A
Epoch 2:   9%|▊         | 521/5971 [04:25<46:10,  1.97it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 24.55it/s][A
Epoch 2:   9%|▉         | 525/5971 [04:25<45:49,  1.98it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 24.04it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 23.59it/s][A
Epoch 2:   9%|▉         | 529/5971 [04:25<45:28,  1.99it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 26.21it/s][A
Epoch 2:   9%|▉         | 533/5971 [04:25<45:07,  2.01it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 536/5971 [04:26<45:01,  2.01it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.45it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.89it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.31it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.55it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.61it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.78it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.94it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.08it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.16it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.09it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.21it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.28it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.25it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.26it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.21it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.24it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.24it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.29it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.56it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.39it/s][A
Epoch 2:   9%|▉         | 536/5971 [04:35<46:33,  1.95it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  4.93it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.10it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.18it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.34it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.53it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.02it/s]

Epoch 2:   9%|▉         | 537/5971 [04:39<47:01,  1.93it/s, loss=0.189, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00159, train/loss_step=0.353, global_step=1199.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 537/5971 [04:39<47:01,  1.93it/s, loss=0.199, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000889, train/loss_step=0.248, global_step=1200.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.89it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.80it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.88it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.98it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.07it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.12it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.16it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.19it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.13it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.19it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.22it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.21it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.15it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.13it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.08it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.14it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.22it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.24it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.54it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.54it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.33it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.54it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.54it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.37it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.33it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.21it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.04it/s]

Epoch 2:   9%|▉         | 538/5971 [04:51<49:00,  1.85it/s, loss=0.199, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000889, train/loss_step=0.248, global_step=1200.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 538/5971 [04:51<49:00,  1.85it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0731, train/loss_vlb_step=0.000246, train/loss_step=0.0731, global_step=1200.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:28,  1.75it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:16,  2.94it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.75it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.30it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.61it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.86it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.04it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.14it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.33it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.61it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.65it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.53it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.12it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.24it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.32it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.41it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.55it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.46it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.40it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.29it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.28it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.24it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.18it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.26it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.31it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.30it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.35it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.19it/s]

Epoch 2:   9%|▉         | 539/5971 [05:03<50:56,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0731, train/loss_vlb_step=0.000246, train/loss_step=0.0731, global_step=1200.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 539/5971 [05:03<50:56,  1.78it/s, loss=0.2, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00157, train/loss_step=0.306, global_step=1200.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.19it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.81it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.27it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.61it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.83it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.00it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.11it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.18it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.19it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.18it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.46it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.52it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.34it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.50it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.50it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.55it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.58it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.61it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.61it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.39it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.35it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.31it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.37it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.41it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.08it/s]

Epoch 2:   9%|▉         | 540/5971 [05:17<53:05,  1.71it/s, loss=0.2, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00157, train/loss_step=0.306, global_step=1200.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 540/5971 [05:17<53:05,  1.71it/s, loss=0.194, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00176, train/loss_step=0.333, global_step=1200.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 541/5971 [05:18<53:07,  1.70it/s, loss=0.194, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00176, train/loss_step=0.333, global_step=1200.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 541/5971 [05:18<53:07,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00112, train/loss_step=0.264, global_step=1201.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 542/5971 [05:19<53:09,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00112, train/loss_step=0.264, global_step=1201.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 542/5971 [05:19<53:09,  1.70it/s, loss=0.2, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000375, train/loss_step=0.112, global_step=1201.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   9%|▉         | 543/5971 [05:19<53:12,  1.70it/s, loss=0.2, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000375, train/loss_step=0.112, global_step=1201.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 543/5971 [05:19<53:12,  1.70it/s, loss=0.204, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000425, train/loss_step=0.127, global_step=1201.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 544/5971 [05:22<53:27,  1.69it/s, loss=0.204, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000425, train/loss_step=0.127, global_step=1201.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 544/5971 [05:22<53:27,  1.69it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000156, train/loss_step=0.0461, global_step=1201.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 545/5971 [05:22<53:29,  1.69it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000156, train/loss_step=0.0461, global_step=1201.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 545/5971 [05:22<53:29,  1.69it/s, loss=0.201, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000818, train/loss_step=0.234, global_step=1202.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   9%|▉         | 546/5971 [05:23<53:32,  1.69it/s, loss=0.201, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000818, train/loss_step=0.234, global_step=1202.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 546/5971 [05:23<53:32,  1.69it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000117, train/loss_step=0.0318, global_step=1202.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 547/5971 [05:24<53:35,  1.69it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000117, train/loss_step=0.0318, global_step=1202.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 547/5971 [05:24<53:35,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000114, train/loss_step=0.0307, global_step=1202.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 548/5971 [05:27<53:50,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000114, train/loss_step=0.0307, global_step=1202.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 548/5971 [05:27<53:50,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000652, train/loss_step=0.170, global_step=1202.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   9%|▉         | 549/5971 [05:27<53:53,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000652, train/loss_step=0.170, global_step=1202.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 549/5971 [05:27<53:53,  1.68it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000308, train/loss_step=0.0937, global_step=1203.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 550/5971 [05:28<53:55,  1.68it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000308, train/loss_step=0.0937, global_step=1203.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 550/5971 [05:28<53:55,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000212, train/loss_step=0.0596, global_step=1203.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 551/5971 [05:29<53:57,  1.67it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000212, train/loss_step=0.0596, global_step=1203.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 551/5971 [05:29<53:57,  1.67it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0677, train/loss_vlb_step=0.000231, train/loss_step=0.0677, global_step=1203.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 552/5971 [05:31<54:12,  1.67it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0677, train/loss_vlb_step=0.000231, train/loss_step=0.0677, global_step=1203.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 552/5971 [05:31<54:12,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000979, train/loss_step=0.251, global_step=1203.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   9%|▉         | 553/5971 [05:32<54:15,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000979, train/loss_step=0.251, global_step=1203.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 553/5971 [05:32<54:15,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=1204.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 554/5971 [05:33<54:17,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=1204.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 554/5971 [05:33<54:17,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000129, train/loss_step=0.0378, global_step=1204.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 555/5971 [05:34<54:19,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000129, train/loss_step=0.0378, global_step=1204.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 555/5971 [05:34<54:19,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.00058, train/loss_step=0.169, global_step=1204.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   9%|▉         | 556/5971 [05:36<54:34,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.00058, train/loss_step=0.169, global_step=1204.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 556/5971 [05:36<54:34,  1.65it/s, loss=0.148, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000776, train/loss_step=0.215, global_step=1204.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 557/5971 [05:37<54:36,  1.65it/s, loss=0.148, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000776, train/loss_step=0.215, global_step=1204.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 557/5971 [05:37<54:36,  1.65it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.29e-5, train/loss_step=0.0196, global_step=1205.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 558/5971 [05:38<54:38,  1.65it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.29e-5, train/loss_step=0.0196, global_step=1205.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 558/5971 [05:38<54:38,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00417, train/loss_vlb_step=2.17e-5, train/loss_step=0.00417, global_step=1205.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 559/5971 [05:39<54:40,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00417, train/loss_vlb_step=2.17e-5, train/loss_step=0.00417, global_step=1205.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 559/5971 [05:39<54:40,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000105, train/loss_step=0.0277, global_step=1205.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:   9%|▉         | 560/5971 [05:41<54:57,  1.64it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000105, train/loss_step=0.0277, global_step=1205.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 560/5971 [05:41<54:57,  1.64it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.38e-5, train/loss_step=0.0178, global_step=1205.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 561/5971 [05:42<54:59,  1.64it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.38e-5, train/loss_step=0.0178, global_step=1205.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 561/5971 [05:42<54:59,  1.64it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.0824, train/loss_vlb_step=0.000276, train/loss_step=0.0824, global_step=1206.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 562/5971 [05:43<55:01,  1.64it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.0824, train/loss_vlb_step=0.000276, train/loss_step=0.0824, global_step=1206.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 562/5971 [05:43<55:01,  1.64it/s, loss=0.0912, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000148, train/loss_step=0.0399, global_step=1206.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 563/5971 [05:44<55:03,  1.64it/s, loss=0.0912, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000148, train/loss_step=0.0399, global_step=1206.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 563/5971 [05:44<55:03,  1.64it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=7.79e-5, train/loss_step=0.0196, global_step=1206.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   9%|▉         | 564/5971 [05:46<55:17,  1.63it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=7.79e-5, train/loss_step=0.0196, global_step=1206.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 564/5971 [05:46<55:17,  1.63it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000738, train/loss_step=0.202, global_step=1206.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   9%|▉         | 565/5971 [05:47<55:19,  1.63it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000738, train/loss_step=0.202, global_step=1206.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 565/5971 [05:47<55:19,  1.63it/s, loss=0.0821, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.19e-5, train/loss_step=0.00199, global_step=1207.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 566/5971 [05:48<55:21,  1.63it/s, loss=0.0821, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.19e-5, train/loss_step=0.00199, global_step=1207.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 566/5971 [05:48<55:21,  1.63it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000457, train/loss_step=0.139, global_step=1207.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:   9%|▉         | 567/5971 [05:49<55:23,  1.63it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000457, train/loss_step=0.139, global_step=1207.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:   9%|▉         | 567/5971 [05:49<55:23,  1.63it/s, loss=0.102, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00123, train/loss_step=0.317, global_step=1207.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|▉         | 568/5971 [05:51<55:39,  1.62it/s, loss=0.102, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00123, train/loss_step=0.317, global_step=1207.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 568/5971 [05:51<55:39,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000743, train/loss_step=0.205, global_step=1207.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 569/5971 [05:52<55:42,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000743, train/loss_step=0.205, global_step=1207.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 569/5971 [05:52<55:42,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000508, train/loss_step=0.152, global_step=1208.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 570/5971 [05:53<55:44,  1.61it/s, loss=0.106, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000508, train/loss_step=0.152, global_step=1208.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 570/5971 [05:53<55:44,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00238, train/loss_step=0.425, global_step=1208.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  10%|▉         | 571/5971 [05:54<55:46,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00238, train/loss_step=0.425, global_step=1208.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 571/5971 [05:54<55:46,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00104, train/loss_step=0.274, global_step=1208.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 572/5971 [05:56<56:01,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00104, train/loss_step=0.274, global_step=1208.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 572/5971 [05:56<56:01,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000491, train/loss_step=0.146, global_step=1208.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 573/5971 [05:57<56:04,  1.60it/s, loss=0.13, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000491, train/loss_step=0.146, global_step=1208.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 573/5971 [05:57<56:04,  1.60it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.65e-5, train/loss_step=0.00299, global_step=1209.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 574/5971 [05:58<56:06,  1.60it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.65e-5, train/loss_step=0.00299, global_step=1209.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 574/5971 [05:58<56:06,  1.60it/s, loss=0.129, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=1209.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  10%|▉         | 575/5971 [05:59<56:08,  1.60it/s, loss=0.129, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=1209.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 575/5971 [05:59<56:08,  1.60it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000163, train/loss_step=0.0464, global_step=1209.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 576/5971 [06:01<56:24,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000163, train/loss_step=0.0464, global_step=1209.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 576/5971 [06:01<56:24,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000125, train/loss_step=0.0326, global_step=1209.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 577/5971 [06:02<56:26,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000125, train/loss_step=0.0326, global_step=1209.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 577/5971 [06:02<56:26,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000861, train/loss_step=0.235, global_step=1210.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|▉         | 578/5971 [06:03<56:28,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000861, train/loss_step=0.235, global_step=1210.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 578/5971 [06:03<56:28,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.51e-5, train/loss_step=0.00731, global_step=1210.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 579/5971 [06:04<56:29,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.51e-5, train/loss_step=0.00731, global_step=1210.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 579/5971 [06:04<56:29,  1.59it/s, loss=0.137, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00109, train/loss_step=0.267, global_step=1210.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  10%|▉         | 580/5971 [06:07<56:50,  1.58it/s, loss=0.137, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00109, train/loss_step=0.267, global_step=1210.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 580/5971 [06:07<56:50,  1.58it/s, loss=0.145, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000662, train/loss_step=0.184, global_step=1210.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 581/5971 [06:08<56:52,  1.58it/s, loss=0.145, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000662, train/loss_step=0.184, global_step=1210.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 581/5971 [06:08<56:52,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000181, train/loss_step=0.0515, global_step=1211.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 582/5971 [06:09<56:54,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000181, train/loss_step=0.0515, global_step=1211.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 582/5971 [06:09<56:54,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000344, train/loss_step=0.105, global_step=1211.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|▉         | 583/5971 [06:10<56:56,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000344, train/loss_step=0.105, global_step=1211.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 583/5971 [06:10<56:56,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.79e-5, train/loss_step=0.0179, global_step=1211.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 584/5971 [06:13<57:19,  1.57it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.79e-5, train/loss_step=0.0179, global_step=1211.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 584/5971 [06:13<57:19,  1.57it/s, loss=0.145, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.00057, train/loss_step=0.172, global_step=1211.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|▉         | 585/5971 [06:14<57:21,  1.56it/s, loss=0.145, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.00057, train/loss_step=0.172, global_step=1211.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 585/5971 [06:14<57:21,  1.56it/s, loss=0.155, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000687, train/loss_step=0.202, global_step=1212.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 586/5971 [06:15<57:23,  1.56it/s, loss=0.155, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000687, train/loss_step=0.202, global_step=1212.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 586/5971 [06:15<57:23,  1.56it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.7e-5, train/loss_step=0.00301, global_step=1212.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 587/5971 [06:16<57:25,  1.56it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.7e-5, train/loss_step=0.00301, global_step=1212.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 587/5971 [06:16<57:25,  1.56it/s, loss=0.147, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00125, train/loss_step=0.292, global_step=1212.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  10%|▉         | 588/5971 [06:18<57:37,  1.56it/s, loss=0.147, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00125, train/loss_step=0.292, global_step=1212.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 588/5971 [06:18<57:37,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000449, train/loss_step=0.137, global_step=1212.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 589/5971 [06:19<57:39,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000449, train/loss_step=0.137, global_step=1212.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 589/5971 [06:19<57:39,  1.56it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.65e-5, train/loss_step=0.0121, global_step=1213.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 590/5971 [06:20<57:40,  1.55it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.65e-5, train/loss_step=0.0121, global_step=1213.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 590/5971 [06:20<57:40,  1.55it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.67e-5, train/loss_step=0.00297, global_step=1213.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 591/5971 [06:20<57:42,  1.55it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.67e-5, train/loss_step=0.00297, global_step=1213.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 591/5971 [06:20<57:42,  1.55it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000188, train/loss_step=0.0551, global_step=1213.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  10%|▉         | 592/5971 [06:23<57:55,  1.55it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000188, train/loss_step=0.0551, global_step=1213.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 592/5971 [06:23<57:55,  1.55it/s, loss=0.104, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=1213.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|▉         | 593/5971 [06:24<57:57,  1.55it/s, loss=0.104, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=1213.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 593/5971 [06:24<57:57,  1.55it/s, loss=0.112, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000544, train/loss_step=0.162, global_step=1214.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 594/5971 [06:24<57:59,  1.55it/s, loss=0.112, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000544, train/loss_step=0.162, global_step=1214.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 594/5971 [06:24<57:59,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.0023, train/loss_step=0.398, global_step=1214.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|▉         | 595/5971 [06:25<58:00,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.0023, train/loss_step=0.398, global_step=1214.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 595/5971 [06:25<58:00,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00113, train/loss_step=0.265, global_step=1214.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 596/5971 [06:27<58:13,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00113, train/loss_step=0.265, global_step=1214.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 596/5971 [06:27<58:13,  1.54it/s, loss=0.164, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.0115, train/loss_step=0.570, global_step=1214.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  10%|▉         | 597/5971 [06:28<58:14,  1.54it/s, loss=0.164, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.0115, train/loss_step=0.570, global_step=1214.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|▉         | 597/5971 [06:28<58:14,  1.54it/s, loss=0.188, v_num=0, train/loss_simple_step=0.727, train/loss_vlb_step=0.0468, train/loss_step=0.727, global_step=1215.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 598/5971 [06:29<58:15,  1.54it/s, loss=0.188, v_num=0, train/loss_simple_step=0.727, train/loss_vlb_step=0.0468, train/loss_step=0.727, global_step=1215.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 598/5971 [06:29<58:15,  1.54it/s, loss=0.193, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000349, train/loss_step=0.104, global_step=1215.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 599/5971 [06:30<58:17,  1.54it/s, loss=0.193, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000349, train/loss_step=0.104, global_step=1215.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 599/5971 [06:30<58:17,  1.54it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000149, train/loss_step=0.0406, global_step=1215.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 600/5971 [06:32<58:31,  1.53it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000149, train/loss_step=0.0406, global_step=1215.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 600/5971 [06:32<58:31,  1.53it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000324, train/loss_step=0.0983, global_step=1215.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 601/5971 [06:33<58:33,  1.53it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0983, train/loss_vlb_step=0.000324, train/loss_step=0.0983, global_step=1215.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 601/5971 [06:33<58:33,  1.53it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000251, train/loss_step=0.0722, global_step=1216.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 602/5971 [06:34<58:35,  1.53it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000251, train/loss_step=0.0722, global_step=1216.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 602/5971 [06:34<58:35,  1.53it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.93e-5, train/loss_step=0.0224, global_step=1216.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  10%|█         | 603/5971 [06:35<58:37,  1.53it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.93e-5, train/loss_step=0.0224, global_step=1216.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 603/5971 [06:35<58:37,  1.53it/s, loss=0.181, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000504, train/loss_step=0.149, global_step=1216.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  10%|█         | 604/5971 [06:37<58:49,  1.52it/s, loss=0.181, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000504, train/loss_step=0.149, global_step=1216.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 604/5971 [06:37<58:49,  1.52it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0094, train/loss_vlb_step=4.32e-5, train/loss_step=0.0094, global_step=1216.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 605/5971 [06:38<58:51,  1.52it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0094, train/loss_vlb_step=4.32e-5, train/loss_step=0.0094, global_step=1216.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 605/5971 [06:38<58:51,  1.52it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.55e-5, train/loss_step=0.00265, global_step=1217.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 606/5971 [06:39<58:52,  1.52it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.55e-5, train/loss_step=0.00265, global_step=1217.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 606/5971 [06:39<58:52,  1.52it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.73e-5, train/loss_step=0.0055, global_step=1217.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|█         | 607/5971 [06:40<58:54,  1.52it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.73e-5, train/loss_step=0.0055, global_step=1217.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 607/5971 [06:40<58:54,  1.52it/s, loss=0.165, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00179, train/loss_step=0.340, global_step=1217.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|█         | 608/5971 [06:42<59:08,  1.51it/s, loss=0.165, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00179, train/loss_step=0.340, global_step=1217.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 608/5971 [06:42<59:08,  1.51it/s, loss=0.177, v_num=0, train/loss_simple_step=0.361, train/loss_vlb_step=0.00226, train/loss_step=0.361, global_step=1217.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 609/5971 [06:43<59:09,  1.51it/s, loss=0.177, v_num=0, train/loss_simple_step=0.361, train/loss_vlb_step=0.00226, train/loss_step=0.361, global_step=1217.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 609/5971 [06:43<59:09,  1.51it/s, loss=0.192, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00138, train/loss_step=0.328, global_step=1218.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 610/5971 [06:44<59:11,  1.51it/s, loss=0.192, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00138, train/loss_step=0.328, global_step=1218.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 610/5971 [06:44<59:11,  1.51it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00956, train/loss_vlb_step=4.32e-5, train/loss_step=0.00956, global_step=1218.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 611/5971 [06:45<59:12,  1.51it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00956, train/loss_vlb_step=4.32e-5, train/loss_step=0.00956, global_step=1218.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 611/5971 [06:45<59:12,  1.51it/s, loss=0.198, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00058, train/loss_step=0.168, global_step=1218.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  10%|█         | 612/5971 [06:47<59:24,  1.50it/s, loss=0.198, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00058, train/loss_step=0.168, global_step=1218.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 612/5971 [06:47<59:24,  1.50it/s, loss=0.198, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000437, train/loss_step=0.132, global_step=1218.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 613/5971 [06:48<59:25,  1.50it/s, loss=0.198, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000437, train/loss_step=0.132, global_step=1218.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 613/5971 [06:48<59:25,  1.50it/s, loss=0.194, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000258, train/loss_step=0.077, global_step=1219.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 614/5971 [06:49<59:27,  1.50it/s, loss=0.194, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000258, train/loss_step=0.077, global_step=1219.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 614/5971 [06:49<59:27,  1.50it/s, loss=0.182, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000528, train/loss_step=0.157, global_step=1219.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 615/5971 [06:50<59:28,  1.50it/s, loss=0.182, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000528, train/loss_step=0.157, global_step=1219.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 615/5971 [06:50<59:28,  1.50it/s, loss=0.177, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000585, train/loss_step=0.165, global_step=1219.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 616/5971 [06:52<59:40,  1.50it/s, loss=0.177, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000585, train/loss_step=0.165, global_step=1219.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 616/5971 [06:52<59:40,  1.50it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.47e-5, train/loss_step=0.0178, global_step=1219.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 617/5971 [06:53<59:41,  1.49it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.47e-5, train/loss_step=0.0178, global_step=1219.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 617/5971 [06:53<59:41,  1.49it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.44e-5, train/loss_step=0.00503, global_step=1220.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 618/5971 [06:54<59:42,  1.49it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.44e-5, train/loss_step=0.00503, global_step=1220.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 618/5971 [06:54<59:42,  1.49it/s, loss=0.113, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=1220.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  10%|█         | 619/5971 [06:55<59:44,  1.49it/s, loss=0.113, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=1220.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 619/5971 [06:55<59:44,  1.49it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00538, train/loss_vlb_step=2.58e-5, train/loss_step=0.00538, global_step=1220.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 620/5971 [06:57<59:55,  1.49it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00538, train/loss_vlb_step=2.58e-5, train/loss_step=0.00538, global_step=1220.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 620/5971 [06:57<59:55,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=1220.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  10%|█         | 621/5971 [06:58<59:57,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=1220.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 621/5971 [06:58<59:57,  1.49it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.24e-5, train/loss_step=0.00218, global_step=1221.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 622/5971 [06:59<59:58,  1.49it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.24e-5, train/loss_step=0.00218, global_step=1221.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 622/5971 [06:59<59:58,  1.49it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.75e-5, train/loss_step=0.00309, global_step=1221.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 623/5971 [06:59<59:59,  1.49it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.75e-5, train/loss_step=0.00309, global_step=1221.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 623/5971 [06:59<59:59,  1.49it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.61e-5, train/loss_step=0.00297, global_step=1221.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  10%|█         | 624/5971 [07:02<1:00:11,  1.48it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.61e-5, train/loss_step=0.00297, global_step=1221.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 624/5971 [07:02<1:00:11,  1.48it/s, loss=0.13, v_num=0, train/loss_simple_step=0.591, train/loss_vlb_step=0.00581, train/loss_step=0.591, global_step=1221.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  10%|█         | 625/5971 [07:03<1:00:12,  1.48it/s, loss=0.13, v_num=0, train/loss_simple_step=0.591, train/loss_vlb_step=0.00581, train/loss_step=0.591, global_step=1221.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 625/5971 [07:03<1:00:12,  1.48it/s, loss=0.138, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000571, train/loss_step=0.163, global_step=1222.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 626/5971 [07:03<1:00:13,  1.48it/s, loss=0.138, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000571, train/loss_step=0.163, global_step=1222.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  10%|█         | 626/5971 [07:03<1:00:13,  1.48it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00962, train/loss_vlb_step=4.29e-5, train/loss_step=0.00962, global_step=1222.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 627/5971 [07:04<1:00:15,  1.48it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00962, train/loss_vlb_step=4.29e-5, train/loss_step=0.00962, global_step=1222.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 627/5971 [07:04<1:00:15,  1.48it/s, loss=0.129, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000565, train/loss_step=0.171, global_step=1222.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  11%|█         | 628/5971 [07:07<1:00:28,  1.47it/s, loss=0.129, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000565, train/loss_step=0.171, global_step=1222.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 628/5971 [07:07<1:00:28,  1.47it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000163, train/loss_step=0.0449, global_step=1222.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 629/5971 [07:08<1:00:29,  1.47it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000163, train/loss_step=0.0449, global_step=1222.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 629/5971 [07:08<1:00:29,  1.47it/s, loss=0.111, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.0011, train/loss_step=0.283, global_step=1223.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  11%|█         | 630/5971 [07:08<1:00:30,  1.47it/s, loss=0.111, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.0011, train/loss_step=0.283, global_step=1223.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 630/5971 [07:08<1:00:30,  1.47it/s, loss=0.12, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000663, train/loss_step=0.189, global_step=1223.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 631/5971 [07:09<1:00:31,  1.47it/s, loss=0.12, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000663, train/loss_step=0.189, global_step=1223.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 631/5971 [07:09<1:00:31,  1.47it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00662, train/loss_vlb_step=3.36e-5, train/loss_step=0.00662, global_step=1223.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 632/5971 [07:11<1:00:43,  1.47it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00662, train/loss_vlb_step=3.36e-5, train/loss_step=0.00662, global_step=1223.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 632/5971 [07:11<1:00:43,  1.47it/s, loss=0.114, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000551, train/loss_step=0.166, global_step=1223.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  11%|█         | 633/5971 [07:12<1:00:44,  1.46it/s, loss=0.114, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000551, train/loss_step=0.166, global_step=1223.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 633/5971 [07:12<1:00:44,  1.46it/s, loss=0.122, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00089, train/loss_step=0.236, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  11%|█         | 634/5971 [07:13<1:00:45,  1.46it/s, loss=0.122, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00089, train/loss_step=0.236, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 634/5971 [07:13<1:00:45,  1.46it/s, loss=0.14, v_num=0, train/loss_simple_step=0.522, train/loss_vlb_step=0.00543, train/loss_step=0.522, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  11%|█         | 635/5971 [07:14<1:00:46,  1.46it/s, loss=0.14, v_num=0, train/loss_simple_step=0.522, train/loss_vlb_step=0.00543, train/loss_step=0.522, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 635/5971 [07:14<1:00:46,  1.46it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0598, train/loss_vlb_step=0.00021, train/loss_step=0.0598, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 636/5971 [07:17<1:00:59,  1.46it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0598, train/loss_vlb_step=0.00021, train/loss_step=0.0598, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  11%|█         | 636/5971 [07:17<1:00:59,  1.46it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:25,  1.94it/s][A
Epoch 2:  11%|█         | 638/5971 [07:17<1:00:51,  1.46it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   2%|▏         | 3/167 [00:00<00:29,  5.63it/s][A
Epoch 2:  11%|█         | 640/5971 [07:17<1:00:40,  1.46it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   4%|▍         | 7/167 [00:00<00:12, 12.72it/s][A
Epoch 2:  11%|█         | 644/5971 [07:17<1:00:15,  1.47it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   6%|▌         | 10/167 [00:00<00:10, 15.65it/s][A
Epoch 2:  11%|█         | 648/5971 [07:17<59:52,  1.48it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  

Validating:   8%|▊         | 13/167 [00:01<00:08, 18.06it/s][A
Epoch 2:  11%|█         | 652/5971 [07:18<59:28,  1.49it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|▉         | 16/167 [00:01<00:07, 20.75it/s][A

Validating:  11%|█▏        | 19/167 [00:01<00:06, 22.75it/s][A
Epoch 2:  11%|█         | 656/5971 [07:18<59:05,  1.50it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  13%|█▎        | 22/167 [00:01<00:05, 24.34it/s][A
Epoch 2:  11%|█         | 660/5971 [07:18<58:42,  1.51it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.12it/s][A
Epoch 2:  11%|█         | 664/5971 [07:18<58:19,  1.52it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 25.00it/s][A

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.99it/s][A
Epoch 2:  11%|█         | 668/5971 [07:18<57:57,  1.52it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  20%|██        | 34/167 [00:01<00:05, 26.40it/s][A
Epoch 2:  11%|█▏        | 672/5971 [07:18<57:35,  1.53it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.54it/s][A
Epoch 2:  11%|█▏        | 676/5971 [07:19<57:13,  1.54it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.14it/s][A
Epoch 2:  11%|█▏        | 680/5971 [07:19<56:52,  1.55it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.21it/s][A
Epoch 2:  11%|█▏        | 684/5971 [07:19<56:30,  1.56it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.23it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 27.54it/s][A
Epoch 2:  12%|█▏        | 688/5971 [07:19<56:09,  1.57it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.76it/s][A
Epoch 2:  12%|█▏        | 692/5971 [07:19<55:48,  1.58it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 27.91it/s][A
Epoch 2:  12%|█▏        | 696/5971 [07:19<55:27,  1.59it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.70it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.96it/s][A
Epoch 2:  12%|█▏        | 700/5971 [07:19<55:07,  1.59it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:02<00:03, 26.44it/s][A
Epoch 2:  12%|█▏        | 704/5971 [07:20<54:47,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.13it/s][A
Epoch 2:  12%|█▏        | 708/5971 [07:20<54:27,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.66it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.22it/s][A
Epoch 2:  12%|█▏        | 712/5971 [07:20<54:08,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.80it/s][A
Epoch 2:  12%|█▏        | 716/5971 [07:20<53:48,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.87it/s][A
Epoch 2:  12%|█▏        | 720/5971 [07:20<53:29,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.91it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 27.75it/s][A
Epoch 2:  12%|█▏        | 724/5971 [07:20<53:10,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 25.96it/s][A
Epoch 2:  12%|█▏        | 728/5971 [07:20<52:51,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 26.64it/s][A
Epoch 2:  12%|█▏        | 732/5971 [07:21<52:32,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.88it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.48it/s][A
Epoch 2:  12%|█▏        | 736/5971 [07:21<52:14,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 28.23it/s][A
Epoch 2:  12%|█▏        | 740/5971 [07:21<51:55,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.90it/s][A
Epoch 2:  12%|█▏        | 744/5971 [07:21<51:37,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 28.42it/s][A
Epoch 2:  13%|█▎        | 748/5971 [07:21<51:19,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 28.79it/s][A
Epoch 2:  13%|█▎        | 752/5971 [07:21<51:02,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:04<00:01, 29.29it/s][A
Epoch 2:  13%|█▎        | 756/5971 [07:21<50:44,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 28.62it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.41it/s][A
Epoch 2:  13%|█▎        | 760/5971 [07:22<50:27,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 28.22it/s][A
Epoch 2:  13%|█▎        | 764/5971 [07:22<50:09,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.11it/s][A
Epoch 2:  13%|█▎        | 768/5971 [07:22<49:53,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.98it/s][A
Epoch 2:  13%|█▎        | 772/5971 [07:22<49:36,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 29.01it/s][A
Epoch 2:  13%|█▎        | 776/5971 [07:22<49:19,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 29.95it/s][A
Epoch 2:  13%|█▎        | 780/5971 [07:22<49:02,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 28.71it/s][A
Epoch 2:  13%|█▎        | 784/5971 [07:22<48:46,  1.77it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 28.90it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 28.57it/s][A
Epoch 2:  13%|█▎        | 788/5971 [07:23<48:30,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 28.38it/s][A
Epoch 2:  13%|█▎        | 792/5971 [07:23<48:14,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 28.63it/s][A
Epoch 2:  13%|█▎        | 796/5971 [07:23<47:58,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 28.77it/s][A
Epoch 2:  13%|█▎        | 800/5971 [07:23<47:42,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 30.03it/s][A
Epoch 2:  13%|█▎        | 804/5971 [07:23<47:27,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  13%|█▎        | 804/5971 [07:23<47:29,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.51e-5, train/loss_step=0.00289, global_step=1224.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  13%|█▎        | 805/5971 [07:24<47:31,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.587, train/loss_vlb_step=0.00869, train/loss_step=0.587, global_step=1225.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  13%|█▎        | 806/5971 [07:25<47:33,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000112, train/loss_step=0.0283, global_step=1225.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 807/5971 [07:26<47:34,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.00027, train/loss_step=0.0815, global_step=1225.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▎        | 808/5971 [07:28<47:44,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.00027, train/loss_step=0.0815, global_step=1225.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 808/5971 [07:28<47:44,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0572, train/loss_vlb_step=0.000192, train/loss_step=0.0572, global_step=1225.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 809/5971 [07:29<47:45,  1.80it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.54e-5, train/loss_step=0.0235, global_step=1226.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 810/5971 [07:30<47:47,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0519, train/loss_vlb_step=0.000182, train/loss_step=0.0519, global_step=1226.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 811/5971 [07:31<47:48,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.00941, train/loss_step=0.621, global_step=1226.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  14%|█▎        | 812/5971 [07:33<47:59,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.00941, train/loss_step=0.621, global_step=1226.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 812/5971 [07:33<47:59,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=1.96e-5, train/loss_step=0.00376, global_step=1226.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 813/5971 [07:34<48:01,  1.79it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0672, train/loss_vlb_step=0.000227, train/loss_step=0.0672, global_step=1227.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▎        | 814/5971 [07:35<48:02,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00426, train/loss_vlb_step=2.24e-5, train/loss_step=0.00426, global_step=1227.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 815/5971 [07:36<48:04,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00175, train/loss_step=0.331, global_step=1227.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  14%|█▎        | 816/5971 [07:38<48:13,  1.78it/s, loss=0.168, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00175, train/loss_step=0.331, global_step=1227.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 816/5971 [07:38<48:13,  1.78it/s, loss=0.175, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000639, train/loss_step=0.178, global_step=1227.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 817/5971 [07:39<48:14,  1.78it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000172, train/loss_step=0.0461, global_step=1228.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 818/5971 [07:40<48:16,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.33e-5, train/loss_step=0.0153, global_step=1228.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▎        | 819/5971 [07:41<48:17,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.44e-6, train/loss_step=0.0016, global_step=1228.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 820/5971 [07:43<48:26,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.44e-6, train/loss_step=0.0016, global_step=1228.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▎        | 820/5971 [07:43<48:26,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.531, train/loss_vlb_step=0.00402, train/loss_step=0.531, global_step=1228.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▎        | 821/5971 [07:44<48:28,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000172, train/loss_step=0.0499, global_step=1229.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 822/5971 [07:45<48:29,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0473, train/loss_vlb_step=0.00017, train/loss_step=0.0473, global_step=1229.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▍        | 823/5971 [07:45<48:31,  1.77it/s, loss=0.143, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=1229.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▍        | 824/5971 [07:48<48:41,  1.76it/s, loss=0.143, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=1229.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 824/5971 [07:48<48:41,  1.76it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000128, train/loss_step=0.0361, global_step=1229.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 825/5971 [07:49<48:43,  1.76it/s, loss=0.121, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000361, train/loss_step=0.110, global_step=1230.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 826/5971 [07:50<48:44,  1.76it/s, loss=0.149, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.0106, train/loss_step=0.577, global_step=1230.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 827/5971 [07:50<48:45,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00229, train/loss_step=0.387, global_step=1230.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 828/5971 [07:53<48:55,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00229, train/loss_step=0.387, global_step=1230.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 828/5971 [07:53<48:55,  1.75it/s, loss=0.173, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000865, train/loss_step=0.234, global_step=1230.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 829/5971 [07:54<48:56,  1.75it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.02e-5, train/loss_step=0.0112, global_step=1231.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 830/5971 [07:54<48:58,  1.75it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00797, train/loss_vlb_step=3.83e-5, train/loss_step=0.00797, global_step=1231.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 831/5971 [07:55<48:59,  1.75it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000143, train/loss_step=0.0393, global_step=1231.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 832/5971 [07:58<49:09,  1.74it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000143, train/loss_step=0.0393, global_step=1231.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 832/5971 [07:58<49:09,  1.74it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.54e-5, train/loss_step=0.0259, global_step=1231.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▍        | 833/5971 [07:59<49:11,  1.74it/s, loss=0.157, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00198, train/loss_step=0.360, global_step=1232.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 834/5971 [07:59<49:12,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.38e-5, train/loss_step=0.0024, global_step=1232.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 835/5971 [08:00<49:13,  1.74it/s, loss=0.158, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00156, train/loss_step=0.352, global_step=1232.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 836/5971 [08:02<49:22,  1.73it/s, loss=0.158, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00156, train/loss_step=0.352, global_step=1232.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 836/5971 [08:02<49:22,  1.73it/s, loss=0.179, v_num=0, train/loss_simple_step=0.601, train/loss_vlb_step=0.00503, train/loss_step=0.601, global_step=1232.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 837/5971 [08:03<49:23,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00101, train/loss_step=0.260, global_step=1233.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 838/5971 [08:04<49:25,  1.73it/s, loss=0.196, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000457, train/loss_step=0.139, global_step=1233.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 839/5971 [08:05<49:26,  1.73it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.21e-5, train/loss_step=0.00407, global_step=1233.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 840/5971 [08:07<49:35,  1.72it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.21e-5, train/loss_step=0.00407, global_step=1233.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 840/5971 [08:07<49:35,  1.72it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000305, train/loss_step=0.0926, global_step=1233.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▍        | 841/5971 [08:08<49:36,  1.72it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=7.71e-5, train/loss_step=0.0195, global_step=1234.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▍        | 842/5971 [08:09<49:37,  1.72it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00575, train/loss_vlb_step=2.97e-5, train/loss_step=0.00575, global_step=1234.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 843/5971 [08:10<49:39,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00164, train/loss_step=0.313, global_step=1234.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  14%|█▍        | 844/5971 [08:12<49:47,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00164, train/loss_step=0.313, global_step=1234.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 844/5971 [08:12<49:47,  1.72it/s, loss=0.22, v_num=0, train/loss_simple_step=0.856, train/loss_vlb_step=0.0872, train/loss_step=0.856, global_step=1234.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 845/5971 [08:13<49:49,  1.71it/s, loss=0.225, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000822, train/loss_step=0.222, global_step=1235.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 846/5971 [08:14<49:50,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.62e-5, train/loss_step=0.0103, global_step=1235.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 847/5971 [08:15<49:51,  1.71it/s, loss=0.213, v_num=0, train/loss_simple_step=0.700, train/loss_vlb_step=0.0164, train/loss_step=0.700, global_step=1235.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  14%|█▍        | 848/5971 [08:17<50:00,  1.71it/s, loss=0.213, v_num=0, train/loss_simple_step=0.700, train/loss_vlb_step=0.0164, train/loss_step=0.700, global_step=1235.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 848/5971 [08:17<50:00,  1.71it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000327, train/loss_step=0.0992, global_step=1235.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 849/5971 [08:18<50:01,  1.71it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.5e-5, train/loss_step=0.0128, global_step=1236.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 850/5971 [08:18<50:02,  1.71it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.24e-5, train/loss_step=0.0196, global_step=1236.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 851/5971 [08:19<50:03,  1.70it/s, loss=0.214, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000579, train/loss_step=0.176, global_step=1236.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  14%|█▍        | 852/5971 [08:22<50:13,  1.70it/s, loss=0.214, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000579, train/loss_step=0.176, global_step=1236.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 852/5971 [08:22<50:13,  1.70it/s, loss=0.217, v_num=0, train/loss_simple_step=0.085, train/loss_vlb_step=0.000286, train/loss_step=0.085, global_step=1236.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 853/5971 [08:23<50:15,  1.70it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.88e-5, train/loss_step=0.00319, global_step=1237.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 854/5971 [08:23<50:16,  1.70it/s, loss=0.223, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00444, train/loss_step=0.483, global_step=1237.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  14%|█▍        | 855/5971 [08:24<50:17,  1.70it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=7.76e-5, train/loss_step=0.0205, global_step=1237.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 856/5971 [08:27<50:28,  1.69it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=7.76e-5, train/loss_step=0.0205, global_step=1237.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 856/5971 [08:27<50:28,  1.69it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000101, train/loss_step=0.0257, global_step=1237.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 857/5971 [08:28<50:30,  1.69it/s, loss=0.172, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000496, train/loss_step=0.149, global_step=1238.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 858/5971 [08:29<50:31,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000831, train/loss_step=0.222, global_step=1238.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 859/5971 [08:30<50:33,  1.69it/s, loss=0.206, v_num=0, train/loss_simple_step=0.610, train/loss_vlb_step=0.0102, train/loss_step=0.610, global_step=1238.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 860/5971 [08:32<50:44,  1.68it/s, loss=0.206, v_num=0, train/loss_simple_step=0.610, train/loss_vlb_step=0.0102, train/loss_step=0.610, global_step=1238.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 860/5971 [08:32<50:44,  1.68it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0886, train/loss_vlb_step=0.000292, train/loss_step=0.0886, global_step=1238.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 861/5971 [08:33<50:45,  1.68it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000164, train/loss_step=0.0466, global_step=1239.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 862/5971 [08:34<50:47,  1.68it/s, loss=0.216, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000582, train/loss_step=0.177, global_step=1239.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  14%|█▍        | 863/5971 [08:35<50:48,  1.68it/s, loss=0.213, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000963, train/loss_step=0.251, global_step=1239.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 864/5971 [08:37<50:56,  1.67it/s, loss=0.213, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000963, train/loss_step=0.251, global_step=1239.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  14%|█▍        | 864/5971 [08:37<50:56,  1.67it/s, loss=0.215, v_num=0, train/loss_simple_step=0.896, train/loss_vlb_step=0.226, train/loss_step=0.896, global_step=1239.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  14%|█▍        | 865/5971 [08:38<50:58,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000109, train/loss_step=0.0285, global_step=1240.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 866/5971 [08:39<50:59,  1.67it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000207, train/loss_step=0.0614, global_step=1240.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 867/5971 [08:40<51:00,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.0152, train/loss_step=0.622, global_step=1240.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  15%|█▍        | 868/5971 [08:43<51:13,  1.66it/s, loss=0.204, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.0152, train/loss_step=0.622, global_step=1240.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 868/5971 [08:43<51:13,  1.66it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00724, train/loss_vlb_step=3.41e-5, train/loss_step=0.00724, global_step=1240.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 869/5971 [08:44<51:14,  1.66it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8.19e-5, train/loss_step=0.0191, global_step=1241.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  15%|█▍        | 870/5971 [08:45<51:15,  1.66it/s, loss=0.213, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00153, train/loss_step=0.293, global_step=1241.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 871/5971 [08:46<51:16,  1.66it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.000241, train/loss_step=0.0707, global_step=1241.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 872/5971 [08:48<51:26,  1.65it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.000241, train/loss_step=0.0707, global_step=1241.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 872/5971 [08:48<51:26,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000483, train/loss_step=0.147, global_step=1241.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  15%|█▍        | 873/5971 [08:49<51:27,  1.65it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=9.2e-5, train/loss_step=0.0225, global_step=1242.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 874/5971 [08:50<51:28,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000269, train/loss_step=0.0815, global_step=1242.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 875/5971 [08:51<51:29,  1.65it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.34e-5, train/loss_step=0.00456, global_step=1242.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 876/5971 [08:53<51:37,  1.64it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.34e-5, train/loss_step=0.00456, global_step=1242.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 876/5971 [08:53<51:37,  1.64it/s, loss=0.196, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000425, train/loss_step=0.129, global_step=1242.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  15%|█▍        | 877/5971 [08:54<51:40,  1.64it/s, loss=0.196, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000487, train/loss_step=0.145, global_step=1243.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 878/5971 [08:55<51:41,  1.64it/s, loss=0.192, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000457, train/loss_step=0.139, global_step=1243.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 879/5971 [08:56<51:42,  1.64it/s, loss=0.167, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000357, train/loss_step=0.109, global_step=1243.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 880/5971 [08:58<51:50,  1.64it/s, loss=0.167, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000357, train/loss_step=0.109, global_step=1243.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 880/5971 [08:58<51:50,  1.64it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0676, train/loss_vlb_step=0.000224, train/loss_step=0.0676, global_step=1243.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 881/5971 [08:59<51:51,  1.64it/s, loss=0.175, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000843, train/loss_step=0.224, global_step=1244.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  15%|█▍        | 882/5971 [09:00<51:52,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0889, train/loss_vlb_step=0.000295, train/loss_step=0.0889, global_step=1244.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 883/5971 [09:01<51:53,  1.63it/s, loss=0.175, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00131, train/loss_step=0.339, global_step=1244.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  15%|█▍        | 884/5971 [09:03<52:01,  1.63it/s, loss=0.175, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00131, train/loss_step=0.339, global_step=1244.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 884/5971 [09:03<52:01,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.087, train/loss_vlb_step=0.000286, train/loss_step=0.087, global_step=1244.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 885/5971 [09:04<52:02,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.00015, train/loss_step=0.0438, global_step=1245.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 886/5971 [09:04<52:03,  1.63it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.77e-5, train/loss_step=0.0102, global_step=1245.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 887/5971 [09:05<52:04,  1.63it/s, loss=0.106, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=1245.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  15%|█▍        | 888/5971 [09:08<52:13,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=1245.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 888/5971 [09:08<52:13,  1.62it/s, loss=0.132, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00345, train/loss_step=0.525, global_step=1245.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  15%|█▍        | 889/5971 [09:08<52:14,  1.62it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000114, train/loss_step=0.0308, global_step=1246.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 890/5971 [09:09<52:15,  1.62it/s, loss=0.119, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.44e-5, train/loss_step=0.015, global_step=1246.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  15%|█▍        | 891/5971 [09:10<52:16,  1.62it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.43e-5, train/loss_step=0.0175, global_step=1246.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 892/5971 [09:13<52:30,  1.61it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.43e-5, train/loss_step=0.0175, global_step=1246.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 892/5971 [09:13<52:30,  1.61it/s, loss=0.12, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000962, train/loss_step=0.225, global_step=1246.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  15%|█▍        | 893/5971 [09:14<52:31,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000629, train/loss_step=0.180, global_step=1247.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 894/5971 [09:15<52:31,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=5.14e-5, train/loss_step=0.0109, global_step=1247.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▍        | 895/5971 [09:16<52:32,  1.61it/s, loss=0.129, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000318, train/loss_step=0.095, global_step=1247.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  15%|█▌        | 896/5971 [09:19<52:43,  1.60it/s, loss=0.129, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000318, train/loss_step=0.095, global_step=1247.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▌        | 896/5971 [09:19<52:43,  1.60it/s, loss=0.129, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000415, train/loss_step=0.125, global_step=1247.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▌        | 897/5971 [09:20<52:44,  1.60it/s, loss=0.152, v_num=0, train/loss_simple_step=0.602, train/loss_vlb_step=0.00504, train/loss_step=0.602, global_step=1248.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  15%|█▌        | 898/5971 [09:20<52:45,  1.60it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.84e-6, train/loss_step=0.00162, global_step=1248.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▌        | 899/5971 [09:21<52:46,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00795, train/loss_vlb_step=3.44e-5, train/loss_step=0.00795, global_step=1248.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  15%|█▌        | 900/5971 [09:24<52:56,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00795, train/loss_vlb_step=3.44e-5, train/loss_step=0.00795, global_step=1248.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▌        | 900/5971 [09:24<52:56,  1.60it/s, loss=0.145, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000606, train/loss_step=0.177, global_step=1248.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  15%|█▌        | 901/5971 [09:25<52:57,  1.60it/s, loss=0.135, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.29e-5, train/loss_step=0.015, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  15%|█▌        | 902/5971 [09:26<52:58,  1.59it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00628, train/loss_vlb_step=3.09e-5, train/loss_step=0.00628, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▌        | 903/5971 [09:27<52:58,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000144, train/loss_step=0.0418, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  15%|█▌        | 904/5971 [09:29<53:09,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000144, train/loss_step=0.0418, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  15%|█▌        | 904/5971 [09:29<53:09,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:19,  2.08it/s][A

Validating:   1%|          | 2/167 [00:00<00:50,  3.27it/s][A
Epoch 2:  15%|█▌        | 908/5971 [09:30<52:56,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.79it/s][A
Epoch 2:  15%|█▌        | 912/5971 [09:30<52:41,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 13.22it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.97it/s][A
Epoch 2:  15%|█▌        | 916/5971 [09:30<52:25,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.97it/s][A
Epoch 2:  15%|█▌        | 920/5971 [09:30<52:10,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.18it/s][A
Epoch 2:  15%|█▌        | 924/5971 [09:30<51:55,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.70it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.33it/s][A
Epoch 2:  16%|█▌        | 928/5971 [09:31<51:40,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.03it/s][A
Epoch 2:  16%|█▌        | 932/5971 [09:31<51:25,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.34it/s][A
Epoch 2:  16%|█▌        | 936/5971 [09:31<51:10,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.92it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 23.49it/s][A
Epoch 2:  16%|█▌        | 940/5971 [09:31<50:56,  1.65it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 23.63it/s][A
Epoch 2:  16%|█▌        | 944/5971 [09:31<50:41,  1.65it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.28it/s][A
Epoch 2:  16%|█▌        | 948/5971 [09:31<50:27,  1.66it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 24.41it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.72it/s][A
Epoch 2:  16%|█▌        | 952/5971 [09:32<50:12,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.66it/s][A
Epoch 2:  16%|█▌        | 956/5971 [09:32<49:58,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 24.70it/s][A
Epoch 2:  16%|█▌        | 960/5971 [09:32<49:44,  1.68it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.87it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.81it/s][A
Epoch 2:  16%|█▌        | 964/5971 [09:32<49:30,  1.69it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.91it/s][A
Epoch 2:  16%|█▌        | 968/5971 [09:32<49:16,  1.69it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 27.48it/s][A
Epoch 2:  16%|█▋        | 972/5971 [09:32<49:02,  1.70it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.21it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.37it/s][A
Epoch 2:  16%|█▋        | 976/5971 [09:32<48:49,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.83it/s][A
Epoch 2:  16%|█▋        | 980/5971 [09:33<48:35,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.06it/s][A
Epoch 2:  16%|█▋        | 984/5971 [09:33<48:22,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.16it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.59it/s][A
Epoch 2:  17%|█▋        | 988/5971 [09:33<48:09,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.68it/s][A
Epoch 2:  17%|█▋        | 992/5971 [09:33<47:55,  1.73it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 25.94it/s][A
Epoch 2:  17%|█▋        | 996/5971 [09:33<47:42,  1.74it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.20it/s][A
Epoch 2:  17%|█▋        | 1000/5971 [09:33<47:29,  1.74it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.77it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.74it/s][A
Epoch 2:  17%|█▋        | 1004/5971 [09:34<47:16,  1.75it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 24.40it/s][A
Epoch 2:  17%|█▋        | 1008/5971 [09:34<47:04,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 24.02it/s][A
Epoch 2:  17%|█▋        | 1012/5971 [09:34<46:51,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 24.61it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 24.83it/s][A
Epoch 2:  17%|█▋        | 1016/5971 [09:34<46:39,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 24.71it/s][A
Epoch 2:  17%|█▋        | 1020/5971 [09:34<46:26,  1.78it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:05<00:01, 25.30it/s][A
Epoch 2:  17%|█▋        | 1024/5971 [09:34<46:14,  1.78it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.60it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.25it/s][A
Epoch 2:  17%|█▋        | 1028/5971 [09:34<46:02,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.01it/s][A
Epoch 2:  17%|█▋        | 1032/5971 [09:35<45:49,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.92it/s][A
Epoch 2:  17%|█▋        | 1036/5971 [09:35<45:37,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 24.65it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 24.61it/s][A
Epoch 2:  17%|█▋        | 1040/5971 [09:35<45:25,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 24.75it/s][A
Epoch 2:  17%|█▋        | 1044/5971 [09:35<45:14,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 25.97it/s][A
Epoch 2:  18%|█▊        | 1048/5971 [09:35<45:02,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.13it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.27it/s][A
Epoch 2:  18%|█▊        | 1052/5971 [09:35<44:50,  1.83it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.68it/s][A
Epoch 2:  18%|█▊        | 1056/5971 [09:36<44:38,  1.83it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.79it/s][A
Epoch 2:  18%|█▊        | 1060/5971 [09:36<44:27,  1.84it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.65it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 24.41it/s][A
Epoch 2:  18%|█▊        | 1064/5971 [09:36<44:15,  1.85it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 24.18it/s][A
Epoch 2:  18%|█▊        | 1068/5971 [09:36<44:04,  1.85it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 23.79it/s][A
Epoch 2:  18%|█▊        | 1072/5971 [09:36<43:53,  1.86it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1072/5971 [09:37<43:54,  1.86it/s, loss=0.136, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00522, train/loss_step=0.480, global_step=1249.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  18%|█▊        | 1073/5971 [09:37<43:55,  1.86it/s, loss=0.149, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.0014, train/loss_step=0.305, global_step=1250.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  18%|█▊        | 1074/5971 [09:38<43:57,  1.86it/s, loss=0.187, v_num=0, train/loss_simple_step=0.775, train/loss_vlb_step=0.0498, train/loss_step=0.775, global_step=1250.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1075/5971 [09:39<43:58,  1.86it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.61e-5, train/loss_step=0.0153, global_step=1250.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1076/5971 [09:42<44:05,  1.85it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.61e-5, train/loss_step=0.0153, global_step=1250.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1076/5971 [09:42<44:05,  1.85it/s, loss=0.176, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00254, train/loss_step=0.386, global_step=1250.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  18%|█▊        | 1077/5971 [09:43<44:07,  1.85it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000117, train/loss_step=0.0325, global_step=1251.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1078/5971 [09:43<44:08,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00122, train/loss_step=0.273, global_step=1251.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  18%|█▊        | 1079/5971 [09:44<44:09,  1.85it/s, loss=0.203, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00113, train/loss_step=0.301, global_step=1251.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1080/5971 [09:47<44:20,  1.84it/s, loss=0.203, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00113, train/loss_step=0.301, global_step=1251.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1080/5971 [09:47<44:20,  1.84it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.2e-5, train/loss_step=0.00216, global_step=1251.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1081/5971 [09:48<44:21,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00119, train/loss_step=0.286, global_step=1252.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  18%|█▊        | 1082/5971 [09:49<44:22,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000147, train/loss_step=0.0418, global_step=1252.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1083/5971 [09:50<44:23,  1.84it/s, loss=0.21, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.0015, train/loss_step=0.319, global_step=1252.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  18%|█▊        | 1084/5971 [09:52<44:29,  1.83it/s, loss=0.21, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.0015, train/loss_step=0.319, global_step=1252.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1084/5971 [09:52<44:29,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.24e-5, train/loss_step=0.0119, global_step=1252.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1085/5971 [09:53<44:31,  1.83it/s, loss=0.19, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00158, train/loss_step=0.321, global_step=1253.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  18%|█▊        | 1086/5971 [09:54<44:32,  1.83it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000124, train/loss_step=0.0336, global_step=1253.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1087/5971 [09:55<44:33,  1.83it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.08e-5, train/loss_step=0.00395, global_step=1253.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1088/5971 [09:57<44:40,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.08e-5, train/loss_step=0.00395, global_step=1253.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1088/5971 [09:57<44:40,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0038, train/loss_vlb_step=2.01e-5, train/loss_step=0.0038, global_step=1253.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  18%|█▊        | 1089/5971 [09:58<44:41,  1.82it/s, loss=0.189, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000506, train/loss_step=0.149, global_step=1254.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  18%|█▊        | 1090/5971 [09:59<44:42,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0745, train/loss_vlb_step=0.000249, train/loss_step=0.0745, global_step=1254.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1091/5971 [10:00<44:43,  1.82it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000125, train/loss_step=0.0334, global_step=1254.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1092/5971 [10:02<44:49,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000125, train/loss_step=0.0334, global_step=1254.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1092/5971 [10:02<44:49,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.09e-5, train/loss_step=0.0121, global_step=1254.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  18%|█▊        | 1093/5971 [10:03<44:50,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00147, train/loss_step=0.331, global_step=1255.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  18%|█▊        | 1094/5971 [10:04<44:51,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00777, train/loss_step=0.574, global_step=1255.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1095/5971 [10:05<44:52,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.00024, train/loss_step=0.0704, global_step=1255.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1096/5971 [10:08<45:02,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.00024, train/loss_step=0.0704, global_step=1255.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1096/5971 [10:08<45:02,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000245, train/loss_step=0.0735, global_step=1255.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1097/5971 [10:08<45:03,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.23e-5, train/loss_step=0.00401, global_step=1256.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1098/5971 [10:09<45:04,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.57e-5, train/loss_step=0.00296, global_step=1256.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1099/5971 [10:10<45:05,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.709, train/loss_vlb_step=0.0116, train/loss_step=0.709, global_step=1256.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  18%|█▊        | 1100/5971 [10:13<45:14,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.709, train/loss_vlb_step=0.0116, train/loss_step=0.709, global_step=1256.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1100/5971 [10:13<45:14,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000209, train/loss_step=0.0575, global_step=1256.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1101/5971 [10:14<45:15,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.52e-5, train/loss_step=0.0207, global_step=1257.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  18%|█▊        | 1102/5971 [10:15<45:16,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000323, train/loss_step=0.0968, global_step=1257.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1103/5971 [10:16<45:17,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.00014, train/loss_step=0.0358, global_step=1257.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  18%|█▊        | 1104/5971 [10:18<45:24,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.00014, train/loss_step=0.0358, global_step=1257.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  18%|█▊        | 1104/5971 [10:18<45:24,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.00045, train/loss_step=0.136, global_step=1257.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▊        | 1105/5971 [10:19<45:25,  1.79it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.000196, train/loss_step=0.0566, global_step=1258.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1106/5971 [10:20<45:26,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000116, train/loss_step=0.0301, global_step=1258.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1107/5971 [10:21<45:27,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00628, train/loss_vlb_step=3.15e-5, train/loss_step=0.00628, global_step=1258.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1108/5971 [10:23<45:33,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00628, train/loss_vlb_step=3.15e-5, train/loss_step=0.00628, global_step=1258.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1108/5971 [10:23<45:33,  1.78it/s, loss=0.131, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000539, train/loss_step=0.155, global_step=1258.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  19%|█▊        | 1109/5971 [10:24<45:34,  1.78it/s, loss=0.136, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.00084, train/loss_step=0.241, global_step=1259.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▊        | 1110/5971 [10:25<45:35,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.67e-6, train/loss_step=0.00144, global_step=1259.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1111/5971 [10:26<45:36,  1.78it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.44e-5, train/loss_step=0.00255, global_step=1259.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1112/5971 [10:28<45:42,  1.77it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.44e-5, train/loss_step=0.00255, global_step=1259.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1112/5971 [10:28<45:42,  1.77it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00457, train/loss_vlb_step=2.48e-5, train/loss_step=0.00457, global_step=1259.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▊        | 1113/5971 [10:29<45:43,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00767, train/loss_step=0.567, global_step=1260.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  19%|█▊        | 1114/5971 [10:30<45:44,  1.77it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.41e-5, train/loss_step=0.0175, global_step=1260.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1115/5971 [10:30<45:45,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00719, train/loss_vlb_step=3.32e-5, train/loss_step=0.00719, global_step=1260.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1116/5971 [10:33<45:51,  1.76it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00719, train/loss_vlb_step=3.32e-5, train/loss_step=0.00719, global_step=1260.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1116/5971 [10:33<45:51,  1.76it/s, loss=0.11, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000186, train/loss_step=0.055, global_step=1260.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  19%|█▊        | 1117/5971 [10:33<45:52,  1.76it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.6e-5, train/loss_step=0.0241, global_step=1261.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▊        | 1118/5971 [10:34<45:53,  1.76it/s, loss=0.142, v_num=0, train/loss_simple_step=0.612, train/loss_vlb_step=0.00846, train/loss_step=0.612, global_step=1261.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▊        | 1119/5971 [10:35<45:53,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000399, train/loss_step=0.117, global_step=1261.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1120/5971 [10:38<46:01,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000399, train/loss_step=0.117, global_step=1261.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1120/5971 [10:38<46:01,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00368, train/loss_step=0.534, global_step=1261.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▉        | 1121/5971 [10:38<46:01,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.3e-5, train/loss_step=0.0022, global_step=1262.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1122/5971 [10:39<46:02,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000395, train/loss_step=0.119, global_step=1262.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1123/5971 [10:40<46:03,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000149, train/loss_step=0.0422, global_step=1262.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1124/5971 [10:42<46:09,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000149, train/loss_step=0.0422, global_step=1262.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1124/5971 [10:42<46:09,  1.75it/s, loss=0.145, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.002, train/loss_step=0.314, global_step=1262.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  19%|█▉        | 1125/5971 [10:43<46:10,  1.75it/s, loss=0.165, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00251, train/loss_step=0.454, global_step=1263.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1126/5971 [10:44<46:10,  1.75it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00803, train/loss_vlb_step=3.71e-5, train/loss_step=0.00803, global_step=1263.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1127/5971 [10:45<46:12,  1.75it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.00015, train/loss_step=0.0407, global_step=1263.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▉        | 1128/5971 [10:47<46:19,  1.74it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.00015, train/loss_step=0.0407, global_step=1263.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1128/5971 [10:47<46:19,  1.74it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00699, train/loss_vlb_step=3.37e-5, train/loss_step=0.00699, global_step=1263.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1129/5971 [10:48<46:20,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000115, train/loss_step=0.0306, global_step=1264.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▉        | 1130/5971 [10:49<46:21,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=7.1e-5, train/loss_step=0.0164, global_step=1264.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▉        | 1131/5971 [10:50<46:21,  1.74it/s, loss=0.159, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000777, train/loss_step=0.215, global_step=1264.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1132/5971 [10:52<46:28,  1.74it/s, loss=0.159, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000777, train/loss_step=0.215, global_step=1264.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1132/5971 [10:52<46:28,  1.74it/s, loss=0.181, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00246, train/loss_step=0.434, global_step=1264.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▉        | 1133/5971 [10:53<46:29,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000502, train/loss_step=0.143, global_step=1265.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1134/5971 [10:54<46:29,  1.73it/s, loss=0.18, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00308, train/loss_step=0.422, global_step=1265.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▉        | 1135/5971 [10:55<46:30,  1.73it/s, loss=0.186, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000419, train/loss_step=0.128, global_step=1265.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1136/5971 [10:57<46:36,  1.73it/s, loss=0.186, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000419, train/loss_step=0.128, global_step=1265.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1136/5971 [10:57<46:36,  1.73it/s, loss=0.201, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00153, train/loss_step=0.356, global_step=1265.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▉        | 1137/5971 [10:58<46:37,  1.73it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=6.15e-5, train/loss_step=0.0135, global_step=1266.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1138/5971 [10:59<46:38,  1.73it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0038, train/loss_vlb_step=2.09e-5, train/loss_step=0.0038, global_step=1266.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1139/5971 [11:00<46:39,  1.73it/s, loss=0.171, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000424, train/loss_step=0.127, global_step=1266.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1140/5971 [11:02<46:45,  1.72it/s, loss=0.171, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000424, train/loss_step=0.127, global_step=1266.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1140/5971 [11:02<46:45,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00156, train/loss_step=0.319, global_step=1266.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▉        | 1141/5971 [11:03<46:46,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.83e-5, train/loss_step=0.00335, global_step=1267.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1142/5971 [11:04<46:47,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000894, train/loss_step=0.228, global_step=1267.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▉        | 1143/5971 [11:05<46:47,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.45e-5, train/loss_step=0.0151, global_step=1267.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1144/5971 [11:07<46:53,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.45e-5, train/loss_step=0.0151, global_step=1267.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1144/5971 [11:07<46:53,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.25e-5, train/loss_step=0.0171, global_step=1267.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1145/5971 [11:08<46:54,  1.71it/s, loss=0.138, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000835, train/loss_step=0.236, global_step=1268.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▉        | 1146/5971 [11:09<46:55,  1.71it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.14e-5, train/loss_step=0.0165, global_step=1268.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1147/5971 [11:10<46:55,  1.71it/s, loss=0.152, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00136, train/loss_step=0.310, global_step=1268.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▉        | 1148/5971 [11:13<47:06,  1.71it/s, loss=0.152, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00136, train/loss_step=0.310, global_step=1268.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1148/5971 [11:13<47:06,  1.71it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.51e-5, train/loss_step=0.00485, global_step=1268.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1149/5971 [11:14<47:07,  1.71it/s, loss=0.168, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00146, train/loss_step=0.356, global_step=1269.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  19%|█▉        | 1150/5971 [11:15<47:08,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.42e-5, train/loss_step=0.00259, global_step=1269.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1151/5971 [11:16<47:08,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.04e-5, train/loss_step=0.0113, global_step=1269.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▉        | 1152/5971 [11:18<47:14,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.04e-5, train/loss_step=0.0113, global_step=1269.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1152/5971 [11:18<47:14,  1.70it/s, loss=0.156, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00271, train/loss_step=0.411, global_step=1269.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▉        | 1153/5971 [11:19<47:15,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.547, train/loss_vlb_step=0.00725, train/loss_step=0.547, global_step=1270.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1154/5971 [11:20<47:16,  1.70it/s, loss=0.194, v_num=0, train/loss_simple_step=0.765, train/loss_vlb_step=0.0286, train/loss_step=0.765, global_step=1270.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  19%|█▉        | 1155/5971 [11:21<47:17,  1.70it/s, loss=0.194, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000481, train/loss_step=0.138, global_step=1270.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1156/5971 [11:23<47:24,  1.69it/s, loss=0.194, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000481, train/loss_step=0.138, global_step=1270.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1156/5971 [11:23<47:24,  1.69it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0892, train/loss_vlb_step=0.000293, train/loss_step=0.0892, global_step=1270.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1157/5971 [11:24<47:25,  1.69it/s, loss=0.182, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.00013, train/loss_step=0.035, global_step=1271.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  19%|█▉        | 1158/5971 [11:25<47:26,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000501, train/loss_step=0.151, global_step=1271.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1159/5971 [11:26<47:26,  1.69it/s, loss=0.212, v_num=0, train/loss_simple_step=0.591, train/loss_vlb_step=0.0138, train/loss_step=0.591, global_step=1271.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  19%|█▉        | 1160/5971 [11:29<47:35,  1.68it/s, loss=0.212, v_num=0, train/loss_simple_step=0.591, train/loss_vlb_step=0.0138, train/loss_step=0.591, global_step=1271.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1160/5971 [11:29<47:35,  1.68it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0876, train/loss_vlb_step=0.000291, train/loss_step=0.0876, global_step=1271.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1161/5971 [11:30<47:36,  1.68it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000165, train/loss_step=0.0462, global_step=1272.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1162/5971 [11:30<47:36,  1.68it/s, loss=0.213, v_num=0, train/loss_simple_step=0.419, train/loss_vlb_step=0.00243, train/loss_step=0.419, global_step=1272.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  19%|█▉        | 1163/5971 [11:31<47:37,  1.68it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0731, train/loss_vlb_step=0.000241, train/loss_step=0.0731, global_step=1272.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1164/5971 [11:34<47:46,  1.68it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0731, train/loss_vlb_step=0.000241, train/loss_step=0.0731, global_step=1272.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  19%|█▉        | 1164/5971 [11:34<47:46,  1.68it/s, loss=0.24, v_num=0, train/loss_simple_step=0.517, train/loss_vlb_step=0.00319, train/loss_step=0.517, global_step=1272.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  20%|█▉        | 1165/5971 [11:35<47:46,  1.68it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.000238, train/loss_step=0.0704, global_step=1273.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  20%|█▉        | 1166/5971 [11:36<47:47,  1.68it/s, loss=0.243, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000822, train/loss_step=0.231, global_step=1273.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  20%|█▉        | 1167/5971 [11:37<47:48,  1.67it/s, loss=0.235, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.0005, train/loss_step=0.150, global_step=1273.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  20%|█▉        | 1168/5971 [11:40<47:56,  1.67it/s, loss=0.235, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.0005, train/loss_step=0.150, global_step=1273.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  20%|█▉        | 1168/5971 [11:40<47:56,  1.67it/s, loss=0.235, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.61e-5, train/loss_step=0.00297, global_step=1273.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  20%|█▉        | 1169/5971 [11:40<47:57,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.05e-5, train/loss_step=0.00174, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  20%|█▉        | 1170/5971 [11:41<47:57,  1.67it/s, loss=0.231, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00111, train/loss_step=0.276, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  20%|█▉        | 1171/5971 [11:42<47:58,  1.67it/s, loss=0.238, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000517, train/loss_step=0.156, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  20%|█▉        | 1172/5971 [11:45<48:05,  1.66it/s, loss=0.238, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000517, train/loss_step=0.156, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  20%|█▉        | 1172/5971 [11:45<48:05,  1.66it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.31it/s][A

Validating:   1%|          | 2/167 [00:00<00:40,  4.11it/s][A
Epoch 2:  20%|█▉        | 1176/5971 [11:46<47:56,  1.67it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:15, 10.32it/s][A

Validating:   4%|▍         | 7/167 [00:00<00:12, 12.60it/s][A
Epoch 2:  20%|█▉        | 1180/5971 [11:46<47:44,  1.67it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   6%|▌         | 10/167 [00:00<00:09, 16.18it/s][A
Epoch 2:  20%|█▉        | 1184/5971 [11:46<47:33,  1.68it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 13/167 [00:01<00:07, 19.30it/s][A
Epoch 2:  20%|█▉        | 1188/5971 [11:46<47:22,  1.68it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|▉         | 16/167 [00:01<00:07, 21.17it/s][A

Validating:  11%|█▏        | 19/167 [00:01<00:06, 23.16it/s][A
Epoch 2:  20%|█▉        | 1192/5971 [11:46<47:10,  1.69it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 23.94it/s][A
Epoch 2:  20%|██        | 1196/5971 [11:46<46:59,  1.69it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 23.78it/s][A
Epoch 2:  20%|██        | 1200/5971 [11:46<46:48,  1.70it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.10it/s][A

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.09it/s][A
Epoch 2:  20%|██        | 1204/5971 [11:47<46:37,  1.70it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:04, 26.68it/s][A
Epoch 2:  20%|██        | 1208/5971 [11:47<46:26,  1.71it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.70it/s][A
Epoch 2:  20%|██        | 1212/5971 [11:47<46:15,  1.71it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.75it/s][A
Epoch 2:  20%|██        | 1216/5971 [11:47<46:04,  1.72it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.58it/s][A
Epoch 2:  20%|██        | 1220/5971 [11:47<45:53,  1.73it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.41it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 27.81it/s][A
Epoch 2:  20%|██        | 1224/5971 [11:47<45:42,  1.73it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.99it/s][A
Epoch 2:  21%|██        | 1228/5971 [11:48<45:32,  1.74it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.40it/s][A
Epoch 2:  21%|██        | 1232/5971 [11:48<45:21,  1.74it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.12it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.58it/s][A
Epoch 2:  21%|██        | 1236/5971 [11:48<45:11,  1.75it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.84it/s][A
Epoch 2:  21%|██        | 1240/5971 [11:48<45:00,  1.75it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.34it/s][A
Epoch 2:  21%|██        | 1244/5971 [11:48<44:50,  1.76it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.34it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.55it/s][A
Epoch 2:  21%|██        | 1248/5971 [11:48<44:40,  1.76it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:05, 15.05it/s][A
Epoch 2:  21%|██        | 1252/5971 [11:49<44:31,  1.77it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:04, 17.22it/s][A
Epoch 2:  21%|██        | 1256/5971 [11:49<44:20,  1.77it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:04, 19.22it/s][A

Validating:  52%|█████▏    | 87/167 [00:04<00:04, 19.98it/s][A
Epoch 2:  21%|██        | 1260/5971 [11:49<44:10,  1.78it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 20.29it/s][A
Epoch 2:  21%|██        | 1264/5971 [11:49<44:00,  1.78it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:03, 20.33it/s][A
Epoch 2:  21%|██        | 1268/5971 [11:49<43:50,  1.79it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:03, 22.40it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 22.71it/s][A
Epoch 2:  21%|██▏       | 1272/5971 [11:50<43:41,  1.79it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 24.21it/s][A
Epoch 2:  21%|██▏       | 1276/5971 [11:50<43:31,  1.80it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.28it/s][A
Epoch 2:  21%|██▏       | 1280/5971 [11:50<43:21,  1.80it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.22it/s][A

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 26.69it/s][A
Epoch 2:  22%|██▏       | 1284/5971 [11:50<43:11,  1.81it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 26.26it/s][A
Epoch 2:  22%|██▏       | 1288/5971 [11:50<43:01,  1.81it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:05<00:01, 25.96it/s][A
Epoch 2:  22%|██▏       | 1292/5971 [11:50<42:52,  1.82it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.50it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.03it/s][A
Epoch 2:  22%|██▏       | 1296/5971 [11:50<42:42,  1.82it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.61it/s][A
Epoch 2:  22%|██▏       | 1300/5971 [11:51<42:33,  1.83it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 28.07it/s][A
Epoch 2:  22%|██▏       | 1304/5971 [11:51<42:23,  1.83it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.68it/s][A
Epoch 2:  22%|██▏       | 1308/5971 [11:51<42:14,  1.84it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 28.57it/s][A
Epoch 2:  22%|██▏       | 1312/5971 [11:51<42:04,  1.85it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:06<00:00, 28.45it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.43it/s][A
Epoch 2:  22%|██▏       | 1316/5971 [11:51<41:55,  1.85it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 24.81it/s][A
Epoch 2:  22%|██▏       | 1320/5971 [11:51<41:46,  1.86it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 24.71it/s][A
Epoch 2:  22%|██▏       | 1324/5971 [11:52<41:37,  1.86it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 24.27it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 23.84it/s][A
Epoch 2:  22%|██▏       | 1328/5971 [11:52<41:28,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 19.42it/s][A
Epoch 2:  22%|██▏       | 1332/5971 [11:52<41:19,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 22.18it/s][A
Epoch 2:  22%|██▏       | 1336/5971 [11:52<41:10,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 22.02it/s][A
Epoch 2:  22%|██▏       | 1340/5971 [11:52<41:01,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  22%|██▏       | 1340/5971 [11:53<41:04,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1274.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  22%|██▏       | 1341/5971 [11:54<41:05,  1.88it/s, loss=0.218, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00538, train/loss_step=0.455, global_step=1275.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  22%|██▏       | 1342/5971 [11:55<41:06,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000672, train/loss_step=0.193, global_step=1275.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  22%|██▏       | 1343/5971 [11:56<41:06,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=8.97e-5, train/loss_step=0.0233, global_step=1275.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1344/5971 [11:58<41:11,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=8.97e-5, train/loss_step=0.0233, global_step=1275.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1344/5971 [11:58<41:11,  1.87it/s, loss=0.196, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00141, train/loss_step=0.343, global_step=1275.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1345/5971 [11:59<41:13,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00913, train/loss_vlb_step=4.42e-5, train/loss_step=0.00913, global_step=1276.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1346/5971 [12:00<41:14,  1.87it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00333, train/loss_vlb_step=1.83e-5, train/loss_step=0.00333, global_step=1276.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1347/5971 [12:01<41:14,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000254, train/loss_step=0.0738, global_step=1276.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  23%|██▎       | 1348/5971 [12:03<41:20,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000254, train/loss_step=0.0738, global_step=1276.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1348/5971 [12:03<41:20,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00368, train/loss_step=0.530, global_step=1276.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  23%|██▎       | 1349/5971 [12:04<41:21,  1.86it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000205, train/loss_step=0.0622, global_step=1277.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1350/5971 [12:05<41:22,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000358, train/loss_step=0.108, global_step=1277.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1351/5971 [12:06<41:22,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000225, train/loss_step=0.0642, global_step=1277.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1352/5971 [12:08<41:27,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000225, train/loss_step=0.0642, global_step=1277.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1352/5971 [12:08<41:27,  1.86it/s, loss=0.182, v_num=0, train/loss_simple_step=0.779, train/loss_vlb_step=0.0292, train/loss_step=0.779, global_step=1277.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  23%|██▎       | 1353/5971 [12:09<41:28,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.442, train/loss_vlb_step=0.00586, train/loss_step=0.442, global_step=1278.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  23%|██▎       | 1354/5971 [12:10<41:29,  1.85it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0744, train/loss_vlb_step=0.000259, train/loss_step=0.0744, global_step=1278.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1355/5971 [12:11<41:29,  1.85it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.1e-5, train/loss_step=0.0144, global_step=1278.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1356/5971 [12:13<41:34,  1.85it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.1e-5, train/loss_step=0.0144, global_step=1278.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1356/5971 [12:13<41:34,  1.85it/s, loss=0.191, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000361, train/loss_step=0.108, global_step=1278.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1357/5971 [12:14<41:35,  1.85it/s, loss=0.214, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00296, train/loss_step=0.463, global_step=1279.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  23%|██▎       | 1358/5971 [12:15<41:36,  1.85it/s, loss=0.214, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.000979, train/loss_step=0.279, global_step=1279.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1359/5971 [12:16<41:36,  1.85it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000151, train/loss_step=0.0426, global_step=1279.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1360/5971 [12:18<41:42,  1.84it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000151, train/loss_step=0.0426, global_step=1279.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1360/5971 [12:18<41:42,  1.84it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00795, train/loss_vlb_step=3.51e-5, train/loss_step=0.00795, global_step=1279.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1361/5971 [12:19<41:44,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.64e-5, train/loss_step=0.00293, global_step=1280.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1362/5971 [12:20<41:44,  1.84it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.82e-5, train/loss_step=0.0157, global_step=1280.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1363/5971 [12:21<41:45,  1.84it/s, loss=0.207, v_num=0, train/loss_simple_step=0.708, train/loss_vlb_step=0.0233, train/loss_step=0.708, global_step=1280.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  23%|██▎       | 1364/5971 [12:23<41:50,  1.84it/s, loss=0.207, v_num=0, train/loss_simple_step=0.708, train/loss_vlb_step=0.0233, train/loss_step=0.708, global_step=1280.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1364/5971 [12:23<41:50,  1.84it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.00013, train/loss_step=0.0363, global_step=1280.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1365/5971 [12:24<41:51,  1.83it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000161, train/loss_step=0.0427, global_step=1281.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1366/5971 [12:25<41:51,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00282, train/loss_step=0.456, global_step=1281.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  23%|██▎       | 1367/5971 [12:26<41:52,  1.83it/s, loss=0.213, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.26e-5, train/loss_step=0.018, global_step=1281.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1368/5971 [12:28<41:57,  1.83it/s, loss=0.213, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.26e-5, train/loss_step=0.018, global_step=1281.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1368/5971 [12:28<41:57,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000123, train/loss_step=0.0326, global_step=1281.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1369/5971 [12:29<41:58,  1.83it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0223, train/loss_vlb_step=8.77e-5, train/loss_step=0.0223, global_step=1282.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  23%|██▎       | 1370/5971 [12:30<41:58,  1.83it/s, loss=0.193, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000975, train/loss_step=0.246, global_step=1282.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  23%|██▎       | 1371/5971 [12:31<41:59,  1.83it/s, loss=0.19, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.69e-5, train/loss_step=0.003, global_step=1282.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1372/5971 [12:33<42:03,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.69e-5, train/loss_step=0.003, global_step=1282.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1372/5971 [12:33<42:03,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.58e-5, train/loss_step=0.00274, global_step=1282.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1373/5971 [12:34<42:04,  1.82it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00143, train/loss_vlb_step=8.36e-6, train/loss_step=0.00143, global_step=1283.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1374/5971 [12:35<42:04,  1.82it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0787, train/loss_vlb_step=0.000266, train/loss_step=0.0787, global_step=1283.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  23%|██▎       | 1375/5971 [12:36<42:05,  1.82it/s, loss=0.135, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000435, train/loss_step=0.131, global_step=1283.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1376/5971 [12:38<42:10,  1.82it/s, loss=0.135, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000435, train/loss_step=0.131, global_step=1283.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1376/5971 [12:38<42:10,  1.82it/s, loss=0.145, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00116, train/loss_step=0.308, global_step=1283.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  23%|██▎       | 1377/5971 [12:39<42:11,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000986, train/loss_step=0.212, global_step=1284.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1378/5971 [12:40<42:11,  1.81it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000169, train/loss_step=0.0502, global_step=1284.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1379/5971 [12:40<42:12,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00205, train/loss_step=0.417, global_step=1284.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  23%|██▎       | 1380/5971 [12:43<42:16,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00205, train/loss_step=0.417, global_step=1284.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1380/5971 [12:43<42:16,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000666, train/loss_step=0.184, global_step=1284.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1381/5971 [12:43<42:17,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=1285.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1382/5971 [12:44<42:17,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000264, train/loss_step=0.0763, global_step=1285.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1383/5971 [12:45<42:18,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000566, train/loss_step=0.168, global_step=1285.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1384/5971 [12:47<42:23,  1.80it/s, loss=0.131, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000566, train/loss_step=0.168, global_step=1285.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1384/5971 [12:47<42:23,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000614, train/loss_step=0.183, global_step=1285.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1385/5971 [12:48<42:23,  1.80it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=9.59e-5, train/loss_step=0.0271, global_step=1286.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1386/5971 [12:49<42:24,  1.80it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000108, train/loss_step=0.0285, global_step=1286.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1387/5971 [12:50<42:24,  1.80it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.87e-5, train/loss_step=0.00331, global_step=1286.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1388/5971 [12:52<42:29,  1.80it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.87e-5, train/loss_step=0.00331, global_step=1286.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1388/5971 [12:52<42:29,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0049, train/loss_vlb_step=2.38e-5, train/loss_step=0.0049, global_step=1286.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1389/5971 [12:53<42:30,  1.80it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000226, train/loss_step=0.0674, global_step=1287.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1390/5971 [12:54<42:30,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00118, train/loss_step=0.283, global_step=1287.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  23%|██▎       | 1391/5971 [12:55<42:31,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00107, train/loss_step=0.281, global_step=1287.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1392/5971 [12:57<42:35,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00107, train/loss_step=0.281, global_step=1287.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1392/5971 [12:57<42:35,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000184, train/loss_step=0.0538, global_step=1287.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1393/5971 [12:58<42:36,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=1.96e-5, train/loss_step=0.00384, global_step=1288.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1394/5971 [12:59<42:36,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.65e-5, train/loss_step=0.0191, global_step=1288.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1395/5971 [13:00<42:37,  1.79it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0754, train/loss_vlb_step=0.000251, train/loss_step=0.0754, global_step=1288.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1396/5971 [13:02<42:42,  1.79it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0754, train/loss_vlb_step=0.000251, train/loss_step=0.0754, global_step=1288.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1396/5971 [13:02<42:42,  1.79it/s, loss=0.121, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000597, train/loss_step=0.163, global_step=1288.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  23%|██▎       | 1397/5971 [13:03<42:42,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000941, train/loss_step=0.230, global_step=1289.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1398/5971 [13:04<42:43,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000124, train/loss_step=0.0325, global_step=1289.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1399/5971 [13:04<42:43,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000154, train/loss_step=0.0424, global_step=1289.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1400/5971 [13:07<42:48,  1.78it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000154, train/loss_step=0.0424, global_step=1289.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1400/5971 [13:07<42:48,  1.78it/s, loss=0.094, v_num=0, train/loss_simple_step=0.00869, train/loss_vlb_step=3.75e-5, train/loss_step=0.00869, global_step=1289.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1401/5971 [13:08<42:48,  1.78it/s, loss=0.0892, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000121, train/loss_step=0.0323, global_step=1290.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  23%|██▎       | 1402/5971 [13:08<42:49,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00283, train/loss_step=0.404, global_step=1290.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  23%|██▎       | 1403/5971 [13:09<42:49,  1.78it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.3e-5, train/loss_step=0.00223, global_step=1290.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1404/5971 [13:12<42:54,  1.77it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.3e-5, train/loss_step=0.00223, global_step=1290.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1404/5971 [13:12<42:54,  1.77it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.83e-5, train/loss_step=0.0138, global_step=1290.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  24%|██▎       | 1405/5971 [13:13<42:55,  1.77it/s, loss=0.0896, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000153, train/loss_step=0.0432, global_step=1291.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1406/5971 [13:13<42:55,  1.77it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.89e-5, train/loss_step=0.00348, global_step=1291.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1407/5971 [13:14<42:56,  1.77it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=4.92e-5, train/loss_step=0.0117, global_step=1291.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  24%|██▎       | 1408/5971 [13:17<43:01,  1.77it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=4.92e-5, train/loss_step=0.0117, global_step=1291.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1408/5971 [13:17<43:01,  1.77it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000795, train/loss_step=0.216, global_step=1291.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  24%|██▎       | 1409/5971 [13:17<43:01,  1.77it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000145, train/loss_step=0.042, global_step=1292.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1410/5971 [13:18<43:02,  1.77it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.067, train/loss_vlb_step=0.000226, train/loss_step=0.067, global_step=1292.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1411/5971 [13:19<43:02,  1.77it/s, loss=0.101, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.0157, train/loss_step=0.561, global_step=1292.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  24%|██▎       | 1412/5971 [13:22<43:07,  1.76it/s, loss=0.101, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.0157, train/loss_step=0.561, global_step=1292.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1412/5971 [13:22<43:07,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00113, train/loss_step=0.276, global_step=1292.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1413/5971 [13:22<43:08,  1.76it/s, loss=0.124, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000893, train/loss_step=0.233, global_step=1293.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1414/5971 [13:23<43:08,  1.76it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00542, train/loss_vlb_step=2.63e-5, train/loss_step=0.00542, global_step=1293.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1415/5971 [13:24<43:08,  1.76it/s, loss=0.13, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000764, train/loss_step=0.212, global_step=1293.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  24%|██▎       | 1416/5971 [13:27<43:14,  1.76it/s, loss=0.13, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000764, train/loss_step=0.212, global_step=1293.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1416/5971 [13:27<43:14,  1.76it/s, loss=0.14, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00237, train/loss_step=0.373, global_step=1293.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  24%|██▎       | 1417/5971 [13:28<43:15,  1.75it/s, loss=0.14, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000814, train/loss_step=0.216, global_step=1294.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▎       | 1418/5971 [13:29<43:15,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.8e-5, train/loss_step=0.0105, global_step=1294.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1419/5971 [13:29<43:16,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00369, train/loss_vlb_step=1.95e-5, train/loss_step=0.00369, global_step=1294.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1420/5971 [13:32<43:21,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00369, train/loss_vlb_step=1.95e-5, train/loss_step=0.00369, global_step=1294.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1420/5971 [13:32<43:21,  1.75it/s, loss=0.143, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000497, train/loss_step=0.141, global_step=1294.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  24%|██▍       | 1421/5971 [13:33<43:21,  1.75it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0782, train/loss_vlb_step=0.000268, train/loss_step=0.0782, global_step=1295.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1422/5971 [13:34<43:22,  1.75it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0972, train/loss_vlb_step=0.000321, train/loss_step=0.0972, global_step=1295.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  24%|██▍       | 1423/5971 [13:34<43:22,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000225, train/loss_step=0.0665, global_step=1295.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1424/5971 [13:37<43:27,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000225, train/loss_step=0.0665, global_step=1295.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1424/5971 [13:37<43:27,  1.74it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00284, train/loss_vlb_step=1.58e-5, train/loss_step=0.00284, global_step=1295.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1425/5971 [13:38<43:27,  1.74it/s, loss=0.155, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.0057, train/loss_step=0.490, global_step=1296.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  24%|██▍       | 1426/5971 [13:38<43:28,  1.74it/s, loss=0.163, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000501, train/loss_step=0.151, global_step=1296.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1427/5971 [13:39<43:28,  1.74it/s, loss=0.172, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.00071, train/loss_step=0.196, global_step=1296.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  24%|██▍       | 1428/5971 [13:41<43:33,  1.74it/s, loss=0.172, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.00071, train/loss_step=0.196, global_step=1296.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1428/5971 [13:41<43:33,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0478, train/loss_vlb_step=0.000163, train/loss_step=0.0478, global_step=1296.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1429/5971 [13:42<43:33,  1.74it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0913, train/loss_vlb_step=0.000314, train/loss_step=0.0913, global_step=1297.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1430/5971 [13:43<43:33,  1.74it/s, loss=0.173, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000697, train/loss_step=0.205, global_step=1297.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  24%|██▍       | 1431/5971 [13:44<43:34,  1.74it/s, loss=0.158, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00129, train/loss_step=0.269, global_step=1297.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  24%|██▍       | 1432/5971 [13:46<43:38,  1.73it/s, loss=0.158, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00129, train/loss_step=0.269, global_step=1297.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1432/5971 [13:46<43:38,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.00041, train/loss_step=0.122, global_step=1297.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1433/5971 [13:47<43:38,  1.73it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.000137, train/loss_step=0.0365, global_step=1298.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1434/5971 [13:48<43:39,  1.73it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00592, train/loss_vlb_step=2.89e-5, train/loss_step=0.00592, global_step=1298.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1435/5971 [13:49<43:39,  1.73it/s, loss=0.14, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000681, train/loss_step=0.188, global_step=1298.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  24%|██▍       | 1436/5971 [13:51<43:44,  1.73it/s, loss=0.14, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000681, train/loss_step=0.188, global_step=1298.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1436/5971 [13:51<43:44,  1.73it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00658, train/loss_vlb_step=3.1e-5, train/loss_step=0.00658, global_step=1298.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1437/5971 [13:52<43:44,  1.73it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00594, train/loss_vlb_step=2.96e-5, train/loss_step=0.00594, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1438/5971 [13:53<43:44,  1.73it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.51e-5, train/loss_step=0.0118, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  24%|██▍       | 1439/5971 [13:54<43:45,  1.73it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.11e-5, train/loss_step=0.00189, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1440/5971 [13:56<43:49,  1.72it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.11e-5, train/loss_step=0.00189, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  24%|██▍       | 1440/5971 [13:56<43:49,  1.72it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.34it/s][A

Validating:   1%|          | 2/167 [00:00<00:46,  3.58it/s][A
Epoch 2:  24%|██▍       | 1444/5971 [13:56<43:41,  1.73it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.33it/s][A
Epoch 2:  24%|██▍       | 1448/5971 [13:57<43:32,  1.73it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.03it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.80it/s][A
Epoch 2:  24%|██▍       | 1452/5971 [13:57<43:23,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.39it/s][A
Epoch 2:  24%|██▍       | 1456/5971 [13:57<43:14,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.01it/s][A
Epoch 2:  24%|██▍       | 1460/5971 [13:57<43:05,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.70it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.67it/s][A
Epoch 2:  25%|██▍       | 1464/5971 [13:57<42:57,  1.75it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.17it/s][A
Epoch 2:  25%|██▍       | 1468/5971 [13:57<42:48,  1.75it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.53it/s][A
Epoch 2:  25%|██▍       | 1472/5971 [13:57<42:39,  1.76it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.86it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 26.18it/s][A
Epoch 2:  25%|██▍       | 1476/5971 [13:58<42:30,  1.76it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.01it/s][A
Epoch 2:  25%|██▍       | 1480/5971 [13:58<42:22,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 27.12it/s][A
Epoch 2:  25%|██▍       | 1484/5971 [13:58<42:13,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 27.82it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 27.88it/s][A
Epoch 2:  25%|██▍       | 1488/5971 [13:58<42:04,  1.78it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 27.32it/s][A
Epoch 2:  25%|██▍       | 1492/5971 [13:58<41:56,  1.78it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.19it/s][A
Epoch 2:  25%|██▌       | 1496/5971 [13:58<41:47,  1.78it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.43it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.03it/s][A
Epoch 2:  25%|██▌       | 1500/5971 [13:59<41:39,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.87it/s][A
Epoch 2:  25%|██▌       | 1504/5971 [13:59<41:30,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.77it/s][A
Epoch 2:  25%|██▌       | 1508/5971 [13:59<41:22,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.76it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.46it/s][A
Epoch 2:  25%|██▌       | 1512/5971 [13:59<41:14,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.28it/s][A
Epoch 2:  25%|██▌       | 1516/5971 [13:59<41:05,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.38it/s][A
Epoch 2:  25%|██▌       | 1520/5971 [13:59<40:57,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.52it/s][A
Epoch 2:  26%|██▌       | 1524/5971 [13:59<40:49,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:02, 28.15it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 28.42it/s][A
Epoch 2:  26%|██▌       | 1528/5971 [14:00<40:41,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.70it/s][A
Epoch 2:  26%|██▌       | 1532/5971 [14:00<40:32,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.05it/s][A
Epoch 2:  26%|██▌       | 1536/5971 [14:00<40:24,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.73it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.41it/s][A
Epoch 2:  26%|██▌       | 1540/5971 [14:00<40:16,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.18it/s][A
Epoch 2:  26%|██▌       | 1544/5971 [14:00<40:08,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.43it/s][A
Epoch 2:  26%|██▌       | 1548/5971 [14:00<40:00,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 27.03it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.34it/s][A
Epoch 2:  26%|██▌       | 1552/5971 [14:00<39:52,  1.85it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 26.98it/s][A
Epoch 2:  26%|██▌       | 1556/5971 [14:01<39:45,  1.85it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.38it/s][A
Epoch 2:  26%|██▌       | 1560/5971 [14:01<39:37,  1.86it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.79it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.14it/s][A
Epoch 2:  26%|██▌       | 1564/5971 [14:01<39:29,  1.86it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.11it/s][A
Epoch 2:  26%|██▋       | 1568/5971 [14:01<39:21,  1.86it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.42it/s][A
Epoch 2:  26%|██▋       | 1572/5971 [14:01<39:13,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.36it/s][A
Epoch 2:  26%|██▋       | 1576/5971 [14:01<39:06,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.83it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 27.37it/s][A
Epoch 2:  26%|██▋       | 1580/5971 [14:02<38:58,  1.88it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.57it/s][A
Epoch 2:  27%|██▋       | 1584/5971 [14:02<38:51,  1.88it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 26.66it/s][A
Epoch 2:  27%|██▋       | 1588/5971 [14:02<38:43,  1.89it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.34it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 25.55it/s][A
Epoch 2:  27%|██▋       | 1592/5971 [14:02<38:35,  1.89it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 22.19it/s][A
Epoch 2:  27%|██▋       | 1596/5971 [14:02<38:28,  1.90it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 20.86it/s][A
Epoch 2:  27%|██▋       | 1600/5971 [14:02<38:21,  1.90it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 22.79it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 23.57it/s][A
Epoch 2:  27%|██▋       | 1604/5971 [14:03<38:13,  1.90it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 24.63it/s][A
Epoch 2:  27%|██▋       | 1608/5971 [14:03<38:06,  1.91it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1608/5971 [14:03<38:07,  1.91it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=1299.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.37it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.75it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.89it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.05it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.51it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.52it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.53it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.57it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.63it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.68it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.69it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.67it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.63it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.68it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.69it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.69it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.69it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.24it/s]

Epoch 2:  27%|██▋       | 1609/5971 [14:15<38:37,  1.88it/s, loss=0.106, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  27%|██▋       | 1609/5971 [14:15<38:39,  1.88it/s, loss=0.106, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.89it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.37it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.98it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.16it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.69it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.68it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.68it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.68it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.31it/s]

Epoch 2:  27%|██▋       | 1610/5971 [14:27<39:07,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1610/5971 [14:27<39:07,  1.86it/s, loss=0.109, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.36it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.17it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.73it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.00it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.06it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.49it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.59it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.66it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.63it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.48it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.28it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.26it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.24it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.24it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.24it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.20it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.20it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.19it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.19it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.20it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.10it/s]

Epoch 2:  27%|██▋       | 1611/5971 [14:39<39:38,  1.83it/s, loss=0.109, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1611/5971 [14:39<39:38,  1.83it/s, loss=0.11, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000313, train/loss_step=0.094, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.93it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.42it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.78it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.04it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.22it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.49it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.52it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.55it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.62it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.34it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.35it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.54it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.36it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.33it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.25it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.19it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.19it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.19it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.20it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.21it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.23it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.15it/s]

Epoch 2:  27%|██▋       | 1612/5971 [14:52<40:12,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000313, train/loss_step=0.094, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1612/5971 [14:52<40:12,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.02e-5, train/loss_step=0.00173, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1613/5971 [14:53<40:13,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.02e-5, train/loss_step=0.00173, global_step=1300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1613/5971 [14:53<40:13,  1.81it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.84e-5, train/loss_step=0.0034, global_step=1301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1614/5971 [14:54<40:13,  1.81it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.84e-5, train/loss_step=0.0034, global_step=1301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1614/5971 [14:54<40:13,  1.81it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.18e-5, train/loss_step=0.00204, global_step=1301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1615/5971 [14:55<40:13,  1.80it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.18e-5, train/loss_step=0.00204, global_step=1301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1615/5971 [14:55<40:13,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.855, train/loss_vlb_step=0.0872, train/loss_step=0.855, global_step=1301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  27%|██▋       | 1616/5971 [14:57<40:17,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.855, train/loss_vlb_step=0.0872, train/loss_step=0.855, global_step=1301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1616/5971 [14:57<40:17,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000343, train/loss_step=0.102, global_step=1301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1617/5971 [14:58<40:17,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000343, train/loss_step=0.102, global_step=1301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1617/5971 [14:58<40:17,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0611, train/loss_vlb_step=0.000206, train/loss_step=0.0611, global_step=1302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1618/5971 [14:59<40:17,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0611, train/loss_vlb_step=0.000206, train/loss_step=0.0611, global_step=1302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1618/5971 [14:59<40:17,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.00028, train/loss_step=0.0852, global_step=1302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  27%|██▋       | 1619/5971 [15:00<40:18,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.00028, train/loss_step=0.0852, global_step=1302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1619/5971 [15:00<40:18,  1.80it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.00632, train/loss_vlb_step=3.06e-5, train/loss_step=0.00632, global_step=1302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1620/5971 [15:02<40:22,  1.80it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.00632, train/loss_vlb_step=3.06e-5, train/loss_step=0.00632, global_step=1302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1620/5971 [15:02<40:22,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.811, train/loss_vlb_step=0.0691, train/loss_step=0.811, global_step=1302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  27%|██▋       | 1621/5971 [15:03<40:22,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.811, train/loss_vlb_step=0.0691, train/loss_step=0.811, global_step=1302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1621/5971 [15:03<40:22,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000128, train/loss_step=0.0344, global_step=1303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1622/5971 [15:04<40:22,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000128, train/loss_step=0.0344, global_step=1303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1622/5971 [15:04<40:22,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00327, train/loss_step=0.432, global_step=1303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  27%|██▋       | 1623/5971 [15:05<40:23,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00327, train/loss_step=0.432, global_step=1303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1623/5971 [15:05<40:23,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.986, train/loss_vlb_step=0.496, train/loss_step=0.986, global_step=1303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  27%|██▋       | 1624/5971 [15:07<40:26,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.986, train/loss_vlb_step=0.496, train/loss_step=0.986, global_step=1303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1624/5971 [15:07<40:26,  1.79it/s, loss=0.212, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00318, train/loss_step=0.480, global_step=1303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1625/5971 [15:08<40:26,  1.79it/s, loss=0.212, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00318, train/loss_step=0.480, global_step=1303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1625/5971 [15:08<40:26,  1.79it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000126, train/loss_step=0.0344, global_step=1304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1626/5971 [15:08<40:27,  1.79it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000126, train/loss_step=0.0344, global_step=1304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1626/5971 [15:08<40:27,  1.79it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0441, train/loss_vlb_step=0.000153, train/loss_step=0.0441, global_step=1304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1627/5971 [15:09<40:27,  1.79it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0441, train/loss_vlb_step=0.000153, train/loss_step=0.0441, global_step=1304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1627/5971 [15:09<40:27,  1.79it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000174, train/loss_step=0.0483, global_step=1304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1628/5971 [15:11<40:31,  1.79it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000174, train/loss_step=0.0483, global_step=1304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1628/5971 [15:11<40:31,  1.79it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0938, train/loss_vlb_step=0.000311, train/loss_step=0.0938, global_step=1304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1629/5971 [15:12<40:31,  1.79it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0938, train/loss_vlb_step=0.000311, train/loss_step=0.0938, global_step=1304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1629/5971 [15:12<40:31,  1.79it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00905, train/loss_vlb_step=4.15e-5, train/loss_step=0.00905, global_step=1305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1630/5971 [15:13<40:31,  1.79it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00905, train/loss_vlb_step=4.15e-5, train/loss_step=0.00905, global_step=1305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1630/5971 [15:13<40:31,  1.79it/s, loss=0.215, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000365, train/loss_step=0.110, global_step=1305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  27%|██▋       | 1631/5971 [15:14<40:32,  1.78it/s, loss=0.215, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000365, train/loss_step=0.110, global_step=1305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1631/5971 [15:14<40:32,  1.78it/s, loss=0.242, v_num=0, train/loss_simple_step=0.639, train/loss_vlb_step=0.00884, train/loss_step=0.639, global_step=1305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  27%|██▋       | 1632/5971 [15:17<40:36,  1.78it/s, loss=0.242, v_num=0, train/loss_simple_step=0.639, train/loss_vlb_step=0.00884, train/loss_step=0.639, global_step=1305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1632/5971 [15:17<40:36,  1.78it/s, loss=0.263, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00279, train/loss_step=0.415, global_step=1305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1633/5971 [15:18<40:37,  1.78it/s, loss=0.263, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00279, train/loss_step=0.415, global_step=1305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1633/5971 [15:18<40:37,  1.78it/s, loss=0.265, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000167, train/loss_step=0.0491, global_step=1306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1634/5971 [15:19<40:37,  1.78it/s, loss=0.265, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000167, train/loss_step=0.0491, global_step=1306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1634/5971 [15:19<40:37,  1.78it/s, loss=0.269, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.000281, train/loss_step=0.0852, global_step=1306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1635/5971 [15:19<40:38,  1.78it/s, loss=0.269, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.000281, train/loss_step=0.0852, global_step=1306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1635/5971 [15:19<40:38,  1.78it/s, loss=0.234, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000509, train/loss_step=0.154, global_step=1306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  27%|██▋       | 1636/5971 [15:22<40:41,  1.78it/s, loss=0.234, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000509, train/loss_step=0.154, global_step=1306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1636/5971 [15:22<40:41,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.22e-5, train/loss_step=0.0233, global_step=1306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1637/5971 [15:23<40:42,  1.77it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.22e-5, train/loss_step=0.0233, global_step=1306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1637/5971 [15:23<40:42,  1.77it/s, loss=0.233, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=1307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1638/5971 [15:23<40:42,  1.77it/s, loss=0.233, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=1307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1638/5971 [15:23<40:42,  1.77it/s, loss=0.229, v_num=0, train/loss_simple_step=0.00651, train/loss_vlb_step=3.24e-5, train/loss_step=0.00651, global_step=1307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1639/5971 [15:24<40:42,  1.77it/s, loss=0.229, v_num=0, train/loss_simple_step=0.00651, train/loss_vlb_step=3.24e-5, train/loss_step=0.00651, global_step=1307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1639/5971 [15:24<40:42,  1.77it/s, loss=0.248, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.0022, train/loss_step=0.374, global_step=1307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  27%|██▋       | 1640/5971 [15:26<40:46,  1.77it/s, loss=0.248, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.0022, train/loss_step=0.374, global_step=1307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1640/5971 [15:26<40:46,  1.77it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.24e-5, train/loss_step=0.00224, global_step=1307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1641/5971 [15:27<40:46,  1.77it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.24e-5, train/loss_step=0.00224, global_step=1307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1641/5971 [15:27<40:46,  1.77it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.64e-5, train/loss_step=0.00974, global_step=1308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1642/5971 [15:28<40:47,  1.77it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.64e-5, train/loss_step=0.00974, global_step=1308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  27%|██▋       | 1642/5971 [15:28<40:47,  1.77it/s, loss=0.192, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000504, train/loss_step=0.150, global_step=1308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1643/5971 [15:29<40:47,  1.77it/s, loss=0.192, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000504, train/loss_step=0.150, global_step=1308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1643/5971 [15:29<40:47,  1.77it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=2.04e-5, train/loss_step=0.00372, global_step=1308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1644/5971 [15:31<40:51,  1.77it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=2.04e-5, train/loss_step=0.00372, global_step=1308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1644/5971 [15:31<40:51,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.91e-5, train/loss_step=0.00352, global_step=1308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1645/5971 [15:32<40:51,  1.76it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.91e-5, train/loss_step=0.00352, global_step=1308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1645/5971 [15:32<40:51,  1.76it/s, loss=0.139, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00313, train/loss_step=0.429, global_step=1309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  28%|██▊       | 1646/5971 [15:33<40:52,  1.76it/s, loss=0.139, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00313, train/loss_step=0.429, global_step=1309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1646/5971 [15:33<40:52,  1.76it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000239, train/loss_step=0.0722, global_step=1309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1647/5971 [15:34<40:52,  1.76it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000239, train/loss_step=0.0722, global_step=1309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1647/5971 [15:34<40:52,  1.76it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000115, train/loss_step=0.0304, global_step=1309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1648/5971 [15:36<40:55,  1.76it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000115, train/loss_step=0.0304, global_step=1309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1648/5971 [15:36<40:55,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000238, train/loss_step=0.0723, global_step=1309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1649/5971 [15:37<40:56,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000238, train/loss_step=0.0723, global_step=1309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1649/5971 [15:37<40:56,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.0013, train/loss_step=0.308, global_step=1310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  28%|██▊       | 1650/5971 [15:38<40:56,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.0013, train/loss_step=0.308, global_step=1310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1650/5971 [15:38<40:56,  1.76it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00803, train/loss_vlb_step=4.02e-5, train/loss_step=0.00803, global_step=1310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1651/5971 [15:39<40:56,  1.76it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00803, train/loss_vlb_step=4.02e-5, train/loss_step=0.00803, global_step=1310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1651/5971 [15:39<40:56,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000723, train/loss_step=0.197, global_step=1310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1652/5971 [15:41<41:00,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000723, train/loss_step=0.197, global_step=1310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1652/5971 [15:41<41:00,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.53e-5, train/loss_step=0.0149, global_step=1310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1653/5971 [15:42<41:00,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.53e-5, train/loss_step=0.0149, global_step=1310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1653/5971 [15:42<41:00,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.0025, train/loss_step=0.323, global_step=1311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  28%|██▊       | 1654/5971 [15:43<41:00,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.0025, train/loss_step=0.323, global_step=1311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1654/5971 [15:43<41:00,  1.75it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000159, train/loss_step=0.0454, global_step=1311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1655/5971 [15:44<41:01,  1.75it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000159, train/loss_step=0.0454, global_step=1311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1655/5971 [15:44<41:01,  1.75it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00511, train/loss_vlb_step=2.6e-5, train/loss_step=0.00511, global_step=1311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1656/5971 [15:46<41:04,  1.75it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00511, train/loss_vlb_step=2.6e-5, train/loss_step=0.00511, global_step=1311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1656/5971 [15:46<41:04,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00114, train/loss_step=0.255, global_step=1311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  28%|██▊       | 1657/5971 [15:47<41:04,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00114, train/loss_step=0.255, global_step=1311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1657/5971 [15:47<41:04,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000323, train/loss_step=0.096, global_step=1312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1658/5971 [15:48<41:05,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000323, train/loss_step=0.096, global_step=1312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1658/5971 [15:48<41:05,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000182, train/loss_step=0.0528, global_step=1312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1659/5971 [15:49<41:05,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000182, train/loss_step=0.0528, global_step=1312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1659/5971 [15:49<41:05,  1.75it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0731, train/loss_vlb_step=0.000248, train/loss_step=0.0731, global_step=1312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1660/5971 [15:51<41:08,  1.75it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0731, train/loss_vlb_step=0.000248, train/loss_step=0.0731, global_step=1312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1660/5971 [15:51<41:08,  1.75it/s, loss=0.125, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00171, train/loss_step=0.344, global_step=1312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1661/5971 [15:52<41:09,  1.75it/s, loss=0.125, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00171, train/loss_step=0.344, global_step=1312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1661/5971 [15:52<41:09,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.26e-5, train/loss_step=0.00222, global_step=1313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1662/5971 [15:53<41:09,  1.74it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.26e-5, train/loss_step=0.00222, global_step=1313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1662/5971 [15:53<41:09,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000634, train/loss_step=0.176, global_step=1313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1663/5971 [15:53<41:09,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000634, train/loss_step=0.176, global_step=1313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1663/5971 [15:53<41:09,  1.74it/s, loss=0.141, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00132, train/loss_step=0.311, global_step=1313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1664/5971 [15:56<41:13,  1.74it/s, loss=0.141, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00132, train/loss_step=0.311, global_step=1313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1664/5971 [15:56<41:13,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.00059, train/loss_step=0.170, global_step=1313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1665/5971 [15:56<41:13,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.00059, train/loss_step=0.170, global_step=1313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1665/5971 [15:56<41:13,  1.74it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0946, train/loss_vlb_step=0.000311, train/loss_step=0.0946, global_step=1314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1666/5971 [15:57<41:13,  1.74it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0946, train/loss_vlb_step=0.000311, train/loss_step=0.0946, global_step=1314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1666/5971 [15:57<41:13,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00211, train/loss_step=0.372, global_step=1314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1667/5971 [15:58<41:13,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00211, train/loss_step=0.372, global_step=1314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1667/5971 [15:58<41:13,  1.74it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0913, train/loss_vlb_step=0.000301, train/loss_step=0.0913, global_step=1314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1668/5971 [16:00<41:17,  1.74it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0913, train/loss_vlb_step=0.000301, train/loss_step=0.0913, global_step=1314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1668/5971 [16:00<41:17,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=1314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1669/5971 [16:01<41:17,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=1314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1669/5971 [16:01<41:17,  1.74it/s, loss=0.146, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000587, train/loss_step=0.175, global_step=1315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1670/5971 [16:02<41:17,  1.74it/s, loss=0.146, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000587, train/loss_step=0.175, global_step=1315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1670/5971 [16:02<41:17,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00235, train/loss_step=0.367, global_step=1315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1671/5971 [16:03<41:17,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00235, train/loss_step=0.367, global_step=1315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1671/5971 [16:03<41:17,  1.74it/s, loss=0.163, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000611, train/loss_step=0.178, global_step=1315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1672/5971 [16:05<41:21,  1.73it/s, loss=0.163, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000611, train/loss_step=0.178, global_step=1315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1672/5971 [16:05<41:21,  1.73it/s, loss=0.164, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.12e-5, train/loss_step=0.023, global_step=1315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1673/5971 [16:06<41:21,  1.73it/s, loss=0.164, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.12e-5, train/loss_step=0.023, global_step=1315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1673/5971 [16:06<41:21,  1.73it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000172, train/loss_step=0.0475, global_step=1316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1674/5971 [16:07<41:21,  1.73it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000172, train/loss_step=0.0475, global_step=1316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1674/5971 [16:07<41:21,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0616, train/loss_vlb_step=0.000207, train/loss_step=0.0616, global_step=1316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1675/5971 [16:08<41:21,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0616, train/loss_vlb_step=0.000207, train/loss_step=0.0616, global_step=1316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1675/5971 [16:08<41:21,  1.73it/s, loss=0.174, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00358, train/loss_step=0.468, global_step=1316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1676/5971 [16:10<41:25,  1.73it/s, loss=0.174, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00358, train/loss_step=0.468, global_step=1316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1676/5971 [16:10<41:25,  1.73it/s, loss=0.171, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000685, train/loss_step=0.190, global_step=1316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1677/5971 [16:11<41:25,  1.73it/s, loss=0.171, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000685, train/loss_step=0.190, global_step=1316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1677/5971 [16:11<41:25,  1.73it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000123, train/loss_step=0.0316, global_step=1317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1678/5971 [16:12<41:25,  1.73it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000123, train/loss_step=0.0316, global_step=1317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1678/5971 [16:12<41:25,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.48e-5, train/loss_step=0.0108, global_step=1317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1679/5971 [16:13<41:26,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.48e-5, train/loss_step=0.0108, global_step=1317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1679/5971 [16:13<41:26,  1.73it/s, loss=0.172, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000768, train/loss_step=0.208, global_step=1317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1680/5971 [16:15<41:29,  1.72it/s, loss=0.172, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000768, train/loss_step=0.208, global_step=1317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1680/5971 [16:15<41:29,  1.72it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=8.24e-5, train/loss_step=0.0193, global_step=1317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1681/5971 [16:16<41:29,  1.72it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=8.24e-5, train/loss_step=0.0193, global_step=1317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1681/5971 [16:16<41:29,  1.72it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.21e-5, train/loss_step=0.0244, global_step=1318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1682/5971 [16:17<41:29,  1.72it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.21e-5, train/loss_step=0.0244, global_step=1318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1682/5971 [16:17<41:29,  1.72it/s, loss=0.184, v_num=0, train/loss_simple_step=0.713, train/loss_vlb_step=0.021, train/loss_step=0.713, global_step=1318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  28%|██▊       | 1683/5971 [16:17<41:30,  1.72it/s, loss=0.184, v_num=0, train/loss_simple_step=0.713, train/loss_vlb_step=0.021, train/loss_step=0.713, global_step=1318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1683/5971 [16:17<41:30,  1.72it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000122, train/loss_step=0.0325, global_step=1318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1684/5971 [16:20<41:33,  1.72it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000122, train/loss_step=0.0325, global_step=1318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1684/5971 [16:20<41:33,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.73e-5, train/loss_step=0.00324, global_step=1318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1685/5971 [16:20<41:33,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.73e-5, train/loss_step=0.00324, global_step=1318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1685/5971 [16:20<41:33,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000757, train/loss_step=0.214, global_step=1319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1686/5971 [16:21<41:33,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000757, train/loss_step=0.214, global_step=1319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1686/5971 [16:21<41:33,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=3.14e-5, train/loss_step=0.00608, global_step=1319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1687/5971 [16:22<41:34,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=3.14e-5, train/loss_step=0.00608, global_step=1319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1687/5971 [16:22<41:34,  1.72it/s, loss=0.151, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000412, train/loss_step=0.121, global_step=1319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1688/5971 [16:24<41:37,  1.71it/s, loss=0.151, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000412, train/loss_step=0.121, global_step=1319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1688/5971 [16:24<41:37,  1.71it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.36e-5, train/loss_step=0.00234, global_step=1319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1689/5971 [16:25<41:37,  1.71it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.36e-5, train/loss_step=0.00234, global_step=1319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1689/5971 [16:25<41:37,  1.71it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000111, train/loss_step=0.0296, global_step=1320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1690/5971 [16:26<41:38,  1.71it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000111, train/loss_step=0.0296, global_step=1320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1690/5971 [16:26<41:38,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000106, train/loss_step=0.0266, global_step=1320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1691/5971 [16:27<41:38,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000106, train/loss_step=0.0266, global_step=1320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1691/5971 [16:27<41:38,  1.71it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.1e-5, train/loss_step=0.00188, global_step=1320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1692/5971 [16:29<41:41,  1.71it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.1e-5, train/loss_step=0.00188, global_step=1320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1692/5971 [16:29<41:41,  1.71it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00634, train/loss_vlb_step=3.14e-5, train/loss_step=0.00634, global_step=1320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1693/5971 [16:30<41:41,  1.71it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00634, train/loss_vlb_step=3.14e-5, train/loss_step=0.00634, global_step=1320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1693/5971 [16:30<41:41,  1.71it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000222, train/loss_step=0.0675, global_step=1321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1694/5971 [16:31<41:41,  1.71it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000222, train/loss_step=0.0675, global_step=1321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1694/5971 [16:31<41:41,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00628, train/loss_step=0.502, global_step=1321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  28%|██▊       | 1695/5971 [16:32<41:41,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00628, train/loss_step=0.502, global_step=1321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1695/5971 [16:32<41:41,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000633, train/loss_step=0.170, global_step=1321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1696/5971 [16:34<41:45,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000633, train/loss_step=0.170, global_step=1321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1696/5971 [16:34<41:45,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.00106, train/loss_step=0.240, global_step=1321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1697/5971 [16:35<41:45,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.00106, train/loss_step=0.240, global_step=1321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1697/5971 [16:35<41:45,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00134, train/loss_step=0.327, global_step=1322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1698/5971 [16:36<41:45,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00134, train/loss_step=0.327, global_step=1322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1698/5971 [16:36<41:45,  1.71it/s, loss=0.146, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000754, train/loss_step=0.210, global_step=1322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1699/5971 [16:37<41:45,  1.70it/s, loss=0.146, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000754, train/loss_step=0.210, global_step=1322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1699/5971 [16:37<41:45,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000534, train/loss_step=0.157, global_step=1322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1700/5971 [16:39<41:49,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000534, train/loss_step=0.157, global_step=1322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1700/5971 [16:39<41:49,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00509, train/loss_step=0.490, global_step=1322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  28%|██▊       | 1701/5971 [16:40<41:49,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00509, train/loss_step=0.490, global_step=1322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  28%|██▊       | 1701/5971 [16:40<41:49,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0629, train/loss_vlb_step=0.000216, train/loss_step=0.0629, global_step=1323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1702/5971 [16:41<41:49,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0629, train/loss_vlb_step=0.000216, train/loss_step=0.0629, global_step=1323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1702/5971 [16:41<41:49,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00663, train/loss_vlb_step=3.39e-5, train/loss_step=0.00663, global_step=1323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1703/5971 [16:41<41:49,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00663, train/loss_vlb_step=3.39e-5, train/loss_step=0.00663, global_step=1323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1703/5971 [16:41<41:49,  1.70it/s, loss=0.145, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00101, train/loss_step=0.256, global_step=1323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  29%|██▊       | 1704/5971 [16:44<41:52,  1.70it/s, loss=0.145, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00101, train/loss_step=0.256, global_step=1323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1704/5971 [16:44<41:52,  1.70it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.7e-6, train/loss_step=0.00144, global_step=1323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1705/5971 [16:45<41:53,  1.70it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.7e-6, train/loss_step=0.00144, global_step=1323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1705/5971 [16:45<41:53,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=4.13e-5, train/loss_step=0.00851, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1706/5971 [16:45<41:53,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=4.13e-5, train/loss_step=0.00851, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1706/5971 [16:45<41:53,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00794, train/loss_step=0.574, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  29%|██▊       | 1707/5971 [16:46<41:53,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00794, train/loss_step=0.574, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1707/5971 [16:46<41:53,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.25e-6, train/loss_step=0.00154, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1708/5971 [16:49<41:56,  1.69it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.25e-6, train/loss_step=0.00154, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  29%|██▊       | 1708/5971 [16:49<41:56,  1.69it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.42it/s][A
Epoch 2:  29%|██▊       | 1710/5971 [16:49<41:53,  1.69it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:41,  3.99it/s][A
Epoch 2:  29%|██▊       | 1712/5971 [16:49<41:50,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.97it/s][A
Epoch 2:  29%|██▊       | 1715/5971 [16:49<41:44,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.05it/s][A
Epoch 2:  29%|██▉       | 1718/5971 [16:49<41:38,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.91it/s][A
Epoch 2:  29%|██▉       | 1721/5971 [16:50<41:32,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.72it/s][A
Epoch 2:  29%|██▉       | 1724/5971 [16:50<41:26,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.09it/s][A
Epoch 2:  29%|██▉       | 1727/5971 [16:50<41:21,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.93it/s][A
Epoch 2:  29%|██▉       | 1731/5971 [16:50<41:13,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.66it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.48it/s][A
Epoch 2:  29%|██▉       | 1735/5971 [16:50<41:05,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.28it/s][A
Epoch 2:  29%|██▉       | 1739/5971 [16:50<40:58,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.03it/s][A
Epoch 2:  29%|██▉       | 1743/5971 [16:50<40:50,  1.73it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.45it/s][A

Validating:  23%|██▎       | 38/167 [00:01<00:05, 25.62it/s][A
Epoch 2:  29%|██▉       | 1747/5971 [16:51<40:43,  1.73it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.58it/s][A
Epoch 2:  29%|██▉       | 1751/5971 [16:51<40:35,  1.73it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.06it/s][A
Epoch 2:  29%|██▉       | 1755/5971 [16:51<40:28,  1.74it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.20it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 24.87it/s][A
Epoch 2:  29%|██▉       | 1759/5971 [16:51<40:20,  1.74it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 24.73it/s][A
Epoch 2:  30%|██▉       | 1763/5971 [16:51<40:13,  1.74it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 24.59it/s][A
Epoch 2:  30%|██▉       | 1767/5971 [16:51<40:05,  1.75it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 23.82it/s][A

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.29it/s][A
Epoch 2:  30%|██▉       | 1771/5971 [16:51<39:58,  1.75it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 24.91it/s][A
Epoch 2:  30%|██▉       | 1775/5971 [16:52<39:51,  1.75it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 25.83it/s][A
Epoch 2:  30%|██▉       | 1779/5971 [16:52<39:43,  1.76it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.69it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.66it/s][A
Epoch 2:  30%|██▉       | 1783/5971 [16:52<39:36,  1.76it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.10it/s][A
Epoch 2:  30%|██▉       | 1787/5971 [16:52<39:29,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.64it/s][A
Epoch 2:  30%|██▉       | 1791/5971 [16:52<39:22,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.01it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.62it/s][A
Epoch 2:  30%|███       | 1795/5971 [16:52<39:15,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.96it/s][A
Epoch 2:  30%|███       | 1799/5971 [16:52<39:07,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 27.90it/s][A
Epoch 2:  30%|███       | 1803/5971 [16:53<39:00,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.96it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 27.91it/s][A
Epoch 2:  30%|███       | 1807/5971 [16:53<38:53,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.60it/s][A
Epoch 2:  30%|███       | 1811/5971 [16:53<38:46,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.32it/s][A
Epoch 2:  30%|███       | 1815/5971 [16:53<38:39,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.74it/s][A
Epoch 2:  30%|███       | 1819/5971 [16:53<38:32,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.97it/s][A

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.36it/s][A
Epoch 2:  31%|███       | 1823/5971 [16:53<38:25,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.23it/s][A
Epoch 2:  31%|███       | 1827/5971 [16:54<38:18,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.96it/s][A
Epoch 2:  31%|███       | 1831/5971 [16:54<38:11,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.60it/s][A
Epoch 2:  31%|███       | 1835/5971 [16:54<38:05,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.14it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.26it/s][A
Epoch 2:  31%|███       | 1839/5971 [16:54<37:58,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.84it/s][A
Epoch 2:  31%|███       | 1843/5971 [16:54<37:51,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.89it/s][A
Epoch 2:  31%|███       | 1847/5971 [16:54<37:44,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.40it/s][A

Validating:  85%|████████▌ | 142/167 [00:05<00:01, 24.93it/s][A
Epoch 2:  31%|███       | 1851/5971 [16:54<37:37,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.91it/s][A
Epoch 2:  31%|███       | 1855/5971 [16:55<37:31,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.93it/s][A
Epoch 2:  31%|███       | 1859/5971 [16:55<37:24,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.53it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.37it/s][A
Epoch 2:  31%|███       | 1863/5971 [16:55<37:17,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.43it/s][A
Epoch 2:  31%|███▏      | 1867/5971 [16:55<37:11,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 21.02it/s][A
Epoch 2:  31%|███▏      | 1871/5971 [16:55<37:04,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 22.11it/s][A

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 22.59it/s][A
Epoch 2:  31%|███▏      | 1875/5971 [16:55<36:58,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  31%|███▏      | 1876/5971 [16:56<36:57,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000364, train/loss_step=0.109, global_step=1324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  31%|███▏      | 1877/5971 [16:57<36:57,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0775, train/loss_vlb_step=0.000264, train/loss_step=0.0775, global_step=1325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  31%|███▏      | 1878/5971 [16:58<36:58,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.77e-5, train/loss_step=0.00327, global_step=1325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  31%|███▏      | 1879/5971 [16:59<36:58,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.77e-5, train/loss_step=0.00327, global_step=1325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  31%|███▏      | 1879/5971 [16:59<36:58,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.26e-5, train/loss_step=0.00223, global_step=1325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  31%|███▏      | 1880/5971 [17:01<37:01,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.87e-5, train/loss_step=0.0132, global_step=1325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1881/5971 [17:02<37:01,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00744, train/loss_vlb_step=3.6e-5, train/loss_step=0.00744, global_step=1326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1882/5971 [17:03<37:01,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0587, train/loss_vlb_step=0.000201, train/loss_step=0.0587, global_step=1326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1883/5971 [17:03<37:01,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0587, train/loss_vlb_step=0.000201, train/loss_step=0.0587, global_step=1326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1883/5971 [17:03<37:01,  1.84it/s, loss=0.136, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000401, train/loss_step=0.121, global_step=1326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1884/5971 [17:06<37:04,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00264, train/loss_step=0.432, global_step=1326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1885/5971 [17:06<37:04,  1.84it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.21e-5, train/loss_step=0.00203, global_step=1327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1886/5971 [17:07<37:05,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.81e-5, train/loss_step=0.0166, global_step=1327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1887/5971 [17:08<37:05,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.81e-5, train/loss_step=0.0166, global_step=1327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1887/5971 [17:08<37:05,  1.84it/s, loss=0.121, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000573, train/loss_step=0.168, global_step=1327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1888/5971 [17:10<37:08,  1.83it/s, loss=0.0963, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.7e-5, train/loss_step=0.0032, global_step=1327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1889/5971 [17:11<37:08,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.00071, train/loss_step=0.210, global_step=1328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1890/5971 [17:12<37:08,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00268, train/loss_step=0.374, global_step=1328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1891/5971 [17:13<37:08,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00268, train/loss_step=0.374, global_step=1328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1891/5971 [17:13<37:08,  1.83it/s, loss=0.136, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00739, train/loss_step=0.534, global_step=1328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1892/5971 [17:15<37:11,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.69e-5, train/loss_step=0.0248, global_step=1328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1893/5971 [17:16<37:11,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00333, train/loss_vlb_step=1.86e-5, train/loss_step=0.00333, global_step=1329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1894/5971 [17:17<37:11,  1.83it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.02e-5, train/loss_step=0.0139, global_step=1329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1895/5971 [17:18<37:12,  1.83it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.02e-5, train/loss_step=0.0139, global_step=1329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1895/5971 [17:18<37:12,  1.83it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.07e-5, train/loss_step=0.0241, global_step=1329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1896/5971 [17:20<37:15,  1.82it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.56e-5, train/loss_step=0.0128, global_step=1329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1897/5971 [17:21<37:15,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.00054, train/loss_step=0.158, global_step=1330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1898/5971 [17:22<37:15,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.61e-5, train/loss_step=0.00297, global_step=1330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1899/5971 [17:23<37:15,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.61e-5, train/loss_step=0.00297, global_step=1330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1899/5971 [17:23<37:15,  1.82it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000122, train/loss_step=0.0357, global_step=1330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1900/5971 [17:25<37:18,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.5e-5, train/loss_step=0.00503, global_step=1330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1901/5971 [17:26<37:18,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=1331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1902/5971 [17:27<37:18,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000493, train/loss_step=0.150, global_step=1331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1903/5971 [17:27<37:19,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000493, train/loss_step=0.150, global_step=1331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1903/5971 [17:27<37:19,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00199, train/loss_step=0.345, global_step=1331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1904/5971 [17:30<37:21,  1.81it/s, loss=0.114, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000694, train/loss_step=0.200, global_step=1331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1905/5971 [17:31<37:22,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.99e-5, train/loss_step=0.0226, global_step=1332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1906/5971 [17:31<37:22,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00172, train/loss_step=0.342, global_step=1332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1907/5971 [17:32<37:22,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00172, train/loss_step=0.342, global_step=1332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1907/5971 [17:32<37:22,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00682, train/loss_step=0.586, global_step=1332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1908/5971 [17:34<37:25,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.0016, train/loss_step=0.313, global_step=1332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1909/5971 [17:35<37:25,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000337, train/loss_step=0.102, global_step=1333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1910/5971 [17:36<37:25,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000593, train/loss_step=0.176, global_step=1333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1911/5971 [17:37<37:25,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000593, train/loss_step=0.176, global_step=1333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1911/5971 [17:37<37:25,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0913, train/loss_vlb_step=0.000301, train/loss_step=0.0913, global_step=1333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1912/5971 [17:39<37:28,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000969, train/loss_step=0.254, global_step=1333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1913/5971 [17:40<37:28,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.0012, train/loss_step=0.259, global_step=1334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1914/5971 [17:41<37:29,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.37e-5, train/loss_step=0.0025, global_step=1334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1915/5971 [17:42<37:29,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.37e-5, train/loss_step=0.0025, global_step=1334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1915/5971 [17:42<37:29,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0478, train/loss_vlb_step=0.000161, train/loss_step=0.0478, global_step=1334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1916/5971 [17:44<37:32,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00601, train/loss_vlb_step=3.19e-5, train/loss_step=0.00601, global_step=1334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1917/5971 [17:45<37:32,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.6e-6, train/loss_step=0.00158, global_step=1335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1918/5971 [17:46<37:32,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.5e-5, train/loss_step=0.0105, global_step=1335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1919/5971 [17:47<37:32,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.5e-5, train/loss_step=0.0105, global_step=1335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1919/5971 [17:47<37:32,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.29e-5, train/loss_step=0.0235, global_step=1335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1920/5971 [17:50<37:37,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000114, train/loss_step=0.0307, global_step=1335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1921/5971 [17:51<37:37,  1.79it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.34e-5, train/loss_step=0.0173, global_step=1336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1922/5971 [17:52<37:37,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000142, train/loss_step=0.0395, global_step=1336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1923/5971 [17:53<37:37,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000142, train/loss_step=0.0395, global_step=1336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1923/5971 [17:53<37:37,  1.79it/s, loss=0.129, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000217, train/loss_step=0.064, global_step=1336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1924/5971 [17:55<37:40,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00546, train/loss_step=0.525, global_step=1336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1925/5971 [17:56<37:40,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.57e-5, train/loss_step=0.0102, global_step=1337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1926/5971 [17:57<37:40,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000925, train/loss_step=0.242, global_step=1337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1927/5971 [17:57<37:40,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000925, train/loss_step=0.242, global_step=1337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1927/5971 [17:57<37:40,  1.79it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0013, train/loss_vlb_step=7.14e-6, train/loss_step=0.0013, global_step=1337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1928/5971 [18:00<37:43,  1.79it/s, loss=0.101, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000383, train/loss_step=0.114, global_step=1337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  32%|███▏      | 1929/5971 [18:00<37:43,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.0024, train/loss_step=0.399, global_step=1338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1930/5971 [18:01<37:44,  1.78it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00933, train/loss_vlb_step=4.34e-5, train/loss_step=0.00933, global_step=1338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1931/5971 [18:02<37:44,  1.78it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00933, train/loss_vlb_step=4.34e-5, train/loss_step=0.00933, global_step=1338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1931/5971 [18:02<37:44,  1.78it/s, loss=0.131, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.00735, train/loss_step=0.554, global_step=1338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  32%|███▏      | 1932/5971 [18:05<37:47,  1.78it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.05e-5, train/loss_step=0.0116, global_step=1338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1933/5971 [18:05<37:47,  1.78it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0765, train/loss_vlb_step=0.000264, train/loss_step=0.0765, global_step=1339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1934/5971 [18:06<37:47,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000963, train/loss_step=0.250, global_step=1339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1935/5971 [18:07<37:47,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000963, train/loss_step=0.250, global_step=1339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1935/5971 [18:07<37:47,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.26e-5, train/loss_step=0.0213, global_step=1339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1936/5971 [18:11<37:52,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00647, train/loss_vlb_step=3.3e-5, train/loss_step=0.00647, global_step=1339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1937/5971 [18:11<37:52,  1.77it/s, loss=0.141, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00197, train/loss_step=0.407, global_step=1340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  32%|███▏      | 1938/5971 [18:12<37:52,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.22e-5, train/loss_step=0.00207, global_step=1340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1939/5971 [18:13<37:52,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.22e-5, train/loss_step=0.00207, global_step=1340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  32%|███▏      | 1939/5971 [18:13<37:52,  1.77it/s, loss=0.15, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.00078, train/loss_step=0.215, global_step=1340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  32%|███▏      | 1940/5971 [18:15<37:55,  1.77it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000144, train/loss_step=0.0408, global_step=1340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1941/5971 [18:16<37:55,  1.77it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.42e-5, train/loss_step=0.00254, global_step=1341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1942/5971 [18:17<37:55,  1.77it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000158, train/loss_step=0.0451, global_step=1341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  33%|███▎      | 1943/5971 [18:18<37:55,  1.77it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000158, train/loss_step=0.0451, global_step=1341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1943/5971 [18:18<37:55,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000529, train/loss_step=0.161, global_step=1341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  33%|███▎      | 1944/5971 [18:20<37:58,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.64e-5, train/loss_step=0.00286, global_step=1341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1945/5971 [18:21<37:59,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000578, train/loss_step=0.163, global_step=1342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  33%|███▎      | 1946/5971 [18:22<37:59,  1.77it/s, loss=0.131, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000419, train/loss_step=0.128, global_step=1342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1947/5971 [18:23<37:59,  1.77it/s, loss=0.131, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000419, train/loss_step=0.128, global_step=1342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1947/5971 [18:23<37:59,  1.77it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.000239, train/loss_step=0.0704, global_step=1342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1948/5971 [18:25<38:02,  1.76it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000122, train/loss_step=0.0334, global_step=1342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  33%|███▎      | 1949/5971 [18:26<38:02,  1.76it/s, loss=0.119, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000585, train/loss_step=0.175, global_step=1343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  33%|███▎      | 1950/5971 [18:27<38:02,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=1343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1951/5971 [18:28<38:02,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=1343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1951/5971 [18:28<38:02,  1.76it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000104, train/loss_step=0.0259, global_step=1343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1952/5971 [18:30<38:05,  1.76it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.64e-5, train/loss_step=0.0123, global_step=1343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  33%|███▎      | 1953/5971 [18:31<38:05,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000801, train/loss_step=0.221, global_step=1344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  33%|███▎      | 1954/5971 [18:32<38:05,  1.76it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=9.69e-5, train/loss_step=0.026, global_step=1344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1955/5971 [18:33<38:05,  1.76it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=9.69e-5, train/loss_step=0.026, global_step=1344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1955/5971 [18:33<38:05,  1.76it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.00421, train/loss_vlb_step=2.15e-5, train/loss_step=0.00421, global_step=1344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1956/5971 [18:35<38:07,  1.75it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.00619, train/loss_vlb_step=3.19e-5, train/loss_step=0.00619, global_step=1344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1957/5971 [18:36<38:08,  1.75it/s, loss=0.0741, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.74e-5, train/loss_step=0.00342, global_step=1345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1958/5971 [18:36<38:08,  1.75it/s, loss=0.0759, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.00014, train/loss_step=0.0374, global_step=1345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  33%|███▎      | 1959/5971 [18:37<38:08,  1.75it/s, loss=0.0759, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.00014, train/loss_step=0.0374, global_step=1345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1959/5971 [18:37<38:08,  1.75it/s, loss=0.0658, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.97e-5, train/loss_step=0.0134, global_step=1345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1960/5971 [18:40<38:10,  1.75it/s, loss=0.084, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00194, train/loss_step=0.405, global_step=1345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  33%|███▎      | 1961/5971 [18:40<38:10,  1.75it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000584, train/loss_step=0.177, global_step=1346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1962/5971 [18:41<38:11,  1.75it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.0558, train/loss_vlb_step=0.000202, train/loss_step=0.0558, global_step=1346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1963/5971 [18:42<38:11,  1.75it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.0558, train/loss_vlb_step=0.000202, train/loss_step=0.0558, global_step=1346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1963/5971 [18:42<38:11,  1.75it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000162, train/loss_step=0.0438, global_step=1346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1964/5971 [18:44<38:13,  1.75it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.33e-5, train/loss_step=0.0119, global_step=1346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  33%|███▎      | 1965/5971 [18:45<38:13,  1.75it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=6.09e-5, train/loss_step=0.0136, global_step=1347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1966/5971 [18:46<38:13,  1.75it/s, loss=0.0741, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.31e-5, train/loss_step=0.00222, global_step=1347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1967/5971 [18:47<38:13,  1.75it/s, loss=0.0741, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.31e-5, train/loss_step=0.00222, global_step=1347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1967/5971 [18:47<38:13,  1.75it/s, loss=0.0708, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.53e-5, train/loss_step=0.00483, global_step=1347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1968/5971 [18:49<38:16,  1.74it/s, loss=0.0705, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000104, train/loss_step=0.0272, global_step=1347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  33%|███▎      | 1969/5971 [18:50<38:16,  1.74it/s, loss=0.0764, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00116, train/loss_step=0.293, global_step=1348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  33%|███▎      | 1970/5971 [18:51<38:16,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.0184, train/loss_step=0.689, global_step=1348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  33%|███▎      | 1971/5971 [18:52<38:16,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.0184, train/loss_step=0.689, global_step=1348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1971/5971 [18:52<38:16,  1.74it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.03e-5, train/loss_step=0.00387, global_step=1348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1972/5971 [18:54<38:19,  1.74it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.04e-5, train/loss_step=0.0114, global_step=1348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  33%|███▎      | 1973/5971 [18:55<38:19,  1.74it/s, loss=0.111, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00204, train/loss_step=0.398, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  33%|███▎      | 1974/5971 [18:56<38:19,  1.74it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.11e-5, train/loss_step=0.00398, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1975/5971 [18:56<38:19,  1.74it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.11e-5, train/loss_step=0.00398, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  33%|███▎      | 1975/5971 [18:56<38:19,  1.74it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0071, train/loss_vlb_step=3.47e-5, train/loss_step=0.0071, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  33%|███▎      | 1976/5971 [18:59<38:21,  1.74it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:01<03:50,  1.39s/it][A

Validating:   1%|          | 2/167 [00:01<01:57,  1.40it/s][A
Epoch 2:  33%|███▎      | 1979/5971 [19:00<38:19,  1.74it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:01<00:38,  4.25it/s][A
Epoch 2:  33%|███▎      | 1983/5971 [19:00<38:13,  1.74it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:01<00:22,  7.16it/s][A
Epoch 2:  33%|███▎      | 1987/5971 [19:01<38:06,  1.74it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:02<00:14, 10.40it/s][A

Validating:   8%|▊         | 14/167 [00:02<00:11, 13.60it/s][A
Epoch 2:  33%|███▎      | 1991/5971 [19:01<38:00,  1.75it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:02<00:09, 15.71it/s][A
Epoch 2:  33%|███▎      | 1995/5971 [19:01<37:53,  1.75it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:02<00:08, 17.80it/s][A
Epoch 2:  33%|███▎      | 1999/5971 [19:01<37:47,  1.75it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:02<00:07, 19.82it/s][A

Validating:  16%|█▌        | 26/167 [00:02<00:06, 21.52it/s][A
Epoch 2:  34%|███▎      | 2003/5971 [19:01<37:40,  1.76it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:02<00:06, 21.75it/s][A
Epoch 2:  34%|███▎      | 2007/5971 [19:01<37:34,  1.76it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:02<00:05, 23.43it/s][A
Epoch 2:  34%|███▎      | 2011/5971 [19:02<37:27,  1.76it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:02<00:05, 23.24it/s][A

Validating:  23%|██▎       | 38/167 [00:03<00:05, 24.48it/s][A
Epoch 2:  34%|███▎      | 2015/5971 [19:02<37:21,  1.76it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:03<00:05, 23.65it/s][A
Epoch 2:  34%|███▍      | 2019/5971 [19:02<37:15,  1.77it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:03<00:05, 22.88it/s][A
Epoch 2:  34%|███▍      | 2023/5971 [19:02<37:08,  1.77it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:03<00:05, 22.54it/s][A

Validating:  30%|██▉       | 50/167 [00:03<00:04, 24.35it/s][A
Epoch 2:  34%|███▍      | 2027/5971 [19:02<37:02,  1.77it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:03<00:04, 24.33it/s][A
Epoch 2:  34%|███▍      | 2031/5971 [19:02<36:56,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 24.32it/s][A
Epoch 2:  34%|███▍      | 2035/5971 [19:03<36:49,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:03<00:04, 25.65it/s][A

Validating:  37%|███▋      | 62/167 [00:04<00:04, 26.23it/s][A
Epoch 2:  34%|███▍      | 2039/5971 [19:03<36:43,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:04<00:03, 27.07it/s][A
Epoch 2:  34%|███▍      | 2043/5971 [19:03<36:37,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:04<00:03, 27.48it/s][A
Epoch 2:  34%|███▍      | 2047/5971 [19:03<36:30,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:04<00:03, 27.11it/s][A

Validating:  44%|████▍     | 74/167 [00:04<00:03, 24.67it/s][A
Epoch 2:  34%|███▍      | 2051/5971 [19:03<36:24,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:04<00:03, 26.02it/s][A
Epoch 2:  34%|███▍      | 2055/5971 [19:03<36:18,  1.80it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:04<00:03, 26.84it/s][A
Epoch 2:  34%|███▍      | 2059/5971 [19:03<36:12,  1.80it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:04<00:03, 26.99it/s][A
Epoch 2:  35%|███▍      | 2063/5971 [19:04<36:06,  1.80it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:04<00:02, 27.67it/s][A

Validating:  54%|█████▍    | 90/167 [00:05<00:02, 28.14it/s][A
Epoch 2:  35%|███▍      | 2067/5971 [19:04<36:00,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▋    | 94/167 [00:05<00:02, 28.68it/s][A
Epoch 2:  35%|███▍      | 2071/5971 [19:04<35:53,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:05<00:02, 28.33it/s][A
Epoch 2:  35%|███▍      | 2075/5971 [19:04<35:47,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:05<00:02, 27.23it/s][A
Epoch 2:  35%|███▍      | 2079/5971 [19:04<35:41,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:05<00:02, 27.50it/s][A
Epoch 2:  35%|███▍      | 2083/5971 [19:04<35:35,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 25.38it/s][A

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 26.20it/s][A
Epoch 2:  35%|███▍      | 2087/5971 [19:04<35:29,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 26.73it/s][A
Epoch 2:  35%|███▌      | 2091/5971 [19:05<35:23,  1.83it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:06<00:01, 25.86it/s][A
Epoch 2:  35%|███▌      | 2095/5971 [19:05<35:17,  1.83it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:06<00:01, 26.32it/s][A

Validating:  73%|███████▎  | 122/167 [00:06<00:01, 25.55it/s][A
Epoch 2:  35%|███▌      | 2099/5971 [19:05<35:11,  1.83it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:06<00:01, 24.98it/s][A
Epoch 2:  35%|███▌      | 2103/5971 [19:05<35:06,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:06<00:01, 26.04it/s][A
Epoch 2:  35%|███▌      | 2107/5971 [19:05<35:00,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 25.87it/s][A

Validating:  80%|████████  | 134/167 [00:06<00:01, 25.20it/s][A
Epoch 2:  35%|███▌      | 2111/5971 [19:05<34:54,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 25.53it/s][A
Epoch 2:  35%|███▌      | 2115/5971 [19:06<34:48,  1.85it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:07<00:01, 23.95it/s][A
Epoch 2:  35%|███▌      | 2119/5971 [19:06<34:42,  1.85it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:07<00:00, 24.44it/s][A

Validating:  87%|████████▋ | 146/167 [00:07<00:00, 25.77it/s][A
Epoch 2:  36%|███▌      | 2123/5971 [19:06<34:36,  1.85it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:07<00:00, 25.26it/s][A
Epoch 2:  36%|███▌      | 2127/5971 [19:06<34:31,  1.86it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:07<00:00, 25.97it/s][A
Epoch 2:  36%|███▌      | 2131/5971 [19:06<34:25,  1.86it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:07<00:00, 26.39it/s][A

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 26.26it/s][A
Epoch 2:  36%|███▌      | 2135/5971 [19:06<34:19,  1.86it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 25.93it/s][A
Epoch 2:  36%|███▌      | 2139/5971 [19:07<34:13,  1.87it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 26.85it/s][A
Epoch 2:  36%|███▌      | 2143/5971 [19:07<34:08,  1.87it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2144/5971 [19:07<34:07,  1.87it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=7.52e-6, train/loss_step=0.00134, global_step=1349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  36%|███▌      | 2145/5971 [19:08<34:07,  1.87it/s, loss=0.126, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00145, train/loss_step=0.323, global_step=1350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  36%|███▌      | 2146/5971 [19:09<34:07,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.67e-5, train/loss_step=0.0152, global_step=1350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2147/5971 [19:10<34:07,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.67e-5, train/loss_step=0.0152, global_step=1350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2147/5971 [19:10<34:07,  1.87it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0646, train/loss_vlb_step=0.000222, train/loss_step=0.0646, global_step=1350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2148/5971 [19:12<34:10,  1.86it/s, loss=0.118, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000766, train/loss_step=0.218, global_step=1350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  36%|███▌      | 2149/5971 [19:13<34:10,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000734, train/loss_step=0.216, global_step=1351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  36%|███▌      | 2150/5971 [19:14<34:10,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00291, train/loss_step=0.387, global_step=1351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2151/5971 [19:15<34:10,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00291, train/loss_step=0.387, global_step=1351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2151/5971 [19:15<34:10,  1.86it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.55e-5, train/loss_step=0.0243, global_step=1351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2152/5971 [19:17<34:12,  1.86it/s, loss=0.167, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.00755, train/loss_step=0.638, global_step=1351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  36%|███▌      | 2153/5971 [19:18<34:12,  1.86it/s, loss=0.175, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000613, train/loss_step=0.172, global_step=1352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2154/5971 [19:18<34:12,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000108, train/loss_step=0.0287, global_step=1352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2155/5971 [19:19<34:12,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000108, train/loss_step=0.0287, global_step=1352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2155/5971 [19:19<34:12,  1.86it/s, loss=0.189, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.000913, train/loss_step=0.258, global_step=1352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  36%|███▌      | 2156/5971 [19:22<34:15,  1.86it/s, loss=0.194, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000409, train/loss_step=0.123, global_step=1352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2157/5971 [19:22<34:15,  1.86it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.65e-5, train/loss_step=0.0126, global_step=1353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2158/5971 [19:23<34:15,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000142, train/loss_step=0.0372, global_step=1353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2159/5971 [19:24<34:15,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000142, train/loss_step=0.0372, global_step=1353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2159/5971 [19:24<34:15,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00457, train/loss_step=0.528, global_step=1353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  36%|███▌      | 2160/5971 [19:27<34:19,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.19e-5, train/loss_step=0.00222, global_step=1353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2161/5971 [19:28<34:19,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0888, train/loss_vlb_step=0.000301, train/loss_step=0.0888, global_step=1354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  36%|███▌      | 2162/5971 [19:29<34:19,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.82e-5, train/loss_step=0.0184, global_step=1354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  36%|███▌      | 2163/5971 [19:30<34:19,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.82e-5, train/loss_step=0.0184, global_step=1354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▌      | 2163/5971 [19:30<34:19,  1.85it/s, loss=0.198, v_num=0, train/loss_simple_step=0.797, train/loss_vlb_step=0.0194, train/loss_step=0.797, global_step=1354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  36%|███▌      | 2164/5971 [19:32<34:22,  1.85it/s, loss=0.214, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00147, train/loss_step=0.331, global_step=1354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2165/5971 [19:33<34:22,  1.85it/s, loss=0.222, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00305, train/loss_step=0.481, global_step=1355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2166/5971 [19:34<34:22,  1.85it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.83e-5, train/loss_step=0.0226, global_step=1355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2167/5971 [19:35<34:22,  1.84it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.83e-5, train/loss_step=0.0226, global_step=1355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2167/5971 [19:35<34:22,  1.84it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=7.03e-5, train/loss_step=0.0158, global_step=1355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  36%|███▋      | 2168/5971 [19:37<34:24,  1.84it/s, loss=0.226, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00151, train/loss_step=0.332, global_step=1355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  36%|███▋      | 2169/5971 [19:38<34:24,  1.84it/s, loss=0.233, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00195, train/loss_step=0.355, global_step=1356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2170/5971 [19:39<34:24,  1.84it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.8e-5, train/loss_step=0.0145, global_step=1356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2171/5971 [19:40<34:24,  1.84it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.8e-5, train/loss_step=0.0145, global_step=1356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2171/5971 [19:40<34:24,  1.84it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.26e-5, train/loss_step=0.00224, global_step=1356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2172/5971 [19:42<34:27,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.42e-5, train/loss_step=0.0025, global_step=1356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  36%|███▋      | 2173/5971 [19:43<34:27,  1.84it/s, loss=0.189, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00159, train/loss_step=0.329, global_step=1357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  36%|███▋      | 2174/5971 [19:44<34:27,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.93e-5, train/loss_step=0.0188, global_step=1357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2175/5971 [19:45<34:27,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.93e-5, train/loss_step=0.0188, global_step=1357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2175/5971 [19:45<34:27,  1.84it/s, loss=0.18, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000264, train/loss_step=0.080, global_step=1357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  36%|███▋      | 2176/5971 [19:47<34:29,  1.83it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.5e-5, train/loss_step=0.0026, global_step=1357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2177/5971 [19:48<34:29,  1.83it/s, loss=0.18, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000513, train/loss_step=0.148, global_step=1358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  36%|███▋      | 2178/5971 [19:49<34:30,  1.83it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0532, train/loss_vlb_step=0.00018, train/loss_step=0.0532, global_step=1358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2179/5971 [19:50<34:30,  1.83it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0532, train/loss_vlb_step=0.00018, train/loss_step=0.0532, global_step=1358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  36%|███▋      | 2179/5971 [19:50<34:30,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.18e-5, train/loss_step=0.00203, global_step=1358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2180/5971 [19:52<34:32,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000113, train/loss_step=0.0279, global_step=1358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  37%|███▋      | 2181/5971 [19:53<34:32,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000449, train/loss_step=0.135, global_step=1359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  37%|███▋      | 2182/5971 [19:54<34:32,  1.83it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000114, train/loss_step=0.0289, global_step=1359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2183/5971 [19:54<34:32,  1.83it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000114, train/loss_step=0.0289, global_step=1359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2183/5971 [19:54<34:32,  1.83it/s, loss=0.149, v_num=0, train/loss_simple_step=0.605, train/loss_vlb_step=0.00438, train/loss_step=0.605, global_step=1359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  37%|███▋      | 2184/5971 [19:57<34:34,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.00027, train/loss_step=0.0815, global_step=1359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2185/5971 [19:57<34:34,  1.82it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000196, train/loss_step=0.0573, global_step=1360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2186/5971 [19:58<34:34,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000278, train/loss_step=0.0815, global_step=1360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2187/5971 [19:59<34:34,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000278, train/loss_step=0.0815, global_step=1360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2187/5971 [19:59<34:34,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000755, train/loss_step=0.194, global_step=1360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  37%|███▋      | 2188/5971 [20:01<34:37,  1.82it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00605, train/loss_vlb_step=3.11e-5, train/loss_step=0.00605, global_step=1360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2189/5971 [20:02<34:37,  1.82it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000102, train/loss_step=0.0262, global_step=1361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2190/5971 [20:03<34:37,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00185, train/loss_step=0.370, global_step=1361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  37%|███▋      | 2191/5971 [20:04<34:37,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00185, train/loss_step=0.370, global_step=1361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2191/5971 [20:04<34:37,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00211, train/loss_vlb_step=1.24e-5, train/loss_step=0.00211, global_step=1361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2192/5971 [20:06<34:39,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00253, train/loss_step=0.412, global_step=1361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  37%|███▋      | 2193/5971 [20:07<34:39,  1.82it/s, loss=0.131, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00108, train/loss_step=0.280, global_step=1362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2194/5971 [20:08<34:39,  1.82it/s, loss=0.147, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00222, train/loss_step=0.347, global_step=1362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2195/5971 [20:09<34:39,  1.82it/s, loss=0.147, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00222, train/loss_step=0.347, global_step=1362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2195/5971 [20:09<34:39,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0634, train/loss_vlb_step=0.000215, train/loss_step=0.0634, global_step=1362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2196/5971 [20:11<34:42,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.21e-5, train/loss_step=0.00422, global_step=1362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2197/5971 [20:12<34:42,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.576, train/loss_vlb_step=0.00558, train/loss_step=0.576, global_step=1363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  37%|███▋      | 2198/5971 [20:13<34:42,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=2.07e-5, train/loss_step=0.00366, global_step=1363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2199/5971 [20:14<34:42,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=2.07e-5, train/loss_step=0.00366, global_step=1363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2199/5971 [20:14<34:42,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00297, train/loss_step=0.458, global_step=1363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  37%|███▋      | 2200/5971 [20:16<34:44,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=8.24e-5, train/loss_step=0.019, global_step=1363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2201/5971 [20:17<34:44,  1.81it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.49e-5, train/loss_step=0.00484, global_step=1364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2202/5971 [20:18<34:44,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00318, train/loss_step=0.461, global_step=1364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  37%|███▋      | 2203/5971 [20:19<34:44,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00318, train/loss_step=0.461, global_step=1364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2203/5971 [20:19<34:44,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00236, train/loss_vlb_step=1.34e-5, train/loss_step=0.00236, global_step=1364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2204/5971 [20:21<34:47,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.000233, train/loss_step=0.0707, global_step=1364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  37%|███▋      | 2205/5971 [20:22<34:47,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000633, train/loss_step=0.180, global_step=1365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  37%|███▋      | 2206/5971 [20:23<34:47,  1.80it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=9.49e-5, train/loss_step=0.0261, global_step=1365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2207/5971 [20:24<34:47,  1.80it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=9.49e-5, train/loss_step=0.0261, global_step=1365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2207/5971 [20:24<34:47,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00957, train/loss_vlb_step=4.31e-5, train/loss_step=0.00957, global_step=1365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2208/5971 [20:27<34:50,  1.80it/s, loss=0.174, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000536, train/loss_step=0.161, global_step=1365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  37%|███▋      | 2209/5971 [20:28<34:50,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.56e-5, train/loss_step=0.005, global_step=1366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  37%|███▋      | 2210/5971 [20:29<34:50,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.22e-5, train/loss_step=0.0236, global_step=1366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2211/5971 [20:30<34:51,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.22e-5, train/loss_step=0.0236, global_step=1366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2211/5971 [20:30<34:51,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00125, train/loss_step=0.305, global_step=1366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  37%|███▋      | 2212/5971 [20:32<34:53,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.654, train/loss_vlb_step=0.0085, train/loss_step=0.654, global_step=1366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  37%|███▋      | 2213/5971 [20:33<34:53,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.000233, train/loss_step=0.0682, global_step=1367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2214/5971 [20:34<34:53,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000215, train/loss_step=0.0615, global_step=1367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2215/5971 [20:34<34:53,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000215, train/loss_step=0.0615, global_step=1367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2215/5971 [20:34<34:53,  1.79it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000286, train/loss_step=0.0864, global_step=1367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2216/5971 [20:37<34:55,  1.79it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000146, train/loss_step=0.0389, global_step=1367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2217/5971 [20:38<34:55,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00503, train/loss_step=0.469, global_step=1368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  37%|███▋      | 2218/5971 [20:39<34:55,  1.79it/s, loss=0.167, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000835, train/loss_step=0.234, global_step=1368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2219/5971 [20:40<34:55,  1.79it/s, loss=0.167, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000835, train/loss_step=0.234, global_step=1368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2219/5971 [20:40<34:55,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000579, train/loss_step=0.158, global_step=1368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2220/5971 [20:42<34:58,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0852, train/loss_vlb_step=0.000283, train/loss_step=0.0852, global_step=1368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2221/5971 [20:43<34:58,  1.79it/s, loss=0.161, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000361, train/loss_step=0.110, global_step=1369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  37%|███▋      | 2222/5971 [20:44<34:58,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000161, train/loss_step=0.0436, global_step=1369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2223/5971 [20:44<34:57,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000161, train/loss_step=0.0436, global_step=1369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2223/5971 [20:44<34:57,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000909, train/loss_step=0.231, global_step=1369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  37%|███▋      | 2224/5971 [20:47<35:00,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.00029, train/loss_step=0.0871, global_step=1369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2225/5971 [20:48<35:00,  1.78it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.36e-5, train/loss_step=0.0207, global_step=1370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2226/5971 [20:49<35:00,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00194, train/loss_step=0.394, global_step=1370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  37%|███▋      | 2227/5971 [20:50<35:00,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00194, train/loss_step=0.394, global_step=1370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2227/5971 [20:50<35:00,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.48e-5, train/loss_step=0.00264, global_step=1370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2228/5971 [20:52<35:03,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0696, train/loss_vlb_step=0.000242, train/loss_step=0.0696, global_step=1370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  37%|███▋      | 2229/5971 [20:53<35:03,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000317, train/loss_step=0.0958, global_step=1371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2230/5971 [20:54<35:03,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00348, train/loss_step=0.497, global_step=1371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  37%|███▋      | 2231/5971 [20:55<35:03,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00348, train/loss_step=0.497, global_step=1371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2231/5971 [20:55<35:03,  1.78it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0078, train/loss_vlb_step=3.9e-5, train/loss_step=0.0078, global_step=1371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2232/5971 [20:58<35:06,  1.77it/s, loss=0.149, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000797, train/loss_step=0.219, global_step=1371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2233/5971 [20:59<35:07,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000156, train/loss_step=0.0428, global_step=1372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2234/5971 [21:00<35:07,  1.77it/s, loss=0.145, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.14e-5, train/loss_step=0.014, global_step=1372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  37%|███▋      | 2235/5971 [21:01<35:07,  1.77it/s, loss=0.145, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.14e-5, train/loss_step=0.014, global_step=1372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2235/5971 [21:01<35:07,  1.77it/s, loss=0.144, v_num=0, train/loss_simple_step=0.058, train/loss_vlb_step=0.000208, train/loss_step=0.058, global_step=1372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2236/5971 [21:03<35:09,  1.77it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.64e-5, train/loss_step=0.0155, global_step=1372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2237/5971 [21:04<35:09,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00143, train/loss_step=0.318, global_step=1373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  37%|███▋      | 2238/5971 [21:05<35:09,  1.77it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=1373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2239/5971 [21:06<35:09,  1.77it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=1373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  37%|███▋      | 2239/5971 [21:06<35:09,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.794, train/loss_vlb_step=0.0278, train/loss_step=0.794, global_step=1373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  38%|███▊      | 2240/5971 [21:08<35:11,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00476, train/loss_vlb_step=2.45e-5, train/loss_step=0.00476, global_step=1373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  38%|███▊      | 2241/5971 [21:09<35:11,  1.77it/s, loss=0.154, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000522, train/loss_step=0.154, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  38%|███▊      | 2242/5971 [21:10<35:11,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.11e-5, train/loss_step=0.00199, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  38%|███▊      | 2243/5971 [21:10<35:11,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.11e-5, train/loss_step=0.00199, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  38%|███▊      | 2243/5971 [21:10<35:11,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000144, train/loss_step=0.0383, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  38%|███▊      | 2244/5971 [21:13<35:14,  1.76it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:06,  2.51it/s][A

Validating:   1%|          | 2/167 [00:00<00:53,  3.08it/s][A
Epoch 2:  38%|███▊      | 2247/5971 [21:14<35:11,  1.76it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:23,  6.89it/s][A
Epoch 2:  38%|███▊      | 2251/5971 [21:14<35:05,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:01<00:14, 11.04it/s][A
Epoch 2:  38%|███▊      | 2255/5971 [21:14<34:59,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:10, 14.39it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.72it/s][A
Epoch 2:  38%|███▊      | 2259/5971 [21:14<34:54,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.56it/s][A
Epoch 2:  38%|███▊      | 2263/5971 [21:15<34:48,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.68it/s][A
Epoch 2:  38%|███▊      | 2267/5971 [21:15<34:42,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.03it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.67it/s][A
Epoch 2:  38%|███▊      | 2271/5971 [21:15<34:37,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 21.84it/s][A
Epoch 2:  38%|███▊      | 2275/5971 [21:15<34:31,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:06, 22.02it/s][A
Epoch 2:  38%|███▊      | 2279/5971 [21:15<34:25,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:02<00:05, 23.28it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.87it/s][A
Epoch 2:  38%|███▊      | 2283/5971 [21:15<34:20,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.81it/s][A
Epoch 2:  38%|███▊      | 2287/5971 [21:16<34:14,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:07, 16.77it/s][A
Epoch 2:  38%|███▊      | 2291/5971 [21:16<34:09,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:07, 15.97it/s][A

Validating:  30%|██▉       | 50/167 [00:03<00:12,  9.43it/s][A
Epoch 2:  38%|███▊      | 2295/5971 [21:17<34:04,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:03<00:09, 11.86it/s][A
Epoch 2:  39%|███▊      | 2299/5971 [21:17<33:59,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:03<00:08, 13.76it/s][A

Validating:  35%|███▍      | 58/167 [00:03<00:07, 14.43it/s][A
Epoch 2:  39%|███▊      | 2303/5971 [21:17<33:53,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:03<00:07, 15.17it/s][A

Validating:  37%|███▋      | 62/167 [00:04<00:07, 14.96it/s][A
Epoch 2:  39%|███▊      | 2307/5971 [21:17<33:48,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  38%|███▊      | 64/167 [00:04<00:08, 12.83it/s][A
Epoch 2:  39%|███▊      | 2311/5971 [21:18<33:43,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:04<00:06, 16.01it/s][A

Validating:  42%|████▏     | 70/167 [00:04<00:05, 17.65it/s][A
Epoch 2:  39%|███▉      | 2315/5971 [21:18<33:37,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▎     | 73/167 [00:04<00:04, 19.19it/s][A
Epoch 2:  39%|███▉      | 2319/5971 [21:18<33:32,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:04<00:04, 21.37it/s][A
Epoch 2:  39%|███▉      | 2323/5971 [21:18<33:27,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:04<00:03, 22.16it/s][A

Validating:  49%|████▉     | 82/167 [00:05<00:03, 23.13it/s][A
Epoch 2:  39%|███▉      | 2327/5971 [21:18<33:21,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  51%|█████     | 85/167 [00:05<00:03, 24.63it/s][A
Epoch 2:  39%|███▉      | 2331/5971 [21:18<33:16,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:05<00:03, 25.06it/s][A
Epoch 2:  39%|███▉      | 2335/5971 [21:19<33:10,  1.83it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:05<00:02, 26.41it/s][A
Epoch 2:  39%|███▉      | 2339/5971 [21:19<33:05,  1.83it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:05<00:02, 27.92it/s][A
Epoch 2:  39%|███▉      | 2343/5971 [21:19<33:00,  1.83it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  59%|█████▉    | 99/167 [00:05<00:02, 27.58it/s][A

Validating:  61%|██████    | 102/167 [00:05<00:02, 27.17it/s][A
Epoch 2:  39%|███▉      | 2347/5971 [21:19<32:54,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:05<00:02, 27.13it/s][A
Epoch 2:  39%|███▉      | 2351/5971 [21:19<32:49,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 25.96it/s][A
Epoch 2:  39%|███▉      | 2355/5971 [21:19<32:44,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  66%|██████▋   | 111/167 [00:06<00:02, 26.83it/s][A

Validating:  68%|██████▊   | 114/167 [00:06<00:01, 27.12it/s][A
Epoch 2:  40%|███▉      | 2359/5971 [21:19<32:38,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:06<00:01, 26.51it/s][A
Epoch 2:  40%|███▉      | 2363/5971 [21:20<32:33,  1.85it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:06<00:01, 26.38it/s][A
Epoch 2:  40%|███▉      | 2367/5971 [21:20<32:28,  1.85it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  74%|███████▎  | 123/167 [00:06<00:02, 21.16it/s][A

Validating:  75%|███████▌  | 126/167 [00:06<00:02, 19.74it/s][A
Epoch 2:  40%|███▉      | 2371/5971 [21:20<32:23,  1.85it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 129/167 [00:06<00:01, 19.86it/s][A
Epoch 2:  40%|███▉      | 2375/5971 [21:20<32:18,  1.86it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:07<00:01, 20.89it/s][A
Epoch 2:  40%|███▉      | 2379/5971 [21:20<32:13,  1.86it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:07<00:01, 22.09it/s][A

Validating:  83%|████████▎ | 138/167 [00:07<00:01, 22.98it/s][A
Epoch 2:  40%|███▉      | 2383/5971 [21:21<32:08,  1.86it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:07<00:01, 24.00it/s][A
Epoch 2:  40%|███▉      | 2387/5971 [21:21<32:02,  1.86it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:07<00:00, 24.43it/s][A
Epoch 2:  40%|████      | 2391/5971 [21:21<31:57,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:07<00:00, 24.01it/s][A

Validating:  90%|████████▉ | 150/167 [00:07<00:00, 25.37it/s][A
Epoch 2:  40%|████      | 2395/5971 [21:21<31:52,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:08<00:00, 19.16it/s][A
Epoch 2:  40%|████      | 2399/5971 [21:21<31:47,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:08<00:00, 21.86it/s][A
Epoch 2:  40%|████      | 2403/5971 [21:21<31:42,  1.88it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:08<00:00, 22.82it/s][A
Epoch 2:  40%|████      | 2407/5971 [21:22<31:37,  1.88it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:08<00:00, 24.73it/s][A
Epoch 2:  40%|████      | 2411/5971 [21:22<31:32,  1.88it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 100%|██████████| 167/167 [00:08<00:00, 25.89it/s][A
Epoch 2:  40%|████      | 2412/5971 [21:22<31:31,  1.88it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  40%|████      | 2413/5971 [21:23<31:31,  1.88it/s, loss=0.153, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000889, train/loss_step=0.251, global_step=1375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  40%|████      | 2414/5971 [21:24<31:31,  1.88it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0534, train/loss_vlb_step=0.000184, train/loss_step=0.0534, global_step=1375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  40%|████      | 2415/5971 [21:25<31:31,  1.88it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0534, train/loss_vlb_step=0.000184, train/loss_step=0.0534, global_step=1375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  40%|████      | 2415/5971 [21:25<31:31,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.820, train/loss_vlb_step=0.0699, train/loss_step=0.820, global_step=1375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  40%|████      | 2416/5971 [21:28<31:34,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000728, train/loss_step=0.200, global_step=1375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  40%|████      | 2417/5971 [21:29<31:34,  1.88it/s, loss=0.202, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.004, train/loss_step=0.461, global_step=1376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  40%|████      | 2418/5971 [21:30<31:34,  1.88it/s, loss=0.198, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00305, train/loss_step=0.417, global_step=1376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2419/5971 [21:30<31:34,  1.87it/s, loss=0.198, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00305, train/loss_step=0.417, global_step=1376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2419/5971 [21:30<31:34,  1.87it/s, loss=0.212, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00186, train/loss_step=0.292, global_step=1376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2420/5971 [21:33<31:36,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00188, train/loss_step=0.374, global_step=1376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  41%|████      | 2421/5971 [21:33<31:36,  1.87it/s, loss=0.225, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000499, train/loss_step=0.151, global_step=1377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2422/5971 [21:34<31:36,  1.87it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=9.07e-5, train/loss_step=0.0224, global_step=1377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2423/5971 [21:35<31:36,  1.87it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=9.07e-5, train/loss_step=0.0224, global_step=1377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2423/5971 [21:35<31:36,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.00576, train/loss_vlb_step=2.83e-5, train/loss_step=0.00576, global_step=1377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2424/5971 [21:37<31:38,  1.87it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000117, train/loss_step=0.0309, global_step=1377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  41%|████      | 2425/5971 [21:38<31:38,  1.87it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.57e-6, train/loss_step=0.00141, global_step=1378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2426/5971 [21:39<31:38,  1.87it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000243, train/loss_step=0.0698, global_step=1378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  41%|████      | 2427/5971 [21:40<31:38,  1.87it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000243, train/loss_step=0.0698, global_step=1378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2427/5971 [21:40<31:38,  1.87it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.43e-5, train/loss_step=0.00497, global_step=1378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2428/5971 [21:42<31:40,  1.86it/s, loss=0.182, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000706, train/loss_step=0.192, global_step=1378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  41%|████      | 2429/5971 [21:43<31:40,  1.86it/s, loss=0.189, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00119, train/loss_step=0.302, global_step=1379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  41%|████      | 2430/5971 [21:44<31:40,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000952, train/loss_step=0.239, global_step=1379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2431/5971 [21:45<31:40,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000952, train/loss_step=0.239, global_step=1379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2431/5971 [21:45<31:40,  1.86it/s, loss=0.209, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000858, train/loss_step=0.201, global_step=1379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2432/5971 [21:47<31:42,  1.86it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.1e-5, train/loss_step=0.0112, global_step=1379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2433/5971 [21:48<31:42,  1.86it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.55e-5, train/loss_step=0.0182, global_step=1380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2434/5971 [21:49<31:42,  1.86it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.000108, train/loss_step=0.0293, global_step=1380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2435/5971 [21:50<31:42,  1.86it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.000108, train/loss_step=0.0293, global_step=1380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2435/5971 [21:50<31:42,  1.86it/s, loss=0.154, v_num=0, train/loss_simple_step=0.058, train/loss_vlb_step=0.000192, train/loss_step=0.058, global_step=1380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  41%|████      | 2436/5971 [21:52<31:44,  1.86it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.8e-5, train/loss_step=0.0189, global_step=1380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2437/5971 [21:53<31:44,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=6.14e-5, train/loss_step=0.0134, global_step=1381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2438/5971 [21:54<31:44,  1.86it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.24e-5, train/loss_step=0.00206, global_step=1381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2439/5971 [21:55<31:44,  1.85it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.24e-5, train/loss_step=0.00206, global_step=1381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2439/5971 [21:55<31:44,  1.85it/s, loss=0.109, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00262, train/loss_step=0.425, global_step=1381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  41%|████      | 2440/5971 [21:57<31:46,  1.85it/s, loss=0.104, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00103, train/loss_step=0.276, global_step=1381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2441/5971 [21:58<31:46,  1.85it/s, loss=0.134, v_num=0, train/loss_simple_step=0.748, train/loss_vlb_step=0.0387, train/loss_step=0.748, global_step=1382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  41%|████      | 2442/5971 [21:59<31:46,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.00053, train/loss_step=0.161, global_step=1382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2443/5971 [22:00<31:46,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.00053, train/loss_step=0.161, global_step=1382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2443/5971 [22:00<31:46,  1.85it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00751, train/loss_vlb_step=3.58e-5, train/loss_step=0.00751, global_step=1382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2444/5971 [22:02<31:47,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000646, train/loss_step=0.196, global_step=1382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  41%|████      | 2445/5971 [22:03<31:48,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.63e-5, train/loss_step=0.0246, global_step=1383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2446/5971 [22:04<31:48,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.00015, train/loss_step=0.0403, global_step=1383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2447/5971 [22:05<31:48,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.00015, train/loss_step=0.0403, global_step=1383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2447/5971 [22:05<31:48,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.21e-5, train/loss_step=0.0115, global_step=1383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2448/5971 [22:07<31:49,  1.84it/s, loss=0.172, v_num=0, train/loss_simple_step=0.661, train/loss_vlb_step=0.0108, train/loss_step=0.661, global_step=1383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  41%|████      | 2449/5971 [22:08<31:49,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000188, train/loss_step=0.0522, global_step=1384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2450/5971 [22:09<31:49,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.101, train/loss_step=0.599, global_step=1384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  41%|████      | 2451/5971 [22:10<31:49,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.101, train/loss_step=0.599, global_step=1384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2451/5971 [22:10<31:49,  1.84it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00847, train/loss_vlb_step=3.95e-5, train/loss_step=0.00847, global_step=1384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2452/5971 [22:12<31:51,  1.84it/s, loss=0.18, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000867, train/loss_step=0.241, global_step=1384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  41%|████      | 2453/5971 [22:13<31:51,  1.84it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.13e-5, train/loss_step=0.00403, global_step=1385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2454/5971 [22:14<31:51,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.9e-5, train/loss_step=0.0126, global_step=1385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  41%|████      | 2455/5971 [22:15<31:51,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.9e-5, train/loss_step=0.0126, global_step=1385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2455/5971 [22:15<31:51,  1.84it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.35e-5, train/loss_step=0.00235, global_step=1385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2456/5971 [22:17<31:53,  1.84it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=9.2e-5, train/loss_step=0.0228, global_step=1385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  41%|████      | 2457/5971 [22:18<31:53,  1.84it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000129, train/loss_step=0.0352, global_step=1386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2458/5971 [22:19<31:53,  1.84it/s, loss=0.21, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0187, train/loss_step=0.668, global_step=1386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  41%|████      | 2459/5971 [22:20<31:53,  1.84it/s, loss=0.21, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0187, train/loss_step=0.668, global_step=1386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2459/5971 [22:20<31:53,  1.84it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000245, train/loss_step=0.0738, global_step=1386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2460/5971 [22:22<31:55,  1.83it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0884, train/loss_vlb_step=0.000293, train/loss_step=0.0884, global_step=1386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2461/5971 [22:23<31:54,  1.83it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000136, train/loss_step=0.0378, global_step=1387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2462/5971 [22:24<31:54,  1.83it/s, loss=0.147, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.00049, train/loss_step=0.147, global_step=1387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  41%|████      | 2463/5971 [22:24<31:54,  1.83it/s, loss=0.147, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.00049, train/loss_step=0.147, global_step=1387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████      | 2463/5971 [22:24<31:54,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000135, train/loss_step=0.036, global_step=1387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2464/5971 [22:27<31:56,  1.83it/s, loss=0.145, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000466, train/loss_step=0.141, global_step=1387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2465/5971 [22:28<31:56,  1.83it/s, loss=0.145, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.76e-5, train/loss_step=0.013, global_step=1388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  41%|████▏     | 2466/5971 [22:29<31:56,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.059, train/loss_vlb_step=0.000209, train/loss_step=0.059, global_step=1388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2467/5971 [22:29<31:56,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.059, train/loss_vlb_step=0.000209, train/loss_step=0.059, global_step=1388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2467/5971 [22:29<31:56,  1.83it/s, loss=0.147, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000136, train/loss_step=0.038, global_step=1388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2468/5971 [22:32<31:58,  1.83it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000156, train/loss_step=0.0439, global_step=1388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2469/5971 [22:33<31:58,  1.83it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.39e-5, train/loss_step=0.0116, global_step=1389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  41%|████▏     | 2470/5971 [22:33<31:58,  1.82it/s, loss=0.0866, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000173, train/loss_step=0.0471, global_step=1389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2471/5971 [22:34<31:58,  1.82it/s, loss=0.0866, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000173, train/loss_step=0.0471, global_step=1389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2471/5971 [22:34<31:58,  1.82it/s, loss=0.105, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00177, train/loss_step=0.374, global_step=1389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  41%|████▏     | 2472/5971 [22:36<31:59,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00116, train/loss_step=0.310, global_step=1389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2473/5971 [22:37<31:59,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000848, train/loss_step=0.247, global_step=1390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2474/5971 [22:38<31:59,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.64e-5, train/loss_step=0.00286, global_step=1390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2475/5971 [22:39<31:59,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.64e-5, train/loss_step=0.00286, global_step=1390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2475/5971 [22:39<31:59,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0384, train/loss_vlb_step=0.000133, train/loss_step=0.0384, global_step=1390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  41%|████▏     | 2476/5971 [22:43<32:03,  1.82it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.73e-5, train/loss_step=0.0162, global_step=1390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  41%|████▏     | 2477/5971 [22:44<32:03,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000495, train/loss_step=0.148, global_step=1391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  42%|████▏     | 2478/5971 [22:45<32:03,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000951, train/loss_step=0.216, global_step=1391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2479/5971 [22:46<32:03,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000951, train/loss_step=0.216, global_step=1391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2479/5971 [22:46<32:03,  1.82it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000127, train/loss_step=0.0341, global_step=1391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2480/5971 [22:48<32:06,  1.81it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0634, train/loss_vlb_step=0.000221, train/loss_step=0.0634, global_step=1391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2481/5971 [22:49<32:06,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.0108, train/loss_step=0.571, global_step=1392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  42%|████▏     | 2482/5971 [22:50<32:05,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00206, train/loss_step=0.318, global_step=1392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2483/5971 [22:51<32:05,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00206, train/loss_step=0.318, global_step=1392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2483/5971 [22:51<32:05,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.15e-5, train/loss_step=0.00194, global_step=1392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2484/5971 [22:54<32:08,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.17e-5, train/loss_step=0.0196, global_step=1392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  42%|████▏     | 2485/5971 [22:55<32:08,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00182, train/loss_step=0.359, global_step=1393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  42%|████▏     | 2486/5971 [22:55<32:08,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00446, train/loss_vlb_step=2.28e-5, train/loss_step=0.00446, global_step=1393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2487/5971 [22:56<32:07,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00446, train/loss_vlb_step=2.28e-5, train/loss_step=0.00446, global_step=1393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2487/5971 [22:56<32:07,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00729, train/loss_vlb_step=3.46e-5, train/loss_step=0.00729, global_step=1393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2488/5971 [22:59<32:10,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000878, train/loss_step=0.233, global_step=1393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  42%|████▏     | 2489/5971 [23:00<32:10,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00635, train/loss_vlb_step=3.12e-5, train/loss_step=0.00635, global_step=1394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2490/5971 [23:01<32:10,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00171, train/loss_step=0.348, global_step=1394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  42%|████▏     | 2491/5971 [23:02<32:09,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00171, train/loss_step=0.348, global_step=1394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2491/5971 [23:02<32:09,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00145, train/loss_step=0.318, global_step=1394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2492/5971 [23:04<32:11,  1.80it/s, loss=0.168, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.0029, train/loss_step=0.404, global_step=1394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  42%|████▏     | 2493/5971 [23:05<32:11,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00245, train/loss_vlb_step=1.43e-5, train/loss_step=0.00245, global_step=1395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2494/5971 [23:06<32:11,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00812, train/loss_vlb_step=3.82e-5, train/loss_step=0.00812, global_step=1395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2495/5971 [23:06<32:11,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00812, train/loss_vlb_step=3.82e-5, train/loss_step=0.00812, global_step=1395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2495/5971 [23:06<32:11,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.33e-5, train/loss_step=0.0156, global_step=1395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  42%|████▏     | 2496/5971 [23:09<32:13,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000399, train/loss_step=0.121, global_step=1395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  42%|████▏     | 2497/5971 [23:09<32:13,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.39e-5, train/loss_step=0.0121, global_step=1396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2498/5971 [23:10<32:12,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=1396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  42%|████▏     | 2499/5971 [23:11<32:12,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=1396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2499/5971 [23:11<32:12,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.000113, train/loss_step=0.0282, global_step=1396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2500/5971 [23:13<32:14,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.000113, train/loss_step=0.0305, global_step=1396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2501/5971 [23:14<32:14,  1.79it/s, loss=0.135, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.0019, train/loss_step=0.349, global_step=1397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  42%|████▏     | 2502/5971 [23:15<32:14,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000838, train/loss_step=0.226, global_step=1397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2503/5971 [23:16<32:14,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000838, train/loss_step=0.226, global_step=1397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2503/5971 [23:16<32:14,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000463, train/loss_step=0.140, global_step=1397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2504/5971 [23:18<32:15,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.0023, train/loss_step=0.389, global_step=1397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  42%|████▏     | 2505/5971 [23:19<32:15,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00264, train/loss_step=0.386, global_step=1398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2506/5971 [23:20<32:15,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=1398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2507/5971 [23:21<32:15,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=1398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2507/5971 [23:21<32:15,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000585, train/loss_step=0.168, global_step=1398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2508/5971 [23:24<32:18,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.13e-5, train/loss_step=0.0112, global_step=1398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2509/5971 [23:25<32:18,  1.79it/s, loss=0.166, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.00041, train/loss_step=0.124, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  42%|████▏     | 2510/5971 [23:26<32:18,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000603, train/loss_step=0.177, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2511/5971 [23:27<32:18,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000603, train/loss_step=0.177, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2511/5971 [23:27<32:18,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000361, train/loss_step=0.110, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  42%|████▏     | 2512/5971 [23:29<32:20,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:12,  2.28it/s][A

Validating:   1%|          | 2/167 [00:00<00:41,  3.97it/s][A
Epoch 2:  42%|████▏     | 2515/5971 [23:30<32:17,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:15, 10.30it/s][A
Epoch 2:  42%|████▏     | 2519/5971 [23:30<32:12,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:10, 14.62it/s][A
Epoch 2:  42%|████▏     | 2523/5971 [23:30<32:07,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.08it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.27it/s][A
Epoch 2:  42%|████▏     | 2527/5971 [23:30<32:02,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.89it/s][A
Epoch 2:  42%|████▏     | 2531/5971 [23:31<31:57,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.00it/s][A
Epoch 2:  42%|████▏     | 2535/5971 [23:31<31:52,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.61it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.89it/s][A
Epoch 2:  43%|████▎     | 2539/5971 [23:31<31:47,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.41it/s][A
Epoch 2:  43%|████▎     | 2543/5971 [23:31<31:42,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:04, 27.05it/s][A
Epoch 2:  43%|████▎     | 2547/5971 [23:31<31:37,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.73it/s][A
Epoch 2:  43%|████▎     | 2551/5971 [23:31<31:32,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 24.07it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.08it/s][A
Epoch 2:  43%|████▎     | 2555/5971 [23:32<31:27,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.88it/s][A
Epoch 2:  43%|████▎     | 2559/5971 [23:32<31:22,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.92it/s][A
Epoch 2:  43%|████▎     | 2563/5971 [23:32<31:17,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.78it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.83it/s][A
Epoch 2:  43%|████▎     | 2567/5971 [23:32<31:12,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.62it/s][A
Epoch 2:  43%|████▎     | 2571/5971 [23:32<31:07,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.03it/s][A
Epoch 2:  43%|████▎     | 2575/5971 [23:32<31:02,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.49it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.37it/s][A
Epoch 2:  43%|████▎     | 2579/5971 [23:32<30:57,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.55it/s][A
Epoch 2:  43%|████▎     | 2583/5971 [23:33<30:52,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.87it/s][A
Epoch 2:  43%|████▎     | 2587/5971 [23:33<30:47,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.52it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 24.86it/s][A
Epoch 2:  43%|████▎     | 2591/5971 [23:33<30:43,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.48it/s][A
Epoch 2:  43%|████▎     | 2595/5971 [23:33<30:38,  1.84it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.31it/s][A
Epoch 2:  44%|████▎     | 2599/5971 [23:33<30:33,  1.84it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.78it/s][A

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.14it/s][A
Epoch 2:  44%|████▎     | 2603/5971 [23:33<30:28,  1.84it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.38it/s][A
Epoch 2:  44%|████▎     | 2607/5971 [23:34<30:23,  1.84it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.48it/s][A
Epoch 2:  44%|████▎     | 2611/5971 [23:34<30:19,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.44it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.63it/s][A
Epoch 2:  44%|████▍     | 2615/5971 [23:34<30:14,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.30it/s][A
Epoch 2:  44%|████▍     | 2619/5971 [23:34<30:09,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.20it/s][A
Epoch 2:  44%|████▍     | 2623/5971 [23:34<30:04,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.34it/s][A

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.34it/s][A
Epoch 2:  44%|████▍     | 2627/5971 [23:34<30:00,  1.86it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:04<00:01, 28.05it/s][A
Epoch 2:  44%|████▍     | 2631/5971 [23:34<29:55,  1.86it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 27.40it/s][A
Epoch 2:  44%|████▍     | 2635/5971 [23:35<29:50,  1.86it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 28.43it/s][A
Epoch 2:  44%|████▍     | 2639/5971 [23:35<29:46,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 28.23it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.84it/s][A
Epoch 2:  44%|████▍     | 2643/5971 [23:35<29:41,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 28.09it/s][A
Epoch 2:  44%|████▍     | 2647/5971 [23:35<29:36,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.85it/s][A
Epoch 2:  44%|████▍     | 2651/5971 [23:35<29:32,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.93it/s][A
Epoch 2:  44%|████▍     | 2655/5971 [23:35<29:27,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.37it/s][A
Epoch 2:  45%|████▍     | 2659/5971 [23:35<29:22,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 28.61it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.35it/s][A
Epoch 2:  45%|████▍     | 2663/5971 [23:36<29:18,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.98it/s][A
Epoch 2:  45%|████▍     | 2667/5971 [23:36<29:13,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.35it/s][A
Epoch 2:  45%|████▍     | 2671/5971 [23:36<29:09,  1.89it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.06it/s][A
Epoch 2:  45%|████▍     | 2675/5971 [23:36<29:04,  1.89it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.50it/s][A

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.73it/s][A
Epoch 2:  45%|████▍     | 2679/5971 [23:36<29:00,  1.89it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▍     | 2680/5971 [23:37<28:59,  1.89it/s, loss=0.138, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.0012, train/loss_step=0.235, global_step=1399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.21it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.31it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.66it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.92it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.11it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.46it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.52it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.52it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.41it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.42it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.42it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.46it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.46it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:03,  4.27it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  4.49it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  4.69it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:02,  4.78it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  4.88it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  4.94it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  4.99it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.03it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.10it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.19it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.16it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.29it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.99it/s]

Epoch 2:  45%|████▍     | 2681/5971 [23:49<29:13,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.12e-5, train/loss_step=0.00188, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.36it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.08it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.61it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.09it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.46it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.73it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.91it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.43it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.44it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.52it/s][A
Epoch 2:  45%|████▍     | 2681/5971 [23:56<29:21,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.12e-5, train/loss_step=0.00188, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.54it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.56it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.38it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.43it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.47it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.29it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.21it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.12it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.00it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.06it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  4.96it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:01,  4.95it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  4.91it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.02it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.12it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.21it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.02it/s]

Epoch 2:  45%|████▍     | 2682/5971 [24:01<29:27,  1.86it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.12e-5, train/loss_step=0.00188, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▍     | 2682/5971 [24:01<29:27,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00301, train/loss_step=0.464, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:38,  1.26it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.31it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.14it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.76it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.23it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.43it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.62it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.78it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.89it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.00it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.18it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.14it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.22it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.18it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.18it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.23it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.15it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.12it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.11it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.11it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:05<00:05,  5.13it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.04it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:05,  4.95it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  4.99it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  4.89it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:06<00:04,  4.92it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.03it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.10it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.14it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.09it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:07<00:03,  5.10it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.13it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.11it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.05it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.10it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.13it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.10it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.08it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.00it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  4.95it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:09<00:01,  4.80it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  4.77it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:01,  4.93it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.04it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.14it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.28it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.35it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.83it/s]

Epoch 2:  45%|████▍     | 2683/5971 [24:14<29:42,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00301, train/loss_step=0.464, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▍     | 2683/5971 [24:14<29:42,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00785, train/loss_vlb_step=3.75e-5, train/loss_step=0.00785, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.31it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.09it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.69it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.15it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.44it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.83it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.97it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.09it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.28it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.27it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.19it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.57it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.23it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.09it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.06it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.06it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.11it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.12it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.36it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.34it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.23it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.20it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.19it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.18it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.05it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.03it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.04it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.13it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.21it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.21it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.95it/s]

Epoch 2:  45%|████▍     | 2684/5971 [24:28<29:57,  1.83it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00785, train/loss_vlb_step=3.75e-5, train/loss_step=0.00785, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▍     | 2684/5971 [24:28<29:57,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  45%|████▍     | 2685/5971 [24:29<29:57,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=1400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▍     | 2685/5971 [24:29<29:57,  1.83it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000159, train/loss_step=0.0456, global_step=1401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▍     | 2686/5971 [24:30<29:57,  1.83it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000159, train/loss_step=0.0456, global_step=1401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▍     | 2686/5971 [24:30<29:57,  1.83it/s, loss=0.172, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00126, train/loss_step=0.324, global_step=1401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  45%|████▌     | 2687/5971 [24:31<29:57,  1.83it/s, loss=0.172, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00126, train/loss_step=0.324, global_step=1401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2687/5971 [24:31<29:57,  1.83it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00952, train/loss_vlb_step=4.2e-5, train/loss_step=0.00952, global_step=1401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2688/5971 [24:33<29:58,  1.82it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00952, train/loss_vlb_step=4.2e-5, train/loss_step=0.00952, global_step=1401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2688/5971 [24:33<29:58,  1.82it/s, loss=0.187, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00198, train/loss_step=0.343, global_step=1401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  45%|████▌     | 2689/5971 [24:34<29:58,  1.82it/s, loss=0.187, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00198, train/loss_step=0.343, global_step=1401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2689/5971 [24:34<29:58,  1.82it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000228, train/loss_step=0.0661, global_step=1402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2690/5971 [24:35<29:58,  1.82it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000228, train/loss_step=0.0661, global_step=1402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2690/5971 [24:35<29:58,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.58e-5, train/loss_step=0.0133, global_step=1402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  45%|████▌     | 2691/5971 [24:36<29:58,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.58e-5, train/loss_step=0.0133, global_step=1402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2691/5971 [24:36<29:58,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.52e-5, train/loss_step=0.0127, global_step=1402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2692/5971 [24:38<30:00,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.52e-5, train/loss_step=0.0127, global_step=1402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2692/5971 [24:38<30:00,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.52e-5, train/loss_step=0.00278, global_step=1402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2693/5971 [24:39<29:59,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.52e-5, train/loss_step=0.00278, global_step=1402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2693/5971 [24:39<29:59,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000588, train/loss_step=0.179, global_step=1403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  45%|████▌     | 2694/5971 [24:40<30:00,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000588, train/loss_step=0.179, global_step=1403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2694/5971 [24:40<30:00,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.00019, train/loss_step=0.0564, global_step=1403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2695/5971 [24:41<29:59,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.00019, train/loss_step=0.0564, global_step=1403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2695/5971 [24:41<29:59,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.47e-5, train/loss_step=0.00456, global_step=1403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2696/5971 [24:43<30:01,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.47e-5, train/loss_step=0.00456, global_step=1403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2696/5971 [24:43<30:01,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00298, train/loss_step=0.453, global_step=1403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  45%|████▌     | 2697/5971 [24:44<30:01,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00298, train/loss_step=0.453, global_step=1403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2697/5971 [24:44<30:01,  1.82it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=6.87e-5, train/loss_step=0.0182, global_step=1404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2698/5971 [24:45<30:01,  1.82it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=6.87e-5, train/loss_step=0.0182, global_step=1404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2698/5971 [24:45<30:01,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.00069, train/loss_step=0.203, global_step=1404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  45%|████▌     | 2699/5971 [24:46<30:01,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.00069, train/loss_step=0.203, global_step=1404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2699/5971 [24:46<30:01,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=1404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2700/5971 [24:48<30:02,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=1404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2700/5971 [24:48<30:02,  1.81it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=2.99e-5, train/loss_step=0.00593, global_step=1404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2701/5971 [24:49<30:02,  1.81it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=2.99e-5, train/loss_step=0.00593, global_step=1404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2701/5971 [24:49<30:02,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.733, train/loss_vlb_step=0.0228, train/loss_step=0.733, global_step=1405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  45%|████▌     | 2702/5971 [24:50<30:02,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.733, train/loss_vlb_step=0.0228, train/loss_step=0.733, global_step=1405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2702/5971 [24:50<30:02,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000184, train/loss_step=0.0553, global_step=1405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2703/5971 [24:51<30:02,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000184, train/loss_step=0.0553, global_step=1405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2703/5971 [24:51<30:02,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00494, train/loss_vlb_step=2.57e-5, train/loss_step=0.00494, global_step=1405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2704/5971 [24:53<30:03,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00494, train/loss_vlb_step=2.57e-5, train/loss_step=0.00494, global_step=1405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2704/5971 [24:53<30:03,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000108, train/loss_step=0.0283, global_step=1405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  45%|████▌     | 2705/5971 [24:54<30:03,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000108, train/loss_step=0.0283, global_step=1405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2705/5971 [24:54<30:03,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0964, train/loss_vlb_step=0.000319, train/loss_step=0.0964, global_step=1406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2706/5971 [24:55<30:03,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0964, train/loss_vlb_step=0.000319, train/loss_step=0.0964, global_step=1406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2706/5971 [24:55<30:03,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00681, train/loss_vlb_step=3.31e-5, train/loss_step=0.00681, global_step=1406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2707/5971 [24:56<30:03,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00681, train/loss_vlb_step=3.31e-5, train/loss_step=0.00681, global_step=1406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2707/5971 [24:56<30:03,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000516, train/loss_step=0.155, global_step=1406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  45%|████▌     | 2708/5971 [24:58<30:04,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000516, train/loss_step=0.155, global_step=1406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2708/5971 [24:58<30:04,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.92e-5, train/loss_step=0.00347, global_step=1406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2709/5971 [24:59<30:04,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.92e-5, train/loss_step=0.00347, global_step=1406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2709/5971 [24:59<30:04,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00502, train/loss_vlb_step=2.5e-5, train/loss_step=0.00502, global_step=1407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2710/5971 [25:00<30:04,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00502, train/loss_vlb_step=2.5e-5, train/loss_step=0.00502, global_step=1407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2710/5971 [25:00<30:04,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.83e-5, train/loss_step=0.0145, global_step=1407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  45%|████▌     | 2711/5971 [25:00<30:04,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.83e-5, train/loss_step=0.0145, global_step=1407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2711/5971 [25:00<30:04,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00686, train/loss_step=0.528, global_step=1407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  45%|████▌     | 2712/5971 [25:03<30:05,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00686, train/loss_step=0.528, global_step=1407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2712/5971 [25:03<30:05,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00108, train/loss_step=0.276, global_step=1407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2713/5971 [25:04<30:05,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00108, train/loss_step=0.276, global_step=1407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2713/5971 [25:04<30:05,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00502, train/loss_vlb_step=2.56e-5, train/loss_step=0.00502, global_step=1408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2714/5971 [25:04<30:05,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00502, train/loss_vlb_step=2.56e-5, train/loss_step=0.00502, global_step=1408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2714/5971 [25:04<30:05,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.08e-5, train/loss_step=0.0167, global_step=1408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  45%|████▌     | 2715/5971 [25:05<30:05,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.08e-5, train/loss_step=0.0167, global_step=1408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2715/5971 [25:05<30:05,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0582, train/loss_vlb_step=0.000202, train/loss_step=0.0582, global_step=1408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2716/5971 [25:08<30:06,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0582, train/loss_vlb_step=0.000202, train/loss_step=0.0582, global_step=1408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  45%|████▌     | 2716/5971 [25:08<30:06,  1.80it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=8.67e-5, train/loss_step=0.0228, global_step=1408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  46%|████▌     | 2717/5971 [25:09<30:06,  1.80it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=8.67e-5, train/loss_step=0.0228, global_step=1408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2717/5971 [25:09<30:06,  1.80it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00126, train/loss_vlb_step=7.57e-6, train/loss_step=0.00126, global_step=1409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2718/5971 [25:10<30:06,  1.80it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00126, train/loss_vlb_step=7.57e-6, train/loss_step=0.00126, global_step=1409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2718/5971 [25:10<30:06,  1.80it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.45e-5, train/loss_step=0.0118, global_step=1409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  46%|████▌     | 2719/5971 [25:10<30:06,  1.80it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.45e-5, train/loss_step=0.0118, global_step=1409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2719/5971 [25:10<30:06,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0979, train/loss_vlb_step=0.000322, train/loss_step=0.0979, global_step=1409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2720/5971 [25:13<30:08,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0979, train/loss_vlb_step=0.000322, train/loss_step=0.0979, global_step=1409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2720/5971 [25:13<30:08,  1.80it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000108, train/loss_step=0.0268, global_step=1409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2721/5971 [25:14<30:08,  1.80it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000108, train/loss_step=0.0268, global_step=1409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2721/5971 [25:14<30:08,  1.80it/s, loss=0.0719, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.25e-5, train/loss_step=0.0244, global_step=1410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2722/5971 [25:15<30:08,  1.80it/s, loss=0.0719, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.25e-5, train/loss_step=0.0244, global_step=1410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2722/5971 [25:15<30:08,  1.80it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.00313, train/loss_vlb_step=1.74e-5, train/loss_step=0.00313, global_step=1410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2723/5971 [25:16<30:07,  1.80it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.00313, train/loss_vlb_step=1.74e-5, train/loss_step=0.00313, global_step=1410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2723/5971 [25:16<30:07,  1.80it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.383, train/loss_vlb_step=0.00211, train/loss_step=0.383, global_step=1410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  46%|████▌     | 2724/5971 [25:18<30:09,  1.79it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.383, train/loss_vlb_step=0.00211, train/loss_step=0.383, global_step=1410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2724/5971 [25:18<30:09,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00172, train/loss_step=0.343, global_step=1410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  46%|████▌     | 2725/5971 [25:19<30:09,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00172, train/loss_step=0.343, global_step=1410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2725/5971 [25:19<30:09,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00323, train/loss_step=0.480, global_step=1411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2726/5971 [25:20<30:09,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00323, train/loss_step=0.480, global_step=1411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2726/5971 [25:20<30:09,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000125, train/loss_step=0.0341, global_step=1411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2727/5971 [25:21<30:08,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000125, train/loss_step=0.0341, global_step=1411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2727/5971 [25:21<30:08,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000129, train/loss_step=0.0351, global_step=1411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2728/5971 [25:23<30:10,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000129, train/loss_step=0.0351, global_step=1411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2728/5971 [25:23<30:10,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.97e-5, train/loss_step=0.0188, global_step=1411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  46%|████▌     | 2729/5971 [25:24<30:10,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.97e-5, train/loss_step=0.0188, global_step=1411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2729/5971 [25:24<30:10,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.0102, train/loss_step=0.417, global_step=1412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  46%|████▌     | 2730/5971 [25:25<30:09,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.0102, train/loss_step=0.417, global_step=1412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2730/5971 [25:25<30:09,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0684, train/loss_vlb_step=0.000236, train/loss_step=0.0684, global_step=1412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2731/5971 [25:26<30:09,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0684, train/loss_vlb_step=0.000236, train/loss_step=0.0684, global_step=1412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2731/5971 [25:26<30:09,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000209, train/loss_step=0.0615, global_step=1412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2732/5971 [25:29<30:12,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000209, train/loss_step=0.0615, global_step=1412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2732/5971 [25:29<30:12,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00716, train/loss_vlb_step=3.57e-5, train/loss_step=0.00716, global_step=1412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2733/5971 [25:30<30:12,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00716, train/loss_vlb_step=3.57e-5, train/loss_step=0.00716, global_step=1412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2733/5971 [25:30<30:12,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.01e-5, train/loss_step=0.0203, global_step=1413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  46%|████▌     | 2734/5971 [25:31<30:12,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.01e-5, train/loss_step=0.0203, global_step=1413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2734/5971 [25:31<30:12,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000122, train/loss_step=0.0322, global_step=1413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2735/5971 [25:31<30:11,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000122, train/loss_step=0.0322, global_step=1413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2735/5971 [25:31<30:11,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00313, train/loss_step=0.429, global_step=1413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  46%|████▌     | 2736/5971 [25:34<30:14,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00313, train/loss_step=0.429, global_step=1413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2736/5971 [25:34<30:14,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000739, train/loss_step=0.195, global_step=1413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2737/5971 [25:35<30:13,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000739, train/loss_step=0.195, global_step=1413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2737/5971 [25:35<30:13,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00583, train/loss_step=0.586, global_step=1414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  46%|████▌     | 2738/5971 [25:36<30:13,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00583, train/loss_step=0.586, global_step=1414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2738/5971 [25:36<30:13,  1.78it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000173, train/loss_step=0.0522, global_step=1414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2739/5971 [25:37<30:13,  1.78it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000173, train/loss_step=0.0522, global_step=1414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2739/5971 [25:37<30:13,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00373, train/loss_step=0.478, global_step=1414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  46%|████▌     | 2740/5971 [25:39<30:15,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00373, train/loss_step=0.478, global_step=1414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2740/5971 [25:39<30:15,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00229, train/loss_step=0.438, global_step=1414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2741/5971 [25:40<30:14,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00229, train/loss_step=0.438, global_step=1414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2741/5971 [25:40<30:14,  1.78it/s, loss=0.212, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000531, train/loss_step=0.161, global_step=1415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2742/5971 [25:41<30:14,  1.78it/s, loss=0.212, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000531, train/loss_step=0.161, global_step=1415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2742/5971 [25:41<30:14,  1.78it/s, loss=0.218, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000383, train/loss_step=0.117, global_step=1415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2743/5971 [25:42<30:14,  1.78it/s, loss=0.218, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000383, train/loss_step=0.117, global_step=1415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2743/5971 [25:42<30:14,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.05e-5, train/loss_step=0.00182, global_step=1415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2744/5971 [25:45<30:16,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.05e-5, train/loss_step=0.00182, global_step=1415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2744/5971 [25:45<30:16,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.00112, train/loss_step=0.252, global_step=1415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  46%|████▌     | 2745/5971 [25:45<30:16,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.00112, train/loss_step=0.252, global_step=1415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2745/5971 [25:45<30:16,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00036, train/loss_step=0.110, global_step=1416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2746/5971 [25:46<30:16,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00036, train/loss_step=0.110, global_step=1416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2746/5971 [25:46<30:16,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0612, train/loss_vlb_step=0.000206, train/loss_step=0.0612, global_step=1416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2747/5971 [25:47<30:15,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0612, train/loss_vlb_step=0.000206, train/loss_step=0.0612, global_step=1416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2747/5971 [25:47<30:15,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000551, train/loss_step=0.161, global_step=1416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  46%|████▌     | 2748/5971 [25:51<30:18,  1.77it/s, loss=0.183, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000551, train/loss_step=0.161, global_step=1416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2748/5971 [25:51<30:18,  1.77it/s, loss=0.187, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000281, train/loss_step=0.083, global_step=1416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2749/5971 [25:52<30:18,  1.77it/s, loss=0.187, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000281, train/loss_step=0.083, global_step=1416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2749/5971 [25:52<30:18,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0996, train/loss_vlb_step=0.000331, train/loss_step=0.0996, global_step=1417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2750/5971 [25:52<30:18,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0996, train/loss_vlb_step=0.000331, train/loss_step=0.0996, global_step=1417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2750/5971 [25:52<30:18,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0794, train/loss_vlb_step=0.000267, train/loss_step=0.0794, global_step=1417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2751/5971 [25:53<30:18,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0794, train/loss_vlb_step=0.000267, train/loss_step=0.0794, global_step=1417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2751/5971 [25:53<30:18,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00172, train/loss_step=0.339, global_step=1417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  46%|████▌     | 2752/5971 [25:56<30:19,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00172, train/loss_step=0.339, global_step=1417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2752/5971 [25:56<30:19,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=4.76e-5, train/loss_step=0.0125, global_step=1417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2753/5971 [25:57<30:19,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=4.76e-5, train/loss_step=0.0125, global_step=1417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2753/5971 [25:57<30:19,  1.77it/s, loss=0.193, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000611, train/loss_step=0.181, global_step=1418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  46%|████▌     | 2754/5971 [25:58<30:19,  1.77it/s, loss=0.193, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000611, train/loss_step=0.181, global_step=1418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2754/5971 [25:58<30:19,  1.77it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.72e-5, train/loss_step=0.0137, global_step=1418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2755/5971 [25:58<30:19,  1.77it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.72e-5, train/loss_step=0.0137, global_step=1418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2755/5971 [25:58<30:19,  1.77it/s, loss=0.211, v_num=0, train/loss_simple_step=0.802, train/loss_vlb_step=0.0683, train/loss_step=0.802, global_step=1418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  46%|████▌     | 2756/5971 [26:01<30:20,  1.77it/s, loss=0.211, v_num=0, train/loss_simple_step=0.802, train/loss_vlb_step=0.0683, train/loss_step=0.802, global_step=1418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2756/5971 [26:01<30:20,  1.77it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000307, train/loss_step=0.0935, global_step=1418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2757/5971 [26:01<30:20,  1.77it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000307, train/loss_step=0.0935, global_step=1418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2757/5971 [26:01<30:20,  1.77it/s, loss=0.203, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00447, train/loss_step=0.533, global_step=1419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  46%|████▌     | 2758/5971 [26:02<30:20,  1.77it/s, loss=0.203, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00447, train/loss_step=0.533, global_step=1419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2758/5971 [26:02<30:20,  1.77it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.89e-5, train/loss_step=0.0111, global_step=1419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2759/5971 [26:03<30:19,  1.76it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.89e-5, train/loss_step=0.0111, global_step=1419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2759/5971 [26:03<30:19,  1.76it/s, loss=0.198, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00339, train/loss_step=0.418, global_step=1419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  46%|████▌     | 2760/5971 [26:05<30:21,  1.76it/s, loss=0.198, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00339, train/loss_step=0.418, global_step=1419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2760/5971 [26:05<30:21,  1.76it/s, loss=0.197, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00281, train/loss_step=0.408, global_step=1419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2761/5971 [26:06<30:20,  1.76it/s, loss=0.197, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00281, train/loss_step=0.408, global_step=1419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▌     | 2761/5971 [26:06<30:20,  1.76it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0869, train/loss_vlb_step=0.000287, train/loss_step=0.0869, global_step=1420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2762/5971 [26:07<30:20,  1.76it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0869, train/loss_vlb_step=0.000287, train/loss_step=0.0869, global_step=1420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2762/5971 [26:07<30:20,  1.76it/s, loss=0.195, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000522, train/loss_step=0.155, global_step=1420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  46%|████▋     | 2763/5971 [26:08<30:20,  1.76it/s, loss=0.195, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000522, train/loss_step=0.155, global_step=1420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2763/5971 [26:08<30:20,  1.76it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.39e-5, train/loss_step=0.0143, global_step=1420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2764/5971 [26:10<30:21,  1.76it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.39e-5, train/loss_step=0.0143, global_step=1420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2764/5971 [26:10<30:21,  1.76it/s, loss=0.199, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.0013, train/loss_step=0.318, global_step=1420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  46%|████▋     | 2765/5971 [26:11<30:21,  1.76it/s, loss=0.199, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.0013, train/loss_step=0.318, global_step=1420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2765/5971 [26:11<30:21,  1.76it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00117, train/loss_vlb_step=7.09e-6, train/loss_step=0.00117, global_step=1421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2766/5971 [26:12<30:21,  1.76it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00117, train/loss_vlb_step=7.09e-6, train/loss_step=0.00117, global_step=1421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2766/5971 [26:12<30:21,  1.76it/s, loss=0.198, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000491, train/loss_step=0.147, global_step=1421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  46%|████▋     | 2767/5971 [26:13<30:21,  1.76it/s, loss=0.198, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000491, train/loss_step=0.147, global_step=1421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2767/5971 [26:13<30:21,  1.76it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0705, train/loss_vlb_step=0.000248, train/loss_step=0.0705, global_step=1421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2768/5971 [26:15<30:22,  1.76it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0705, train/loss_vlb_step=0.000248, train/loss_step=0.0705, global_step=1421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2768/5971 [26:15<30:22,  1.76it/s, loss=0.208, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00256, train/loss_step=0.378, global_step=1421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  46%|████▋     | 2769/5971 [26:16<30:22,  1.76it/s, loss=0.208, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00256, train/loss_step=0.378, global_step=1421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2769/5971 [26:16<30:22,  1.76it/s, loss=0.218, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.0014, train/loss_step=0.289, global_step=1422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  46%|████▋     | 2770/5971 [26:17<30:22,  1.76it/s, loss=0.218, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.0014, train/loss_step=0.289, global_step=1422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2770/5971 [26:17<30:22,  1.76it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000172, train/loss_step=0.0499, global_step=1422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2771/5971 [26:18<30:21,  1.76it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000172, train/loss_step=0.0499, global_step=1422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2771/5971 [26:18<30:21,  1.76it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.05e-5, train/loss_step=0.00384, global_step=1422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2772/5971 [26:20<30:23,  1.75it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.05e-5, train/loss_step=0.00384, global_step=1422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2772/5971 [26:20<30:23,  1.75it/s, loss=0.208, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.00068, train/loss_step=0.196, global_step=1422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  46%|████▋     | 2773/5971 [26:21<30:23,  1.75it/s, loss=0.208, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.00068, train/loss_step=0.196, global_step=1422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2773/5971 [26:21<30:23,  1.75it/s, loss=0.227, v_num=0, train/loss_simple_step=0.551, train/loss_vlb_step=0.00609, train/loss_step=0.551, global_step=1423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2774/5971 [26:22<30:23,  1.75it/s, loss=0.227, v_num=0, train/loss_simple_step=0.551, train/loss_vlb_step=0.00609, train/loss_step=0.551, global_step=1423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2774/5971 [26:22<30:23,  1.75it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0836, train/loss_vlb_step=0.000281, train/loss_step=0.0836, global_step=1423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2775/5971 [26:23<30:22,  1.75it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0836, train/loss_vlb_step=0.000281, train/loss_step=0.0836, global_step=1423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2775/5971 [26:23<30:22,  1.75it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0492, train/loss_vlb_step=0.000179, train/loss_step=0.0492, global_step=1423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2776/5971 [26:25<30:24,  1.75it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0492, train/loss_vlb_step=0.000179, train/loss_step=0.0492, global_step=1423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  46%|████▋     | 2776/5971 [26:25<30:24,  1.75it/s, loss=0.198, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000687, train/loss_step=0.202, global_step=1423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  47%|████▋     | 2777/5971 [26:26<30:24,  1.75it/s, loss=0.198, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000687, train/loss_step=0.202, global_step=1423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  47%|████▋     | 2777/5971 [26:26<30:24,  1.75it/s, loss=0.179, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000453, train/loss_step=0.138, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  47%|████▋     | 2778/5971 [26:27<30:23,  1.75it/s, loss=0.179, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000453, train/loss_step=0.138, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  47%|████▋     | 2778/5971 [26:27<30:23,  1.75it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00433, train/loss_vlb_step=2.1e-5, train/loss_step=0.00433, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  47%|████▋     | 2779/5971 [26:28<30:23,  1.75it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00433, train/loss_vlb_step=2.1e-5, train/loss_step=0.00433, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  47%|████▋     | 2779/5971 [26:28<30:23,  1.75it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0523, train/loss_vlb_step=0.000177, train/loss_step=0.0523, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  47%|████▋     | 2780/5971 [26:30<30:24,  1.75it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0523, train/loss_vlb_step=0.000177, train/loss_step=0.0523, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  47%|████▋     | 2780/5971 [26:30<30:24,  1.75it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:22,  2.00it/s][A
Epoch 2:  47%|████▋     | 2782/5971 [26:30<30:23,  1.75it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:45,  3.66it/s][A
Epoch 2:  47%|████▋     | 2784/5971 [26:31<30:20,  1.75it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.58it/s][A
Epoch 2:  47%|████▋     | 2787/5971 [26:31<30:17,  1.75it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.22it/s][A
Epoch 2:  47%|████▋     | 2790/5971 [26:31<30:13,  1.75it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.07it/s][A
Epoch 2:  47%|████▋     | 2793/5971 [26:31<30:10,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.97it/s][A
Epoch 2:  47%|████▋     | 2796/5971 [26:31<30:06,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.96it/s][A
Epoch 2:  47%|████▋     | 2800/5971 [26:31<30:01,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.14it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.82it/s][A
Epoch 2:  47%|████▋     | 2804/5971 [26:31<29:57,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.37it/s][A
Epoch 2:  47%|████▋     | 2808/5971 [26:32<29:52,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.29it/s][A
Epoch 2:  47%|████▋     | 2812/5971 [26:32<29:47,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.79it/s][A
Epoch 2:  47%|████▋     | 2816/5971 [26:32<29:43,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 27.58it/s][A

Validating:  23%|██▎       | 39/167 [00:01<00:04, 27.84it/s][A
Epoch 2:  47%|████▋     | 2820/5971 [26:32<29:38,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 28.39it/s][A
Epoch 2:  47%|████▋     | 2824/5971 [26:32<29:34,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 28.15it/s][A
Epoch 2:  47%|████▋     | 2828/5971 [26:32<29:29,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 28.66it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 26.60it/s][A
Epoch 2:  47%|████▋     | 2832/5971 [26:32<29:24,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.02it/s][A
Epoch 2:  47%|████▋     | 2836/5971 [26:33<29:20,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.46it/s][A
Epoch 2:  48%|████▊     | 2840/5971 [26:33<29:15,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.23it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:03, 27.25it/s][A
Epoch 2:  48%|████▊     | 2844/5971 [26:33<29:11,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:02<00:03, 27.63it/s][A
Epoch 2:  48%|████▊     | 2848/5971 [26:33<29:06,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 27.74it/s][A
Epoch 2:  48%|████▊     | 2852/5971 [26:33<29:02,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.37it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.58it/s][A
Epoch 2:  48%|████▊     | 2856/5971 [26:33<28:57,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.90it/s][A
Epoch 2:  48%|████▊     | 2860/5971 [26:33<28:53,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.51it/s][A
Epoch 2:  48%|████▊     | 2864/5971 [26:34<28:48,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.02it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.49it/s][A
Epoch 2:  48%|████▊     | 2868/5971 [26:34<28:44,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.00it/s][A
Epoch 2:  48%|████▊     | 2872/5971 [26:34<28:39,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 27.14it/s][A
Epoch 2:  48%|████▊     | 2876/5971 [26:34<28:35,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.46it/s][A
Epoch 2:  48%|████▊     | 2880/5971 [26:34<28:30,  1.81it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.68it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.37it/s][A
Epoch 2:  48%|████▊     | 2884/5971 [26:34<28:26,  1.81it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.15it/s][A
Epoch 2:  48%|████▊     | 2888/5971 [26:34<28:22,  1.81it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.01it/s][A
Epoch 2:  48%|████▊     | 2892/5971 [26:35<28:17,  1.81it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 27.57it/s][A
Epoch 2:  49%|████▊     | 2896/5971 [26:35<28:13,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 28.86it/s][A

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 27.55it/s][A
Epoch 2:  49%|████▊     | 2900/5971 [26:35<28:08,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.97it/s][A
Epoch 2:  49%|████▊     | 2904/5971 [26:35<28:04,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.73it/s][A
Epoch 2:  49%|████▊     | 2908/5971 [26:35<28:00,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.37it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 27.10it/s][A
Epoch 2:  49%|████▉     | 2912/5971 [26:35<27:55,  1.83it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.84it/s][A
Epoch 2:  49%|████▉     | 2916/5971 [26:35<27:51,  1.83it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.04it/s][A
Epoch 2:  49%|████▉     | 2920/5971 [26:36<27:47,  1.83it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 25.39it/s][A

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 25.92it/s][A
Epoch 2:  49%|████▉     | 2924/5971 [26:36<27:42,  1.83it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:05<00:00, 26.20it/s][A
Epoch 2:  49%|████▉     | 2928/5971 [26:36<27:38,  1.83it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.85it/s][A
Epoch 2:  49%|████▉     | 2932/5971 [26:36<27:34,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.21it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.83it/s][A
Epoch 2:  49%|████▉     | 2936/5971 [26:36<27:30,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.33it/s][A
Epoch 2:  49%|████▉     | 2940/5971 [26:36<27:25,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.98it/s][A
Epoch 2:  49%|████▉     | 2944/5971 [26:37<27:21,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.63it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.48it/s][A
Epoch 2:  49%|████▉     | 2948/5971 [26:37<27:17,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  49%|████▉     | 2948/5971 [26:37<27:17,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.00087, train/loss_step=0.236, global_step=1424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  49%|████▉     | 2949/5971 [26:38<27:17,  1.85it/s, loss=0.159, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000851, train/loss_step=0.236, global_step=1425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  49%|████▉     | 2950/5971 [26:39<27:17,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000776, train/loss_step=0.205, global_step=1425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  49%|████▉     | 2951/5971 [26:40<27:17,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00248, train/loss_step=0.404, global_step=1425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  49%|████▉     | 2952/5971 [26:42<27:18,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00248, train/loss_step=0.404, global_step=1425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  49%|████▉     | 2952/5971 [26:42<27:18,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.53e-5, train/loss_step=0.00704, global_step=1425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  49%|████▉     | 2953/5971 [26:43<27:18,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.57e-5, train/loss_step=0.0151, global_step=1426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  49%|████▉     | 2954/5971 [26:44<27:18,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000152, train/loss_step=0.0426, global_step=1426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  49%|████▉     | 2955/5971 [26:45<27:17,  1.84it/s, loss=0.172, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00143, train/loss_step=0.297, global_step=1426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  50%|████▉     | 2956/5971 [26:47<27:19,  1.84it/s, loss=0.172, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00143, train/loss_step=0.297, global_step=1426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2956/5971 [26:47<27:19,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000538, train/loss_step=0.160, global_step=1426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2957/5971 [26:48<27:19,  1.84it/s, loss=0.163, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00166, train/loss_step=0.319, global_step=1427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  50%|████▉     | 2958/5971 [26:49<27:18,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.18e-5, train/loss_step=0.0114, global_step=1427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2959/5971 [26:50<27:18,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.01e-5, train/loss_step=0.00382, global_step=1427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2960/5971 [26:52<27:19,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.01e-5, train/loss_step=0.00382, global_step=1427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2960/5971 [26:52<27:19,  1.84it/s, loss=0.18, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.0117, train/loss_step=0.578, global_step=1427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  50%|████▉     | 2961/5971 [26:53<27:19,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0702, train/loss_vlb_step=0.000231, train/loss_step=0.0702, global_step=1428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2962/5971 [26:54<27:19,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000701, train/loss_step=0.188, global_step=1428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  50%|████▉     | 2963/5971 [26:55<27:19,  1.83it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.24e-5, train/loss_step=0.0121, global_step=1428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2964/5971 [26:57<27:20,  1.83it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.24e-5, train/loss_step=0.0121, global_step=1428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2964/5971 [26:57<27:20,  1.83it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00356, train/loss_vlb_step=1.87e-5, train/loss_step=0.00356, global_step=1428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2965/5971 [26:58<27:20,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.00065, train/loss_step=0.197, global_step=1429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  50%|████▉     | 2966/5971 [26:59<27:19,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.5e-5, train/loss_step=0.00483, global_step=1429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2967/5971 [27:00<27:19,  1.83it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00177, train/loss_vlb_step=1.06e-5, train/loss_step=0.00177, global_step=1429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2968/5971 [27:02<27:20,  1.83it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00177, train/loss_vlb_step=1.06e-5, train/loss_step=0.00177, global_step=1429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2968/5971 [27:02<27:20,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000183, train/loss_step=0.0528, global_step=1429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  50%|████▉     | 2969/5971 [27:03<27:20,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.089, train/loss_vlb_step=0.000293, train/loss_step=0.089, global_step=1430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  50%|████▉     | 2970/5971 [27:04<27:20,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000839, train/loss_step=0.189, global_step=1430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2971/5971 [27:05<27:20,  1.83it/s, loss=0.124, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000979, train/loss_step=0.236, global_step=1430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2972/5971 [27:07<27:21,  1.83it/s, loss=0.124, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000979, train/loss_step=0.236, global_step=1430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2972/5971 [27:07<27:21,  1.83it/s, loss=0.131, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000476, train/loss_step=0.142, global_step=1430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2973/5971 [27:08<27:21,  1.83it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00263, train/loss_vlb_step=1.53e-5, train/loss_step=0.00263, global_step=1431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2974/5971 [27:08<27:21,  1.83it/s, loss=0.134, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000409, train/loss_step=0.124, global_step=1431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  50%|████▉     | 2975/5971 [27:09<27:20,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=1431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2976/5971 [27:12<27:21,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=1431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2976/5971 [27:12<27:21,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.00049, train/loss_step=0.149, global_step=1431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  50%|████▉     | 2977/5971 [27:12<27:21,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.07e-5, train/loss_step=0.0115, global_step=1432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2978/5971 [27:13<27:21,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00157, train/loss_step=0.358, global_step=1432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  50%|████▉     | 2979/5971 [27:14<27:21,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000134, train/loss_step=0.0355, global_step=1432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2980/5971 [27:16<27:22,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000134, train/loss_step=0.0355, global_step=1432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2980/5971 [27:16<27:22,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.503, train/loss_vlb_step=0.00324, train/loss_step=0.503, global_step=1432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  50%|████▉     | 2981/5971 [27:17<27:22,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0995, train/loss_vlb_step=0.000328, train/loss_step=0.0995, global_step=1433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2982/5971 [27:18<27:21,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000164, train/loss_step=0.0487, global_step=1433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2983/5971 [27:19<27:21,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000105, train/loss_step=0.0275, global_step=1433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2984/5971 [27:21<27:22,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000105, train/loss_step=0.0275, global_step=1433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|████▉     | 2984/5971 [27:21<27:22,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000608, train/loss_step=0.180, global_step=1433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  50%|████▉     | 2985/5971 [27:22<27:22,  1.82it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000167, train/loss_step=0.0449, global_step=1434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2986/5971 [27:23<27:22,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00299, train/loss_step=0.430, global_step=1434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  50%|█████     | 2987/5971 [27:24<27:22,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000806, train/loss_step=0.240, global_step=1434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2988/5971 [27:26<27:23,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000806, train/loss_step=0.240, global_step=1434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2988/5971 [27:26<27:23,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0509, train/loss_vlb_step=0.000182, train/loss_step=0.0509, global_step=1434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2989/5971 [27:27<27:23,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000177, train/loss_step=0.0475, global_step=1435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2990/5971 [27:28<27:23,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.78e-5, train/loss_step=0.00332, global_step=1435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2991/5971 [27:29<27:22,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00186, train/loss_step=0.359, global_step=1435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  50%|█████     | 2992/5971 [27:31<27:23,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00186, train/loss_step=0.359, global_step=1435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2992/5971 [27:31<27:23,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00284, train/loss_step=0.432, global_step=1435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2993/5971 [27:32<27:23,  1.81it/s, loss=0.205, v_num=0, train/loss_simple_step=0.851, train/loss_vlb_step=0.0402, train/loss_step=0.851, global_step=1436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  50%|█████     | 2994/5971 [27:33<27:23,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0035, train/loss_vlb_step=1.81e-5, train/loss_step=0.0035, global_step=1436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2995/5971 [27:34<27:23,  1.81it/s, loss=0.216, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00393, train/loss_step=0.454, global_step=1436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  50%|█████     | 2996/5971 [27:36<27:24,  1.81it/s, loss=0.216, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00393, train/loss_step=0.454, global_step=1436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2996/5971 [27:36<27:24,  1.81it/s, loss=0.22, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000889, train/loss_step=0.229, global_step=1436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2997/5971 [27:37<27:24,  1.81it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.00023, train/loss_step=0.0653, global_step=1437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2998/5971 [27:38<27:23,  1.81it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.58e-5, train/loss_step=0.00289, global_step=1437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 2999/5971 [27:39<27:23,  1.81it/s, loss=0.215, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000792, train/loss_step=0.226, global_step=1437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  50%|█████     | 3000/5971 [27:41<27:24,  1.81it/s, loss=0.215, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000792, train/loss_step=0.226, global_step=1437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3000/5971 [27:41<27:24,  1.81it/s, loss=0.201, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000776, train/loss_step=0.224, global_step=1437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3001/5971 [27:42<27:24,  1.81it/s, loss=0.225, v_num=0, train/loss_simple_step=0.573, train/loss_vlb_step=0.00745, train/loss_step=0.573, global_step=1438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  50%|█████     | 3002/5971 [27:42<27:24,  1.81it/s, loss=0.237, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00128, train/loss_step=0.297, global_step=1438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3003/5971 [27:43<27:23,  1.81it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000312, train/loss_step=0.0948, global_step=1438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3004/5971 [27:46<27:24,  1.80it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000312, train/loss_step=0.0948, global_step=1438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3004/5971 [27:46<27:24,  1.80it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.08e-5, train/loss_step=0.0019, global_step=1438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3005/5971 [27:46<27:24,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=1439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  50%|█████     | 3006/5971 [27:47<27:24,  1.80it/s, loss=0.225, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.001, train/loss_step=0.246, global_step=1439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  50%|█████     | 3007/5971 [27:48<27:24,  1.80it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.79e-5, train/loss_step=0.00334, global_step=1439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3008/5971 [27:51<27:25,  1.80it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.79e-5, train/loss_step=0.00334, global_step=1439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3008/5971 [27:51<27:25,  1.80it/s, loss=0.223, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000862, train/loss_step=0.237, global_step=1439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  50%|█████     | 3009/5971 [27:51<27:25,  1.80it/s, loss=0.221, v_num=0, train/loss_simple_step=0.00962, train/loss_vlb_step=4.57e-5, train/loss_step=0.00962, global_step=1440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3010/5971 [27:52<27:25,  1.80it/s, loss=0.231, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000745, train/loss_step=0.198, global_step=1440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  50%|█████     | 3011/5971 [27:53<27:24,  1.80it/s, loss=0.22, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000521, train/loss_step=0.151, global_step=1440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  50%|█████     | 3012/5971 [27:55<27:25,  1.80it/s, loss=0.22, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000521, train/loss_step=0.151, global_step=1440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3012/5971 [27:55<27:25,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.637, train/loss_vlb_step=0.00753, train/loss_step=0.637, global_step=1440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  50%|█████     | 3013/5971 [27:56<27:25,  1.80it/s, loss=0.19, v_num=0, train/loss_simple_step=0.050, train/loss_vlb_step=0.000181, train/loss_step=0.050, global_step=1441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3014/5971 [27:57<27:25,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000193, train/loss_step=0.0568, global_step=1441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  50%|█████     | 3015/5971 [27:58<27:25,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.00048, train/loss_step=0.145, global_step=1441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  51%|█████     | 3016/5971 [28:00<27:26,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.00048, train/loss_step=0.145, global_step=1441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3016/5971 [28:00<27:26,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000317, train/loss_step=0.096, global_step=1441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3017/5971 [28:01<27:26,  1.79it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.00019, train/loss_step=0.0551, global_step=1442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3018/5971 [28:02<27:25,  1.79it/s, loss=0.181, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.0008, train/loss_step=0.208, global_step=1442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  51%|█████     | 3019/5971 [28:03<27:25,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000121, train/loss_step=0.0344, global_step=1442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3020/5971 [28:05<27:26,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000121, train/loss_step=0.0344, global_step=1442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3020/5971 [28:05<27:26,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000748, train/loss_step=0.219, global_step=1442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  51%|█████     | 3021/5971 [28:06<27:26,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00321, train/loss_step=0.467, global_step=1443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  51%|█████     | 3022/5971 [28:07<27:26,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=1443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3023/5971 [28:08<27:25,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00222, train/loss_step=0.433, global_step=1443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  51%|█████     | 3024/5971 [28:10<27:26,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00222, train/loss_step=0.433, global_step=1443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3024/5971 [28:10<27:26,  1.79it/s, loss=0.2, v_num=0, train/loss_simple_step=0.521, train/loss_vlb_step=0.0064, train/loss_step=0.521, global_step=1443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  51%|█████     | 3025/5971 [28:11<27:26,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000593, train/loss_step=0.173, global_step=1444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3026/5971 [28:12<27:26,  1.79it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.08e-5, train/loss_step=0.0151, global_step=1444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3027/5971 [28:13<27:26,  1.79it/s, loss=0.199, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000452, train/loss_step=0.135, global_step=1444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  51%|█████     | 3028/5971 [28:15<27:27,  1.79it/s, loss=0.199, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000452, train/loss_step=0.135, global_step=1444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3028/5971 [28:15<27:27,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000591, train/loss_step=0.173, global_step=1444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3029/5971 [28:16<27:26,  1.79it/s, loss=0.216, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00216, train/loss_step=0.409, global_step=1445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  51%|█████     | 3030/5971 [28:16<27:26,  1.79it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000109, train/loss_step=0.0286, global_step=1445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3031/5971 [28:17<27:26,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000316, train/loss_step=0.0958, global_step=1445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3032/5971 [28:20<27:27,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000316, train/loss_step=0.0958, global_step=1445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3032/5971 [28:20<27:27,  1.78it/s, loss=0.18, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000518, train/loss_step=0.154, global_step=1445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  51%|█████     | 3033/5971 [28:20<27:27,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00582, train/loss_vlb_step=2.95e-5, train/loss_step=0.00582, global_step=1446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3034/5971 [28:21<27:26,  1.78it/s, loss=0.212, v_num=0, train/loss_simple_step=0.731, train/loss_vlb_step=0.013, train/loss_step=0.731, global_step=1446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  51%|█████     | 3035/5971 [28:22<27:26,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.22e-5, train/loss_step=0.0118, global_step=1446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3036/5971 [28:25<27:27,  1.78it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.22e-5, train/loss_step=0.0118, global_step=1446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3036/5971 [28:25<27:27,  1.78it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000126, train/loss_step=0.0335, global_step=1446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3037/5971 [28:26<27:27,  1.78it/s, loss=0.225, v_num=0, train/loss_simple_step=0.511, train/loss_vlb_step=0.00402, train/loss_step=0.511, global_step=1447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  51%|█████     | 3038/5971 [28:27<27:27,  1.78it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.71e-5, train/loss_step=0.00309, global_step=1447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3039/5971 [28:27<27:27,  1.78it/s, loss=0.22, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000477, train/loss_step=0.142, global_step=1447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  51%|█████     | 3040/5971 [28:29<27:28,  1.78it/s, loss=0.22, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000477, train/loss_step=0.142, global_step=1447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3040/5971 [28:29<27:28,  1.78it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000254, train/loss_step=0.0771, global_step=1447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3041/5971 [28:30<27:27,  1.78it/s, loss=0.222, v_num=0, train/loss_simple_step=0.648, train/loss_vlb_step=0.0135, train/loss_step=0.648, global_step=1448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  51%|█████     | 3042/5971 [28:31<27:27,  1.78it/s, loss=0.217, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000139, train/loss_step=0.038, global_step=1448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3043/5971 [28:32<27:27,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.78e-5, train/loss_step=0.0104, global_step=1448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3044/5971 [28:34<27:28,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.78e-5, train/loss_step=0.0104, global_step=1448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3044/5971 [28:34<27:28,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00498, train/loss_step=0.486, global_step=1448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  51%|█████     | 3045/5971 [28:35<27:28,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.81e-5, train/loss_step=0.00329, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3046/5971 [28:36<27:27,  1.77it/s, loss=0.198, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00102, train/loss_step=0.262, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  51%|█████     | 3047/5971 [28:37<27:27,  1.77it/s, loss=0.196, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000347, train/loss_step=0.105, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3048/5971 [28:39<27:28,  1.77it/s, loss=0.196, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000347, train/loss_step=0.105, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  51%|█████     | 3048/5971 [28:39<27:28,  1.77it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:05,  2.54it/s][A

Validating:   1%|          | 2/167 [00:00<00:39,  4.21it/s][A
Epoch 2:  51%|█████     | 3052/5971 [28:40<27:24,  1.77it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:15, 10.26it/s][A
Epoch 2:  51%|█████     | 3056/5971 [28:40<27:20,  1.78it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:10, 15.11it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.92it/s][A
Epoch 2:  51%|█████     | 3060/5971 [28:40<27:16,  1.78it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:00<00:07, 21.13it/s][A
Epoch 2:  51%|█████▏    | 3064/5971 [28:40<27:12,  1.78it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 23.02it/s][A
Epoch 2:  51%|█████▏    | 3068/5971 [28:40<27:07,  1.78it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:05, 24.81it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:05, 25.24it/s][A
Epoch 2:  51%|█████▏    | 3072/5971 [28:41<27:03,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.18it/s][A
Epoch 2:  52%|█████▏    | 3076/5971 [28:41<26:59,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 26.39it/s][A
Epoch 2:  52%|█████▏    | 3080/5971 [28:41<26:55,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.68it/s][A

Validating:  21%|██        | 35/167 [00:01<00:04, 27.19it/s][A
Epoch 2:  52%|█████▏    | 3084/5971 [28:41<26:51,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.22it/s][A
Epoch 2:  52%|█████▏    | 3088/5971 [28:41<26:46,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:01<00:04, 27.10it/s][A
Epoch 2:  52%|█████▏    | 3092/5971 [28:41<26:42,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.69it/s][A
Epoch 2:  52%|█████▏    | 3096/5971 [28:41<26:38,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 28.08it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 27.49it/s][A
Epoch 2:  52%|█████▏    | 3100/5971 [28:42<26:34,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.44it/s][A
Epoch 2:  52%|█████▏    | 3104/5971 [28:42<26:30,  1.80it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.90it/s][A
Epoch 2:  52%|█████▏    | 3108/5971 [28:42<26:26,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.03it/s][A
Epoch 2:  52%|█████▏    | 3112/5971 [28:42<26:21,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.34it/s][A

Validating:  40%|████      | 67/167 [00:02<00:03, 26.70it/s][A
Epoch 2:  52%|█████▏    | 3116/5971 [28:42<26:17,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.86it/s][A
Epoch 2:  52%|█████▏    | 3120/5971 [28:42<26:13,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.16it/s][A
Epoch 2:  52%|█████▏    | 3124/5971 [28:42<26:09,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.90it/s][A
Epoch 2:  52%|█████▏    | 3128/5971 [28:43<26:05,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:02, 29.06it/s][A
Epoch 2:  52%|█████▏    | 3132/5971 [28:43<26:01,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:02, 29.19it/s][A
Epoch 2:  53%|█████▎    | 3136/5971 [28:43<25:57,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 29.64it/s][A

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 28.83it/s][A
Epoch 2:  53%|█████▎    | 3140/5971 [28:43<25:53,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▋    | 94/167 [00:03<00:02, 27.67it/s][A
Epoch 2:  53%|█████▎    | 3144/5971 [28:43<25:49,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:03<00:02, 27.37it/s][A
Epoch 2:  53%|█████▎    | 3148/5971 [28:43<25:45,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 27.42it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 28.01it/s][A
Epoch 2:  53%|█████▎    | 3152/5971 [28:43<25:41,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.15it/s][A
Epoch 2:  53%|█████▎    | 3156/5971 [28:44<25:37,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.69it/s][A
Epoch 2:  53%|█████▎    | 3160/5971 [28:44<25:33,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 29.02it/s][A
Epoch 2:  53%|█████▎    | 3164/5971 [28:44<25:29,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:04<00:01, 30.18it/s][A
Epoch 2:  53%|█████▎    | 3168/5971 [28:44<25:25,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:04<00:01, 30.04it/s][A
Epoch 2:  53%|█████▎    | 3172/5971 [28:44<25:21,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:04<00:01, 29.29it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 29.13it/s][A
Epoch 2:  53%|█████▎    | 3176/5971 [28:44<25:17,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 29.32it/s][A
Epoch 2:  53%|█████▎    | 3180/5971 [28:44<25:13,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 28.90it/s][A
Epoch 2:  53%|█████▎    | 3184/5971 [28:45<25:09,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.75it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 27.43it/s][A
Epoch 2:  53%|█████▎    | 3188/5971 [28:45<25:05,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 27.57it/s][A
Epoch 2:  53%|█████▎    | 3192/5971 [28:45<25:01,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 27.73it/s][A
Epoch 2:  54%|█████▎    | 3196/5971 [28:45<24:57,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 27.90it/s][A
Epoch 2:  54%|█████▎    | 3200/5971 [28:45<24:53,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:05<00:00, 29.43it/s][A

Validating:  93%|█████████▎| 155/167 [00:05<00:00, 29.50it/s][A
Epoch 2:  54%|█████▎    | 3204/5971 [28:45<24:49,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 29.05it/s][A
Epoch 2:  54%|█████▎    | 3208/5971 [28:45<24:46,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 28.59it/s][A
Epoch 2:  54%|█████▍    | 3212/5971 [28:46<24:42,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 27.69it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 28.01it/s][A
Epoch 2:  54%|█████▍    | 3216/5971 [28:46<24:38,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3216/5971 [28:46<24:38,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=1449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  54%|█████▍    | 3217/5971 [28:47<24:38,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00144, train/loss_step=0.329, global_step=1450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  54%|█████▍    | 3218/5971 [28:48<24:38,  1.86it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.7e-5, train/loss_step=0.0127, global_step=1450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3219/5971 [28:49<24:37,  1.86it/s, loss=0.186, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000521, train/loss_step=0.156, global_step=1450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3220/5971 [28:51<24:39,  1.86it/s, loss=0.186, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000521, train/loss_step=0.156, global_step=1450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3220/5971 [28:51<24:39,  1.86it/s, loss=0.197, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00183, train/loss_step=0.365, global_step=1450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  54%|█████▍    | 3221/5971 [28:52<24:38,  1.86it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=2.9e-5, train/loss_step=0.00618, global_step=1451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3222/5971 [28:53<24:38,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00745, train/loss_vlb_step=3.59e-5, train/loss_step=0.00745, global_step=1451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3223/5971 [28:54<24:38,  1.86it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5e-5, train/loss_step=0.0113, global_step=1451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  54%|█████▍    | 3224/5971 [28:56<24:39,  1.86it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5e-5, train/loss_step=0.0113, global_step=1451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3224/5971 [28:56<24:39,  1.86it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00792, train/loss_vlb_step=3.71e-5, train/loss_step=0.00792, global_step=1451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3225/5971 [28:57<24:38,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.714, train/loss_vlb_step=0.025, train/loss_step=0.714, global_step=1452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  54%|█████▍    | 3226/5971 [28:58<24:38,  1.86it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000125, train/loss_step=0.0347, global_step=1452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3227/5971 [28:59<24:38,  1.86it/s, loss=0.187, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00361, train/loss_step=0.472, global_step=1452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  54%|█████▍    | 3228/5971 [29:01<24:39,  1.85it/s, loss=0.187, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00361, train/loss_step=0.472, global_step=1452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3228/5971 [29:01<24:39,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=1452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3229/5971 [29:02<24:39,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00102, train/loss_step=0.267, global_step=1453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  54%|█████▍    | 3230/5971 [29:03<24:38,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00318, train/loss_vlb_step=1.79e-5, train/loss_step=0.00318, global_step=1453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3231/5971 [29:04<24:38,  1.85it/s, loss=0.174, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000407, train/loss_step=0.123, global_step=1453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  54%|█████▍    | 3232/5971 [29:06<24:39,  1.85it/s, loss=0.174, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000407, train/loss_step=0.123, global_step=1453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3232/5971 [29:06<24:39,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000824, train/loss_step=0.216, global_step=1453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3233/5971 [29:07<24:39,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0952, train/loss_vlb_step=0.000314, train/loss_step=0.0952, global_step=1454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3234/5971 [29:08<24:38,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00168, train/loss_step=0.336, global_step=1454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  54%|█████▍    | 3235/5971 [29:08<24:38,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=1454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3236/5971 [29:11<24:39,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=1454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3236/5971 [29:11<24:39,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.44e-5, train/loss_step=0.0242, global_step=1454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3237/5971 [29:12<24:39,  1.85it/s, loss=0.159, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=1455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3238/5971 [29:12<24:39,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000745, train/loss_step=0.204, global_step=1455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3239/5971 [29:13<24:38,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000483, train/loss_step=0.145, global_step=1455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3240/5971 [29:16<24:39,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000483, train/loss_step=0.145, global_step=1455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3240/5971 [29:16<24:39,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.22e-5, train/loss_step=0.0144, global_step=1455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3241/5971 [29:17<24:39,  1.85it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0638, train/loss_vlb_step=0.000212, train/loss_step=0.0638, global_step=1456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3242/5971 [29:17<24:39,  1.84it/s, loss=0.17, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00166, train/loss_step=0.329, global_step=1456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  54%|█████▍    | 3243/5971 [29:18<24:39,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000973, train/loss_step=0.248, global_step=1456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3244/5971 [29:20<24:39,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000973, train/loss_step=0.248, global_step=1456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3244/5971 [29:20<24:39,  1.84it/s, loss=0.217, v_num=0, train/loss_simple_step=0.724, train/loss_vlb_step=0.0203, train/loss_step=0.724, global_step=1456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  54%|█████▍    | 3245/5971 [29:21<24:39,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.00017, train/loss_step=0.0495, global_step=1457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3246/5971 [29:22<24:39,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00206, train/loss_step=0.363, global_step=1457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  54%|█████▍    | 3247/5971 [29:23<24:39,  1.84it/s, loss=0.213, v_num=0, train/loss_simple_step=0.723, train/loss_vlb_step=0.0129, train/loss_step=0.723, global_step=1457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3248/5971 [29:25<24:39,  1.84it/s, loss=0.213, v_num=0, train/loss_simple_step=0.723, train/loss_vlb_step=0.0129, train/loss_step=0.723, global_step=1457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3248/5971 [29:25<24:39,  1.84it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.00016, train/loss_step=0.0433, global_step=1457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3249/5971 [29:26<24:39,  1.84it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.00019, train/loss_step=0.0537, global_step=1458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3250/5971 [29:27<24:39,  1.84it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.33e-5, train/loss_step=0.00235, global_step=1458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3251/5971 [29:28<24:39,  1.84it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.83e-5, train/loss_step=0.00334, global_step=1458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3252/5971 [29:30<24:39,  1.84it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.83e-5, train/loss_step=0.00334, global_step=1458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3252/5971 [29:30<24:39,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.16e-5, train/loss_step=0.0113, global_step=1458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  54%|█████▍    | 3253/5971 [29:31<24:39,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.03e-5, train/loss_step=0.0143, global_step=1459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  54%|█████▍    | 3254/5971 [29:32<24:39,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.00024, train/loss_step=0.0707, global_step=1459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3255/5971 [29:33<24:39,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.42e-5, train/loss_step=0.0129, global_step=1459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  55%|█████▍    | 3256/5971 [29:35<24:39,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.42e-5, train/loss_step=0.0129, global_step=1459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3256/5971 [29:35<24:39,  1.83it/s, loss=0.168, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000569, train/loss_step=0.165, global_step=1459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3257/5971 [29:36<24:39,  1.83it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.00023, train/loss_step=0.0698, global_step=1460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3258/5971 [29:37<24:39,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.59e-5, train/loss_step=0.0104, global_step=1460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3259/5971 [29:37<24:39,  1.83it/s, loss=0.164, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00112, train/loss_step=0.300, global_step=1460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  55%|█████▍    | 3260/5971 [29:40<24:39,  1.83it/s, loss=0.164, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00112, train/loss_step=0.300, global_step=1460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3260/5971 [29:40<24:39,  1.83it/s, loss=0.168, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=1460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3261/5971 [29:41<24:39,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00905, train/loss_vlb_step=4.15e-5, train/loss_step=0.00905, global_step=1461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3262/5971 [29:42<24:40,  1.83it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.88e-5, train/loss_step=0.0136, global_step=1461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  55%|█████▍    | 3263/5971 [29:43<24:39,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.03e-5, train/loss_step=0.00172, global_step=1461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3264/5971 [29:45<24:40,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.03e-5, train/loss_step=0.00172, global_step=1461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3264/5971 [29:45<24:40,  1.83it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.63e-5, train/loss_step=0.0098, global_step=1461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  55%|█████▍    | 3265/5971 [29:46<24:40,  1.83it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000158, train/loss_step=0.0425, global_step=1462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3266/5971 [29:47<24:40,  1.83it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000119, train/loss_step=0.0336, global_step=1462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3267/5971 [29:48<24:39,  1.83it/s, loss=0.0511, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000183, train/loss_step=0.0525, global_step=1462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3268/5971 [29:50<24:40,  1.83it/s, loss=0.0511, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000183, train/loss_step=0.0525, global_step=1462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3268/5971 [29:50<24:40,  1.83it/s, loss=0.049, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.49e-5, train/loss_step=0.00261, global_step=1462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3269/5971 [29:51<24:40,  1.83it/s, loss=0.0466, v_num=0, train/loss_simple_step=0.00439, train/loss_vlb_step=2.27e-5, train/loss_step=0.00439, global_step=1463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3270/5971 [29:52<24:40,  1.82it/s, loss=0.06, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00118, train/loss_step=0.270, global_step=1463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  55%|█████▍    | 3271/5971 [29:53<24:39,  1.82it/s, loss=0.0602, v_num=0, train/loss_simple_step=0.00853, train/loss_vlb_step=3.95e-5, train/loss_step=0.00853, global_step=1463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3272/5971 [29:55<24:40,  1.82it/s, loss=0.0602, v_num=0, train/loss_simple_step=0.00853, train/loss_vlb_step=3.95e-5, train/loss_step=0.00853, global_step=1463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3272/5971 [29:55<24:40,  1.82it/s, loss=0.0735, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00109, train/loss_step=0.277, global_step=1463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  55%|█████▍    | 3273/5971 [29:56<24:40,  1.82it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000469, train/loss_step=0.140, global_step=1464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3274/5971 [29:57<24:40,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.729, train/loss_vlb_step=0.0194, train/loss_step=0.729, global_step=1464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  55%|█████▍    | 3275/5971 [29:58<24:39,  1.82it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0739, train/loss_vlb_step=0.000246, train/loss_step=0.0739, global_step=1464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3276/5971 [30:00<24:40,  1.82it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0739, train/loss_vlb_step=0.000246, train/loss_step=0.0739, global_step=1464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3276/5971 [30:00<24:40,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00275, train/loss_vlb_step=1.5e-5, train/loss_step=0.00275, global_step=1464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3277/5971 [30:01<24:40,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=1465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  55%|█████▍    | 3278/5971 [30:01<24:39,  1.82it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000194, train/loss_step=0.0573, global_step=1465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3279/5971 [30:02<24:39,  1.82it/s, loss=0.098, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.7e-5, train/loss_step=0.0125, global_step=1465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  55%|█████▍    | 3280/5971 [30:05<24:40,  1.82it/s, loss=0.098, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.7e-5, train/loss_step=0.0125, global_step=1465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3280/5971 [30:05<24:40,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000526, train/loss_step=0.160, global_step=1465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3281/5971 [30:06<24:40,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0664, train/loss_vlb_step=0.000221, train/loss_step=0.0664, global_step=1466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3282/5971 [30:07<24:40,  1.82it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.2e-5, train/loss_step=0.00204, global_step=1466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3283/5971 [30:07<24:39,  1.82it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.41e-5, train/loss_step=0.00266, global_step=1466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3284/5971 [30:09<24:40,  1.81it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.41e-5, train/loss_step=0.00266, global_step=1466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▍    | 3284/5971 [30:09<24:40,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.681, train/loss_vlb_step=0.0103, train/loss_step=0.681, global_step=1466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  55%|█████▌    | 3285/5971 [30:10<24:40,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000351, train/loss_step=0.106, global_step=1467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3286/5971 [30:11<24:39,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000694, train/loss_step=0.202, global_step=1467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3287/5971 [30:12<24:39,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00524, train/loss_step=0.474, global_step=1467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  55%|█████▌    | 3288/5971 [30:15<24:40,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00524, train/loss_step=0.474, global_step=1467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3288/5971 [30:15<24:40,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00376, train/loss_step=0.483, global_step=1467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3289/5971 [30:15<24:40,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00558, train/loss_vlb_step=2.69e-5, train/loss_step=0.00558, global_step=1468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3290/5971 [30:16<24:40,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.000201, train/loss_step=0.0571, global_step=1468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  55%|█████▌    | 3291/5971 [30:17<24:39,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.68e-5, train/loss_step=0.0216, global_step=1468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  55%|█████▌    | 3292/5971 [30:19<24:40,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.68e-5, train/loss_step=0.0216, global_step=1468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3292/5971 [30:19<24:40,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00639, train/loss_vlb_step=3.12e-5, train/loss_step=0.00639, global_step=1468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3293/5971 [30:20<24:40,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000172, train/loss_step=0.0491, global_step=1469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3294/5971 [30:21<24:39,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00144, train/loss_step=0.345, global_step=1469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  55%|█████▌    | 3295/5971 [30:22<24:39,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00112, train/loss_step=0.265, global_step=1469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3296/5971 [30:24<24:40,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00112, train/loss_step=0.265, global_step=1469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3296/5971 [30:24<24:40,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000149, train/loss_step=0.042, global_step=1469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3297/5971 [30:25<24:40,  1.81it/s, loss=0.154, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000177, train/loss_step=0.049, global_step=1470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3298/5971 [30:26<24:39,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.487, train/loss_vlb_step=0.0821, train/loss_step=0.487, global_step=1470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  55%|█████▌    | 3299/5971 [30:27<24:39,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00145, train/loss_step=0.304, global_step=1470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3300/5971 [30:29<24:40,  1.80it/s, loss=0.19, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00145, train/loss_step=0.304, global_step=1470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3300/5971 [30:29<24:40,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.65e-5, train/loss_step=0.00739, global_step=1470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3301/5971 [30:30<24:39,  1.80it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.74e-5, train/loss_step=0.00331, global_step=1471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  55%|█████▌    | 3302/5971 [30:31<24:39,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=1471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  55%|█████▌    | 3303/5971 [30:32<24:39,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00696, train/loss_vlb_step=3.34e-5, train/loss_step=0.00696, global_step=1471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3304/5971 [30:34<24:40,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00696, train/loss_vlb_step=3.34e-5, train/loss_step=0.00696, global_step=1471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3304/5971 [30:34<24:40,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.5e-5, train/loss_step=0.00264, global_step=1471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  55%|█████▌    | 3305/5971 [30:34<24:39,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.59e-5, train/loss_step=0.00497, global_step=1472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3306/5971 [30:35<24:39,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000514, train/loss_step=0.155, global_step=1472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  55%|█████▌    | 3307/5971 [30:36<24:39,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00628, train/loss_vlb_step=3.11e-5, train/loss_step=0.00628, global_step=1472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3308/5971 [30:39<24:40,  1.80it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00628, train/loss_vlb_step=3.11e-5, train/loss_step=0.00628, global_step=1472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3308/5971 [30:39<24:40,  1.80it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.77e-5, train/loss_step=0.010, global_step=1472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  55%|█████▌    | 3309/5971 [30:40<24:39,  1.80it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.00628, train/loss_vlb_step=3.01e-5, train/loss_step=0.00628, global_step=1473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3310/5971 [30:41<24:39,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000677, train/loss_step=0.186, global_step=1473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  55%|█████▌    | 3311/5971 [30:42<24:39,  1.80it/s, loss=0.109, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000384, train/loss_step=0.117, global_step=1473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3312/5971 [30:44<24:40,  1.80it/s, loss=0.109, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000384, train/loss_step=0.117, global_step=1473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3312/5971 [30:44<24:40,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000677, train/loss_step=0.198, global_step=1473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  55%|█████▌    | 3313/5971 [30:45<24:40,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00248, train/loss_step=0.432, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  56%|█████▌    | 3314/5971 [30:46<24:39,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000651, train/loss_step=0.192, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  56%|█████▌    | 3315/5971 [30:47<24:39,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00245, train/loss_step=0.318, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  56%|█████▌    | 3316/5971 [30:49<24:40,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00245, train/loss_step=0.318, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  56%|█████▌    | 3316/5971 [30:49<24:40,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:20,  2.05it/s][A

Validating:   1%|          | 2/167 [00:00<00:46,  3.55it/s][A
Epoch 2:  56%|█████▌    | 3320/5971 [30:49<24:36,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.08it/s][A
Epoch 2:  56%|█████▌    | 3324/5971 [30:50<24:32,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 13.20it/s][A

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.32it/s][A
Epoch 2:  56%|█████▌    | 3328/5971 [30:50<24:28,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   9%|▉         | 15/167 [00:01<00:07, 19.98it/s][A
Epoch 2:  56%|█████▌    | 3332/5971 [30:50<24:25,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  11%|█         | 18/167 [00:01<00:06, 21.92it/s][A
Epoch 2:  56%|█████▌    | 3336/5971 [30:50<24:21,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.52it/s][A
Epoch 2:  56%|█████▌    | 3340/5971 [30:50<24:17,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.18it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.97it/s][A
Epoch 2:  56%|█████▌    | 3344/5971 [30:50<24:13,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.89it/s][A
Epoch 2:  56%|█████▌    | 3348/5971 [30:51<24:09,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.33it/s][A
Epoch 2:  56%|█████▌    | 3352/5971 [30:51<24:05,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.56it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:04, 27.06it/s][A
Epoch 2:  56%|█████▌    | 3356/5971 [30:51<24:02,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.34it/s][A
Epoch 2:  56%|█████▋    | 3360/5971 [30:51<23:58,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.89it/s][A
Epoch 2:  56%|█████▋    | 3364/5971 [30:51<23:54,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.23it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 25.27it/s][A
Epoch 2:  56%|█████▋    | 3368/5971 [30:51<23:50,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.92it/s][A
Epoch 2:  56%|█████▋    | 3372/5971 [30:51<23:46,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.17it/s][A
Epoch 2:  57%|█████▋    | 3376/5971 [30:52<23:43,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 23.87it/s][A

Validating:  38%|███▊      | 63/167 [00:03<00:04, 21.50it/s][A
Epoch 2:  57%|█████▋    | 3380/5971 [30:52<23:39,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 20.33it/s][A
Epoch 2:  57%|█████▋    | 3384/5971 [30:52<23:35,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 22.29it/s][A
Epoch 2:  57%|█████▋    | 3388/5971 [30:52<23:32,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 23.99it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.47it/s][A
Epoch 2:  57%|█████▋    | 3392/5971 [30:52<23:28,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.03it/s][A
Epoch 2:  57%|█████▋    | 3396/5971 [30:52<23:24,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.41it/s][A
Epoch 2:  57%|█████▋    | 3400/5971 [30:53<23:20,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.46it/s][A

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 25.69it/s][A
Epoch 2:  57%|█████▋    | 3404/5971 [30:53<23:17,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 25.82it/s][A
Epoch 2:  57%|█████▋    | 3408/5971 [30:53<23:13,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.19it/s][A
Epoch 2:  57%|█████▋    | 3412/5971 [30:53<23:09,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.20it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.18it/s][A
Epoch 2:  57%|█████▋    | 3416/5971 [30:53<23:06,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.60it/s][A
Epoch 2:  57%|█████▋    | 3420/5971 [30:53<23:02,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.77it/s][A
Epoch 2:  57%|█████▋    | 3424/5971 [30:54<22:58,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.80it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.60it/s][A
Epoch 2:  57%|█████▋    | 3428/5971 [30:54<22:55,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 26.33it/s][A
Epoch 2:  57%|█████▋    | 3432/5971 [30:54<22:51,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.13it/s][A
Epoch 2:  58%|█████▊    | 3436/5971 [30:54<22:47,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.45it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.28it/s][A
Epoch 2:  58%|█████▊    | 3440/5971 [30:54<22:44,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.43it/s][A
Epoch 2:  58%|█████▊    | 3444/5971 [30:54<22:40,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.59it/s][A
Epoch 2:  58%|█████▊    | 3448/5971 [30:54<22:36,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.78it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 25.97it/s][A
Epoch 2:  58%|█████▊    | 3452/5971 [30:55<22:33,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.67it/s][A
Epoch 2:  58%|█████▊    | 3456/5971 [30:55<22:29,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 26.69it/s][A
Epoch 2:  58%|█████▊    | 3460/5971 [30:55<22:26,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.19it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.97it/s][A
Epoch 2:  58%|█████▊    | 3464/5971 [30:55<22:22,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.82it/s][A
Epoch 2:  58%|█████▊    | 3468/5971 [30:55<22:18,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.64it/s][A
Epoch 2:  58%|█████▊    | 3472/5971 [30:55<22:15,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.79it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 27.72it/s][A
Epoch 2:  58%|█████▊    | 3476/5971 [30:55<22:11,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.87it/s][A
Epoch 2:  58%|█████▊    | 3480/5971 [30:56<22:08,  1.88it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.76it/s][A
Epoch 2:  58%|█████▊    | 3484/5971 [30:56<22:04,  1.88it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3484/5971 [30:56<22:04,  1.88it/s, loss=0.162, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.00726, train/loss_step=0.624, global_step=1474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  58%|█████▊    | 3485/5971 [30:57<22:04,  1.88it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.66e-5, train/loss_step=0.00316, global_step=1475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3486/5971 [30:58<22:04,  1.88it/s, loss=0.158, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00262, train/loss_step=0.461, global_step=1475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  58%|█████▊    | 3487/5971 [30:59<22:04,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000191, train/loss_step=0.055, global_step=1475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3488/5971 [31:01<22:04,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000191, train/loss_step=0.055, global_step=1475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3488/5971 [31:01<22:04,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000512, train/loss_step=0.155, global_step=1475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3489/5971 [31:02<22:04,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000185, train/loss_step=0.0542, global_step=1476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3490/5971 [31:03<22:04,  1.87it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.55e-5, train/loss_step=0.00481, global_step=1476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3491/5971 [31:04<22:03,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00168, train/loss_step=0.337, global_step=1476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  58%|█████▊    | 3492/5971 [31:06<22:04,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00168, train/loss_step=0.337, global_step=1476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3492/5971 [31:06<22:04,  1.87it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0811, train/loss_vlb_step=0.000267, train/loss_step=0.0811, global_step=1476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  58%|█████▊    | 3493/5971 [31:07<22:04,  1.87it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.21e-5, train/loss_step=0.0043, global_step=1477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▊    | 3494/5971 [31:08<22:03,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=9.04e-6, train/loss_step=0.00153, global_step=1477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3495/5971 [31:08<22:03,  1.87it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.42e-5, train/loss_step=0.0239, global_step=1477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  59%|█████▊    | 3496/5971 [31:11<22:04,  1.87it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.42e-5, train/loss_step=0.0239, global_step=1477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3496/5971 [31:11<22:04,  1.87it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0878, train/loss_vlb_step=0.000289, train/loss_step=0.0878, global_step=1477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3497/5971 [31:12<22:04,  1.87it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00811, train/loss_vlb_step=3.85e-5, train/loss_step=0.00811, global_step=1478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3498/5971 [31:12<22:03,  1.87it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.71e-5, train/loss_step=0.0192, global_step=1478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  59%|█████▊    | 3499/5971 [31:13<22:03,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.08e-6, train/loss_step=0.00154, global_step=1478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3500/5971 [31:15<22:04,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.08e-6, train/loss_step=0.00154, global_step=1478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3500/5971 [31:15<22:04,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.61e-5, train/loss_step=0.0218, global_step=1478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  59%|█████▊    | 3501/5971 [31:16<22:03,  1.87it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0963, train/loss_vlb_step=0.000317, train/loss_step=0.0963, global_step=1479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3502/5971 [31:17<22:03,  1.87it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.79e-5, train/loss_step=0.00331, global_step=1479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3503/5971 [31:18<22:03,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000252, train/loss_step=0.0763, global_step=1479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▊    | 3504/5971 [31:21<22:03,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000252, train/loss_step=0.0763, global_step=1479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3504/5971 [31:21<22:03,  1.86it/s, loss=0.102, v_num=0, train/loss_simple_step=0.541, train/loss_vlb_step=0.00764, train/loss_step=0.541, global_step=1479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  59%|█████▊    | 3505/5971 [31:21<22:03,  1.86it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.39e-5, train/loss_step=0.0122, global_step=1480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3506/5971 [31:22<22:03,  1.86it/s, loss=0.0813, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.00016, train/loss_step=0.0421, global_step=1480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▊    | 3507/5971 [31:23<22:03,  1.86it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000483, train/loss_step=0.145, global_step=1480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3508/5971 [31:25<22:03,  1.86it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000483, train/loss_step=0.145, global_step=1480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3508/5971 [31:25<22:03,  1.86it/s, loss=0.0784, v_num=0, train/loss_simple_step=0.00699, train/loss_vlb_step=3.37e-5, train/loss_step=0.00699, global_step=1480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3509/5971 [31:26<22:03,  1.86it/s, loss=0.0817, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000401, train/loss_step=0.120, global_step=1481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  59%|█████▉    | 3510/5971 [31:27<22:03,  1.86it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.00041, train/loss_step=0.121, global_step=1481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3511/5971 [31:28<22:02,  1.86it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.585, train/loss_vlb_step=0.0076, train/loss_step=0.585, global_step=1481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3512/5971 [31:30<22:03,  1.86it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.585, train/loss_vlb_step=0.0076, train/loss_step=0.585, global_step=1481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3512/5971 [31:30<22:03,  1.86it/s, loss=0.096, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.06e-5, train/loss_step=0.00393, global_step=1481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3513/5971 [31:31<22:03,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000802, train/loss_step=0.205, global_step=1482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  59%|█████▉    | 3514/5971 [31:32<22:03,  1.86it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0692, train/loss_vlb_step=0.000228, train/loss_step=0.0692, global_step=1482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3515/5971 [31:33<22:02,  1.86it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000127, train/loss_step=0.0367, global_step=1482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3516/5971 [31:35<22:03,  1.86it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000127, train/loss_step=0.0367, global_step=1482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3516/5971 [31:35<22:03,  1.86it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000107, train/loss_step=0.0286, global_step=1482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3517/5971 [31:36<22:03,  1.85it/s, loss=0.129, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00406, train/loss_step=0.452, global_step=1483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  59%|█████▉    | 3518/5971 [31:37<22:02,  1.85it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0813, train/loss_vlb_step=0.00027, train/loss_step=0.0813, global_step=1483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3519/5971 [31:38<22:02,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.46e-5, train/loss_step=0.0077, global_step=1483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3520/5971 [31:40<22:02,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.46e-5, train/loss_step=0.0077, global_step=1483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3520/5971 [31:40<22:02,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.00854, train/loss_step=0.577, global_step=1483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  59%|█████▉    | 3521/5971 [31:41<22:02,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000904, train/loss_step=0.250, global_step=1484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3522/5971 [31:42<22:02,  1.85it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000254, train/loss_step=0.0759, global_step=1484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3523/5971 [31:43<22:02,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0908, train/loss_vlb_step=0.0003, train/loss_step=0.0908, global_step=1484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  59%|█████▉    | 3524/5971 [31:45<22:02,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0908, train/loss_vlb_step=0.0003, train/loss_step=0.0908, global_step=1484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3524/5971 [31:45<22:02,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.88e-5, train/loss_step=0.00341, global_step=1484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3525/5971 [31:46<22:02,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000756, train/loss_step=0.202, global_step=1485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  59%|█████▉    | 3526/5971 [31:47<22:02,  1.85it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00524, train/loss_vlb_step=2.67e-5, train/loss_step=0.00524, global_step=1485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3527/5971 [31:48<22:01,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000127, train/loss_step=0.0367, global_step=1485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3528/5971 [31:50<22:02,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000127, train/loss_step=0.0367, global_step=1485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3528/5971 [31:50<22:02,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000104, train/loss_step=0.0262, global_step=1485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3529/5971 [31:51<22:02,  1.85it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000115, train/loss_step=0.0311, global_step=1486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3530/5971 [31:52<22:01,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00261, train/loss_step=0.351, global_step=1486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  59%|█████▉    | 3531/5971 [31:53<22:01,  1.85it/s, loss=0.141, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00122, train/loss_step=0.287, global_step=1486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3532/5971 [31:55<22:02,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00122, train/loss_step=0.287, global_step=1486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3532/5971 [31:55<22:02,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00552, train/loss_vlb_step=2.9e-5, train/loss_step=0.00552, global_step=1486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3533/5971 [31:56<22:02,  1.84it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000138, train/loss_step=0.0368, global_step=1487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3534/5971 [31:57<22:01,  1.84it/s, loss=0.167, v_num=0, train/loss_simple_step=0.753, train/loss_vlb_step=0.0282, train/loss_step=0.753, global_step=1487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  59%|█████▉    | 3535/5971 [31:58<22:01,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.04e-5, train/loss_step=0.00387, global_step=1487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3536/5971 [32:01<22:02,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.04e-5, train/loss_step=0.00387, global_step=1487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3536/5971 [32:01<22:02,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.56e-5, train/loss_step=0.0208, global_step=1487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  59%|█████▉    | 3537/5971 [32:02<22:02,  1.84it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.00017, train/loss_step=0.0466, global_step=1488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3538/5971 [32:03<22:02,  1.84it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0513, train/loss_vlb_step=0.000177, train/loss_step=0.0513, global_step=1488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3539/5971 [32:03<22:01,  1.84it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000283, train/loss_step=0.0815, global_step=1488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3540/5971 [32:06<22:02,  1.84it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000283, train/loss_step=0.0815, global_step=1488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3540/5971 [32:06<22:02,  1.84it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=1488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3541/5971 [32:06<22:01,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.416, train/loss_vlb_step=0.00204, train/loss_step=0.416, global_step=1489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  59%|█████▉    | 3542/5971 [32:07<22:01,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.0087, train/loss_step=0.574, global_step=1489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3543/5971 [32:08<22:01,  1.84it/s, loss=0.158, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000903, train/loss_step=0.231, global_step=1489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3544/5971 [32:10<22:01,  1.84it/s, loss=0.158, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000903, train/loss_step=0.231, global_step=1489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3544/5971 [32:10<22:01,  1.84it/s, loss=0.177, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00288, train/loss_step=0.376, global_step=1489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3545/5971 [32:11<22:01,  1.84it/s, loss=0.179, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000999, train/loss_step=0.244, global_step=1490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3546/5971 [32:12<22:01,  1.84it/s, loss=0.183, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000267, train/loss_step=0.080, global_step=1490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3547/5971 [32:13<22:00,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00268, train/loss_step=0.392, global_step=1490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  59%|█████▉    | 3548/5971 [32:16<22:01,  1.83it/s, loss=0.2, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00268, train/loss_step=0.392, global_step=1490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3548/5971 [32:16<22:01,  1.83it/s, loss=0.224, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00593, train/loss_step=0.501, global_step=1490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3549/5971 [32:17<22:01,  1.83it/s, loss=0.235, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000941, train/loss_step=0.255, global_step=1491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3550/5971 [32:18<22:01,  1.83it/s, loss=0.246, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.00612, train/loss_step=0.563, global_step=1491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3551/5971 [32:18<22:01,  1.83it/s, loss=0.25, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00162, train/loss_step=0.367, global_step=1491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  59%|█████▉    | 3552/5971 [32:21<22:01,  1.83it/s, loss=0.25, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00162, train/loss_step=0.367, global_step=1491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  59%|█████▉    | 3552/5971 [32:21<22:01,  1.83it/s, loss=0.259, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000657, train/loss_step=0.192, global_step=1491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3553/5971 [32:22<22:01,  1.83it/s, loss=0.274, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00186, train/loss_step=0.323, global_step=1492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3554/5971 [32:23<22:01,  1.83it/s, loss=0.264, v_num=0, train/loss_simple_step=0.551, train/loss_vlb_step=0.00796, train/loss_step=0.551, global_step=1492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3555/5971 [32:23<22:00,  1.83it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.72e-5, train/loss_step=0.0162, global_step=1492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3556/5971 [32:26<22:01,  1.83it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.72e-5, train/loss_step=0.0162, global_step=1492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3556/5971 [32:26<22:01,  1.83it/s, loss=0.265, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.000154, train/loss_step=0.046, global_step=1492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3557/5971 [32:26<22:00,  1.83it/s, loss=0.265, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000125, train/loss_step=0.034, global_step=1493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3558/5971 [32:27<22:00,  1.83it/s, loss=0.277, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00115, train/loss_step=0.287, global_step=1493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3559/5971 [32:28<22:00,  1.83it/s, loss=0.274, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000105, train/loss_step=0.0268, global_step=1493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3560/5971 [32:31<22:01,  1.82it/s, loss=0.274, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000105, train/loss_step=0.0268, global_step=1493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3560/5971 [32:31<22:01,  1.82it/s, loss=0.274, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.21e-5, train/loss_step=0.00204, global_step=1493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3561/5971 [32:32<22:00,  1.82it/s, loss=0.27, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00226, train/loss_step=0.334, global_step=1494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  60%|█████▉    | 3562/5971 [32:33<22:00,  1.82it/s, loss=0.258, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00205, train/loss_step=0.349, global_step=1494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3563/5971 [32:34<22:00,  1.82it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.85e-5, train/loss_step=0.0102, global_step=1494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3564/5971 [32:36<22:01,  1.82it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.85e-5, train/loss_step=0.0102, global_step=1494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3564/5971 [32:36<22:01,  1.82it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0608, train/loss_vlb_step=0.00021, train/loss_step=0.0608, global_step=1494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3565/5971 [32:37<22:00,  1.82it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.87e-5, train/loss_step=0.0105, global_step=1495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3566/5971 [32:38<22:00,  1.82it/s, loss=0.216, v_num=0, train/loss_simple_step=0.00571, train/loss_vlb_step=2.69e-5, train/loss_step=0.00571, global_step=1495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3567/5971 [32:39<22:00,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.000306, train/loss_step=0.0912, global_step=1495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3568/5971 [32:41<22:00,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.000306, train/loss_step=0.0912, global_step=1495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3568/5971 [32:41<22:00,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.47e-5, train/loss_step=0.0126, global_step=1495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3569/5971 [32:42<22:00,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.1e-5, train/loss_step=0.0194, global_step=1496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3570/5971 [32:43<21:59,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00151, train/loss_step=0.294, global_step=1496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3571/5971 [32:43<21:59,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00222, train/loss_step=0.385, global_step=1496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3572/5971 [32:46<22:00,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00222, train/loss_step=0.385, global_step=1496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3572/5971 [32:46<22:00,  1.82it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0081, train/loss_vlb_step=3.84e-5, train/loss_step=0.0081, global_step=1496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3573/5971 [32:47<21:59,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00863, train/loss_vlb_step=4.07e-5, train/loss_step=0.00863, global_step=1497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3574/5971 [32:48<21:59,  1.82it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.27e-5, train/loss_step=0.00247, global_step=1497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  60%|█████▉    | 3575/5971 [32:49<21:59,  1.82it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.64e-5, train/loss_step=0.0129, global_step=1497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  60%|█████▉    | 3576/5971 [32:51<21:59,  1.81it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.64e-5, train/loss_step=0.0129, global_step=1497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3576/5971 [32:51<21:59,  1.81it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.7e-5, train/loss_step=0.0149, global_step=1497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3577/5971 [32:52<21:59,  1.81it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.00013, train/loss_step=0.0352, global_step=1498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3578/5971 [32:52<21:59,  1.81it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000478, train/loss_step=0.145, global_step=1498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|█████▉    | 3579/5971 [32:53<21:58,  1.81it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000227, train/loss_step=0.0665, global_step=1498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3580/5971 [32:55<21:59,  1.81it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000227, train/loss_step=0.0665, global_step=1498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3580/5971 [32:55<21:59,  1.81it/s, loss=0.0938, v_num=0, train/loss_simple_step=0.00977, train/loss_vlb_step=4.42e-5, train/loss_step=0.00977, global_step=1498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3581/5971 [32:56<21:59,  1.81it/s, loss=0.0854, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000575, train/loss_step=0.166, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  60%|█████▉    | 3581/5971 [33:06<22:05,  1.80it/s, loss=0.0854, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000575, train/loss_step=0.166, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3582/5971 [33:41<22:27,  1.77it/s, loss=0.0854, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000575, train/loss_step=0.166, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|█████▉    | 3582/5971 [33:41<22:27,  1.77it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00501, train/loss_step=0.537, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|██████    | 3583/5971 [33:41<22:27,  1.77it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00501, train/loss_step=0.537, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|██████    | 3583/5971 [33:41<22:27,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00148, train/loss_step=0.324, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  60%|██████    | 3584/5971 [33:44<22:27,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00148, train/loss_step=0.324, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  60%|██████    | 3584/5971 [33:44<22:27,  1.77it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.31it/s][A
Epoch 2:  60%|██████    | 3586/5971 [33:44<22:26,  1.77it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:43,  3.79it/s][A
Epoch 2:  60%|██████    | 3588/5971 [33:44<22:24,  1.77it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.85it/s][A
Epoch 2:  60%|██████    | 3591/5971 [33:45<22:21,  1.77it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.78it/s][A
Epoch 2:  60%|██████    | 3594/5971 [33:45<22:19,  1.78it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.62it/s][A
Epoch 2:  60%|██████    | 3597/5971 [33:45<22:16,  1.78it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.51it/s][A
Epoch 2:  60%|██████    | 3601/5971 [33:45<22:12,  1.78it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.82it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.94it/s][A
Epoch 2:  60%|██████    | 3605/5971 [33:45<22:09,  1.78it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.64it/s][A
Epoch 2:  60%|██████    | 3609/5971 [33:45<22:05,  1.78it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 26.56it/s][A
Epoch 2:  61%|██████    | 3613/5971 [33:45<22:01,  1.78it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.83it/s][A
Epoch 2:  61%|██████    | 3617/5971 [33:46<21:58,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.45it/s][A

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.76it/s][A
Epoch 2:  61%|██████    | 3621/5971 [33:46<21:54,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 26.00it/s][A
Epoch 2:  61%|██████    | 3625/5971 [33:46<21:51,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.55it/s][A
Epoch 2:  61%|██████    | 3629/5971 [33:46<21:47,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.25it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.63it/s][A
Epoch 2:  61%|██████    | 3633/5971 [33:46<21:43,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.22it/s][A
Epoch 2:  61%|██████    | 3637/5971 [33:46<21:40,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.19it/s][A
Epoch 2:  61%|██████    | 3641/5971 [33:46<21:36,  1.80it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  35%|███▍      | 58/167 [00:02<00:03, 27.40it/s][A
Epoch 2:  61%|██████    | 3645/5971 [33:47<21:33,  1.80it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 27.39it/s][A

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.82it/s][A
Epoch 2:  61%|██████    | 3649/5971 [33:47<21:29,  1.80it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:03<00:03, 28.31it/s][A
Epoch 2:  61%|██████    | 3653/5971 [33:47<21:26,  1.80it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.89it/s][A
Epoch 2:  61%|██████    | 3657/5971 [33:47<21:22,  1.80it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.77it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 27.03it/s][A
Epoch 2:  61%|██████▏   | 3661/5971 [33:47<21:19,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.40it/s][A
Epoch 2:  61%|██████▏   | 3665/5971 [33:47<21:15,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.13it/s][A
Epoch 2:  61%|██████▏   | 3669/5971 [33:47<21:12,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  51%|█████     | 85/167 [00:03<00:03, 27.10it/s][A

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.18it/s][A
Epoch 2:  62%|██████▏   | 3673/5971 [33:48<21:08,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 27.83it/s][A
Epoch 2:  62%|██████▏   | 3677/5971 [33:48<21:05,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.54it/s][A
Epoch 2:  62%|██████▏   | 3681/5971 [33:48<21:01,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.33it/s][A
Epoch 2:  62%|██████▏   | 3685/5971 [33:48<20:58,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.71it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.48it/s][A
Epoch 2:  62%|██████▏   | 3689/5971 [33:48<20:54,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.17it/s][A
Epoch 2:  62%|██████▏   | 3693/5971 [33:48<20:51,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.53it/s][A
Epoch 2:  62%|██████▏   | 3697/5971 [33:49<20:47,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 24.65it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:02, 25.33it/s][A
Epoch 2:  62%|██████▏   | 3701/5971 [33:49<20:44,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.27it/s][A
Epoch 2:  62%|██████▏   | 3705/5971 [33:49<20:40,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.45it/s][A
Epoch 2:  62%|██████▏   | 3709/5971 [33:49<20:37,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.17it/s][A
Epoch 2:  62%|██████▏   | 3713/5971 [33:49<20:33,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 28.00it/s][A
Epoch 2:  62%|██████▏   | 3717/5971 [33:49<20:30,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.04it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 25.31it/s][A
Epoch 2:  62%|██████▏   | 3721/5971 [33:50<20:27,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.90it/s][A
Epoch 2:  62%|██████▏   | 3725/5971 [33:50<20:23,  1.84it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 25.88it/s][A
Epoch 2:  62%|██████▏   | 3729/5971 [33:50<20:20,  1.84it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 26.28it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.26it/s][A
Epoch 2:  63%|██████▎   | 3733/5971 [33:50<20:16,  1.84it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.08it/s][A
Epoch 2:  63%|██████▎   | 3737/5971 [33:50<20:13,  1.84it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.77it/s][A
Epoch 2:  63%|██████▎   | 3741/5971 [33:50<20:10,  1.84it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.83it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.32it/s][A
Epoch 2:  63%|██████▎   | 3745/5971 [33:50<20:06,  1.84it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.92it/s][A
Epoch 2:  63%|██████▎   | 3749/5971 [33:51<20:03,  1.85it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.04it/s][A
Epoch 2:  63%|██████▎   | 3752/5971 [33:51<20:01,  1.85it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.79it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.66it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.14it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.53it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.58it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:06,  4.63it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:06,  4.48it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  4.75it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  4.98it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.17it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.46it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.50it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.62it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.05it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  4.22it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:02,  4.43it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  4.59it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  4.80it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  4.97it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.13it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.30it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.46it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.98it/s]

Epoch 2:  63%|██████▎   | 3753/5971 [34:03<20:07,  1.84it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.13e-5, train/loss_step=0.00844, global_step=1499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3753/5971 [34:03<20:07,  1.84it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.35e-5, train/loss_step=0.00227, global_step=1500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.34it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.10it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.59it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  4.04it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.41it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.68it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.88it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.05it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  4.81it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:06,  3.68it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:06<00:05,  4.01it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  4.26it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:04,  4.53it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:04,  4.66it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  4.84it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:07<00:03,  4.98it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.11it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.18it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.37it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.44it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.36it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.15it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.17it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.25it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.25it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.88it/s]

Epoch 2:  63%|██████▎   | 3754/5971 [34:16<20:14,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.35e-5, train/loss_step=0.00227, global_step=1500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3754/5971 [34:16<20:14,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00238, train/loss_step=0.391, global_step=1500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.36it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.19it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.83it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.31it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.67it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.92it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.11it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.39it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.46it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.46it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.37it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.38it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.38it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.58it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.52it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.40it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.29it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.33it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.48it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.46it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.11it/s]

Epoch 2:  63%|██████▎   | 3755/5971 [34:28<20:20,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00238, train/loss_step=0.391, global_step=1500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3755/5971 [34:28<20:20,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000432, train/loss_step=0.124, global_step=1500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:26,  1.86it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:15,  3.07it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.86it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.40it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:08,  5.02it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.15it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.27it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:01<00:07,  5.32it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.28it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.28it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.40it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.25it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.20it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.15it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.14it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.21it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.18it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.18it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.41it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.45it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.54it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.57it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.57it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.56it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.59it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.61it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.43it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.31it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.34it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.39it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.38it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.19it/s]

Epoch 2:  63%|██████▎   | 3756/5971 [34:42<20:27,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000432, train/loss_step=0.124, global_step=1500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3756/5971 [34:42<20:27,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00143, train/loss_vlb_step=8.46e-6, train/loss_step=0.00143, global_step=1500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3757/5971 [34:43<20:27,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00143, train/loss_vlb_step=8.46e-6, train/loss_step=0.00143, global_step=1500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3757/5971 [34:43<20:27,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.505, train/loss_vlb_step=0.0039, train/loss_step=0.505, global_step=1501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  63%|██████▎   | 3758/5971 [34:44<20:26,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.505, train/loss_vlb_step=0.0039, train/loss_step=0.505, global_step=1501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3758/5971 [34:44<20:26,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00893, train/loss_vlb_step=4.04e-5, train/loss_step=0.00893, global_step=1501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3759/5971 [34:44<20:26,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00893, train/loss_vlb_step=4.04e-5, train/loss_step=0.00893, global_step=1501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3759/5971 [34:44<20:26,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00262, train/loss_step=0.452, global_step=1501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  63%|██████▎   | 3760/5971 [34:47<20:26,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00262, train/loss_step=0.452, global_step=1501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3760/5971 [34:47<20:26,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.000217, train/loss_step=0.0653, global_step=1501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3761/5971 [34:48<20:26,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.000217, train/loss_step=0.0653, global_step=1501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3761/5971 [34:48<20:26,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00838, train/loss_vlb_step=3.97e-5, train/loss_step=0.00838, global_step=1502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3762/5971 [34:48<20:26,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00838, train/loss_vlb_step=3.97e-5, train/loss_step=0.00838, global_step=1502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3762/5971 [34:48<20:26,  1.80it/s, loss=0.15, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=1502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  63%|██████▎   | 3763/5971 [34:49<20:25,  1.80it/s, loss=0.15, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=1502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3763/5971 [34:49<20:25,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0714, train/loss_vlb_step=0.000241, train/loss_step=0.0714, global_step=1502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3764/5971 [34:52<20:26,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0714, train/loss_vlb_step=0.000241, train/loss_step=0.0714, global_step=1502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3764/5971 [34:52<20:26,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000143, train/loss_step=0.0399, global_step=1502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3765/5971 [34:53<20:26,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000143, train/loss_step=0.0399, global_step=1502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3765/5971 [34:53<20:26,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.00022, train/loss_step=0.0631, global_step=1503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3766/5971 [34:54<20:25,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.00022, train/loss_step=0.0631, global_step=1503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3766/5971 [34:54<20:25,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000581, train/loss_step=0.168, global_step=1503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3767/5971 [34:54<20:25,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000581, train/loss_step=0.168, global_step=1503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3767/5971 [34:54<20:25,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00097, train/loss_step=0.253, global_step=1503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3768/5971 [34:57<20:25,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00097, train/loss_step=0.253, global_step=1503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3768/5971 [34:57<20:25,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=1503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3769/5971 [34:57<20:25,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=1503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3769/5971 [34:57<20:25,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.86e-5, train/loss_step=0.00784, global_step=1504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3770/5971 [34:58<20:25,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.86e-5, train/loss_step=0.00784, global_step=1504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3770/5971 [34:58<20:25,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.17e-5, train/loss_step=0.0143, global_step=1504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3771/5971 [34:59<20:24,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.17e-5, train/loss_step=0.0143, global_step=1504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3771/5971 [34:59<20:24,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00328, train/loss_step=0.475, global_step=1504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  63%|██████▎   | 3772/5971 [35:01<20:25,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00328, train/loss_step=0.475, global_step=1504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3772/5971 [35:01<20:25,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.000195, train/loss_step=0.0566, global_step=1504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3773/5971 [35:02<20:24,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.000195, train/loss_step=0.0566, global_step=1504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3773/5971 [35:02<20:24,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0092, train/loss_vlb_step=4.37e-5, train/loss_step=0.0092, global_step=1505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3774/5971 [35:03<20:24,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0092, train/loss_vlb_step=4.37e-5, train/loss_step=0.0092, global_step=1505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3774/5971 [35:03<20:24,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000491, train/loss_step=0.146, global_step=1505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3775/5971 [35:04<20:23,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000491, train/loss_step=0.146, global_step=1505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3775/5971 [35:04<20:23,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00371, train/loss_step=0.458, global_step=1505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3776/5971 [35:07<20:24,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00371, train/loss_step=0.458, global_step=1505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3776/5971 [35:07<20:24,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.18e-5, train/loss_step=0.00434, global_step=1505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3777/5971 [35:08<20:24,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.18e-5, train/loss_step=0.00434, global_step=1505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3777/5971 [35:08<20:24,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000154, train/loss_step=0.0417, global_step=1506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3778/5971 [35:09<20:24,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000154, train/loss_step=0.0417, global_step=1506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3778/5971 [35:09<20:24,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000136, train/loss_step=0.0361, global_step=1506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3779/5971 [35:10<20:23,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000136, train/loss_step=0.0361, global_step=1506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3779/5971 [35:10<20:23,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=1506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  63%|██████▎   | 3780/5971 [35:12<20:24,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=1506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3780/5971 [35:12<20:24,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.854, train/loss_vlb_step=0.108, train/loss_step=0.854, global_step=1506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  63%|██████▎   | 3781/5971 [35:13<20:23,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.854, train/loss_vlb_step=0.108, train/loss_step=0.854, global_step=1506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3781/5971 [35:13<20:23,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00279, train/loss_step=0.439, global_step=1507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3782/5971 [35:14<20:23,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00279, train/loss_step=0.439, global_step=1507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3782/5971 [35:14<20:23,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00442, train/loss_step=0.474, global_step=1507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3783/5971 [35:15<20:22,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00442, train/loss_step=0.474, global_step=1507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3783/5971 [35:15<20:22,  1.79it/s, loss=0.191, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000426, train/loss_step=0.128, global_step=1507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3784/5971 [35:17<20:23,  1.79it/s, loss=0.191, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000426, train/loss_step=0.128, global_step=1507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3784/5971 [35:17<20:23,  1.79it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000127, train/loss_step=0.0335, global_step=1507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3785/5971 [35:18<20:23,  1.79it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000127, train/loss_step=0.0335, global_step=1507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3785/5971 [35:18<20:23,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.47e-5, train/loss_step=0.00264, global_step=1508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3786/5971 [35:19<20:22,  1.79it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.47e-5, train/loss_step=0.00264, global_step=1508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3786/5971 [35:19<20:22,  1.79it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000195, train/loss_step=0.0564, global_step=1508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  63%|██████▎   | 3787/5971 [35:20<20:22,  1.79it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000195, train/loss_step=0.0564, global_step=1508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3787/5971 [35:20<20:22,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000324, train/loss_step=0.0937, global_step=1508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3788/5971 [35:22<20:22,  1.79it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000324, train/loss_step=0.0937, global_step=1508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3788/5971 [35:22<20:22,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.000279, train/loss_step=0.0847, global_step=1508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3789/5971 [35:23<20:22,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.000279, train/loss_step=0.0847, global_step=1508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3789/5971 [35:23<20:22,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00302, train/loss_step=0.399, global_step=1509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  63%|██████▎   | 3790/5971 [35:23<20:21,  1.78it/s, loss=0.197, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00302, train/loss_step=0.399, global_step=1509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3790/5971 [35:23<20:21,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00357, train/loss_vlb_step=1.91e-5, train/loss_step=0.00357, global_step=1509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3791/5971 [35:24<20:21,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00357, train/loss_vlb_step=1.91e-5, train/loss_step=0.00357, global_step=1509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  63%|██████▎   | 3791/5971 [35:24<20:21,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000931, train/loss_step=0.213, global_step=1509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  64%|██████▎   | 3792/5971 [35:26<20:21,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000931, train/loss_step=0.213, global_step=1509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3792/5971 [35:26<20:21,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00111, train/loss_step=0.259, global_step=1509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▎   | 3793/5971 [35:27<20:21,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00111, train/loss_step=0.259, global_step=1509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3793/5971 [35:27<20:21,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.17e-5, train/loss_step=0.00912, global_step=1510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3794/5971 [35:28<20:21,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.17e-5, train/loss_step=0.00912, global_step=1510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3794/5971 [35:28<20:21,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000234, train/loss_step=0.0706, global_step=1510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▎   | 3795/5971 [35:29<20:20,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000234, train/loss_step=0.0706, global_step=1510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3795/5971 [35:29<20:20,  1.78it/s, loss=0.172, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.00035, train/loss_step=0.104, global_step=1510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  64%|██████▎   | 3796/5971 [35:32<20:21,  1.78it/s, loss=0.172, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.00035, train/loss_step=0.104, global_step=1510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3796/5971 [35:32<20:21,  1.78it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.44e-5, train/loss_step=0.00731, global_step=1510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3797/5971 [35:32<20:20,  1.78it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.44e-5, train/loss_step=0.00731, global_step=1510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3797/5971 [35:32<20:20,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.0022, train/loss_step=0.381, global_step=1511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  64%|██████▎   | 3798/5971 [35:33<20:20,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.0022, train/loss_step=0.381, global_step=1511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3798/5971 [35:33<20:20,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000642, train/loss_step=0.175, global_step=1511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3799/5971 [35:34<20:20,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000642, train/loss_step=0.175, global_step=1511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3799/5971 [35:34<20:20,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000144, train/loss_step=0.0382, global_step=1511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3800/5971 [35:36<20:20,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000144, train/loss_step=0.0382, global_step=1511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3800/5971 [35:36<20:20,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.51e-5, train/loss_step=0.0232, global_step=1511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  64%|██████▎   | 3801/5971 [35:37<20:20,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.51e-5, train/loss_step=0.0232, global_step=1511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3801/5971 [35:37<20:20,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.06e-5, train/loss_step=0.011, global_step=1512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▎   | 3802/5971 [35:38<20:19,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.06e-5, train/loss_step=0.011, global_step=1512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3802/5971 [35:38<20:19,  1.78it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000133, train/loss_step=0.0391, global_step=1512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3803/5971 [35:39<20:19,  1.78it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000133, train/loss_step=0.0391, global_step=1512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3803/5971 [35:39<20:19,  1.78it/s, loss=0.127, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00487, train/loss_step=0.542, global_step=1512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  64%|██████▎   | 3804/5971 [35:42<20:20,  1.78it/s, loss=0.127, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00487, train/loss_step=0.542, global_step=1512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3804/5971 [35:42<20:20,  1.78it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.02e-5, train/loss_step=0.0196, global_step=1512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3805/5971 [35:43<20:19,  1.78it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.02e-5, train/loss_step=0.0196, global_step=1512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3805/5971 [35:43<20:19,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.13e-5, train/loss_step=0.0231, global_step=1513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3806/5971 [35:44<20:19,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.13e-5, train/loss_step=0.0231, global_step=1513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▎   | 3806/5971 [35:44<20:19,  1.78it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000128, train/loss_step=0.0343, global_step=1513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3807/5971 [35:44<20:18,  1.78it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000128, train/loss_step=0.0343, global_step=1513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3807/5971 [35:44<20:18,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000108, train/loss_step=0.0285, global_step=1513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3808/5971 [35:47<20:19,  1.77it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000108, train/loss_step=0.0285, global_step=1513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3808/5971 [35:47<20:19,  1.77it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.00032, train/loss_step=0.0974, global_step=1513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▍   | 3809/5971 [35:48<20:19,  1.77it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.00032, train/loss_step=0.0974, global_step=1513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3809/5971 [35:48<20:19,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.013, train/loss_step=0.643, global_step=1514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  64%|██████▍   | 3810/5971 [35:49<20:18,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.013, train/loss_step=0.643, global_step=1514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3810/5971 [35:49<20:18,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000199, train/loss_step=0.0586, global_step=1514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3811/5971 [35:50<20:18,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000199, train/loss_step=0.0586, global_step=1514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3811/5971 [35:50<20:18,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00223, train/loss_step=0.389, global_step=1514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  64%|██████▍   | 3812/5971 [35:52<20:18,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00223, train/loss_step=0.389, global_step=1514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3812/5971 [35:52<20:18,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.00012, train/loss_step=0.033, global_step=1514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3813/5971 [35:53<20:18,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.00012, train/loss_step=0.033, global_step=1514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3813/5971 [35:53<20:18,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000396, train/loss_step=0.120, global_step=1515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3814/5971 [35:54<20:17,  1.77it/s, loss=0.142, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000396, train/loss_step=0.120, global_step=1515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3814/5971 [35:54<20:17,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.00246, train/loss_step=0.479, global_step=1515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▍   | 3815/5971 [35:54<20:17,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.00246, train/loss_step=0.479, global_step=1515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3815/5971 [35:54<20:17,  1.77it/s, loss=0.17, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000942, train/loss_step=0.253, global_step=1515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3816/5971 [35:57<20:17,  1.77it/s, loss=0.17, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000942, train/loss_step=0.253, global_step=1515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3816/5971 [35:57<20:17,  1.77it/s, loss=0.175, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=1515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3817/5971 [35:58<20:17,  1.77it/s, loss=0.175, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=1515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3817/5971 [35:58<20:17,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.523, train/loss_vlb_step=0.00552, train/loss_step=0.523, global_step=1516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▍   | 3818/5971 [35:59<20:17,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.523, train/loss_vlb_step=0.00552, train/loss_step=0.523, global_step=1516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3818/5971 [35:59<20:17,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.43e-5, train/loss_step=0.00259, global_step=1516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3819/5971 [35:59<20:16,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.43e-5, train/loss_step=0.00259, global_step=1516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3819/5971 [35:59<20:16,  1.77it/s, loss=0.178, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=1516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  64%|██████▍   | 3820/5971 [36:02<20:17,  1.77it/s, loss=0.178, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=1516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3820/5971 [36:02<20:17,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.3e-5, train/loss_step=0.00234, global_step=1516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3821/5971 [36:03<20:16,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.3e-5, train/loss_step=0.00234, global_step=1516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3821/5971 [36:03<20:16,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=2.96e-5, train/loss_step=0.00608, global_step=1517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3822/5971 [36:03<20:16,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=2.96e-5, train/loss_step=0.00608, global_step=1517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3822/5971 [36:03<20:16,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.34e-5, train/loss_step=0.0166, global_step=1517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  64%|██████▍   | 3823/5971 [36:04<20:16,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.34e-5, train/loss_step=0.0166, global_step=1517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3823/5971 [36:04<20:16,  1.77it/s, loss=0.158, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.00063, train/loss_step=0.181, global_step=1517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  64%|██████▍   | 3824/5971 [36:07<20:16,  1.76it/s, loss=0.158, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.00063, train/loss_step=0.181, global_step=1517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3824/5971 [36:07<20:16,  1.76it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.41e-5, train/loss_step=0.0244, global_step=1517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3825/5971 [36:08<20:16,  1.76it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.41e-5, train/loss_step=0.0244, global_step=1517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3825/5971 [36:08<20:16,  1.76it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00365, train/loss_vlb_step=2.02e-5, train/loss_step=0.00365, global_step=1518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3826/5971 [36:08<20:15,  1.76it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00365, train/loss_vlb_step=2.02e-5, train/loss_step=0.00365, global_step=1518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3826/5971 [36:08<20:15,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=8.02e-5, train/loss_step=0.019, global_step=1518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  64%|██████▍   | 3827/5971 [36:09<20:15,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=8.02e-5, train/loss_step=0.019, global_step=1518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3827/5971 [36:09<20:15,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00122, train/loss_step=0.253, global_step=1518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3828/5971 [36:12<20:15,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00122, train/loss_step=0.253, global_step=1518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3828/5971 [36:12<20:15,  1.76it/s, loss=0.188, v_num=0, train/loss_simple_step=0.506, train/loss_vlb_step=0.00415, train/loss_step=0.506, global_step=1518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3829/5971 [36:12<20:15,  1.76it/s, loss=0.188, v_num=0, train/loss_simple_step=0.506, train/loss_vlb_step=0.00415, train/loss_step=0.506, global_step=1518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3829/5971 [36:12<20:15,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.88e-5, train/loss_step=0.00326, global_step=1519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3830/5971 [36:13<20:14,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.88e-5, train/loss_step=0.00326, global_step=1519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3830/5971 [36:13<20:14,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=2.93e-5, train/loss_step=0.00618, global_step=1519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3831/5971 [36:14<20:14,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=2.93e-5, train/loss_step=0.00618, global_step=1519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3831/5971 [36:14<20:14,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00886, train/loss_vlb_step=4.24e-5, train/loss_step=0.00886, global_step=1519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3832/5971 [36:17<20:15,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00886, train/loss_vlb_step=4.24e-5, train/loss_step=0.00886, global_step=1519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3832/5971 [36:17<20:15,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.00017, train/loss_step=0.0491, global_step=1519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  64%|██████▍   | 3833/5971 [36:18<20:14,  1.76it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.00017, train/loss_step=0.0491, global_step=1519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3833/5971 [36:18<20:14,  1.76it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.47e-5, train/loss_step=0.0155, global_step=1520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▍   | 3834/5971 [36:19<20:14,  1.76it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.47e-5, train/loss_step=0.0155, global_step=1520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3834/5971 [36:19<20:14,  1.76it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.93e-5, train/loss_step=0.0226, global_step=1520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3835/5971 [36:20<20:13,  1.76it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.93e-5, train/loss_step=0.0226, global_step=1520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3835/5971 [36:20<20:13,  1.76it/s, loss=0.108, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.001, train/loss_step=0.274, global_step=1520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  64%|██████▍   | 3836/5971 [36:23<20:14,  1.76it/s, loss=0.108, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.001, train/loss_step=0.274, global_step=1520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3836/5971 [36:23<20:14,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000581, train/loss_step=0.176, global_step=1520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3837/5971 [36:24<20:14,  1.76it/s, loss=0.112, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000581, train/loss_step=0.176, global_step=1520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3837/5971 [36:24<20:14,  1.76it/s, loss=0.111, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00646, train/loss_step=0.507, global_step=1521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▍   | 3838/5971 [36:25<20:14,  1.76it/s, loss=0.111, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00646, train/loss_step=0.507, global_step=1521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3838/5971 [36:25<20:14,  1.76it/s, loss=0.119, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000558, train/loss_step=0.162, global_step=1521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3839/5971 [36:26<20:13,  1.76it/s, loss=0.119, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000558, train/loss_step=0.162, global_step=1521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3839/5971 [36:26<20:13,  1.76it/s, loss=0.12, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000541, train/loss_step=0.155, global_step=1521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▍   | 3840/5971 [36:28<20:14,  1.76it/s, loss=0.12, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000541, train/loss_step=0.155, global_step=1521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3840/5971 [36:28<20:14,  1.76it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000186, train/loss_step=0.0548, global_step=1521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3841/5971 [36:29<20:13,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000186, train/loss_step=0.0548, global_step=1521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3841/5971 [36:29<20:13,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000165, train/loss_step=0.0454, global_step=1522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3842/5971 [36:30<20:13,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000165, train/loss_step=0.0454, global_step=1522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3842/5971 [36:30<20:13,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.82e-5, train/loss_step=0.0168, global_step=1522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  64%|██████▍   | 3843/5971 [36:31<20:12,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.82e-5, train/loss_step=0.0168, global_step=1522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3843/5971 [36:31<20:12,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=6.23e-5, train/loss_step=0.0136, global_step=1522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3844/5971 [36:33<20:13,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=6.23e-5, train/loss_step=0.0136, global_step=1522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3844/5971 [36:33<20:13,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000138, train/loss_step=0.0362, global_step=1522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3845/5971 [36:34<20:12,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000138, train/loss_step=0.0362, global_step=1522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3845/5971 [36:34<20:12,  1.75it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000143, train/loss_step=0.0397, global_step=1523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3846/5971 [36:35<20:12,  1.75it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000143, train/loss_step=0.0397, global_step=1523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3846/5971 [36:35<20:12,  1.75it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000226, train/loss_step=0.0658, global_step=1523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3847/5971 [36:35<20:12,  1.75it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000226, train/loss_step=0.0658, global_step=1523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3847/5971 [36:35<20:12,  1.75it/s, loss=0.118, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000731, train/loss_step=0.192, global_step=1523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  64%|██████▍   | 3848/5971 [36:38<20:12,  1.75it/s, loss=0.118, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000731, train/loss_step=0.192, global_step=1523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3848/5971 [36:38<20:12,  1.75it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=1523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3849/5971 [36:39<20:12,  1.75it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=1523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3849/5971 [36:39<20:12,  1.75it/s, loss=0.098, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.05e-5, train/loss_step=0.0166, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3850/5971 [36:40<20:11,  1.75it/s, loss=0.098, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.05e-5, train/loss_step=0.0166, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3850/5971 [36:40<20:11,  1.75it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000195, train/loss_step=0.0575, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3851/5971 [36:40<20:11,  1.75it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000195, train/loss_step=0.0575, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  64%|██████▍   | 3851/5971 [36:40<20:11,  1.75it/s, loss=0.105, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000337, train/loss_step=0.103, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  65%|██████▍   | 3852/5971 [36:43<20:11,  1.75it/s, loss=0.105, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000337, train/loss_step=0.103, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  65%|██████▍   | 3852/5971 [36:43<20:11,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:12,  2.28it/s][A
Epoch 2:  65%|██████▍   | 3854/5971 [36:43<20:10,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:50,  3.26it/s][A
Epoch 2:  65%|██████▍   | 3856/5971 [36:43<20:08,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.84it/s][A
Epoch 2:  65%|██████▍   | 3859/5971 [36:43<20:05,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.27it/s][A
Epoch 2:  65%|██████▍   | 3862/5971 [36:44<20:03,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.70it/s][A
Epoch 2:  65%|██████▍   | 3865/5971 [36:44<20:00,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.63it/s][A
Epoch 2:  65%|██████▍   | 3868/5971 [36:44<19:58,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.96it/s][A
Epoch 2:  65%|██████▍   | 3871/5971 [36:44<19:55,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.84it/s][A
Epoch 2:  65%|██████▍   | 3874/5971 [36:44<19:52,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.20it/s][A
Epoch 2:  65%|██████▍   | 3877/5971 [36:44<19:50,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.53it/s][A
Epoch 2:  65%|██████▍   | 3880/5971 [36:44<19:47,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.65it/s][A
Epoch 2:  65%|██████▌   | 3883/5971 [36:44<19:45,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.81it/s][A
Epoch 2:  65%|██████▌   | 3886/5971 [36:44<19:42,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 26.17it/s][A
Epoch 2:  65%|██████▌   | 3889/5971 [36:45<19:40,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.06it/s][A
Epoch 2:  65%|██████▌   | 3892/5971 [36:45<19:37,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.84it/s][A
Epoch 2:  65%|██████▌   | 3895/5971 [36:45<19:35,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.79it/s][A
Epoch 2:  65%|██████▌   | 3898/5971 [36:45<19:32,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 27.66it/s][A
Epoch 2:  65%|██████▌   | 3901/5971 [36:45<19:30,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 28.23it/s][A
Epoch 2:  65%|██████▌   | 3904/5971 [36:45<19:27,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.40it/s][A
Epoch 2:  65%|██████▌   | 3907/5971 [36:45<19:24,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.17it/s][A
Epoch 2:  65%|██████▌   | 3910/5971 [36:45<19:22,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.84it/s][A
Epoch 2:  66%|██████▌   | 3913/5971 [36:45<19:19,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 26.20it/s][A
Epoch 2:  66%|██████▌   | 3916/5971 [36:46<19:17,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.97it/s][A
Epoch 2:  66%|██████▌   | 3919/5971 [36:46<19:14,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.59it/s][A
Epoch 2:  66%|██████▌   | 3922/5971 [36:46<19:12,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.28it/s][A
Epoch 2:  66%|██████▌   | 3925/5971 [36:46<19:09,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.99it/s][A
Epoch 2:  66%|██████▌   | 3928/5971 [36:46<19:07,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.09it/s][A
Epoch 2:  66%|██████▌   | 3931/5971 [36:46<19:04,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.81it/s][A
Epoch 2:  66%|██████▌   | 3934/5971 [36:46<19:02,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 23.84it/s][A
Epoch 2:  66%|██████▌   | 3937/5971 [36:46<18:59,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 24.57it/s][A
Epoch 2:  66%|██████▌   | 3940/5971 [36:46<18:57,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 25.22it/s][A
Epoch 2:  66%|██████▌   | 3943/5971 [36:47<18:54,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.28it/s][A
Epoch 2:  66%|██████▌   | 3947/5971 [36:47<18:51,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.86it/s][A
Epoch 2:  66%|██████▌   | 3951/5971 [36:47<18:48,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.09it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.20it/s][A
Epoch 2:  66%|██████▌   | 3955/5971 [36:47<18:44,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.17it/s][A
Epoch 2:  66%|██████▋   | 3959/5971 [36:47<18:41,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.85it/s][A
Epoch 2:  66%|██████▋   | 3963/5971 [36:47<18:38,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 28.14it/s][A
Epoch 2:  66%|██████▋   | 3967/5971 [36:47<18:35,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 27.14it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.80it/s][A
Epoch 2:  67%|██████▋   | 3971/5971 [36:48<18:31,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.01it/s][A
Epoch 2:  67%|██████▋   | 3975/5971 [36:48<18:28,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 24.72it/s][A
Epoch 2:  67%|██████▋   | 3979/5971 [36:48<18:25,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.69it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.71it/s][A
Epoch 2:  67%|██████▋   | 3983/5971 [36:48<18:22,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.26it/s][A
Epoch 2:  67%|██████▋   | 3987/5971 [36:48<18:18,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.09it/s][A
Epoch 2:  67%|██████▋   | 3991/5971 [36:48<18:15,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.95it/s][A

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 25.55it/s][A
Epoch 2:  67%|██████▋   | 3995/5971 [36:49<18:12,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.11it/s][A
Epoch 2:  67%|██████▋   | 3999/5971 [36:49<18:09,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 25.60it/s][A
Epoch 2:  67%|██████▋   | 4003/5971 [36:49<18:05,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.56it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 27.26it/s][A
Epoch 2:  67%|██████▋   | 4007/5971 [36:49<18:02,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.93it/s][A
Epoch 2:  67%|██████▋   | 4011/5971 [36:49<17:59,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.89it/s][A
Epoch 2:  67%|██████▋   | 4015/5971 [36:49<17:56,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.71it/s][A

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.99it/s][A
Epoch 2:  67%|██████▋   | 4019/5971 [36:49<17:53,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  67%|██████▋   | 4020/5971 [36:50<17:52,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000209, train/loss_step=0.060, global_step=1524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  67%|██████▋   | 4021/5971 [36:51<17:52,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0906, train/loss_vlb_step=0.000306, train/loss_step=0.0906, global_step=1525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  67%|██████▋   | 4022/5971 [36:52<17:51,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00738, train/loss_vlb_step=3.58e-5, train/loss_step=0.00738, global_step=1525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  67%|██████▋   | 4023/5971 [36:53<17:51,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00738, train/loss_vlb_step=3.58e-5, train/loss_step=0.00738, global_step=1525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  67%|██████▋   | 4023/5971 [36:53<17:51,  1.82it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.00548, train/loss_vlb_step=2.74e-5, train/loss_step=0.00548, global_step=1525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  67%|██████▋   | 4024/5971 [36:55<17:51,  1.82it/s, loss=0.087, v_num=0, train/loss_simple_step=0.00867, train/loss_vlb_step=3.95e-5, train/loss_step=0.00867, global_step=1525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  67%|██████▋   | 4025/5971 [36:56<17:51,  1.82it/s, loss=0.0624, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.43e-5, train/loss_step=0.0152, global_step=1526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  67%|██████▋   | 4026/5971 [36:57<17:50,  1.82it/s, loss=0.0723, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00171, train/loss_step=0.360, global_step=1526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  67%|██████▋   | 4027/5971 [36:58<17:50,  1.82it/s, loss=0.0723, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00171, train/loss_step=0.360, global_step=1526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  67%|██████▋   | 4027/5971 [36:58<17:50,  1.82it/s, loss=0.0747, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000712, train/loss_step=0.204, global_step=1526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  67%|██████▋   | 4028/5971 [37:01<17:51,  1.81it/s, loss=0.0753, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000229, train/loss_step=0.0656, global_step=1526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  67%|██████▋   | 4029/5971 [37:02<17:50,  1.81it/s, loss=0.0877, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00144, train/loss_step=0.293, global_step=1527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  67%|██████▋   | 4030/5971 [37:02<17:50,  1.81it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.0896, train/loss_vlb_step=0.000297, train/loss_step=0.0896, global_step=1527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4031/5971 [37:03<17:50,  1.81it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.0896, train/loss_vlb_step=0.000297, train/loss_step=0.0896, global_step=1527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4031/5971 [37:03<17:50,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00218, train/loss_step=0.398, global_step=1527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  68%|██████▊   | 4032/5971 [37:06<17:50,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00785, train/loss_vlb_step=3.67e-5, train/loss_step=0.00785, global_step=1527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4033/5971 [37:06<17:49,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.99e-5, train/loss_step=0.0215, global_step=1528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  68%|██████▊   | 4034/5971 [37:08<17:49,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=1528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4035/5971 [37:09<17:49,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=1528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4035/5971 [37:09<17:49,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000802, train/loss_step=0.220, global_step=1528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4036/5971 [37:12<17:50,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00108, train/loss_step=0.266, global_step=1528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  68%|██████▊   | 4037/5971 [37:13<17:49,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00689, train/loss_vlb_step=3.31e-5, train/loss_step=0.00689, global_step=1529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4038/5971 [37:14<17:49,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000769, train/loss_step=0.211, global_step=1529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  68%|██████▊   | 4039/5971 [37:15<17:48,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000769, train/loss_step=0.211, global_step=1529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4039/5971 [37:15<17:48,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000138, train/loss_step=0.0392, global_step=1529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4040/5971 [37:17<17:49,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00353, train/loss_step=0.448, global_step=1529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  68%|██████▊   | 4041/5971 [37:18<17:48,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000175, train/loss_step=0.0471, global_step=1530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4042/5971 [37:19<17:48,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000237, train/loss_step=0.0722, global_step=1530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4043/5971 [37:20<17:47,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000237, train/loss_step=0.0722, global_step=1530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4043/5971 [37:20<17:47,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0882, train/loss_vlb_step=0.000295, train/loss_step=0.0882, global_step=1530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4044/5971 [37:22<17:48,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00609, train/loss_step=0.568, global_step=1530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  68%|██████▊   | 4045/5971 [37:23<17:47,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.57e-5, train/loss_step=0.00274, global_step=1531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4046/5971 [37:24<17:47,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.484, train/loss_vlb_step=0.00617, train/loss_step=0.484, global_step=1531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  68%|██████▊   | 4047/5971 [37:25<17:47,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.484, train/loss_vlb_step=0.00617, train/loss_step=0.484, global_step=1531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4047/5971 [37:25<17:47,  1.80it/s, loss=0.194, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00343, train/loss_step=0.440, global_step=1531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4048/5971 [37:27<17:47,  1.80it/s, loss=0.21, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00246, train/loss_step=0.384, global_step=1531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4049/5971 [37:28<17:46,  1.80it/s, loss=0.203, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.00049, train/loss_step=0.149, global_step=1532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4050/5971 [37:29<17:46,  1.80it/s, loss=0.208, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000673, train/loss_step=0.185, global_step=1532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4051/5971 [37:30<17:46,  1.80it/s, loss=0.208, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000673, train/loss_step=0.185, global_step=1532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4051/5971 [37:30<17:46,  1.80it/s, loss=0.194, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000405, train/loss_step=0.123, global_step=1532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4052/5971 [37:32<17:46,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0724, train/loss_vlb_step=0.000248, train/loss_step=0.0724, global_step=1532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4053/5971 [37:33<17:45,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00417, train/loss_step=0.448, global_step=1533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  68%|██████▊   | 4054/5971 [37:34<17:45,  1.80it/s, loss=0.228, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00162, train/loss_step=0.313, global_step=1533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4055/5971 [37:34<17:45,  1.80it/s, loss=0.228, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00162, train/loss_step=0.313, global_step=1533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4055/5971 [37:34<17:45,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0058, train/loss_vlb_step=2.98e-5, train/loss_step=0.0058, global_step=1533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4056/5971 [37:37<17:45,  1.80it/s, loss=0.209, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=1533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4057/5971 [37:37<17:45,  1.80it/s, loss=0.242, v_num=0, train/loss_simple_step=0.663, train/loss_vlb_step=0.0108, train/loss_step=0.663, global_step=1534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  68%|██████▊   | 4058/5971 [37:38<17:44,  1.80it/s, loss=0.233, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000113, train/loss_step=0.031, global_step=1534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4059/5971 [37:39<17:44,  1.80it/s, loss=0.233, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000113, train/loss_step=0.031, global_step=1534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4059/5971 [37:39<17:44,  1.80it/s, loss=0.241, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000668, train/loss_step=0.201, global_step=1534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4060/5971 [37:42<17:44,  1.80it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.89e-5, train/loss_step=0.0101, global_step=1534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4061/5971 [37:42<17:44,  1.79it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000159, train/loss_step=0.0439, global_step=1535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4062/5971 [37:43<17:43,  1.79it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000163, train/loss_step=0.0446, global_step=1535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4063/5971 [37:44<17:43,  1.79it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000163, train/loss_step=0.0446, global_step=1535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4063/5971 [37:44<17:43,  1.79it/s, loss=0.237, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.0037, train/loss_step=0.475, global_step=1535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  68%|██████▊   | 4064/5971 [37:47<17:43,  1.79it/s, loss=0.215, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000377, train/loss_step=0.114, global_step=1535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4065/5971 [37:48<17:43,  1.79it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.58e-5, train/loss_step=0.0101, global_step=1536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4066/5971 [37:49<17:42,  1.79it/s, loss=0.202, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000789, train/loss_step=0.231, global_step=1536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4067/5971 [37:50<17:42,  1.79it/s, loss=0.202, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000789, train/loss_step=0.231, global_step=1536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4067/5971 [37:50<17:42,  1.79it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0486, train/loss_vlb_step=0.000165, train/loss_step=0.0486, global_step=1536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4068/5971 [37:52<17:42,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.000328, train/loss_step=0.0974, global_step=1536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4069/5971 [37:53<17:42,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.02e-5, train/loss_step=0.016, global_step=1537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  68%|██████▊   | 4070/5971 [37:53<17:41,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00293, train/loss_step=0.481, global_step=1537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4071/5971 [37:54<17:41,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00293, train/loss_step=0.481, global_step=1537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4071/5971 [37:54<17:41,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.04e-5, train/loss_step=0.0145, global_step=1537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4072/5971 [37:56<17:41,  1.79it/s, loss=0.201, v_num=0, train/loss_simple_step=0.669, train/loss_vlb_step=0.0163, train/loss_step=0.669, global_step=1537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  68%|██████▊   | 4073/5971 [37:57<17:41,  1.79it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.84e-5, train/loss_step=0.00336, global_step=1538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4074/5971 [37:58<17:40,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000412, train/loss_step=0.125, global_step=1538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  68%|██████▊   | 4075/5971 [37:59<17:40,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000412, train/loss_step=0.125, global_step=1538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4075/5971 [37:59<17:40,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00606, train/loss_vlb_step=3.06e-5, train/loss_step=0.00606, global_step=1538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4076/5971 [38:01<17:40,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.41e-5, train/loss_step=0.00252, global_step=1538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4077/5971 [38:02<17:40,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0927, train/loss_vlb_step=0.000306, train/loss_step=0.0927, global_step=1539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4078/5971 [38:03<17:39,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=1539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  68%|██████▊   | 4079/5971 [38:04<17:39,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=1539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4079/5971 [38:04<17:39,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000896, train/loss_step=0.233, global_step=1539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4080/5971 [38:06<17:39,  1.78it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=8.44e-5, train/loss_step=0.0228, global_step=1539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4081/5971 [38:07<17:39,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00738, train/loss_vlb_step=3.54e-5, train/loss_step=0.00738, global_step=1540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4082/5971 [38:08<17:38,  1.78it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.22e-5, train/loss_step=0.00214, global_step=1540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4083/5971 [38:09<17:38,  1.78it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.22e-5, train/loss_step=0.00214, global_step=1540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4083/5971 [38:09<17:38,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.00075, train/loss_step=0.202, global_step=1540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  68%|██████▊   | 4084/5971 [38:11<17:38,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00182, train/loss_step=0.354, global_step=1540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4085/5971 [38:12<17:38,  1.78it/s, loss=0.14, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000146, train/loss_step=0.040, global_step=1541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4086/5971 [38:13<17:37,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.653, train/loss_vlb_step=0.0107, train/loss_step=0.653, global_step=1541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4087/5971 [38:14<17:37,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.653, train/loss_vlb_step=0.0107, train/loss_step=0.653, global_step=1541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4087/5971 [38:14<17:37,  1.78it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00867, train/loss_vlb_step=4.02e-5, train/loss_step=0.00867, global_step=1541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  68%|██████▊   | 4088/5971 [38:16<17:37,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000212, train/loss_step=0.0642, global_step=1541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4089/5971 [38:17<17:37,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.02e-5, train/loss_step=0.0018, global_step=1542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  68%|██████▊   | 4090/5971 [38:18<17:36,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00132, train/loss_step=0.317, global_step=1542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  69%|██████▊   | 4091/5971 [38:19<17:36,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00132, train/loss_step=0.317, global_step=1542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4091/5971 [38:19<17:36,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.25e-6, train/loss_step=0.00139, global_step=1542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4092/5971 [38:21<17:36,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000636, train/loss_step=0.175, global_step=1542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  69%|██████▊   | 4093/5971 [38:22<17:36,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.7e-5, train/loss_step=0.0254, global_step=1543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4094/5971 [38:23<17:35,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000227, train/loss_step=0.0657, global_step=1543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4095/5971 [38:24<17:35,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000227, train/loss_step=0.0657, global_step=1543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4095/5971 [38:24<17:35,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.08e-5, train/loss_step=0.00401, global_step=1543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4096/5971 [38:26<17:35,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.14e-5, train/loss_step=0.0174, global_step=1543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  69%|██████▊   | 4097/5971 [38:27<17:35,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.29e-5, train/loss_step=0.00226, global_step=1544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4098/5971 [38:28<17:34,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000495, train/loss_step=0.150, global_step=1544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  69%|██████▊   | 4099/5971 [38:28<17:34,  1.78it/s, loss=0.117, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000495, train/loss_step=0.150, global_step=1544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4099/5971 [38:28<17:34,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000444, train/loss_step=0.134, global_step=1544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4100/5971 [38:31<17:34,  1.77it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000326, train/loss_step=0.0991, global_step=1544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4101/5971 [38:32<17:34,  1.77it/s, loss=0.126, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000737, train/loss_step=0.200, global_step=1545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  69%|██████▊   | 4102/5971 [38:32<17:33,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.793, train/loss_vlb_step=0.0411, train/loss_step=0.793, global_step=1545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  69%|██████▊   | 4103/5971 [38:33<17:33,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.793, train/loss_vlb_step=0.0411, train/loss_step=0.793, global_step=1545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4103/5971 [38:33<17:33,  1.77it/s, loss=0.169, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000977, train/loss_step=0.266, global_step=1545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4104/5971 [38:36<17:33,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0031, train/loss_vlb_step=1.75e-5, train/loss_step=0.0031, global_step=1545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▊   | 4105/5971 [38:37<17:32,  1.77it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.32e-5, train/loss_step=0.00455, global_step=1546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4106/5971 [38:37<17:32,  1.77it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.06e-5, train/loss_step=0.0174, global_step=1546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  69%|██████▉   | 4107/5971 [38:38<17:32,  1.77it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.06e-5, train/loss_step=0.0174, global_step=1546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4107/5971 [38:38<17:32,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.704, train/loss_vlb_step=0.0158, train/loss_step=0.704, global_step=1546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  69%|██████▉   | 4108/5971 [38:40<17:32,  1.77it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.49e-5, train/loss_step=0.0148, global_step=1546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4109/5971 [38:41<17:31,  1.77it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00133, train/loss_vlb_step=7.97e-6, train/loss_step=0.00133, global_step=1547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4110/5971 [38:42<17:31,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0681, train/loss_vlb_step=0.000227, train/loss_step=0.0681, global_step=1547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4111/5971 [38:43<17:31,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0681, train/loss_vlb_step=0.000227, train/loss_step=0.0681, global_step=1547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4111/5971 [38:43<17:31,  1.77it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.77e-5, train/loss_step=0.0218, global_step=1547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  69%|██████▉   | 4112/5971 [38:46<17:31,  1.77it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.00016, train/loss_step=0.0446, global_step=1547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4113/5971 [38:46<17:30,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000281, train/loss_step=0.0837, global_step=1548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4114/5971 [38:47<17:30,  1.77it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.21e-5, train/loss_step=0.00212, global_step=1548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4115/5971 [38:48<17:30,  1.77it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.21e-5, train/loss_step=0.00212, global_step=1548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4115/5971 [38:48<17:30,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=1548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  69%|██████▉   | 4116/5971 [38:50<17:30,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000183, train/loss_step=0.0512, global_step=1548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4117/5971 [38:51<17:29,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000113, train/loss_step=0.0287, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  69%|██████▉   | 4118/5971 [38:52<17:29,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000467, train/loss_step=0.142, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  69%|██████▉   | 4119/5971 [38:53<17:28,  1.77it/s, loss=0.14, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000467, train/loss_step=0.142, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4119/5971 [38:53<17:28,  1.77it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000205, train/loss_step=0.0596, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  69%|██████▉   | 4120/5971 [38:55<17:29,  1.76it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:19,  2.09it/s][A

Validating:   1%|          | 2/167 [00:00<00:48,  3.42it/s][A
Epoch 2:  69%|██████▉   | 4123/5971 [38:56<17:26,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.86it/s][A
Epoch 2:  69%|██████▉   | 4127/5971 [38:56<17:23,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.54it/s][A
Epoch 2:  69%|██████▉   | 4131/5971 [38:56<17:20,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.71it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.82it/s][A
Epoch 2:  69%|██████▉   | 4135/5971 [38:56<17:17,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.93it/s][A
Epoch 2:  69%|██████▉   | 4139/5971 [38:57<17:14,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.64it/s][A
Epoch 2:  69%|██████▉   | 4143/5971 [38:57<17:10,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.28it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.34it/s][A
Epoch 2:  69%|██████▉   | 4147/5971 [38:57<17:07,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.55it/s][A
Epoch 2:  70%|██████▉   | 4151/5971 [38:57<17:04,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.38it/s][A
Epoch 2:  70%|██████▉   | 4155/5971 [38:57<17:01,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.77it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.08it/s][A
Epoch 2:  70%|██████▉   | 4159/5971 [38:57<16:58,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.36it/s][A
Epoch 2:  70%|██████▉   | 4163/5971 [38:57<16:55,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.28it/s][A
Epoch 2:  70%|██████▉   | 4167/5971 [38:58<16:51,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.18it/s][A
Epoch 2:  70%|██████▉   | 4171/5971 [38:58<16:48,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 27.87it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.19it/s][A
Epoch 2:  70%|██████▉   | 4175/5971 [38:58<16:45,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.48it/s][A
Epoch 2:  70%|██████▉   | 4179/5971 [38:58<16:42,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.28it/s][A
Epoch 2:  70%|███████   | 4183/5971 [38:58<16:39,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.42it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.10it/s][A
Epoch 2:  70%|███████   | 4187/5971 [38:58<16:36,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 27.65it/s][A
Epoch 2:  70%|███████   | 4191/5971 [38:58<16:33,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.14it/s][A
Epoch 2:  70%|███████   | 4195/5971 [38:59<16:30,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.81it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.40it/s][A
Epoch 2:  70%|███████   | 4199/5971 [38:59<16:26,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.38it/s][A
Epoch 2:  70%|███████   | 4203/5971 [38:59<16:23,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 27.40it/s][A
Epoch 2:  70%|███████   | 4207/5971 [38:59<16:20,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.62it/s][A

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 25.91it/s][A
Epoch 2:  71%|███████   | 4211/5971 [38:59<16:17,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.11it/s][A
Epoch 2:  71%|███████   | 4215/5971 [38:59<16:14,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.79it/s][A
Epoch 2:  71%|███████   | 4219/5971 [39:00<16:11,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.27it/s][A
Epoch 2:  71%|███████   | 4223/5971 [39:00<16:08,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 26.20it/s][A

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.82it/s][A
Epoch 2:  71%|███████   | 4227/5971 [39:00<16:05,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 25.92it/s][A
Epoch 2:  71%|███████   | 4231/5971 [39:00<16:02,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 25.67it/s][A
Epoch 2:  71%|███████   | 4235/5971 [39:00<15:59,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 26.07it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 24.98it/s][A
Epoch 2:  71%|███████   | 4239/5971 [39:00<15:56,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 23.97it/s][A
Epoch 2:  71%|███████   | 4243/5971 [39:00<15:53,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.94it/s][A
Epoch 2:  71%|███████   | 4247/5971 [39:01<15:50,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.86it/s][A
Epoch 2:  71%|███████   | 4251/5971 [39:01<15:47,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 24.67it/s][A
Epoch 2:  71%|███████▏  | 4255/5971 [39:01<15:44,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.23it/s][A

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.85it/s][A
Epoch 2:  71%|███████▏  | 4259/5971 [39:01<15:41,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.24it/s][A
Epoch 2:  71%|███████▏  | 4263/5971 [39:01<15:38,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.13it/s][A
Epoch 2:  71%|███████▏  | 4267/5971 [39:01<15:35,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.43it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.81it/s][A
Epoch 2:  72%|███████▏  | 4271/5971 [39:02<15:31,  1.82it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.08it/s][A
Epoch 2:  72%|███████▏  | 4275/5971 [39:02<15:28,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.67it/s][A
Epoch 2:  72%|███████▏  | 4279/5971 [39:02<15:25,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.44it/s][A

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.52it/s][A
Epoch 2:  72%|███████▏  | 4283/5971 [39:02<15:22,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.36it/s][A
Epoch 2:  72%|███████▏  | 4287/5971 [39:02<15:20,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4288/5971 [39:03<15:19,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00201, train/loss_step=0.348, global_step=1549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  72%|███████▏  | 4289/5971 [39:04<15:19,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=3.94e-5, train/loss_step=0.00854, global_step=1550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4290/5971 [39:04<15:18,  1.83it/s, loss=0.107, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000487, train/loss_step=0.146, global_step=1550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  72%|███████▏  | 4291/5971 [39:05<15:18,  1.83it/s, loss=0.107, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000487, train/loss_step=0.146, global_step=1550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4291/5971 [39:05<15:18,  1.83it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.47e-5, train/loss_step=0.00485, global_step=1550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4292/5971 [39:08<15:18,  1.83it/s, loss=0.094, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.24e-5, train/loss_step=0.0143, global_step=1550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  72%|███████▏  | 4293/5971 [39:09<15:18,  1.83it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000361, train/loss_step=0.110, global_step=1551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4294/5971 [39:10<15:17,  1.83it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000126, train/loss_step=0.0362, global_step=1551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  72%|███████▏  | 4295/5971 [39:11<15:17,  1.83it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000126, train/loss_step=0.0362, global_step=1551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4295/5971 [39:11<15:17,  1.83it/s, loss=0.0656, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.26e-5, train/loss_step=0.0119, global_step=1551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4296/5971 [39:13<15:17,  1.83it/s, loss=0.0657, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.61e-5, train/loss_step=0.0156, global_step=1551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4297/5971 [39:14<15:17,  1.83it/s, loss=0.07, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.00029, train/loss_step=0.0883, global_step=1552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  72%|███████▏  | 4298/5971 [39:15<15:16,  1.83it/s, loss=0.0673, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.62e-5, train/loss_step=0.0134, global_step=1552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4299/5971 [39:16<15:16,  1.82it/s, loss=0.0673, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.62e-5, train/loss_step=0.0134, global_step=1552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4299/5971 [39:16<15:16,  1.82it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000213, train/loss_step=0.0632, global_step=1552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4300/5971 [39:18<15:16,  1.82it/s, loss=0.0835, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00151, train/loss_step=0.327, global_step=1552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  72%|███████▏  | 4301/5971 [39:19<15:16,  1.82it/s, loss=0.0795, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.65e-5, train/loss_step=0.00297, global_step=1553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4302/5971 [39:20<15:15,  1.82it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.00847, train/loss_vlb_step=4.07e-5, train/loss_step=0.00847, global_step=1553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4303/5971 [39:21<15:15,  1.82it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.00847, train/loss_vlb_step=4.07e-5, train/loss_step=0.00847, global_step=1553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4303/5971 [39:21<15:15,  1.82it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00464, train/loss_step=0.433, global_step=1553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  72%|███████▏  | 4304/5971 [39:23<15:15,  1.82it/s, loss=0.0947, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000127, train/loss_step=0.0331, global_step=1553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4305/5971 [39:24<15:14,  1.82it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.68e-5, train/loss_step=0.00324, global_step=1554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4306/5971 [39:25<15:14,  1.82it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00926, train/loss_vlb_step=4.21e-5, train/loss_step=0.00926, global_step=1554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4307/5971 [39:26<15:13,  1.82it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.00926, train/loss_vlb_step=4.21e-5, train/loss_step=0.00926, global_step=1554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4307/5971 [39:26<15:13,  1.82it/s, loss=0.0847, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.41e-5, train/loss_step=0.0175, global_step=1554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  72%|███████▏  | 4308/5971 [39:29<15:14,  1.82it/s, loss=0.0766, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000659, train/loss_step=0.186, global_step=1554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  72%|███████▏  | 4309/5971 [39:30<15:13,  1.82it/s, loss=0.0763, v_num=0, train/loss_simple_step=0.00146, train/loss_vlb_step=8.6e-6, train/loss_step=0.00146, global_step=1555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4310/5971 [39:30<15:13,  1.82it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00386, train/loss_vlb_step=2.02e-5, train/loss_step=0.00386, global_step=1555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4311/5971 [39:31<15:13,  1.82it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00386, train/loss_vlb_step=2.02e-5, train/loss_step=0.00386, global_step=1555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4311/5971 [39:31<15:13,  1.82it/s, loss=0.0702, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.38e-5, train/loss_step=0.0252, global_step=1555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  72%|███████▏  | 4312/5971 [39:34<15:13,  1.82it/s, loss=0.0851, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00127, train/loss_step=0.312, global_step=1555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  72%|███████▏  | 4313/5971 [39:34<15:12,  1.82it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000432, train/loss_step=0.129, global_step=1556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4314/5971 [39:35<15:12,  1.82it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.22e-5, train/loss_step=0.0197, global_step=1556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4315/5971 [39:36<15:11,  1.82it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.22e-5, train/loss_step=0.0197, global_step=1556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4315/5971 [39:36<15:11,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00991, train/loss_step=0.574, global_step=1556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  72%|███████▏  | 4316/5971 [39:38<15:12,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.15e-5, train/loss_step=0.0179, global_step=1556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4317/5971 [39:39<15:11,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.17e-5, train/loss_step=0.0112, global_step=1557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  72%|███████▏  | 4318/5971 [39:40<15:11,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.78e-5, train/loss_step=0.0134, global_step=1557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4319/5971 [39:41<15:10,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.78e-5, train/loss_step=0.0134, global_step=1557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4319/5971 [39:41<15:10,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000208, train/loss_step=0.063, global_step=1557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  72%|███████▏  | 4320/5971 [39:44<15:11,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00186, train/loss_step=0.318, global_step=1557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4321/5971 [39:45<15:10,  1.81it/s, loss=0.13, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00312, train/loss_step=0.412, global_step=1558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  72%|███████▏  | 4322/5971 [39:46<15:10,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000361, train/loss_step=0.109, global_step=1558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4323/5971 [39:47<15:09,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000361, train/loss_step=0.109, global_step=1558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4323/5971 [39:47<15:09,  1.81it/s, loss=0.13, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00256, train/loss_step=0.332, global_step=1558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  72%|███████▏  | 4324/5971 [39:49<15:09,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00297, train/loss_step=0.410, global_step=1558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4325/5971 [39:50<15:09,  1.81it/s, loss=0.171, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00334, train/loss_step=0.457, global_step=1559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4326/5971 [39:50<15:08,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000686, train/loss_step=0.193, global_step=1559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4327/5971 [39:51<15:08,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000686, train/loss_step=0.193, global_step=1559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4327/5971 [39:51<15:08,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0884, train/loss_vlb_step=0.000296, train/loss_step=0.0884, global_step=1559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  72%|███████▏  | 4328/5971 [39:54<15:08,  1.81it/s, loss=0.211, v_num=0, train/loss_simple_step=0.727, train/loss_vlb_step=0.0164, train/loss_step=0.727, global_step=1559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  73%|███████▎  | 4329/5971 [39:55<15:08,  1.81it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000158, train/loss_step=0.0444, global_step=1560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4330/5971 [39:56<15:07,  1.81it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.73e-5, train/loss_step=0.0032, global_step=1560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4331/5971 [39:57<15:07,  1.81it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.73e-5, train/loss_step=0.0032, global_step=1560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4331/5971 [39:57<15:07,  1.81it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0954, train/loss_vlb_step=0.000313, train/loss_step=0.0954, global_step=1560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4332/5971 [39:59<15:07,  1.81it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000287, train/loss_step=0.0842, global_step=1560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4333/5971 [40:00<15:07,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000104, train/loss_step=0.0274, global_step=1561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  73%|███████▎  | 4334/5971 [40:00<15:06,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.06e-5, train/loss_step=0.00176, global_step=1561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4335/5971 [40:01<15:06,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.06e-5, train/loss_step=0.00176, global_step=1561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4335/5971 [40:01<15:06,  1.81it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.17e-5, train/loss_step=0.0144, global_step=1561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  73%|███████▎  | 4336/5971 [40:03<15:06,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000398, train/loss_step=0.116, global_step=1561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4337/5971 [40:04<15:05,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000901, train/loss_step=0.211, global_step=1562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4338/5971 [40:05<15:05,  1.80it/s, loss=0.201, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00124, train/loss_step=0.313, global_step=1562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4339/5971 [40:06<15:04,  1.80it/s, loss=0.201, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00124, train/loss_step=0.313, global_step=1562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4339/5971 [40:06<15:04,  1.80it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.62e-5, train/loss_step=0.00289, global_step=1562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4340/5971 [40:09<15:05,  1.80it/s, loss=0.204, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00363, train/loss_step=0.428, global_step=1562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  73%|███████▎  | 4341/5971 [40:09<15:04,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=1563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4342/5971 [40:10<15:04,  1.80it/s, loss=0.206, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00354, train/loss_step=0.449, global_step=1563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4343/5971 [40:11<15:03,  1.80it/s, loss=0.206, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00354, train/loss_step=0.449, global_step=1563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4343/5971 [40:11<15:03,  1.80it/s, loss=0.215, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00379, train/loss_step=0.514, global_step=1563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4344/5971 [40:13<15:03,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0592, train/loss_vlb_step=0.000201, train/loss_step=0.0592, global_step=1563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4345/5971 [40:15<15:03,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000135, train/loss_step=0.0359, global_step=1564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4346/5971 [40:15<15:03,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00697, train/loss_vlb_step=3.47e-5, train/loss_step=0.00697, global_step=1564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4347/5971 [40:16<15:02,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00697, train/loss_vlb_step=3.47e-5, train/loss_step=0.00697, global_step=1564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4347/5971 [40:16<15:02,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000134, train/loss_step=0.0368, global_step=1564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4348/5971 [40:19<15:02,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00426, train/loss_step=0.525, global_step=1564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  73%|███████▎  | 4349/5971 [40:20<15:02,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000107, train/loss_step=0.0279, global_step=1565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4350/5971 [40:20<15:01,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=1565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  73%|███████▎  | 4351/5971 [40:21<15:01,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=1565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4351/5971 [40:21<15:01,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000449, train/loss_step=0.136, global_step=1565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4352/5971 [40:23<15:01,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000908, train/loss_step=0.255, global_step=1565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4353/5971 [40:24<15:01,  1.80it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.000193, train/loss_step=0.0546, global_step=1566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4354/5971 [40:25<15:00,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.07e-5, train/loss_step=0.0234, global_step=1566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4355/5971 [40:26<15:00,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.07e-5, train/loss_step=0.0234, global_step=1566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4355/5971 [40:26<15:00,  1.80it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0576, train/loss_vlb_step=0.0002, train/loss_step=0.0576, global_step=1566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4356/5971 [40:28<15:00,  1.79it/s, loss=0.182, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00113, train/loss_step=0.267, global_step=1566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4357/5971 [40:29<14:59,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000235, train/loss_step=0.0674, global_step=1567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4358/5971 [40:30<14:59,  1.79it/s, loss=0.167, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000529, train/loss_step=0.155, global_step=1567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  73%|███████▎  | 4359/5971 [40:31<14:59,  1.79it/s, loss=0.167, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000529, train/loss_step=0.155, global_step=1567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4359/5971 [40:31<14:59,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=0.000101, train/loss_step=0.0246, global_step=1567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4360/5971 [40:33<14:59,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.06e-5, train/loss_step=0.0178, global_step=1567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4361/5971 [40:34<14:58,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00338, train/loss_vlb_step=1.92e-5, train/loss_step=0.00338, global_step=1568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4362/5971 [40:35<14:58,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000896, train/loss_step=0.240, global_step=1568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  73%|███████▎  | 4363/5971 [40:36<14:57,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000896, train/loss_step=0.240, global_step=1568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4363/5971 [40:36<14:57,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00421, train/loss_vlb_step=2.23e-5, train/loss_step=0.00421, global_step=1568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4364/5971 [40:38<14:57,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000228, train/loss_step=0.0669, global_step=1568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4365/5971 [40:39<14:57,  1.79it/s, loss=0.13, v_num=0, train/loss_simple_step=0.505, train/loss_vlb_step=0.00644, train/loss_step=0.505, global_step=1569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  73%|███████▎  | 4366/5971 [40:40<14:56,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00118, train/loss_step=0.286, global_step=1569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4367/5971 [40:41<14:56,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00118, train/loss_step=0.286, global_step=1569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4367/5971 [40:41<14:56,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000811, train/loss_step=0.216, global_step=1569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4368/5971 [40:43<14:56,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00236, train/loss_step=0.420, global_step=1569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4369/5971 [40:44<14:56,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.61e-5, train/loss_step=0.00481, global_step=1570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4370/5971 [40:45<14:55,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000281, train/loss_step=0.0828, global_step=1570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4371/5971 [40:46<14:55,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000281, train/loss_step=0.0828, global_step=1570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4371/5971 [40:46<14:55,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0607, train/loss_vlb_step=0.000204, train/loss_step=0.0607, global_step=1570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4372/5971 [40:48<14:55,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00318, train/loss_vlb_step=1.73e-5, train/loss_step=0.00318, global_step=1570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4373/5971 [40:49<14:54,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000151, train/loss_step=0.0405, global_step=1571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4374/5971 [40:50<14:54,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000843, train/loss_step=0.225, global_step=1571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  73%|███████▎  | 4375/5971 [40:51<14:54,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000843, train/loss_step=0.225, global_step=1571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4375/5971 [40:51<14:54,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00299, train/loss_step=0.461, global_step=1571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4376/5971 [40:53<14:54,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0603, train/loss_vlb_step=0.000205, train/loss_step=0.0603, global_step=1571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4377/5971 [40:54<14:53,  1.78it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000108, train/loss_step=0.0272, global_step=1572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4378/5971 [40:55<14:53,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.8e-5, train/loss_step=0.00342, global_step=1572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4379/5971 [40:56<14:52,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.8e-5, train/loss_step=0.00342, global_step=1572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4379/5971 [40:56<14:52,  1.78it/s, loss=0.144, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000532, train/loss_step=0.151, global_step=1572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  73%|███████▎  | 4380/5971 [40:58<14:52,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000247, train/loss_step=0.0734, global_step=1572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4381/5971 [40:59<14:52,  1.78it/s, loss=0.167, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00206, train/loss_step=0.408, global_step=1573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  73%|███████▎  | 4382/5971 [41:00<14:51,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0672, train/loss_vlb_step=0.000224, train/loss_step=0.0672, global_step=1573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4383/5971 [41:01<14:51,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0672, train/loss_vlb_step=0.000224, train/loss_step=0.0672, global_step=1573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4383/5971 [41:01<14:51,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000115, train/loss_step=0.0294, global_step=1573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  73%|███████▎  | 4384/5971 [41:03<14:51,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0452, train/loss_vlb_step=0.000164, train/loss_step=0.0452, global_step=1573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4385/5971 [41:04<14:51,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00697, train/loss_vlb_step=3.28e-5, train/loss_step=0.00697, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4386/5971 [41:05<14:50,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.0028, train/loss_step=0.432, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  73%|███████▎  | 4387/5971 [41:05<14:50,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.0028, train/loss_step=0.432, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4387/5971 [41:05<14:50,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000104, train/loss_step=0.0291, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  73%|███████▎  | 4388/5971 [41:08<14:50,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.27it/s][A

Validating:   1%|          | 2/167 [00:00<01:01,  2.67it/s][A
Epoch 2:  74%|███████▎  | 4391/5971 [41:09<14:48,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   2%|▏         | 4/167 [00:00<00:28,  5.74it/s][A
Epoch 2:  74%|███████▎  | 4395/5971 [41:09<14:45,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   4%|▍         | 7/167 [00:01<00:16,  9.94it/s][A

Validating:   6%|▌         | 10/167 [00:01<00:11, 13.63it/s][A
Epoch 2:  74%|███████▎  | 4399/5971 [41:09<14:42,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 13/167 [00:01<00:09, 16.39it/s][A
Epoch 2:  74%|███████▎  | 4403/5971 [41:09<14:39,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.55it/s][A
Epoch 2:  74%|███████▍  | 4407/5971 [41:09<14:36,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.33it/s][A

Validating:  13%|█▎        | 22/167 [00:01<00:06, 21.60it/s][A
Epoch 2:  74%|███████▍  | 4411/5971 [41:09<14:33,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 23.70it/s][A
Epoch 2:  74%|███████▍  | 4415/5971 [41:10<14:30,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 23.24it/s][A
Epoch 2:  74%|███████▍  | 4419/5971 [41:10<14:27,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▊        | 31/167 [00:01<00:06, 22.61it/s][A

Validating:  20%|██        | 34/167 [00:02<00:05, 23.27it/s][A
Epoch 2:  74%|███████▍  | 4423/5971 [41:10<14:24,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 23.50it/s][A
Epoch 2:  74%|███████▍  | 4427/5971 [41:10<14:21,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 23.86it/s][A
Epoch 2:  74%|███████▍  | 4431/5971 [41:10<14:18,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.11it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.65it/s][A
Epoch 2:  74%|███████▍  | 4435/5971 [41:10<14:15,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 25.59it/s][A
Epoch 2:  74%|███████▍  | 4439/5971 [41:11<14:12,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  31%|███       | 52/167 [00:02<00:04, 25.61it/s][A
Epoch 2:  74%|███████▍  | 4443/5971 [41:11<14:09,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 25.92it/s][A

Validating:  35%|███▍      | 58/167 [00:03<00:04, 25.51it/s][A
Epoch 2:  74%|███████▍  | 4447/5971 [41:11<14:06,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 26.31it/s][A
Epoch 2:  75%|███████▍  | 4451/5971 [41:11<14:03,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.89it/s][A
Epoch 2:  75%|███████▍  | 4455/5971 [41:11<14:00,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.94it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.32it/s][A
Epoch 2:  75%|███████▍  | 4459/5971 [41:11<13:57,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.58it/s][A
Epoch 2:  75%|███████▍  | 4463/5971 [41:11<13:55,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.40it/s][A
Epoch 2:  75%|███████▍  | 4467/5971 [41:12<13:52,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.70it/s][A
Epoch 2:  75%|███████▍  | 4471/5971 [41:12<13:49,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:02, 28.59it/s][A
Epoch 2:  75%|███████▍  | 4475/5971 [41:12<13:46,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:04<00:02, 28.72it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 27.70it/s][A
Epoch 2:  75%|███████▌  | 4479/5971 [41:12<13:43,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.84it/s][A
Epoch 2:  75%|███████▌  | 4483/5971 [41:12<13:40,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 28.86it/s][A
Epoch 2:  75%|███████▌  | 4487/5971 [41:12<13:37,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 28.71it/s][A
Epoch 2:  75%|███████▌  | 4491/5971 [41:12<13:34,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 28.76it/s][A

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 28.26it/s][A
Epoch 2:  75%|███████▌  | 4495/5971 [41:13<13:31,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.98it/s][A
Epoch 2:  75%|███████▌  | 4499/5971 [41:13<13:29,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 26.10it/s][A
Epoch 2:  75%|███████▌  | 4503/5971 [41:13<13:26,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 25.55it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.49it/s][A
Epoch 2:  75%|███████▌  | 4507/5971 [41:13<13:23,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.97it/s][A
Epoch 2:  76%|███████▌  | 4511/5971 [41:13<13:20,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.09it/s][A
Epoch 2:  76%|███████▌  | 4515/5971 [41:13<13:17,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.49it/s][A
Epoch 2:  76%|███████▌  | 4519/5971 [41:14<13:14,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.59it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.23it/s][A
Epoch 2:  76%|███████▌  | 4523/5971 [41:14<13:11,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.19it/s][A
Epoch 2:  76%|███████▌  | 4527/5971 [41:14<13:09,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.41it/s][A
Epoch 2:  76%|███████▌  | 4531/5971 [41:14<13:06,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.27it/s][A
Epoch 2:  76%|███████▌  | 4535/5971 [41:14<13:03,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.59it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.16it/s][A
Epoch 2:  76%|███████▌  | 4539/5971 [41:14<13:00,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.33it/s][A
Epoch 2:  76%|███████▌  | 4543/5971 [41:14<12:57,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.10it/s][A
Epoch 2:  76%|███████▌  | 4547/5971 [41:15<12:54,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.87it/s][A
Epoch 2:  76%|███████▌  | 4551/5971 [41:15<12:52,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.56it/s][A

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 27.56it/s][A
Epoch 2:  76%|███████▋  | 4555/5971 [41:15<12:49,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4556/5971 [41:15<12:48,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  76%|███████▋  | 4557/5971 [41:16<12:48,  1.84it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.4e-5, train/loss_step=0.0205, global_step=1575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  76%|███████▋  | 4558/5971 [41:17<12:47,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.555, train/loss_vlb_step=0.00531, train/loss_step=0.555, global_step=1575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  76%|███████▋  | 4559/5971 [41:18<12:47,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.555, train/loss_vlb_step=0.00531, train/loss_step=0.555, global_step=1575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4559/5971 [41:18<12:47,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00208, train/loss_step=0.357, global_step=1575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4560/5971 [41:20<12:47,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00705, train/loss_vlb_step=3.41e-5, train/loss_step=0.00705, global_step=1575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4561/5971 [41:21<12:47,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=1576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  76%|███████▋  | 4562/5971 [41:22<12:46,  1.84it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00472, train/loss_vlb_step=2.42e-5, train/loss_step=0.00472, global_step=1576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4563/5971 [41:23<12:46,  1.84it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00472, train/loss_vlb_step=2.42e-5, train/loss_step=0.00472, global_step=1576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4563/5971 [41:23<12:46,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.46e-5, train/loss_step=0.00254, global_step=1576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4564/5971 [41:25<12:46,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000233, train/loss_step=0.0661, global_step=1576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  76%|███████▋  | 4565/5971 [41:26<12:45,  1.84it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.18e-5, train/loss_step=0.00668, global_step=1577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4566/5971 [41:27<12:45,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000559, train/loss_step=0.167, global_step=1577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  76%|███████▋  | 4567/5971 [41:28<12:44,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000559, train/loss_step=0.167, global_step=1577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  76%|███████▋  | 4567/5971 [41:28<12:44,  1.84it/s, loss=0.138, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00123, train/loss_step=0.319, global_step=1577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  77%|███████▋  | 4568/5971 [41:30<12:44,  1.83it/s, loss=0.159, v_num=0, train/loss_simple_step=0.491, train/loss_vlb_step=0.00336, train/loss_step=0.491, global_step=1577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4569/5971 [41:31<12:44,  1.83it/s, loss=0.145, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000473, train/loss_step=0.139, global_step=1578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4570/5971 [41:32<12:43,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000428, train/loss_step=0.128, global_step=1578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4571/5971 [41:33<12:43,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000428, train/loss_step=0.128, global_step=1578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4571/5971 [41:33<12:43,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000731, train/loss_step=0.205, global_step=1578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4572/5971 [41:35<12:43,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000134, train/loss_step=0.0363, global_step=1578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4573/5971 [41:36<12:43,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00653, train/loss_vlb_step=3.2e-5, train/loss_step=0.00653, global_step=1579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4574/5971 [41:37<12:42,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0602, train/loss_vlb_step=0.000209, train/loss_step=0.0602, global_step=1579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4575/5971 [41:38<12:42,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0602, train/loss_vlb_step=0.000209, train/loss_step=0.0602, global_step=1579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4575/5971 [41:38<12:42,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=2e-5, train/loss_step=0.00364, global_step=1579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  77%|███████▋  | 4576/5971 [41:40<12:42,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0892, train/loss_vlb_step=0.000293, train/loss_step=0.0892, global_step=1579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4577/5971 [41:41<12:41,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.2e-5, train/loss_step=0.0044, global_step=1580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  77%|███████▋  | 4578/5971 [41:42<12:41,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.57e-5, train/loss_step=0.0188, global_step=1580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4579/5971 [41:43<12:40,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.57e-5, train/loss_step=0.0188, global_step=1580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4579/5971 [41:43<12:40,  1.83it/s, loss=0.106, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.00113, train/loss_step=0.252, global_step=1580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  77%|███████▋  | 4580/5971 [41:45<12:40,  1.83it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0584, train/loss_vlb_step=0.000203, train/loss_step=0.0584, global_step=1580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4581/5971 [41:46<12:40,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=5.06e-5, train/loss_step=0.0105, global_step=1581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  77%|███████▋  | 4582/5971 [41:47<12:39,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.04e-5, train/loss_step=0.0115, global_step=1581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4583/5971 [41:48<12:39,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.04e-5, train/loss_step=0.0115, global_step=1581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4583/5971 [41:48<12:39,  1.83it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.000152, train/loss_step=0.0434, global_step=1581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4584/5971 [41:50<12:39,  1.83it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.07e-5, train/loss_step=0.0148, global_step=1581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  77%|███████▋  | 4585/5971 [41:51<12:38,  1.83it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00916, train/loss_vlb_step=4.29e-5, train/loss_step=0.00916, global_step=1582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4586/5971 [41:52<12:38,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.0023, train/loss_step=0.401, global_step=1582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  77%|███████▋  | 4587/5971 [41:52<12:38,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.0023, train/loss_step=0.401, global_step=1582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4587/5971 [41:52<12:38,  1.83it/s, loss=0.106, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000439, train/loss_step=0.129, global_step=1582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4588/5971 [41:55<12:37,  1.82it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000136, train/loss_step=0.0367, global_step=1582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4589/5971 [41:55<12:37,  1.82it/s, loss=0.0765, v_num=0, train/loss_simple_step=0.00956, train/loss_vlb_step=4.42e-5, train/loss_step=0.00956, global_step=1583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4590/5971 [41:56<12:37,  1.82it/s, loss=0.0706, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.09e-5, train/loss_step=0.011, global_step=1583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  77%|███████▋  | 4591/5971 [41:57<12:36,  1.82it/s, loss=0.0706, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.09e-5, train/loss_step=0.011, global_step=1583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4591/5971 [41:57<12:36,  1.82it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00143, train/loss_step=0.310, global_step=1583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4592/5971 [42:00<12:36,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.0103, train/loss_step=0.546, global_step=1583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  77%|███████▋  | 4593/5971 [42:01<12:36,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.59e-5, train/loss_step=0.0027, global_step=1584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4594/5971 [42:01<12:35,  1.82it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.13e-5, train/loss_step=0.0143, global_step=1584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4595/5971 [42:02<12:35,  1.82it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.13e-5, train/loss_step=0.0143, global_step=1584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4595/5971 [42:02<12:35,  1.82it/s, loss=0.105, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000436, train/loss_step=0.132, global_step=1584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  77%|███████▋  | 4596/5971 [42:04<12:35,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00169, train/loss_vlb_step=1.02e-5, train/loss_step=0.00169, global_step=1584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4597/5971 [42:05<12:34,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000127, train/loss_step=0.0323, global_step=1585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  77%|███████▋  | 4598/5971 [42:06<12:34,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.54e-5, train/loss_step=0.00278, global_step=1585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4599/5971 [42:07<12:33,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.54e-5, train/loss_step=0.00278, global_step=1585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4599/5971 [42:07<12:33,  1.82it/s, loss=0.089, v_num=0, train/loss_simple_step=0.00249, train/loss_vlb_step=1.45e-5, train/loss_step=0.00249, global_step=1585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4600/5971 [42:10<12:33,  1.82it/s, loss=0.0862, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.71e-5, train/loss_step=0.00308, global_step=1585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4601/5971 [42:10<12:33,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00134, train/loss_step=0.315, global_step=1586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  77%|███████▋  | 4602/5971 [42:11<12:33,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000108, train/loss_step=0.0286, global_step=1586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4603/5971 [42:12<12:32,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000108, train/loss_step=0.0286, global_step=1586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4603/5971 [42:12<12:32,  1.82it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.03e-5, train/loss_step=0.00179, global_step=1586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  77%|███████▋  | 4604/5971 [42:14<12:32,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00181, train/loss_step=0.382, global_step=1586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  77%|███████▋  | 4605/5971 [42:15<12:32,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=3.43e-5, train/loss_step=0.00661, global_step=1587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4606/5971 [42:16<12:31,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000231, train/loss_step=0.0673, global_step=1587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  77%|███████▋  | 4607/5971 [42:17<12:31,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000231, train/loss_step=0.0673, global_step=1587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4607/5971 [42:17<12:31,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.00308, train/loss_step=0.479, global_step=1587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  77%|███████▋  | 4608/5971 [42:19<12:31,  1.81it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00413, train/loss_vlb_step=2.15e-5, train/loss_step=0.00413, global_step=1587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4609/5971 [42:20<12:30,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.0063, train/loss_step=0.542, global_step=1588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  77%|███████▋  | 4610/5971 [42:21<12:30,  1.81it/s, loss=0.147, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000218, train/loss_step=0.066, global_step=1588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4611/5971 [42:22<12:29,  1.81it/s, loss=0.147, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000218, train/loss_step=0.066, global_step=1588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4611/5971 [42:22<12:29,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.73e-5, train/loss_step=0.0185, global_step=1588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4612/5971 [42:24<12:29,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.59e-5, train/loss_step=0.0211, global_step=1588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4613/5971 [42:25<12:29,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00268, train/loss_step=0.401, global_step=1589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  77%|███████▋  | 4614/5971 [42:26<12:28,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.21e-5, train/loss_step=0.00422, global_step=1589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4615/5971 [42:27<12:28,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.21e-5, train/loss_step=0.00422, global_step=1589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4615/5971 [42:27<12:28,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.18e-5, train/loss_step=0.00205, global_step=1589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4616/5971 [42:29<12:28,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.25e-5, train/loss_step=0.0181, global_step=1589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  77%|███████▋  | 4617/5971 [42:30<12:27,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000249, train/loss_step=0.0725, global_step=1590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4618/5971 [42:31<12:27,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.002, train/loss_step=0.389, global_step=1590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  77%|███████▋  | 4619/5971 [42:32<12:26,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.002, train/loss_step=0.389, global_step=1590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4619/5971 [42:32<12:26,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.54e-5, train/loss_step=0.0125, global_step=1590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4620/5971 [42:34<12:26,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0049, train/loss_vlb_step=2.46e-5, train/loss_step=0.0049, global_step=1590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4621/5971 [42:35<12:26,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000152, train/loss_step=0.0407, global_step=1591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4622/5971 [42:36<12:26,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000194, train/loss_step=0.0567, global_step=1591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4623/5971 [42:37<12:25,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000194, train/loss_step=0.0567, global_step=1591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4623/5971 [42:37<12:25,  1.81it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00426, train/loss_vlb_step=2.18e-5, train/loss_step=0.00426, global_step=1591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4624/5971 [42:39<12:25,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000105, train/loss_step=0.0275, global_step=1591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4625/5971 [42:40<12:25,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000107, train/loss_step=0.0302, global_step=1592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4626/5971 [42:41<12:24,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.98e-5, train/loss_step=0.0135, global_step=1592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  77%|███████▋  | 4627/5971 [42:42<12:24,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.98e-5, train/loss_step=0.0135, global_step=1592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  77%|███████▋  | 4627/5971 [42:42<12:24,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00341, train/loss_step=0.402, global_step=1592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  78%|███████▊  | 4628/5971 [42:44<12:24,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.91e-5, train/loss_step=0.0191, global_step=1592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4629/5971 [42:45<12:23,  1.80it/s, loss=0.0847, v_num=0, train/loss_simple_step=0.0909, train/loss_vlb_step=0.000306, train/loss_step=0.0909, global_step=1593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4630/5971 [42:46<12:23,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00446, train/loss_step=0.448, global_step=1593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  78%|███████▊  | 4631/5971 [42:47<12:22,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00446, train/loss_step=0.448, global_step=1593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4631/5971 [42:47<12:22,  1.80it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000136, train/loss_step=0.0395, global_step=1593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4632/5971 [42:49<12:22,  1.80it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000114, train/loss_step=0.0311, global_step=1593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4633/5971 [42:50<12:22,  1.80it/s, loss=0.0857, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.56e-5, train/loss_step=0.00731, global_step=1594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4634/5971 [42:51<12:21,  1.80it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000646, train/loss_step=0.196, global_step=1594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  78%|███████▊  | 4635/5971 [42:52<12:21,  1.80it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000646, train/loss_step=0.196, global_step=1594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4635/5971 [42:52<12:21,  1.80it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000132, train/loss_step=0.0346, global_step=1594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4636/5971 [42:54<12:21,  1.80it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.00018, train/loss_step=0.0508, global_step=1594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  78%|███████▊  | 4637/5971 [42:55<12:20,  1.80it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.00947, train/loss_vlb_step=4.45e-5, train/loss_step=0.00947, global_step=1595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4638/5971 [42:56<12:20,  1.80it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.02e-5, train/loss_step=0.0145, global_step=1595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  78%|███████▊  | 4639/5971 [42:56<12:19,  1.80it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.02e-5, train/loss_step=0.0145, global_step=1595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4639/5971 [42:56<12:19,  1.80it/s, loss=0.084, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.00055, train/loss_step=0.160, global_step=1595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  78%|███████▊  | 4640/5971 [42:59<12:19,  1.80it/s, loss=0.108, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00318, train/loss_step=0.483, global_step=1595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4641/5971 [43:00<12:19,  1.80it/s, loss=0.115, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.00066, train/loss_step=0.189, global_step=1596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4642/5971 [43:01<12:18,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000411, train/loss_step=0.123, global_step=1596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4643/5971 [43:02<12:18,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000411, train/loss_step=0.123, global_step=1596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4643/5971 [43:02<12:18,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000793, train/loss_step=0.227, global_step=1596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  78%|███████▊  | 4644/5971 [43:04<12:18,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.08e-5, train/loss_step=0.00419, global_step=1596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4645/5971 [43:05<12:17,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.73e-5, train/loss_step=0.0127, global_step=1597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  78%|███████▊  | 4646/5971 [43:06<12:17,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.77e-5, train/loss_step=0.019, global_step=1597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  78%|███████▊  | 4647/5971 [43:06<12:16,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.77e-5, train/loss_step=0.019, global_step=1597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4647/5971 [43:06<12:16,  1.80it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000149, train/loss_step=0.0423, global_step=1597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4648/5971 [43:09<12:16,  1.80it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.32e-5, train/loss_step=0.0144, global_step=1597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  78%|███████▊  | 4649/5971 [43:09<12:16,  1.80it/s, loss=0.109, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000246, train/loss_step=0.073, global_step=1598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4650/5971 [43:10<12:15,  1.80it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.27e-5, train/loss_step=0.0118, global_step=1598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4651/5971 [43:11<12:15,  1.79it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.27e-5, train/loss_step=0.0118, global_step=1598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4651/5971 [43:11<12:15,  1.79it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.00109, train/loss_step=0.254, global_step=1598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  78%|███████▊  | 4652/5971 [43:13<12:15,  1.79it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.87e-6, train/loss_step=0.00166, global_step=1598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4653/5971 [43:14<12:14,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.515, train/loss_vlb_step=0.00492, train/loss_step=0.515, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  78%|███████▊  | 4654/5971 [43:15<12:14,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0839, train/loss_vlb_step=0.000281, train/loss_step=0.0839, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4655/5971 [43:16<12:13,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0839, train/loss_vlb_step=0.000281, train/loss_step=0.0839, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  78%|███████▊  | 4655/5971 [43:16<12:13,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00102, train/loss_step=0.251, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  78%|███████▊  | 4656/5971 [43:18<12:13,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:15,  2.19it/s][A

Validating:   1%|          | 2/167 [00:00<00:49,  3.34it/s][A
Epoch 2:  78%|███████▊  | 4659/5971 [43:19<12:11,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.95it/s][A
Epoch 2:  78%|███████▊  | 4663/5971 [43:19<12:09,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.26it/s][A
Epoch 2:  78%|███████▊  | 4667/5971 [43:19<12:06,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:10, 15.54it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.04it/s][A
Epoch 2:  78%|███████▊  | 4671/5971 [43:19<12:03,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.29it/s][A
Epoch 2:  78%|███████▊  | 4675/5971 [43:20<12:00,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 18.86it/s][A
Epoch 2:  78%|███████▊  | 4679/5971 [43:20<11:57,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:10, 13.96it/s][A

Validating:  15%|█▍        | 25/167 [00:02<00:12, 11.22it/s][A
Epoch 2:  78%|███████▊  | 4683/5971 [43:20<11:55,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 28/167 [00:02<00:10, 13.72it/s][A
Epoch 2:  78%|███████▊  | 4687/5971 [43:21<11:52,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▊        | 31/167 [00:02<00:08, 15.79it/s][A

Validating:  20%|██        | 34/167 [00:02<00:07, 17.93it/s][A
Epoch 2:  79%|███████▊  | 4691/5971 [43:21<11:49,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  22%|██▏       | 37/167 [00:02<00:06, 19.97it/s][A
Epoch 2:  79%|███████▊  | 4695/5971 [43:21<11:46,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  24%|██▍       | 40/167 [00:02<00:06, 19.93it/s][A
Epoch 2:  79%|███████▊  | 4699/5971 [43:21<11:44,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▌       | 43/167 [00:03<00:08, 15.20it/s][A

Validating:  28%|██▊       | 46/167 [00:03<00:06, 17.77it/s][A
Epoch 2:  79%|███████▉  | 4703/5971 [43:21<11:41,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▉       | 49/167 [00:03<00:05, 19.86it/s][A
Epoch 2:  79%|███████▉  | 4707/5971 [43:22<11:38,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  31%|███       | 52/167 [00:03<00:05, 21.40it/s][A
Epoch 2:  79%|███████▉  | 4711/5971 [43:22<11:35,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  33%|███▎      | 55/167 [00:03<00:05, 21.12it/s][A

Validating:  35%|███▍      | 58/167 [00:03<00:04, 22.24it/s][A
Epoch 2:  79%|███████▉  | 4715/5971 [43:22<11:33,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 23.01it/s][A
Epoch 2:  79%|███████▉  | 4719/5971 [43:22<11:30,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  38%|███▊      | 64/167 [00:03<00:04, 23.27it/s][A
Epoch 2:  79%|███████▉  | 4723/5971 [43:22<11:27,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:04<00:04, 20.76it/s][A

Validating:  42%|████▏     | 70/167 [00:04<00:04, 21.27it/s][A
Epoch 2:  79%|███████▉  | 4727/5971 [43:22<11:24,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▎     | 73/167 [00:04<00:04, 22.67it/s][A
Epoch 2:  79%|███████▉  | 4731/5971 [43:23<11:22,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:04<00:03, 23.92it/s][A
Epoch 2:  79%|███████▉  | 4735/5971 [43:23<11:19,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:04<00:03, 25.13it/s][A

Validating:  49%|████▉     | 82/167 [00:04<00:03, 25.21it/s][A
Epoch 2:  79%|███████▉  | 4739/5971 [43:23<11:16,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  51%|█████     | 85/167 [00:04<00:03, 24.97it/s][A
Epoch 2:  79%|███████▉  | 4743/5971 [43:23<11:13,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 24.67it/s][A
Epoch 2:  80%|███████▉  | 4747/5971 [43:23<11:11,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 25.13it/s][A

Validating:  56%|█████▋    | 94/167 [00:05<00:03, 23.58it/s][A
Epoch 2:  80%|███████▉  | 4751/5971 [43:23<11:08,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:05<00:03, 23.01it/s][A
Epoch 2:  80%|███████▉  | 4755/5971 [43:24<11:05,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:05<00:02, 24.29it/s][A
Epoch 2:  80%|███████▉  | 4759/5971 [43:24<11:03,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:05<00:02, 24.30it/s][A

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 24.08it/s][A
Epoch 2:  80%|███████▉  | 4763/5971 [43:24<11:00,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 24.78it/s][A
Epoch 2:  80%|███████▉  | 4767/5971 [43:24<10:57,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 25.36it/s][A
Epoch 2:  80%|███████▉  | 4771/5971 [43:24<10:54,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 25.99it/s][A

Validating:  71%|███████   | 118/167 [00:06<00:01, 25.69it/s][A
Epoch 2:  80%|███████▉  | 4775/5971 [43:24<10:52,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:06<00:01, 26.27it/s][A
Epoch 2:  80%|████████  | 4779/5971 [43:24<10:49,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:06<00:01, 26.15it/s][A
Epoch 2:  80%|████████  | 4783/5971 [43:25<10:46,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:06<00:01, 26.33it/s][A

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 27.01it/s][A
Epoch 2:  80%|████████  | 4787/5971 [43:25<10:44,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:06<00:01, 26.99it/s][A
Epoch 2:  80%|████████  | 4791/5971 [43:25<10:41,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 26.26it/s][A
Epoch 2:  80%|████████  | 4795/5971 [43:25<10:38,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 24.61it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:01, 24.44it/s][A
Epoch 2:  80%|████████  | 4799/5971 [43:25<10:36,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:07<00:00, 24.85it/s][A
Epoch 2:  80%|████████  | 4803/5971 [43:25<10:33,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:07<00:00, 25.39it/s][A
Epoch 2:  81%|████████  | 4807/5971 [43:26<10:30,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:07<00:00, 24.68it/s][A

Validating:  92%|█████████▏| 154/167 [00:07<00:00, 25.18it/s][A
Epoch 2:  81%|████████  | 4811/5971 [43:26<10:28,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:07<00:00, 24.12it/s][A
Epoch 2:  81%|████████  | 4815/5971 [43:26<10:25,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 24.28it/s][A
Epoch 2:  81%|████████  | 4819/5971 [43:26<10:22,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 25.36it/s][A
Epoch 2:  81%|████████  | 4823/5971 [43:26<10:20,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4824/5971 [43:27<10:19,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.46it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.32it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.97it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.45it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.76it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.95it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.11it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.51it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.45it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.27it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.37it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.42it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.37it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.62it/s][A
Epoch 2:  81%|████████  | 4824/5971 [43:36<10:22,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.48it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.46it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.12it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.08it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.13it/s]

Epoch 2:  81%|████████  | 4825/5971 [43:39<10:21,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000188, train/loss_step=0.055, global_step=1599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4825/5971 [43:39<10:21,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000361, train/loss_step=0.109, global_step=1600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.35it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.16it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.77it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.26it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.62it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.90it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.11it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.41it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.50it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.47it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.48it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.52it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.45it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.47it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.48it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.13it/s]

Epoch 2:  81%|████████  | 4826/5971 [43:51<10:24,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000361, train/loss_step=0.109, global_step=1600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4826/5971 [43:51<10:24,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.16e-5, train/loss_step=0.00404, global_step=1600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.36it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.19it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.79it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.27it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.92it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.12it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.32it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.12it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.06it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  4.98it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.03it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.07it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:06,  4.99it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.06it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.16it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.14it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.14it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.23it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.48it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.46it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.04it/s]

Epoch 2:  81%|████████  | 4827/5971 [44:03<10:26,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.16e-5, train/loss_step=0.00404, global_step=1600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4827/5971 [44:03<10:26,  1.83it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.24e-5, train/loss_step=0.00216, global_step=1600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.30it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.13it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.71it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.16it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.47it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.68it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.85it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.89it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.95it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.01it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.06it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.08it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.21it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.15it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.17it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.17it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.20it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.17it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.16it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.12it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.16it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.15it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.14it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.13it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.15it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.10it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.15it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.24it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.25it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.08it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.03it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  4.98it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.07it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.14it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.90it/s]

Epoch 2:  81%|████████  | 4828/5971 [44:18<10:29,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.24e-5, train/loss_step=0.00216, global_step=1600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4828/5971 [44:18<10:29,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00322, train/loss_step=0.500, global_step=1600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  81%|████████  | 4829/5971 [44:19<10:28,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00322, train/loss_step=0.500, global_step=1600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4829/5971 [44:19<10:28,  1.82it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=9e-5, train/loss_step=0.0213, global_step=1601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████  | 4830/5971 [44:20<10:28,  1.82it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=9e-5, train/loss_step=0.0213, global_step=1601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4830/5971 [44:20<10:28,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0597, train/loss_vlb_step=0.000204, train/loss_step=0.0597, global_step=1601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4831/5971 [44:21<10:27,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0597, train/loss_vlb_step=0.000204, train/loss_step=0.0597, global_step=1601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4831/5971 [44:21<10:27,  1.82it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000171, train/loss_step=0.0496, global_step=1601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4832/5971 [44:23<10:27,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000171, train/loss_step=0.0496, global_step=1601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4832/5971 [44:23<10:27,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.000165, train/loss_step=0.044, global_step=1601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  81%|████████  | 4833/5971 [44:24<10:27,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.000165, train/loss_step=0.044, global_step=1601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4833/5971 [44:24<10:27,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.32e-5, train/loss_step=0.0122, global_step=1602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4834/5971 [44:25<10:26,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.32e-5, train/loss_step=0.0122, global_step=1602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4834/5971 [44:25<10:26,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=1602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████  | 4835/5971 [44:26<10:26,  1.81it/s, loss=0.111, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=1602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4835/5971 [44:26<10:26,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00119, train/loss_step=0.292, global_step=1602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████  | 4836/5971 [44:28<10:26,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00119, train/loss_step=0.292, global_step=1602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4836/5971 [44:28<10:26,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000271, train/loss_step=0.0815, global_step=1602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4837/5971 [44:29<10:25,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000271, train/loss_step=0.0815, global_step=1602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4837/5971 [44:29<10:25,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.37e-5, train/loss_step=0.010, global_step=1603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  81%|████████  | 4838/5971 [44:30<10:25,  1.81it/s, loss=0.124, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.37e-5, train/loss_step=0.010, global_step=1603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4838/5971 [44:30<10:25,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000255, train/loss_step=0.077, global_step=1603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4839/5971 [44:31<10:24,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000255, train/loss_step=0.077, global_step=1603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4839/5971 [44:31<10:24,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00115, train/loss_step=0.263, global_step=1603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████  | 4840/5971 [44:33<10:24,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00115, train/loss_step=0.263, global_step=1603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4840/5971 [44:33<10:24,  1.81it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0519, train/loss_vlb_step=0.000186, train/loss_step=0.0519, global_step=1603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4841/5971 [44:34<10:24,  1.81it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0519, train/loss_vlb_step=0.000186, train/loss_step=0.0519, global_step=1603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4841/5971 [44:34<10:24,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000253, train/loss_step=0.0759, global_step=1604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4842/5971 [44:35<10:23,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000253, train/loss_step=0.0759, global_step=1604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4842/5971 [44:35<10:23,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0687, train/loss_vlb_step=0.000231, train/loss_step=0.0687, global_step=1604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4843/5971 [44:36<10:23,  1.81it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0687, train/loss_vlb_step=0.000231, train/loss_step=0.0687, global_step=1604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4843/5971 [44:36<10:23,  1.81it/s, loss=0.0959, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.59e-5, train/loss_step=0.0224, global_step=1604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4844/5971 [44:38<10:23,  1.81it/s, loss=0.0959, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.59e-5, train/loss_step=0.0224, global_step=1604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4844/5971 [44:38<10:23,  1.81it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.00423, train/loss_vlb_step=2.25e-5, train/loss_step=0.00423, global_step=1604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4845/5971 [44:39<10:22,  1.81it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.00423, train/loss_vlb_step=2.25e-5, train/loss_step=0.00423, global_step=1604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4845/5971 [44:39<10:22,  1.81it/s, loss=0.0957, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000535, train/loss_step=0.157, global_step=1605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  81%|████████  | 4846/5971 [44:40<10:22,  1.81it/s, loss=0.0957, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000535, train/loss_step=0.157, global_step=1605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4846/5971 [44:40<10:22,  1.81it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.0834, train/loss_vlb_step=0.000277, train/loss_step=0.0834, global_step=1605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4847/5971 [44:41<10:21,  1.81it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.0834, train/loss_vlb_step=0.000277, train/loss_step=0.0834, global_step=1605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4847/5971 [44:41<10:21,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000421, train/loss_step=0.127, global_step=1605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  81%|████████  | 4848/5971 [44:43<10:21,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000421, train/loss_step=0.127, global_step=1605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4848/5971 [44:43<10:21,  1.81it/s, loss=0.081, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.31e-6, train/loss_step=0.00141, global_step=1605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4849/5971 [44:44<10:21,  1.81it/s, loss=0.081, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.31e-6, train/loss_step=0.00141, global_step=1605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4849/5971 [44:44<10:21,  1.81it/s, loss=0.0825, v_num=0, train/loss_simple_step=0.0519, train/loss_vlb_step=0.000185, train/loss_step=0.0519, global_step=1606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4850/5971 [44:45<10:20,  1.81it/s, loss=0.0825, v_num=0, train/loss_simple_step=0.0519, train/loss_vlb_step=0.000185, train/loss_step=0.0519, global_step=1606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4850/5971 [44:45<10:20,  1.81it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.00583, train/loss_vlb_step=2.82e-5, train/loss_step=0.00583, global_step=1606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4851/5971 [44:46<10:20,  1.81it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.00583, train/loss_vlb_step=2.82e-5, train/loss_step=0.00583, global_step=1606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████  | 4851/5971 [44:46<10:20,  1.81it/s, loss=0.0782, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=6.39e-5, train/loss_step=0.0167, global_step=1606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  81%|████████▏ | 4852/5971 [44:48<10:20,  1.80it/s, loss=0.0782, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=6.39e-5, train/loss_step=0.0167, global_step=1606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4852/5971 [44:48<10:20,  1.80it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=1606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████▏ | 4853/5971 [44:49<10:19,  1.80it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=1606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4853/5971 [44:49<10:19,  1.80it/s, loss=0.088, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000517, train/loss_step=0.143, global_step=1607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████▏ | 4854/5971 [44:50<10:19,  1.80it/s, loss=0.088, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000517, train/loss_step=0.143, global_step=1607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4854/5971 [44:50<10:19,  1.80it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=1607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4855/5971 [44:51<10:18,  1.80it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=1607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4855/5971 [44:51<10:18,  1.80it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.37e-5, train/loss_step=0.021, global_step=1607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4856/5971 [44:53<10:18,  1.80it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.37e-5, train/loss_step=0.021, global_step=1607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4856/5971 [44:53<10:18,  1.80it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000347, train/loss_step=0.106, global_step=1607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4857/5971 [44:54<10:17,  1.80it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000347, train/loss_step=0.106, global_step=1607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4857/5971 [44:54<10:17,  1.80it/s, loss=0.0857, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000122, train/loss_step=0.0327, global_step=1608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4858/5971 [44:55<10:17,  1.80it/s, loss=0.0857, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000122, train/loss_step=0.0327, global_step=1608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4858/5971 [44:55<10:17,  1.80it/s, loss=0.0827, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.78e-5, train/loss_step=0.0164, global_step=1608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████▏ | 4859/5971 [44:56<10:16,  1.80it/s, loss=0.0827, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.78e-5, train/loss_step=0.0164, global_step=1608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4859/5971 [44:56<10:16,  1.80it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000114, train/loss_step=0.0285, global_step=1608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4860/5971 [44:58<10:16,  1.80it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000114, train/loss_step=0.0285, global_step=1608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4860/5971 [44:58<10:16,  1.80it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.27e-5, train/loss_step=0.0186, global_step=1608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████▏ | 4861/5971 [44:59<10:16,  1.80it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.27e-5, train/loss_step=0.0186, global_step=1608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4861/5971 [44:59<10:16,  1.80it/s, loss=0.0665, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.58e-5, train/loss_step=0.0204, global_step=1609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4862/5971 [45:00<10:15,  1.80it/s, loss=0.0665, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.58e-5, train/loss_step=0.0204, global_step=1609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4862/5971 [45:00<10:15,  1.80it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000398, train/loss_step=0.120, global_step=1609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  81%|████████▏ | 4863/5971 [45:01<10:15,  1.80it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000398, train/loss_step=0.120, global_step=1609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4863/5971 [45:01<10:15,  1.80it/s, loss=0.0731, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000342, train/loss_step=0.103, global_step=1609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4864/5971 [45:03<10:15,  1.80it/s, loss=0.0731, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000342, train/loss_step=0.103, global_step=1609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4864/5971 [45:03<10:15,  1.80it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000636, train/loss_step=0.179, global_step=1609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4865/5971 [45:04<10:14,  1.80it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000636, train/loss_step=0.179, global_step=1609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4865/5971 [45:04<10:14,  1.80it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000192, train/loss_step=0.0533, global_step=1610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4866/5971 [45:05<10:14,  1.80it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000192, train/loss_step=0.0533, global_step=1610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  81%|████████▏ | 4866/5971 [45:05<10:14,  1.80it/s, loss=0.0754, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000206, train/loss_step=0.0586, global_step=1610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4867/5971 [45:06<10:13,  1.80it/s, loss=0.0754, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000206, train/loss_step=0.0586, global_step=1610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4867/5971 [45:06<10:13,  1.80it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.86e-5, train/loss_step=0.0159, global_step=1610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4868/5971 [45:08<10:13,  1.80it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.86e-5, train/loss_step=0.0159, global_step=1610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4868/5971 [45:08<10:13,  1.80it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.45e-6, train/loss_step=0.00139, global_step=1610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4869/5971 [45:09<10:13,  1.80it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.45e-6, train/loss_step=0.00139, global_step=1610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4869/5971 [45:09<10:13,  1.80it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000755, train/loss_step=0.217, global_step=1611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4870/5971 [45:10<10:12,  1.80it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000755, train/loss_step=0.217, global_step=1611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4870/5971 [45:10<10:12,  1.80it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000144, train/loss_step=0.0393, global_step=1611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4871/5971 [45:11<10:12,  1.80it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000144, train/loss_step=0.0393, global_step=1611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4871/5971 [45:11<10:12,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00184, train/loss_step=0.344, global_step=1611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4872/5971 [45:14<10:12,  1.80it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00184, train/loss_step=0.344, global_step=1611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4872/5971 [45:14<10:12,  1.80it/s, loss=0.0911, v_num=0, train/loss_simple_step=0.00766, train/loss_vlb_step=3.62e-5, train/loss_step=0.00766, global_step=1611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4873/5971 [45:15<10:11,  1.80it/s, loss=0.0911, v_num=0, train/loss_simple_step=0.00766, train/loss_vlb_step=3.62e-5, train/loss_step=0.00766, global_step=1611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4873/5971 [45:15<10:11,  1.80it/s, loss=0.109, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.0035, train/loss_step=0.504, global_step=1612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  82%|████████▏ | 4874/5971 [45:16<10:11,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.0035, train/loss_step=0.504, global_step=1612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4874/5971 [45:16<10:11,  1.79it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.26e-5, train/loss_step=0.0021, global_step=1612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4875/5971 [45:16<10:10,  1.79it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.26e-5, train/loss_step=0.0021, global_step=1612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4875/5971 [45:16<10:10,  1.79it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000352, train/loss_step=0.106, global_step=1612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4876/5971 [45:19<10:10,  1.79it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000352, train/loss_step=0.106, global_step=1612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4876/5971 [45:19<10:10,  1.79it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.35e-5, train/loss_step=0.00251, global_step=1612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4877/5971 [45:20<10:10,  1.79it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.35e-5, train/loss_step=0.00251, global_step=1612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4877/5971 [45:20<10:10,  1.79it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0435, train/loss_vlb_step=0.000159, train/loss_step=0.0435, global_step=1613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4878/5971 [45:20<10:09,  1.79it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0435, train/loss_vlb_step=0.000159, train/loss_step=0.0435, global_step=1613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4878/5971 [45:20<10:09,  1.79it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=9.87e-6, train/loss_step=0.00168, global_step=1613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4879/5971 [45:21<10:09,  1.79it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=9.87e-6, train/loss_step=0.00168, global_step=1613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4879/5971 [45:21<10:09,  1.79it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000445, train/loss_step=0.134, global_step=1613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4880/5971 [45:24<10:08,  1.79it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000445, train/loss_step=0.134, global_step=1613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4880/5971 [45:24<10:08,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000302, train/loss_step=0.0916, global_step=1613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4881/5971 [45:25<10:08,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000302, train/loss_step=0.0916, global_step=1613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4881/5971 [45:25<10:08,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.00016, train/loss_step=0.046, global_step=1614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4882/5971 [45:26<10:07,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.00016, train/loss_step=0.046, global_step=1614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4882/5971 [45:26<10:07,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.093, train/loss_vlb_step=0.000306, train/loss_step=0.093, global_step=1614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4883/5971 [45:27<10:07,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.093, train/loss_vlb_step=0.000306, train/loss_step=0.093, global_step=1614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4883/5971 [45:27<10:07,  1.79it/s, loss=0.105, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000502, train/loss_step=0.152, global_step=1614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4884/5971 [45:29<10:07,  1.79it/s, loss=0.105, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000502, train/loss_step=0.152, global_step=1614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4884/5971 [45:29<10:07,  1.79it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000141, train/loss_step=0.0405, global_step=1614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4885/5971 [45:30<10:06,  1.79it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000141, train/loss_step=0.0405, global_step=1614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4885/5971 [45:30<10:06,  1.79it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.00919, train/loss_vlb_step=4.17e-5, train/loss_step=0.00919, global_step=1615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4886/5971 [45:31<10:06,  1.79it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.00919, train/loss_vlb_step=4.17e-5, train/loss_step=0.00919, global_step=1615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4886/5971 [45:31<10:06,  1.79it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.000212, train/loss_step=0.0637, global_step=1615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4887/5971 [45:32<10:05,  1.79it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.000212, train/loss_step=0.0637, global_step=1615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4887/5971 [45:32<10:05,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000756, train/loss_step=0.215, global_step=1615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4888/5971 [45:34<10:05,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000756, train/loss_step=0.215, global_step=1615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4888/5971 [45:34<10:05,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00156, train/loss_step=0.337, global_step=1615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4889/5971 [45:35<10:05,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00156, train/loss_step=0.337, global_step=1615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4889/5971 [45:35<10:05,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00113, train/loss_step=0.278, global_step=1616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4890/5971 [45:36<10:04,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00113, train/loss_step=0.278, global_step=1616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4890/5971 [45:36<10:04,  1.79it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.43e-5, train/loss_step=0.00246, global_step=1616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4891/5971 [45:37<10:04,  1.79it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.43e-5, train/loss_step=0.00246, global_step=1616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4891/5971 [45:37<10:04,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000779, train/loss_step=0.211, global_step=1616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4892/5971 [45:39<10:04,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000779, train/loss_step=0.211, global_step=1616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4892/5971 [45:39<10:04,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.24e-5, train/loss_step=0.00213, global_step=1616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4893/5971 [45:40<10:03,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.24e-5, train/loss_step=0.00213, global_step=1616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4893/5971 [45:40<10:03,  1.79it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000539, train/loss_step=0.162, global_step=1617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  82%|████████▏ | 4894/5971 [45:41<10:03,  1.79it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000539, train/loss_step=0.162, global_step=1617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4894/5971 [45:41<10:03,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000448, train/loss_step=0.136, global_step=1617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4895/5971 [45:42<10:02,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000448, train/loss_step=0.136, global_step=1617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4895/5971 [45:42<10:02,  1.79it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00992, train/loss_vlb_step=4.41e-5, train/loss_step=0.00992, global_step=1617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4896/5971 [45:44<10:02,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00992, train/loss_vlb_step=4.41e-5, train/loss_step=0.00992, global_step=1617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4896/5971 [45:44<10:02,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.68e-5, train/loss_step=0.0186, global_step=1617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  82%|████████▏ | 4897/5971 [45:45<10:01,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.68e-5, train/loss_step=0.0186, global_step=1617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4897/5971 [45:45<10:01,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000863, train/loss_step=0.231, global_step=1618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4898/5971 [45:46<10:01,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000863, train/loss_step=0.231, global_step=1618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4898/5971 [45:46<10:01,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.532, train/loss_vlb_step=0.0058, train/loss_step=0.532, global_step=1618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  82%|████████▏ | 4899/5971 [45:47<10:01,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.532, train/loss_vlb_step=0.0058, train/loss_step=0.532, global_step=1618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4899/5971 [45:47<10:01,  1.78it/s, loss=0.143, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000956, train/loss_step=0.228, global_step=1618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4900/5971 [45:49<10:00,  1.78it/s, loss=0.143, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000956, train/loss_step=0.228, global_step=1618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4900/5971 [45:49<10:00,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00173, train/loss_step=0.388, global_step=1618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4901/5971 [45:50<10:00,  1.78it/s, loss=0.158, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00173, train/loss_step=0.388, global_step=1618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4901/5971 [45:50<10:00,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=7.02e-5, train/loss_step=0.0169, global_step=1619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4902/5971 [45:51<09:59,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=7.02e-5, train/loss_step=0.0169, global_step=1619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4902/5971 [45:51<09:59,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=0.000102, train/loss_step=0.027, global_step=1619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4903/5971 [45:52<09:59,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=0.000102, train/loss_step=0.027, global_step=1619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4903/5971 [45:52<09:59,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000156, train/loss_step=0.0453, global_step=1619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4904/5971 [45:54<09:59,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000156, train/loss_step=0.0453, global_step=1619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4904/5971 [45:54<09:59,  1.78it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000246, train/loss_step=0.0709, global_step=1619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4905/5971 [45:55<09:58,  1.78it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000246, train/loss_step=0.0709, global_step=1619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4905/5971 [45:55<09:58,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.838, train/loss_vlb_step=0.0481, train/loss_step=0.838, global_step=1620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  82%|████████▏ | 4906/5971 [45:56<09:58,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.838, train/loss_vlb_step=0.0481, train/loss_step=0.838, global_step=1620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4906/5971 [45:56<09:58,  1.78it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00566, train/loss_vlb_step=2.81e-5, train/loss_step=0.00566, global_step=1620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4907/5971 [45:57<09:57,  1.78it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00566, train/loss_vlb_step=2.81e-5, train/loss_step=0.00566, global_step=1620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4907/5971 [45:57<09:57,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.79e-5, train/loss_step=0.00326, global_step=1620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4908/5971 [45:59<09:57,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.79e-5, train/loss_step=0.00326, global_step=1620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4908/5971 [45:59<09:57,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.6e-5, train/loss_step=0.00278, global_step=1620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  82%|████████▏ | 4909/5971 [46:00<09:57,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.6e-5, train/loss_step=0.00278, global_step=1620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4909/5971 [46:00<09:57,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.05e-5, train/loss_step=0.0113, global_step=1621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4910/5971 [46:01<09:56,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.05e-5, train/loss_step=0.0113, global_step=1621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4910/5971 [46:01<09:56,  1.78it/s, loss=0.181, v_num=0, train/loss_simple_step=0.678, train/loss_vlb_step=0.0152, train/loss_step=0.678, global_step=1621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4911/5971 [46:02<09:56,  1.78it/s, loss=0.181, v_num=0, train/loss_simple_step=0.678, train/loss_vlb_step=0.0152, train/loss_step=0.678, global_step=1621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4911/5971 [46:02<09:56,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00181, train/loss_step=0.308, global_step=1621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4912/5971 [46:04<09:55,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00181, train/loss_step=0.308, global_step=1621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4912/5971 [46:04<09:55,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.18e-5, train/loss_step=0.00213, global_step=1621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4913/5971 [46:05<09:55,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.18e-5, train/loss_step=0.00213, global_step=1621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4913/5971 [46:05<09:55,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000418, train/loss_step=0.126, global_step=1622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4914/5971 [46:06<09:54,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000418, train/loss_step=0.126, global_step=1622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4914/5971 [46:06<09:54,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.73e-5, train/loss_step=0.00309, global_step=1622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4915/5971 [46:07<09:54,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.73e-5, train/loss_step=0.00309, global_step=1622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4915/5971 [46:07<09:54,  1.78it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000321, train/loss_step=0.0968, global_step=1622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4916/5971 [46:09<09:54,  1.78it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000321, train/loss_step=0.0968, global_step=1622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4916/5971 [46:09<09:54,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000549, train/loss_step=0.167, global_step=1622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  82%|████████▏ | 4917/5971 [46:10<09:53,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000549, train/loss_step=0.167, global_step=1622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4917/5971 [46:10<09:53,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000479, train/loss_step=0.145, global_step=1623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4918/5971 [46:11<09:53,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000479, train/loss_step=0.145, global_step=1623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4918/5971 [46:11<09:53,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000553, train/loss_step=0.168, global_step=1623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4919/5971 [46:12<09:52,  1.77it/s, loss=0.167, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000553, train/loss_step=0.168, global_step=1623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4919/5971 [46:12<09:52,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00164, train/loss_step=0.353, global_step=1623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  82%|████████▏ | 4920/5971 [46:14<09:52,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00164, train/loss_step=0.353, global_step=1623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4920/5971 [46:14<09:52,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000731, train/loss_step=0.195, global_step=1623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4921/5971 [46:15<09:52,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000731, train/loss_step=0.195, global_step=1623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4921/5971 [46:15<09:52,  1.77it/s, loss=0.206, v_num=0, train/loss_simple_step=0.867, train/loss_vlb_step=0.219, train/loss_step=0.867, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  82%|████████▏ | 4922/5971 [46:16<09:51,  1.77it/s, loss=0.206, v_num=0, train/loss_simple_step=0.867, train/loss_vlb_step=0.219, train/loss_step=0.867, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4922/5971 [46:16<09:51,  1.77it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.55e-5, train/loss_step=0.00487, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4923/5971 [46:17<09:51,  1.77it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.55e-5, train/loss_step=0.00487, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4923/5971 [46:17<09:51,  1.77it/s, loss=0.216, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00124, train/loss_step=0.281, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  82%|████████▏ | 4924/5971 [46:19<09:50,  1.77it/s, loss=0.216, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00124, train/loss_step=0.281, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  82%|████████▏ | 4924/5971 [46:19<09:50,  1.77it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:05,  2.52it/s][A
Epoch 2:  82%|████████▏ | 4926/5971 [46:19<09:49,  1.77it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:45,  3.66it/s][A
Epoch 2:  83%|████████▎ | 4928/5971 [46:19<09:48,  1.77it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   2%|▏         | 4/167 [00:00<00:21,  7.50it/s][A
Epoch 2:  83%|████████▎ | 4931/5971 [46:20<09:46,  1.77it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   4%|▍         | 7/167 [00:00<00:12, 12.62it/s][A
Epoch 2:  83%|████████▎ | 4934/5971 [46:20<09:44,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   6%|▌         | 10/167 [00:00<00:09, 15.99it/s][A
Epoch 2:  83%|████████▎ | 4937/5971 [46:20<09:42,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 13/167 [00:01<00:08, 18.27it/s][A
Epoch 2:  83%|████████▎ | 4940/5971 [46:20<09:40,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.98it/s][A
Epoch 2:  83%|████████▎ | 4943/5971 [46:20<09:38,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.49it/s][A
Epoch 2:  83%|████████▎ | 4946/5971 [46:20<09:36,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 23.39it/s][A
Epoch 2:  83%|████████▎ | 4949/5971 [46:20<09:34,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 24.37it/s][A
Epoch 2:  83%|████████▎ | 4952/5971 [46:20<09:32,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 25.35it/s][A
Epoch 2:  83%|████████▎ | 4956/5971 [46:20<09:29,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:04, 27.12it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 26.07it/s][A
Epoch 2:  83%|████████▎ | 4960/5971 [46:21<09:26,  1.78it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.56it/s][A
Epoch 2:  83%|████████▎ | 4964/5971 [46:21<09:24,  1.79it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 27.00it/s][A
Epoch 2:  83%|████████▎ | 4968/5971 [46:21<09:21,  1.79it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 27.47it/s][A
Epoch 2:  83%|████████▎ | 4972/5971 [46:21<09:18,  1.79it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 28.28it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 26.15it/s][A
Epoch 2:  83%|████████▎ | 4976/5971 [46:21<09:16,  1.79it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.64it/s][A
Epoch 2:  83%|████████▎ | 4980/5971 [46:21<09:13,  1.79it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.28it/s][A
Epoch 2:  83%|████████▎ | 4984/5971 [46:22<09:10,  1.79it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.59it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.67it/s][A
Epoch 2:  84%|████████▎ | 4988/5971 [46:22<09:08,  1.79it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.58it/s][A
Epoch 2:  84%|████████▎ | 4992/5971 [46:22<09:05,  1.79it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 28.67it/s][A
Epoch 2:  84%|████████▎ | 4996/5971 [46:22<09:02,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 28.07it/s][A
Epoch 2:  84%|████████▎ | 5000/5971 [46:22<09:00,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 28.28it/s][A
Epoch 2:  84%|████████▍ | 5004/5971 [46:22<08:57,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.66it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.25it/s][A
Epoch 2:  84%|████████▍ | 5008/5971 [46:22<08:55,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.62it/s][A
Epoch 2:  84%|████████▍ | 5012/5971 [46:23<08:52,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.60it/s][A
Epoch 2:  84%|████████▍ | 5016/5971 [46:23<08:49,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 26.17it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.53it/s][A
Epoch 2:  84%|████████▍ | 5020/5971 [46:23<08:47,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.49it/s][A
Epoch 2:  84%|████████▍ | 5024/5971 [46:23<08:44,  1.81it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.42it/s][A
Epoch 2:  84%|████████▍ | 5028/5971 [46:23<08:41,  1.81it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.14it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.45it/s][A
Epoch 2:  84%|████████▍ | 5032/5971 [46:23<08:39,  1.81it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.81it/s][A
Epoch 2:  84%|████████▍ | 5036/5971 [46:23<08:36,  1.81it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.45it/s][A
Epoch 2:  84%|████████▍ | 5040/5971 [46:24<08:34,  1.81it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:04<00:01, 25.86it/s][A
Epoch 2:  84%|████████▍ | 5044/5971 [46:24<08:31,  1.81it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.91it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.20it/s][A
Epoch 2:  85%|████████▍ | 5048/5971 [46:24<08:29,  1.81it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.37it/s][A
Epoch 2:  85%|████████▍ | 5052/5971 [46:24<08:26,  1.81it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.20it/s][A
Epoch 2:  85%|████████▍ | 5056/5971 [46:24<08:23,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.39it/s][A
Epoch 2:  85%|████████▍ | 5060/5971 [46:24<08:21,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 28.48it/s][A
Epoch 2:  85%|████████▍ | 5064/5971 [46:24<08:18,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.81it/s][A

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 28.03it/s][A
Epoch 2:  85%|████████▍ | 5068/5971 [46:25<08:16,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:05<00:00, 27.72it/s][A
Epoch 2:  85%|████████▍ | 5072/5971 [46:25<08:13,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.69it/s][A
Epoch 2:  85%|████████▌ | 5076/5971 [46:25<08:11,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.88it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.05it/s][A
Epoch 2:  85%|████████▌ | 5080/5971 [46:25<08:08,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.45it/s][A
Epoch 2:  85%|████████▌ | 5084/5971 [46:25<08:05,  1.83it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 24.49it/s][A
Epoch 2:  85%|████████▌ | 5088/5971 [46:25<08:03,  1.83it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 24.48it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 23.89it/s][A
Epoch 2:  85%|████████▌ | 5092/5971 [46:26<08:00,  1.83it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5092/5971 [46:26<08:00,  1.83it/s, loss=0.23, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00166, train/loss_step=0.349, global_step=1624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  85%|████████▌ | 5093/5971 [46:27<08:00,  1.83it/s, loss=0.205, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00177, train/loss_step=0.337, global_step=1625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5094/5971 [46:28<07:59,  1.83it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00417, train/loss_vlb_step=2.2e-5, train/loss_step=0.00417, global_step=1625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5095/5971 [46:29<07:59,  1.83it/s, loss=0.218, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000939, train/loss_step=0.255, global_step=1625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  85%|████████▌ | 5096/5971 [46:31<07:59,  1.83it/s, loss=0.218, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000939, train/loss_step=0.255, global_step=1625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5096/5971 [46:31<07:59,  1.83it/s, loss=0.231, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.000996, train/loss_step=0.272, global_step=1625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5097/5971 [46:32<07:58,  1.83it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=4.86e-5, train/loss_step=0.0117, global_step=1626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5098/5971 [46:33<07:58,  1.83it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000208, train/loss_step=0.0606, global_step=1626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  85%|████████▌ | 5099/5971 [46:34<07:57,  1.82it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000114, train/loss_step=0.0304, global_step=1626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5100/5971 [46:36<07:57,  1.82it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000114, train/loss_step=0.0304, global_step=1626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5100/5971 [46:36<07:57,  1.82it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0506, train/loss_vlb_step=0.000185, train/loss_step=0.0506, global_step=1626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5101/5971 [46:37<07:57,  1.82it/s, loss=0.22, v_num=0, train/loss_simple_step=0.749, train/loss_vlb_step=0.020, train/loss_step=0.749, global_step=1627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  85%|████████▌ | 5102/5971 [46:38<07:56,  1.82it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.15e-5, train/loss_step=0.0162, global_step=1627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5103/5971 [46:39<07:56,  1.82it/s, loss=0.223, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000516, train/loss_step=0.154, global_step=1627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  85%|████████▌ | 5104/5971 [46:41<07:55,  1.82it/s, loss=0.223, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000516, train/loss_step=0.154, global_step=1627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  85%|████████▌ | 5104/5971 [46:41<07:55,  1.82it/s, loss=0.239, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00469, train/loss_step=0.477, global_step=1627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  85%|████████▌ | 5105/5971 [46:42<07:55,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.0545, train/loss_vlb_step=0.000195, train/loss_step=0.0545, global_step=1628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5106/5971 [46:43<07:54,  1.82it/s, loss=0.229, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000184, train/loss_step=0.0553, global_step=1628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5107/5971 [46:44<07:54,  1.82it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0572, train/loss_vlb_step=0.000194, train/loss_step=0.0572, global_step=1628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5108/5971 [46:47<07:54,  1.82it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0572, train/loss_vlb_step=0.000194, train/loss_step=0.0572, global_step=1628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5108/5971 [46:47<07:54,  1.82it/s, loss=0.212, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000484, train/loss_step=0.146, global_step=1628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  86%|████████▌ | 5109/5971 [46:48<07:53,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.53e-5, train/loss_step=0.0156, global_step=1629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5110/5971 [46:48<07:53,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000455, train/loss_step=0.138, global_step=1629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▌ | 5111/5971 [46:49<07:52,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000117, train/loss_step=0.0309, global_step=1629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5112/5971 [46:51<07:52,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000117, train/loss_step=0.0309, global_step=1629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5112/5971 [46:51<07:52,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.65e-5, train/loss_step=0.00292, global_step=1629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5113/5971 [46:52<07:51,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00265, train/loss_step=0.430, global_step=1630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  86%|████████▌ | 5114/5971 [46:53<07:51,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00542, train/loss_vlb_step=2.84e-5, train/loss_step=0.00542, global_step=1630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5115/5971 [46:54<07:50,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.73e-5, train/loss_step=0.0132, global_step=1630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  86%|████████▌ | 5116/5971 [46:57<07:50,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.73e-5, train/loss_step=0.0132, global_step=1630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5116/5971 [46:57<07:50,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.31e-5, train/loss_step=0.010, global_step=1630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  86%|████████▌ | 5117/5971 [46:57<07:50,  1.82it/s, loss=0.143, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00167, train/loss_step=0.357, global_step=1631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5118/5971 [46:58<07:49,  1.82it/s, loss=0.172, v_num=0, train/loss_simple_step=0.640, train/loss_vlb_step=0.0144, train/loss_step=0.640, global_step=1631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▌ | 5119/5971 [46:59<07:49,  1.82it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.62e-5, train/loss_step=0.0233, global_step=1631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5120/5971 [47:01<07:48,  1.81it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.62e-5, train/loss_step=0.0233, global_step=1631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5120/5971 [47:01<07:48,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=3.01e-5, train/loss_step=0.00588, global_step=1631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5121/5971 [47:02<07:48,  1.81it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.82e-5, train/loss_step=0.0135, global_step=1632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  86%|████████▌ | 5122/5971 [47:03<07:47,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00105, train/loss_step=0.265, global_step=1632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  86%|████████▌ | 5123/5971 [47:04<07:47,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.84e-5, train/loss_step=0.0155, global_step=1632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5124/5971 [47:06<07:47,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.84e-5, train/loss_step=0.0155, global_step=1632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5124/5971 [47:06<07:47,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.649, train/loss_vlb_step=0.0165, train/loss_step=0.649, global_step=1632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  86%|████████▌ | 5125/5971 [47:07<07:46,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=9.13e-5, train/loss_step=0.0222, global_step=1633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5126/5971 [47:08<07:46,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00554, train/loss_vlb_step=2.66e-5, train/loss_step=0.00554, global_step=1633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5127/5971 [47:09<07:45,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=2e-5, train/loss_step=0.00377, global_step=1633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  86%|████████▌ | 5128/5971 [47:11<07:45,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=2e-5, train/loss_step=0.00377, global_step=1633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5128/5971 [47:11<07:45,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.0019, train/loss_step=0.393, global_step=1633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▌ | 5129/5971 [47:12<07:44,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.644, train/loss_vlb_step=0.0212, train/loss_step=0.644, global_step=1634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5130/5971 [47:13<07:44,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00121, train/loss_step=0.276, global_step=1634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5131/5971 [47:14<07:43,  1.81it/s, loss=0.212, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.0059, train/loss_step=0.472, global_step=1634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5132/5971 [47:16<07:43,  1.81it/s, loss=0.212, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.0059, train/loss_step=0.472, global_step=1634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5132/5971 [47:16<07:43,  1.81it/s, loss=0.231, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00187, train/loss_step=0.374, global_step=1634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5133/5971 [47:17<07:43,  1.81it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00516, train/loss_vlb_step=2.48e-5, train/loss_step=0.00516, global_step=1635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5134/5971 [47:18<07:42,  1.81it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.21e-5, train/loss_step=0.00912, global_step=1635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5135/5971 [47:19<07:42,  1.81it/s, loss=0.217, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000513, train/loss_step=0.154, global_step=1635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  86%|████████▌ | 5136/5971 [47:22<07:41,  1.81it/s, loss=0.217, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000513, train/loss_step=0.154, global_step=1635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5136/5971 [47:22<07:41,  1.81it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000102, train/loss_step=0.0274, global_step=1635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5137/5971 [47:22<07:41,  1.81it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.000166, train/loss_step=0.0495, global_step=1636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5138/5971 [47:23<07:40,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000191, train/loss_step=0.0538, global_step=1636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5139/5971 [47:24<07:40,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.09e-5, train/loss_step=0.0175, global_step=1636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▌ | 5140/5971 [47:26<07:40,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.09e-5, train/loss_step=0.0175, global_step=1636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5140/5971 [47:26<07:40,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000756, train/loss_step=0.195, global_step=1636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▌ | 5141/5971 [47:27<07:39,  1.81it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.22e-5, train/loss_step=0.00208, global_step=1637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5142/5971 [47:28<07:39,  1.81it/s, loss=0.181, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00118, train/loss_step=0.257, global_step=1637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  86%|████████▌ | 5143/5971 [47:29<07:38,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.000888, train/loss_step=0.259, global_step=1637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5144/5971 [47:31<07:38,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.000888, train/loss_step=0.259, global_step=1637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5144/5971 [47:31<07:38,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00163, train/loss_step=0.292, global_step=1637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▌ | 5145/5971 [47:32<07:37,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.88e-5, train/loss_step=0.0216, global_step=1638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5146/5971 [47:33<07:37,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.000982, train/loss_step=0.271, global_step=1638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▌ | 5147/5971 [47:34<07:36,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.24e-5, train/loss_step=0.00407, global_step=1638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5148/5971 [47:36<07:36,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.24e-5, train/loss_step=0.00407, global_step=1638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▌ | 5148/5971 [47:36<07:36,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000138, train/loss_step=0.0363, global_step=1638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▌ | 5149/5971 [47:37<07:36,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.11e-5, train/loss_step=0.00195, global_step=1639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5150/5971 [47:38<07:35,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000337, train/loss_step=0.101, global_step=1639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  86%|████████▋ | 5151/5971 [47:39<07:35,  1.80it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.000256, train/loss_step=0.0778, global_step=1639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5152/5971 [47:41<07:34,  1.80it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.000256, train/loss_step=0.0778, global_step=1639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5152/5971 [47:41<07:34,  1.80it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000123, train/loss_step=0.0343, global_step=1639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5153/5971 [47:42<07:34,  1.80it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00044, train/loss_step=0.131, global_step=1640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  86%|████████▋ | 5154/5971 [47:43<07:33,  1.80it/s, loss=0.107, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000493, train/loss_step=0.146, global_step=1640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5155/5971 [47:43<07:33,  1.80it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.69e-5, train/loss_step=0.005, global_step=1640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5156/5971 [47:46<07:32,  1.80it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.69e-5, train/loss_step=0.005, global_step=1640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5156/5971 [47:46<07:32,  1.80it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.3e-5, train/loss_step=0.0137, global_step=1640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5157/5971 [47:47<07:32,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000562, train/loss_step=0.164, global_step=1641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  86%|████████▋ | 5158/5971 [47:48<07:31,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000836, train/loss_step=0.228, global_step=1641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5159/5971 [47:48<07:31,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000143, train/loss_step=0.0371, global_step=1641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5160/5971 [47:51<07:31,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000143, train/loss_step=0.0371, global_step=1641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5160/5971 [47:51<07:31,  1.80it/s, loss=0.14, v_num=0, train/loss_simple_step=0.707, train/loss_vlb_step=0.0119, train/loss_step=0.707, global_step=1641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  86%|████████▋ | 5161/5971 [47:51<07:30,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000124, train/loss_step=0.0324, global_step=1642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5162/5971 [47:52<07:30,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.00425, train/loss_step=0.496, global_step=1642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  86%|████████▋ | 5163/5971 [47:53<07:29,  1.80it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00469, train/loss_vlb_step=2.44e-5, train/loss_step=0.00469, global_step=1642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5164/5971 [47:56<07:29,  1.80it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00469, train/loss_vlb_step=2.44e-5, train/loss_step=0.00469, global_step=1642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  86%|████████▋ | 5164/5971 [47:56<07:29,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0879, train/loss_vlb_step=0.000299, train/loss_step=0.0879, global_step=1642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  87%|████████▋ | 5165/5971 [47:57<07:28,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000469, train/loss_step=0.140, global_step=1643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  87%|████████▋ | 5166/5971 [47:58<07:28,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.62e-5, train/loss_step=0.0153, global_step=1643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5167/5971 [47:58<07:27,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.551, train/loss_vlb_step=0.00745, train/loss_step=0.551, global_step=1643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  87%|████████▋ | 5168/5971 [48:00<07:27,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.551, train/loss_vlb_step=0.00745, train/loss_step=0.551, global_step=1643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5168/5971 [48:00<07:27,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000514, train/loss_step=0.156, global_step=1643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5169/5971 [48:01<07:27,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00263, train/loss_vlb_step=1.5e-5, train/loss_step=0.00263, global_step=1644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5170/5971 [48:02<07:26,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.51e-5, train/loss_step=0.00269, global_step=1644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5171/5971 [48:03<07:26,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.18e-5, train/loss_step=0.00206, global_step=1644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5172/5971 [48:05<07:25,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.18e-5, train/loss_step=0.00206, global_step=1644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5172/5971 [48:05<07:25,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.67e-5, train/loss_step=0.0102, global_step=1644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  87%|████████▋ | 5173/5971 [48:06<07:25,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.79e-5, train/loss_step=0.00346, global_step=1645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5174/5971 [48:07<07:24,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8.42e-5, train/loss_step=0.0191, global_step=1645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  87%|████████▋ | 5175/5971 [48:08<07:24,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.0014, train/loss_step=0.326, global_step=1645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  87%|████████▋ | 5176/5971 [48:10<07:23,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.0014, train/loss_step=0.326, global_step=1645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5176/5971 [48:10<07:23,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.65e-5, train/loss_step=0.0136, global_step=1645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5177/5971 [48:11<07:23,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00181, train/loss_step=0.325, global_step=1646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  87%|████████▋ | 5178/5971 [48:12<07:22,  1.79it/s, loss=0.186, v_num=0, train/loss_simple_step=0.795, train/loss_vlb_step=0.0345, train/loss_step=0.795, global_step=1646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  87%|████████▋ | 5179/5971 [48:13<07:22,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0498, train/loss_vlb_step=0.000178, train/loss_step=0.0498, global_step=1646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5180/5971 [48:15<07:22,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0498, train/loss_vlb_step=0.000178, train/loss_step=0.0498, global_step=1646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5180/5971 [48:15<07:22,  1.79it/s, loss=0.198, v_num=0, train/loss_simple_step=0.924, train/loss_vlb_step=0.465, train/loss_step=0.924, global_step=1646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  87%|████████▋ | 5181/5971 [48:16<07:21,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.9e-5, train/loss_step=0.0188, global_step=1647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5182/5971 [48:17<07:21,  1.79it/s, loss=0.196, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00353, train/loss_step=0.465, global_step=1647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  87%|████████▋ | 5183/5971 [48:17<07:20,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.00011, train/loss_step=0.0274, global_step=1647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5184/5971 [48:20<07:20,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.00011, train/loss_step=0.0274, global_step=1647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5184/5971 [48:20<07:20,  1.79it/s, loss=0.209, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00148, train/loss_step=0.334, global_step=1647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  87%|████████▋ | 5185/5971 [48:21<07:19,  1.79it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.45e-5, train/loss_step=0.00253, global_step=1648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5186/5971 [48:22<07:19,  1.79it/s, loss=0.211, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000689, train/loss_step=0.192, global_step=1648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  87%|████████▋ | 5187/5971 [48:23<07:18,  1.79it/s, loss=0.192, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000563, train/loss_step=0.165, global_step=1648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5188/5971 [48:25<07:18,  1.79it/s, loss=0.192, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000563, train/loss_step=0.165, global_step=1648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5188/5971 [48:25<07:18,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000358, train/loss_step=0.109, global_step=1648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5189/5971 [48:26<07:17,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.21e-5, train/loss_step=0.00408, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5190/5971 [48:27<07:17,  1.79it/s, loss=0.198, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000576, train/loss_step=0.171, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  87%|████████▋ | 5191/5971 [48:28<07:16,  1.79it/s, loss=0.223, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00434, train/loss_step=0.501, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  87%|████████▋ | 5192/5971 [48:30<07:16,  1.78it/s, loss=0.223, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00434, train/loss_step=0.501, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  87%|████████▋ | 5192/5971 [48:30<07:16,  1.78it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:01<03:58,  1.44s/it][A

Validating:   1%|          | 2/167 [00:01<02:03,  1.34it/s][A
Epoch 2:  87%|████████▋ | 5196/5971 [48:32<07:14,  1.78it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:01<00:38,  4.16it/s][A
Epoch 2:  87%|████████▋ | 5200/5971 [48:32<07:11,  1.79it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:01<00:22,  7.15it/s][A

Validating:   7%|▋         | 11/167 [00:02<00:15, 10.20it/s][A
Epoch 2:  87%|████████▋ | 5204/5971 [48:32<07:09,  1.79it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:02<00:11, 12.78it/s][A
Epoch 2:  87%|████████▋ | 5208/5971 [48:32<07:06,  1.79it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:02<00:09, 15.19it/s][A
Epoch 2:  87%|████████▋ | 5212/5971 [48:32<07:04,  1.79it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:02<00:08, 17.87it/s][A

Validating:  14%|█▍        | 23/167 [00:02<00:07, 19.41it/s][A
Epoch 2:  87%|████████▋ | 5216/5971 [48:33<07:01,  1.79it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:02<00:06, 21.53it/s][A
Epoch 2:  87%|████████▋ | 5220/5971 [48:33<06:59,  1.79it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:02<00:05, 23.87it/s][A
Epoch 2:  87%|████████▋ | 5224/5971 [48:33<06:56,  1.79it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 24.88it/s][A
Epoch 2:  88%|████████▊ | 5228/5971 [48:33<06:53,  1.79it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:03<00:05, 25.10it/s][A

Validating:  23%|██▎       | 39/167 [00:03<00:04, 25.76it/s][A
Epoch 2:  88%|████████▊ | 5232/5971 [48:33<06:51,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:03<00:04, 26.33it/s][A
Epoch 2:  88%|████████▊ | 5236/5971 [48:33<06:48,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:03<00:04, 26.93it/s][A
Epoch 2:  88%|████████▊ | 5240/5971 [48:33<06:46,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:03<00:04, 26.86it/s][A

Validating:  31%|███       | 51/167 [00:03<00:04, 25.89it/s][A
Epoch 2:  88%|████████▊ | 5244/5971 [48:34<06:43,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:03<00:04, 26.01it/s][A
Epoch 2:  88%|████████▊ | 5248/5971 [48:34<06:41,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:03<00:04, 24.69it/s][A
Epoch 2:  88%|████████▊ | 5252/5971 [48:34<06:38,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:03<00:04, 25.86it/s][A

Validating:  38%|███▊      | 63/167 [00:04<00:03, 26.66it/s][A
Epoch 2:  88%|████████▊ | 5256/5971 [48:34<06:36,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:04<00:03, 26.98it/s][A
Epoch 2:  88%|████████▊ | 5260/5971 [48:34<06:33,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:04<00:03, 27.14it/s][A
Epoch 2:  88%|████████▊ | 5264/5971 [48:34<06:31,  1.81it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:04<00:03, 25.92it/s][A

Validating:  45%|████▍     | 75/167 [00:04<00:03, 26.45it/s][A
Epoch 2:  88%|████████▊ | 5268/5971 [48:34<06:28,  1.81it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:04<00:03, 25.89it/s][A
Epoch 2:  88%|████████▊ | 5272/5971 [48:35<06:26,  1.81it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:04<00:03, 26.87it/s][A
Epoch 2:  88%|████████▊ | 5276/5971 [48:35<06:23,  1.81it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:04<00:03, 26.66it/s][A

Validating:  52%|█████▏    | 87/167 [00:04<00:02, 26.86it/s][A
Epoch 2:  88%|████████▊ | 5280/5971 [48:35<06:21,  1.81it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:05<00:02, 27.47it/s][A
Epoch 2:  88%|████████▊ | 5284/5971 [48:35<06:18,  1.81it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:05<00:02, 27.54it/s][A
Epoch 2:  89%|████████▊ | 5288/5971 [48:35<06:16,  1.81it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:05<00:02, 27.62it/s][A

Validating:  59%|█████▉    | 99/167 [00:05<00:02, 25.86it/s][A
Epoch 2:  89%|████████▊ | 5292/5971 [48:35<06:14,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:05<00:02, 26.50it/s][A
Epoch 2:  89%|████████▊ | 5296/5971 [48:36<06:11,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:05<00:02, 25.49it/s][A
Epoch 2:  89%|████████▉ | 5300/5971 [48:36<06:09,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 26.48it/s][A

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 27.20it/s][A
Epoch 2:  89%|████████▉ | 5304/5971 [48:36<06:06,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:05<00:01, 26.96it/s][A
Epoch 2:  89%|████████▉ | 5308/5971 [48:36<06:04,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:06<00:01, 26.58it/s][A
Epoch 2:  89%|████████▉ | 5312/5971 [48:36<06:01,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:06<00:01, 27.12it/s][A

Validating:  74%|███████▎  | 123/167 [00:06<00:01, 27.85it/s][A
Epoch 2:  89%|████████▉ | 5316/5971 [48:36<05:59,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:06<00:01, 27.48it/s][A
Epoch 2:  89%|████████▉ | 5320/5971 [48:36<05:56,  1.82it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 129/167 [00:06<00:01, 27.60it/s][A
Epoch 2:  89%|████████▉ | 5324/5971 [48:37<05:54,  1.83it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 27.26it/s][A

Validating:  81%|████████  | 135/167 [00:06<00:01, 26.92it/s][A
Epoch 2:  89%|████████▉ | 5328/5971 [48:37<05:51,  1.83it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 27.02it/s][A
Epoch 2:  89%|████████▉ | 5332/5971 [48:37<05:49,  1.83it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 26.43it/s][A
Epoch 2:  89%|████████▉ | 5336/5971 [48:37<05:47,  1.83it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:07<00:00, 26.52it/s][A

Validating:  88%|████████▊ | 147/167 [00:07<00:00, 25.14it/s][A
Epoch 2:  89%|████████▉ | 5340/5971 [48:37<05:44,  1.83it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:07<00:00, 24.80it/s][A
Epoch 2:  89%|████████▉ | 5344/5971 [48:37<05:42,  1.83it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:07<00:00, 23.52it/s][A
Epoch 2:  90%|████████▉ | 5348/5971 [48:38<05:39,  1.83it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:07<00:00, 23.94it/s][A

Validating:  95%|█████████▌| 159/167 [00:07<00:00, 24.58it/s][A
Epoch 2:  90%|████████▉ | 5352/5971 [48:38<05:37,  1.83it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 24.38it/s][A
Epoch 2:  90%|████████▉ | 5356/5971 [48:38<05:35,  1.84it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 24.60it/s][A
Epoch 2:  90%|████████▉ | 5360/5971 [48:38<05:32,  1.84it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5360/5971 [48:38<05:32,  1.84it/s, loss=0.234, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.0011, train/loss_step=0.237, global_step=1649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  90%|████████▉ | 5361/5971 [48:39<05:32,  1.84it/s, loss=0.253, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00211, train/loss_step=0.371, global_step=1650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5362/5971 [48:40<05:31,  1.84it/s, loss=0.255, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000209, train/loss_step=0.0628, global_step=1650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5363/5971 [48:41<05:31,  1.84it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.61e-5, train/loss_step=0.0224, global_step=1650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  90%|████████▉ | 5364/5971 [48:45<05:30,  1.83it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.61e-5, train/loss_step=0.0224, global_step=1650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5364/5971 [48:45<05:30,  1.83it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0687, train/loss_vlb_step=0.000228, train/loss_step=0.0687, global_step=1650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5365/5971 [48:46<05:30,  1.83it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0494, train/loss_vlb_step=0.000179, train/loss_step=0.0494, global_step=1651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5366/5971 [48:46<05:29,  1.83it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.31e-5, train/loss_step=0.00228, global_step=1651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5367/5971 [48:47<05:29,  1.83it/s, loss=0.198, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000805, train/loss_step=0.226, global_step=1651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  90%|████████▉ | 5368/5971 [48:50<05:29,  1.83it/s, loss=0.198, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000805, train/loss_step=0.226, global_step=1651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5368/5971 [48:50<05:29,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=2.8e-5, train/loss_step=0.00618, global_step=1651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5369/5971 [48:51<05:28,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.16e-5, train/loss_step=0.0206, global_step=1652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  90%|████████▉ | 5370/5971 [48:51<05:28,  1.83it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00759, train/loss_vlb_step=3.74e-5, train/loss_step=0.00759, global_step=1652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5371/5971 [48:52<05:27,  1.83it/s, loss=0.15, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.0045, train/loss_step=0.439, global_step=1652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]      
Epoch 2:  90%|████████▉ | 5372/5971 [48:55<05:27,  1.83it/s, loss=0.15, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.0045, train/loss_step=0.439, global_step=1652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5372/5971 [48:55<05:27,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0846, train/loss_vlb_step=0.000281, train/loss_step=0.0846, global_step=1652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|████████▉ | 5373/5971 [48:56<05:26,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.00606, train/loss_step=0.466, global_step=1653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  90%|█████████ | 5374/5971 [48:57<05:26,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000126, train/loss_step=0.0335, global_step=1653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5375/5971 [48:58<05:25,  1.83it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00814, train/loss_vlb_step=3.82e-5, train/loss_step=0.00814, global_step=1653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5376/5971 [49:00<05:25,  1.83it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00814, train/loss_vlb_step=3.82e-5, train/loss_step=0.00814, global_step=1653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5376/5971 [49:00<05:25,  1.83it/s, loss=0.145, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000421, train/loss_step=0.126, global_step=1653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  90%|█████████ | 5377/5971 [49:01<05:24,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0591, train/loss_vlb_step=0.000212, train/loss_step=0.0591, global_step=1654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5378/5971 [49:02<05:24,  1.83it/s, loss=0.148, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000581, train/loss_step=0.160, global_step=1654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  90%|█████████ | 5379/5971 [49:03<05:23,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000741, train/loss_step=0.209, global_step=1654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5380/5971 [49:05<05:23,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000741, train/loss_step=0.209, global_step=1654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5380/5971 [49:05<05:23,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=1654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5381/5971 [49:06<05:22,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.00014, train/loss_step=0.0365, global_step=1655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5382/5971 [49:07<05:22,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.51e-5, train/loss_step=0.0026, global_step=1655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5383/5971 [49:07<05:21,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00155, train/loss_step=0.312, global_step=1655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  90%|█████████ | 5384/5971 [49:10<05:21,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00155, train/loss_step=0.312, global_step=1655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5384/5971 [49:10<05:21,  1.83it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000116, train/loss_step=0.0306, global_step=1655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5385/5971 [49:11<05:21,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000326, train/loss_step=0.0991, global_step=1656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5386/5971 [49:12<05:20,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00585, train/loss_vlb_step=3e-5, train/loss_step=0.00585, global_step=1656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  90%|█████████ | 5387/5971 [49:12<05:20,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000825, train/loss_step=0.239, global_step=1656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5388/5971 [49:15<05:19,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000825, train/loss_step=0.239, global_step=1656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5388/5971 [49:15<05:19,  1.82it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00736, train/loss_vlb_step=3.71e-5, train/loss_step=0.00736, global_step=1656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5389/5971 [49:16<05:19,  1.82it/s, loss=0.13, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000477, train/loss_step=0.144, global_step=1657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  90%|█████████ | 5390/5971 [49:16<05:18,  1.82it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000104, train/loss_step=0.0278, global_step=1657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5391/5971 [49:17<05:18,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000761, train/loss_step=0.194, global_step=1657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  90%|█████████ | 5392/5971 [49:20<05:17,  1.82it/s, loss=0.119, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000761, train/loss_step=0.194, global_step=1657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5392/5971 [49:20<05:17,  1.82it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000124, train/loss_step=0.0328, global_step=1657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5393/5971 [49:21<05:17,  1.82it/s, loss=0.103, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000697, train/loss_step=0.202, global_step=1658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  90%|█████████ | 5394/5971 [49:21<05:16,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.1e-5, train/loss_step=0.00407, global_step=1658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5395/5971 [49:22<05:16,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000891, train/loss_step=0.236, global_step=1658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  90%|█████████ | 5396/5971 [49:25<05:15,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000891, train/loss_step=0.236, global_step=1658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5396/5971 [49:25<05:15,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00135, train/loss_step=0.303, global_step=1658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  90%|█████████ | 5397/5971 [49:25<05:15,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.648, train/loss_vlb_step=0.0146, train/loss_step=0.648, global_step=1659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  90%|█████████ | 5398/5971 [49:26<05:14,  1.82it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=0.0001, train/loss_step=0.0254, global_step=1659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5399/5971 [49:27<05:14,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.008, train/loss_step=0.578, global_step=1659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  90%|█████████ | 5400/5971 [49:29<05:13,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.008, train/loss_step=0.578, global_step=1659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5400/5971 [49:29<05:13,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.823, train/loss_vlb_step=0.0331, train/loss_step=0.823, global_step=1659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5401/5971 [49:30<05:13,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.032, train/loss_vlb_step=0.000121, train/loss_step=0.032, global_step=1660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5402/5971 [49:31<05:12,  1.82it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000143, train/loss_step=0.0389, global_step=1660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  90%|█████████ | 5403/5971 [49:32<05:12,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00839, train/loss_vlb_step=4.15e-5, train/loss_step=0.00839, global_step=1660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5404/5971 [49:34<05:12,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00839, train/loss_vlb_step=4.15e-5, train/loss_step=0.00839, global_step=1660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5404/5971 [49:34<05:12,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=9.05e-5, train/loss_step=0.0226, global_step=1660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  91%|█████████ | 5405/5971 [49:35<05:11,  1.82it/s, loss=0.189, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000731, train/loss_step=0.210, global_step=1661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5406/5971 [49:36<05:11,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000417, train/loss_step=0.127, global_step=1661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5407/5971 [49:37<05:10,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00722, train/loss_vlb_step=3.5e-5, train/loss_step=0.00722, global_step=1661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5408/5971 [49:39<05:10,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00722, train/loss_vlb_step=3.5e-5, train/loss_step=0.00722, global_step=1661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5408/5971 [49:39<05:10,  1.82it/s, loss=0.193, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000637, train/loss_step=0.190, global_step=1661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  91%|█████████ | 5409/5971 [49:40<05:09,  1.82it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00117, train/loss_vlb_step=7.1e-6, train/loss_step=0.00117, global_step=1662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5410/5971 [49:41<05:09,  1.81it/s, loss=0.191, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=1662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  91%|█████████ | 5411/5971 [49:42<05:08,  1.81it/s, loss=0.187, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=1662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5412/5971 [49:44<05:08,  1.81it/s, loss=0.187, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=1662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5412/5971 [49:44<05:08,  1.81it/s, loss=0.193, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.00051, train/loss_step=0.151, global_step=1662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5413/5971 [49:45<05:07,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.02e-5, train/loss_step=0.00173, global_step=1663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5414/5971 [49:46<05:07,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000259, train/loss_step=0.077, global_step=1663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  91%|█████████ | 5415/5971 [49:47<05:06,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.47e-5, train/loss_step=0.00269, global_step=1663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5416/5971 [49:49<05:06,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.47e-5, train/loss_step=0.00269, global_step=1663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5416/5971 [49:49<05:06,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0972, train/loss_vlb_step=0.000321, train/loss_step=0.0972, global_step=1663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5417/5971 [49:50<05:05,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000158, train/loss_step=0.0479, global_step=1664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5418/5971 [49:51<05:05,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0486, train/loss_vlb_step=0.000177, train/loss_step=0.0486, global_step=1664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5419/5971 [49:52<05:04,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=9.6e-5, train/loss_step=0.0266, global_step=1664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  91%|█████████ | 5420/5971 [49:54<05:04,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=9.6e-5, train/loss_step=0.0266, global_step=1664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5420/5971 [49:54<05:04,  1.81it/s, loss=0.0694, v_num=0, train/loss_simple_step=0.0524, train/loss_vlb_step=0.000182, train/loss_step=0.0524, global_step=1664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5421/5971 [49:55<05:03,  1.81it/s, loss=0.0731, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=1665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  91%|█████████ | 5422/5971 [49:56<05:03,  1.81it/s, loss=0.0713, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.3e-5, train/loss_step=0.00225, global_step=1665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5423/5971 [49:57<05:02,  1.81it/s, loss=0.0732, v_num=0, train/loss_simple_step=0.0476, train/loss_vlb_step=0.00017, train/loss_step=0.0476, global_step=1665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5424/5971 [50:00<05:02,  1.81it/s, loss=0.0732, v_num=0, train/loss_simple_step=0.0476, train/loss_vlb_step=0.00017, train/loss_step=0.0476, global_step=1665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5424/5971 [50:00<05:02,  1.81it/s, loss=0.0725, v_num=0, train/loss_simple_step=0.00783, train/loss_vlb_step=3.58e-5, train/loss_step=0.00783, global_step=1665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5425/5971 [50:01<05:02,  1.81it/s, loss=0.063, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.49e-5, train/loss_step=0.0191, global_step=1666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  91%|█████████ | 5426/5971 [50:02<05:01,  1.81it/s, loss=0.0655, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.00064, train/loss_step=0.178, global_step=1666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5427/5971 [50:02<05:00,  1.81it/s, loss=0.0653, v_num=0, train/loss_simple_step=0.00425, train/loss_vlb_step=2.14e-5, train/loss_step=0.00425, global_step=1666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5428/5971 [50:05<05:00,  1.81it/s, loss=0.0653, v_num=0, train/loss_simple_step=0.00425, train/loss_vlb_step=2.14e-5, train/loss_step=0.00425, global_step=1666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5428/5971 [50:05<05:00,  1.81it/s, loss=0.0606, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=1666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  91%|█████████ | 5429/5971 [50:06<05:00,  1.81it/s, loss=0.0734, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00101, train/loss_step=0.257, global_step=1667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5430/5971 [50:07<04:59,  1.81it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00164, train/loss_step=0.340, global_step=1667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5431/5971 [50:08<04:59,  1.81it/s, loss=0.104, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00503, train/loss_step=0.512, global_step=1667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5432/5971 [50:10<04:58,  1.80it/s, loss=0.104, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00503, train/loss_step=0.512, global_step=1667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5432/5971 [50:10<04:58,  1.80it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.57e-5, train/loss_step=0.0105, global_step=1667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5433/5971 [50:11<04:58,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.673, train/loss_vlb_step=0.0107, train/loss_step=0.673, global_step=1668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  91%|█████████ | 5434/5971 [50:12<04:57,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00831, train/loss_vlb_step=3.89e-5, train/loss_step=0.00831, global_step=1668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5435/5971 [50:13<04:57,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000153, train/loss_step=0.0402, global_step=1668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5436/5971 [50:15<04:56,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000153, train/loss_step=0.0402, global_step=1668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5436/5971 [50:15<04:56,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.549, train/loss_vlb_step=0.00487, train/loss_step=0.549, global_step=1668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  91%|█████████ | 5437/5971 [50:16<04:56,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.0028, train/loss_step=0.436, global_step=1669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5438/5971 [50:17<04:55,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00152, train/loss_step=0.306, global_step=1669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5439/5971 [50:18<04:55,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000124, train/loss_step=0.0314, global_step=1669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5440/5971 [50:20<04:54,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000124, train/loss_step=0.0314, global_step=1669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5440/5971 [50:20<04:54,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.53e-5, train/loss_step=0.0026, global_step=1669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5441/5971 [50:21<04:54,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0061, train/loss_vlb_step=3.05e-5, train/loss_step=0.0061, global_step=1670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5442/5971 [50:22<04:53,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000927, train/loss_step=0.262, global_step=1670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5443/5971 [50:23<04:53,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000375, train/loss_step=0.114, global_step=1670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5444/5971 [50:25<04:52,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000375, train/loss_step=0.114, global_step=1670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5444/5971 [50:25<04:52,  1.80it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00106, train/loss_vlb_step=6.35e-6, train/loss_step=0.00106, global_step=1670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5445/5971 [50:26<04:52,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000115, train/loss_step=0.0313, global_step=1671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████ | 5446/5971 [50:27<04:51,  1.80it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0516, train/loss_vlb_step=0.000186, train/loss_step=0.0516, global_step=1671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5447/5971 [50:28<04:51,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2e-5, train/loss_step=0.00398, global_step=1671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  91%|█████████ | 5448/5971 [50:30<04:50,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2e-5, train/loss_step=0.00398, global_step=1671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████ | 5448/5971 [50:30<04:50,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.000134, train/loss_step=0.0385, global_step=1671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5449/5971 [50:31<04:50,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0087, train/loss_vlb_step=3.96e-5, train/loss_step=0.0087, global_step=1672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████▏| 5450/5971 [50:32<04:49,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00243, train/loss_step=0.448, global_step=1672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  91%|█████████▏| 5451/5971 [50:33<04:49,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00104, train/loss_step=0.244, global_step=1672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5452/5971 [50:35<04:48,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00104, train/loss_step=0.244, global_step=1672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5452/5971 [50:35<04:48,  1.80it/s, loss=0.173, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000801, train/loss_step=0.209, global_step=1672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5453/5971 [50:36<04:48,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000696, train/loss_step=0.192, global_step=1673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5454/5971 [50:37<04:47,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.02e-5, train/loss_step=0.0143, global_step=1673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5455/5971 [50:38<04:47,  1.80it/s, loss=0.158, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000721, train/loss_step=0.211, global_step=1673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  91%|█████████▏| 5456/5971 [50:40<04:46,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000721, train/loss_step=0.211, global_step=1673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5456/5971 [50:40<04:46,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000577, train/loss_step=0.169, global_step=1673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5457/5971 [50:41<04:46,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000133, train/loss_step=0.0397, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5458/5971 [50:42<04:45,  1.79it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.51e-5, train/loss_step=0.00503, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5459/5971 [50:42<04:45,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000681, train/loss_step=0.202, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  91%|█████████▏| 5460/5971 [50:45<04:44,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000681, train/loss_step=0.202, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  91%|█████████▏| 5460/5971 [50:45<04:44,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:03,  2.61it/s][A

Validating:   1%|          | 2/167 [00:00<00:44,  3.71it/s][A
Epoch 2:  92%|█████████▏| 5464/5971 [50:45<04:42,  1.79it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.48it/s][A
Epoch 2:  92%|█████████▏| 5468/5971 [50:45<04:40,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.94it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.84it/s][A
Epoch 2:  92%|█████████▏| 5472/5971 [50:46<04:37,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.55it/s][A
Epoch 2:  92%|█████████▏| 5476/5971 [50:46<04:35,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.04it/s][A
Epoch 2:  92%|█████████▏| 5480/5971 [50:46<04:32,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 24.29it/s][A
Epoch 2:  92%|█████████▏| 5484/5971 [50:46<04:30,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.15it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.96it/s][A
Epoch 2:  92%|█████████▏| 5488/5971 [50:46<04:28,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 26.00it/s][A
Epoch 2:  92%|█████████▏| 5492/5971 [50:46<04:25,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.50it/s][A
Epoch 2:  92%|█████████▏| 5496/5971 [50:47<04:23,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.46it/s][A

Validating:  23%|██▎       | 39/167 [00:01<00:04, 25.91it/s][A
Epoch 2:  92%|█████████▏| 5500/5971 [50:47<04:20,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.95it/s][A
Epoch 2:  92%|█████████▏| 5504/5971 [50:47<04:18,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.44it/s][A
Epoch 2:  92%|█████████▏| 5508/5971 [50:47<04:16,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.26it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 24.76it/s][A
Epoch 2:  92%|█████████▏| 5512/5971 [50:47<04:13,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.06it/s][A
Epoch 2:  92%|█████████▏| 5516/5971 [50:47<04:11,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 24.63it/s][A
Epoch 2:  92%|█████████▏| 5520/5971 [50:47<04:08,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.19it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.26it/s][A
Epoch 2:  93%|█████████▎| 5524/5971 [50:48<04:06,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.62it/s][A
Epoch 2:  93%|█████████▎| 5528/5971 [50:48<04:04,  1.81it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.39it/s][A
Epoch 2:  93%|█████████▎| 5532/5971 [50:48<04:01,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.90it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.94it/s][A
Epoch 2:  93%|█████████▎| 5536/5971 [50:48<03:59,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.96it/s][A
Epoch 2:  93%|█████████▎| 5540/5971 [50:48<03:57,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.48it/s][A
Epoch 2:  93%|█████████▎| 5544/5971 [50:48<03:54,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 24.46it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.60it/s][A
Epoch 2:  93%|█████████▎| 5548/5971 [50:49<03:52,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 25.68it/s][A
Epoch 2:  93%|█████████▎| 5552/5971 [50:49<03:50,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.78it/s][A
Epoch 2:  93%|█████████▎| 5556/5971 [50:49<03:47,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.20it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.40it/s][A
Epoch 2:  93%|█████████▎| 5560/5971 [50:49<03:45,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.70it/s][A
Epoch 2:  93%|█████████▎| 5564/5971 [50:49<03:43,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.30it/s][A
Epoch 2:  93%|█████████▎| 5568/5971 [50:49<03:40,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.87it/s][A
Epoch 2:  93%|█████████▎| 5572/5971 [50:49<03:38,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.92it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 27.09it/s][A
Epoch 2:  93%|█████████▎| 5576/5971 [50:50<03:36,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.61it/s][A
Epoch 2:  93%|█████████▎| 5580/5971 [50:50<03:33,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.25it/s][A
Epoch 2:  94%|█████████▎| 5584/5971 [50:50<03:31,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.20it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.30it/s][A
Epoch 2:  94%|█████████▎| 5588/5971 [50:50<03:29,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 25.76it/s][A
Epoch 2:  94%|█████████▎| 5592/5971 [50:50<03:26,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.03it/s][A
Epoch 2:  94%|█████████▎| 5596/5971 [50:50<03:24,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 24.11it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.34it/s][A
Epoch 2:  94%|█████████▍| 5600/5971 [50:51<03:22,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 25.49it/s][A
Epoch 2:  94%|█████████▍| 5604/5971 [50:51<03:19,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.58it/s][A
Epoch 2:  94%|█████████▍| 5608/5971 [50:51<03:17,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 25.01it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.11it/s][A
Epoch 2:  94%|█████████▍| 5612/5971 [50:51<03:15,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.40it/s][A
Epoch 2:  94%|█████████▍| 5616/5971 [50:51<03:12,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 24.94it/s][A
Epoch 2:  94%|█████████▍| 5620/5971 [50:51<03:10,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.66it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.21it/s][A
Epoch 2:  94%|█████████▍| 5624/5971 [50:51<03:08,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.03it/s][A
Epoch 2:  94%|█████████▍| 5628/5971 [50:52<03:05,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5628/5971 [50:52<03:05,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.94e-5, train/loss_step=0.00565, global_step=1674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [A
Epoch 2:  94%|█████████▍| 5629/5971 [50:53<03:05,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.576, train/loss_vlb_step=0.00764, train/loss_step=0.576, global_step=1675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  94%|█████████▍| 5630/5971 [50:54<03:04,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000609, train/loss_step=0.174, global_step=1675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5631/5971 [50:55<03:04,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000195, train/loss_step=0.0562, global_step=1675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5632/5971 [50:58<03:04,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000195, train/loss_step=0.0562, global_step=1675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5632/5971 [50:58<03:04,  1.84it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=7.42e-5, train/loss_step=0.0196, global_step=1675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  94%|█████████▍| 5633/5971 [50:59<03:03,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000209, train/loss_step=0.0628, global_step=1676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5634/5971 [51:00<03:03,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00553, train/loss_vlb_step=2.66e-5, train/loss_step=0.00553, global_step=1676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5635/5971 [51:01<03:02,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000313, train/loss_step=0.094, global_step=1676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  94%|█████████▍| 5636/5971 [51:03<03:02,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000313, train/loss_step=0.094, global_step=1676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5636/5971 [51:03<03:02,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.000263, train/loss_step=0.078, global_step=1676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5637/5971 [51:04<03:01,  1.84it/s, loss=0.174, v_num=0, train/loss_simple_step=0.676, train/loss_vlb_step=0.0152, train/loss_step=0.676, global_step=1677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  94%|█████████▍| 5638/5971 [51:05<03:01,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000715, train/loss_step=0.206, global_step=1677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5639/5971 [51:06<03:00,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00162, train/loss_step=0.314, global_step=1677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  94%|█████████▍| 5640/5971 [51:08<03:00,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00162, train/loss_step=0.314, global_step=1677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5640/5971 [51:08<03:00,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0074, train/loss_vlb_step=3.5e-5, train/loss_step=0.0074, global_step=1677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5641/5971 [51:09<02:59,  1.84it/s, loss=0.158, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000886, train/loss_step=0.251, global_step=1678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  94%|█████████▍| 5642/5971 [51:10<02:59,  1.84it/s, loss=0.179, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00268, train/loss_step=0.425, global_step=1678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▍| 5643/5971 [51:11<02:58,  1.84it/s, loss=0.174, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=1678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5644/5971 [51:13<02:58,  1.84it/s, loss=0.174, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=1678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5644/5971 [51:13<02:58,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.68e-5, train/loss_step=0.0107, global_step=1678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5645/5971 [51:14<02:57,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=6.91e-5, train/loss_step=0.017, global_step=1679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▍| 5646/5971 [51:15<02:56,  1.84it/s, loss=0.21, v_num=0, train/loss_simple_step=0.915, train/loss_vlb_step=0.116, train/loss_step=0.915, global_step=1679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  95%|█████████▍| 5647/5971 [51:16<02:56,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00465, train/loss_vlb_step=2.42e-5, train/loss_step=0.00465, global_step=1679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5648/5971 [51:18<02:56,  1.83it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00465, train/loss_vlb_step=2.42e-5, train/loss_step=0.00465, global_step=1679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5648/5971 [51:18<02:56,  1.83it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.09e-5, train/loss_step=0.0242, global_step=1679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▍| 5649/5971 [51:19<02:55,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0888, train/loss_vlb_step=0.000294, train/loss_step=0.0888, global_step=1680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5650/5971 [51:20<02:54,  1.83it/s, loss=0.191, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.0022, train/loss_step=0.452, global_step=1680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  95%|█████████▍| 5651/5971 [51:21<02:54,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000383, train/loss_step=0.116, global_step=1680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5652/5971 [51:23<02:54,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000383, train/loss_step=0.116, global_step=1680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5652/5971 [51:23<02:54,  1.83it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000189, train/loss_step=0.0564, global_step=1680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5653/5971 [51:24<02:53,  1.83it/s, loss=0.206, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00101, train/loss_step=0.258, global_step=1681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  95%|█████████▍| 5654/5971 [51:25<02:52,  1.83it/s, loss=0.214, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000636, train/loss_step=0.184, global_step=1681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5655/5971 [51:26<02:52,  1.83it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0812, train/loss_vlb_step=0.000271, train/loss_step=0.0812, global_step=1681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5656/5971 [51:29<02:52,  1.83it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0812, train/loss_vlb_step=0.000271, train/loss_step=0.0812, global_step=1681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5656/5971 [51:29<02:52,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=1681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▍| 5657/5971 [51:30<02:51,  1.83it/s, loss=0.207, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.004, train/loss_step=0.498, global_step=1682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  95%|█████████▍| 5658/5971 [51:30<02:50,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.81e-5, train/loss_step=0.0106, global_step=1682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5659/5971 [51:31<02:50,  1.83it/s, loss=0.205, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00589, train/loss_step=0.462, global_step=1682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▍| 5660/5971 [51:34<02:50,  1.83it/s, loss=0.205, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00589, train/loss_step=0.462, global_step=1682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5660/5971 [51:34<02:50,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0663, train/loss_vlb_step=0.000234, train/loss_step=0.0663, global_step=1682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5661/5971 [51:35<02:49,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00349, train/loss_vlb_step=1.9e-5, train/loss_step=0.00349, global_step=1683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5662/5971 [51:36<02:48,  1.83it/s, loss=0.186, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.00085, train/loss_step=0.239, global_step=1683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  95%|█████████▍| 5663/5971 [51:37<02:48,  1.83it/s, loss=0.2, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00225, train/loss_step=0.392, global_step=1683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▍| 5664/5971 [51:40<02:48,  1.83it/s, loss=0.2, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00225, train/loss_step=0.392, global_step=1683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5664/5971 [51:40<02:48,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00151, train/loss_step=0.330, global_step=1683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5665/5971 [51:41<02:47,  1.83it/s, loss=0.221, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000438, train/loss_step=0.127, global_step=1684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5666/5971 [51:42<02:46,  1.83it/s, loss=0.189, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00107, train/loss_step=0.266, global_step=1684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▍| 5667/5971 [51:43<02:46,  1.83it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.00017, train/loss_step=0.0487, global_step=1684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5668/5971 [51:45<02:46,  1.83it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.00017, train/loss_step=0.0487, global_step=1684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5668/5971 [51:45<02:46,  1.83it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000164, train/loss_step=0.0487, global_step=1684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5669/5971 [51:46<02:45,  1.83it/s, loss=0.198, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000744, train/loss_step=0.211, global_step=1685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▍| 5670/5971 [51:47<02:44,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.98e-5, train/loss_step=0.0155, global_step=1685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5671/5971 [51:48<02:44,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000368, train/loss_step=0.111, global_step=1685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▍| 5672/5971 [51:51<02:43,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000368, train/loss_step=0.111, global_step=1685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▍| 5672/5971 [51:51<02:43,  1.82it/s, loss=0.192, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00179, train/loss_step=0.369, global_step=1685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5673/5971 [51:52<02:43,  1.82it/s, loss=0.192, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00126, train/loss_step=0.249, global_step=1686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5674/5971 [51:52<02:42,  1.82it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.00017, train/loss_step=0.0496, global_step=1686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5675/5971 [51:53<02:42,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000189, train/loss_step=0.0536, global_step=1686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5676/5971 [51:56<02:41,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000189, train/loss_step=0.0536, global_step=1686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5676/5971 [51:56<02:41,  1.82it/s, loss=0.202, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.003, train/loss_step=0.492, global_step=1686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  95%|█████████▌| 5677/5971 [51:57<02:41,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.25e-5, train/loss_step=0.0208, global_step=1687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5678/5971 [51:58<02:40,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00211, train/loss_vlb_step=1.24e-5, train/loss_step=0.00211, global_step=1687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5679/5971 [51:59<02:40,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000608, train/loss_step=0.175, global_step=1687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  95%|█████████▌| 5680/5971 [52:01<02:39,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000608, train/loss_step=0.175, global_step=1687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5680/5971 [52:01<02:39,  1.82it/s, loss=0.167, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.00048, train/loss_step=0.144, global_step=1687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5681/5971 [52:02<02:39,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000103, train/loss_step=0.028, global_step=1688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5682/5971 [52:03<02:38,  1.82it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0979, train/loss_vlb_step=0.000323, train/loss_step=0.0979, global_step=1688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5683/5971 [52:04<02:38,  1.82it/s, loss=0.149, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000461, train/loss_step=0.133, global_step=1688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▌| 5684/5971 [52:06<02:37,  1.82it/s, loss=0.149, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000461, train/loss_step=0.133, global_step=1688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5684/5971 [52:06<02:37,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=8.95e-5, train/loss_step=0.0235, global_step=1688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5685/5971 [52:07<02:37,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.42e-5, train/loss_step=0.0024, global_step=1689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5686/5971 [52:08<02:36,  1.82it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.86e-5, train/loss_step=0.0106, global_step=1689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5687/5971 [52:09<02:36,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000414, train/loss_step=0.125, global_step=1689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5688/5971 [52:11<02:35,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000414, train/loss_step=0.125, global_step=1689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5688/5971 [52:11<02:35,  1.82it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00863, train/loss_vlb_step=4.24e-5, train/loss_step=0.00863, global_step=1689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5689/5971 [52:12<02:35,  1.82it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000101, train/loss_step=0.0262, global_step=1690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5690/5971 [52:13<02:34,  1.82it/s, loss=0.112, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000394, train/loss_step=0.117, global_step=1690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▌| 5691/5971 [52:14<02:34,  1.82it/s, loss=0.126, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.0029, train/loss_step=0.394, global_step=1690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▌| 5692/5971 [52:16<02:33,  1.81it/s, loss=0.126, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.0029, train/loss_step=0.394, global_step=1690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5692/5971 [52:16<02:33,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000157, train/loss_step=0.0451, global_step=1690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5693/5971 [52:17<02:33,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000822, train/loss_step=0.210, global_step=1691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5694/5971 [52:18<02:32,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.58e-6, train/loss_step=0.0016, global_step=1691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5695/5971 [52:19<02:32,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000974, train/loss_step=0.240, global_step=1691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5696/5971 [52:21<02:31,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000974, train/loss_step=0.240, global_step=1691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5696/5971 [52:21<02:31,  1.81it/s, loss=0.0904, v_num=0, train/loss_simple_step=0.00321, train/loss_vlb_step=1.79e-5, train/loss_step=0.00321, global_step=1691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5697/5971 [52:22<02:31,  1.81it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000114, train/loss_step=0.0291, global_step=1692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5698/5971 [52:23<02:30,  1.81it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.27e-5, train/loss_step=0.0174, global_step=1692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5699/5971 [52:24<02:30,  1.81it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00168, train/loss_step=0.318, global_step=1692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  95%|█████████▌| 5700/5971 [52:26<02:29,  1.81it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00168, train/loss_step=0.318, global_step=1692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5700/5971 [52:26<02:29,  1.81it/s, loss=0.103, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.00155, train/loss_step=0.222, global_step=1692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  95%|█████████▌| 5701/5971 [52:27<02:29,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00166, train/loss_step=0.350, global_step=1693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  95%|█████████▌| 5702/5971 [52:28<02:28,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000122, train/loss_step=0.0322, global_step=1693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5703/5971 [52:29<02:27,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000486, train/loss_step=0.148, global_step=1693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  96%|█████████▌| 5704/5971 [52:31<02:27,  1.81it/s, loss=0.116, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000486, train/loss_step=0.148, global_step=1693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5704/5971 [52:31<02:27,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.26e-5, train/loss_step=0.00221, global_step=1693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5705/5971 [52:32<02:26,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00107, train/loss_step=0.277, global_step=1694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  96%|█████████▌| 5706/5971 [52:33<02:26,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000317, train/loss_step=0.0955, global_step=1694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5707/5971 [52:34<02:25,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000461, train/loss_step=0.137, global_step=1694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  96%|█████████▌| 5708/5971 [52:36<02:25,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000461, train/loss_step=0.137, global_step=1694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5708/5971 [52:36<02:25,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.0024, train/loss_step=0.403, global_step=1694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  96%|█████████▌| 5709/5971 [52:37<02:24,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000797, train/loss_step=0.235, global_step=1695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5710/5971 [52:37<02:24,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000433, train/loss_step=0.132, global_step=1695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5711/5971 [52:38<02:23,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.55e-5, train/loss_step=0.00487, global_step=1695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5712/5971 [52:41<02:23,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.55e-5, train/loss_step=0.00487, global_step=1695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5712/5971 [52:41<02:23,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.52e-5, train/loss_step=0.00264, global_step=1695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5713/5971 [52:42<02:22,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000661, train/loss_step=0.194, global_step=1696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  96%|█████████▌| 5714/5971 [52:43<02:22,  1.81it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0932, train/loss_vlb_step=0.000306, train/loss_step=0.0932, global_step=1696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5715/5971 [52:44<02:21,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.57e-5, train/loss_step=0.0027, global_step=1696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  96%|█████████▌| 5716/5971 [52:46<02:21,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.57e-5, train/loss_step=0.0027, global_step=1696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5716/5971 [52:46<02:21,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000347, train/loss_step=0.105, global_step=1696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  96%|█████████▌| 5717/5971 [52:47<02:20,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=1697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5718/5971 [52:48<02:20,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00182, train/loss_step=0.319, global_step=1697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  96%|█████████▌| 5719/5971 [52:49<02:19,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00197, train/loss_step=0.367, global_step=1697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5720/5971 [52:51<02:19,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00197, train/loss_step=0.367, global_step=1697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5720/5971 [52:51<02:19,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.61e-5, train/loss_step=0.0155, global_step=1697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5721/5971 [52:52<02:18,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.71e-5, train/loss_step=0.0191, global_step=1698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5722/5971 [52:53<02:18,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0446, train/loss_vlb_step=0.000162, train/loss_step=0.0446, global_step=1698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5723/5971 [52:53<02:17,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00625, train/loss_vlb_step=3.05e-5, train/loss_step=0.00625, global_step=1698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5724/5971 [52:55<02:17,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00625, train/loss_vlb_step=3.05e-5, train/loss_step=0.00625, global_step=1698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5724/5971 [52:55<02:17,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.21e-5, train/loss_step=0.0172, global_step=1698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  96%|█████████▌| 5725/5971 [52:56<02:16,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00357, train/loss_vlb_step=1.91e-5, train/loss_step=0.00357, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5726/5971 [52:57<02:15,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000953, train/loss_step=0.224, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  96%|█████████▌| 5727/5971 [52:58<02:15,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.26e-5, train/loss_step=0.0175, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5728/5971 [53:01<02:14,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.26e-5, train/loss_step=0.0175, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  96%|█████████▌| 5728/5971 [53:01<02:14,  1.80it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.45it/s][A

Validating:   1%|          | 2/167 [00:00<00:50,  3.26it/s][A
Epoch 2:  96%|█████████▌| 5732/5971 [53:02<02:12,  1.80it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.89it/s][A
Epoch 2:  96%|█████████▌| 5736/5971 [53:02<02:10,  1.80it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.34it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.77it/s][A
Epoch 2:  96%|█████████▌| 5740/5971 [53:02<02:08,  1.80it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:08, 19.00it/s][A
Epoch 2:  96%|█████████▌| 5744/5971 [53:02<02:05,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.23it/s][A
Epoch 2:  96%|█████████▋| 5748/5971 [53:02<02:03,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.93it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.64it/s][A
Epoch 2:  96%|█████████▋| 5752/5971 [53:03<02:01,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.28it/s][A
Epoch 2:  96%|█████████▋| 5756/5971 [53:03<01:58,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.92it/s][A
Epoch 2:  96%|█████████▋| 5760/5971 [53:03<01:56,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.10it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 24.32it/s][A
Epoch 2:  97%|█████████▋| 5764/5971 [53:03<01:54,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.33it/s][A
Epoch 2:  97%|█████████▋| 5768/5971 [53:03<01:52,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.57it/s][A
Epoch 2:  97%|█████████▋| 5772/5971 [53:03<01:49,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.43it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.09it/s][A
Epoch 2:  97%|█████████▋| 5776/5971 [53:03<01:47,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.02it/s][A
Epoch 2:  97%|█████████▋| 5780/5971 [53:04<01:45,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.25it/s][A
Epoch 2:  97%|█████████▋| 5784/5971 [53:04<01:42,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 24.86it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.22it/s][A
Epoch 2:  97%|█████████▋| 5788/5971 [53:04<01:40,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.27it/s][A
Epoch 2:  97%|█████████▋| 5792/5971 [53:04<01:38,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:05, 19.54it/s][A
Epoch 2:  97%|█████████▋| 5796/5971 [53:04<01:36,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:04, 19.96it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:04, 21.69it/s][A
Epoch 2:  97%|█████████▋| 5800/5971 [53:05<01:33,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:04, 22.68it/s][A
Epoch 2:  97%|█████████▋| 5804/5971 [53:05<01:31,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 22.60it/s][A
Epoch 2:  97%|█████████▋| 5808/5971 [53:05<01:29,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 23.12it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 24.70it/s][A
Epoch 2:  97%|█████████▋| 5812/5971 [53:05<01:27,  1.82it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 25.56it/s][A
Epoch 2:  97%|█████████▋| 5816/5971 [53:05<01:24,  1.83it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 24.61it/s][A
Epoch 2:  97%|█████████▋| 5820/5971 [53:05<01:22,  1.83it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.37it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 24.48it/s][A
Epoch 2:  98%|█████████▊| 5824/5971 [53:05<01:20,  1.83it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.51it/s][A
Epoch 2:  98%|█████████▊| 5828/5971 [53:06<01:18,  1.83it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.30it/s][A
Epoch 2:  98%|█████████▊| 5832/5971 [53:06<01:15,  1.83it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.58it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.55it/s][A
Epoch 2:  98%|█████████▊| 5836/5971 [53:06<01:13,  1.83it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 25.39it/s][A
Epoch 2:  98%|█████████▊| 5840/5971 [53:06<01:11,  1.83it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 25.61it/s][A
Epoch 2:  98%|█████████▊| 5844/5971 [53:06<01:09,  1.83it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 26.73it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.38it/s][A
Epoch 2:  98%|█████████▊| 5848/5971 [53:06<01:07,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.97it/s][A
Epoch 2:  98%|█████████▊| 5852/5971 [53:07<01:04,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.09it/s][A
Epoch 2:  98%|█████████▊| 5856/5971 [53:07<01:02,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.56it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.82it/s][A
Epoch 2:  98%|█████████▊| 5860/5971 [53:07<01:00,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.79it/s][A
Epoch 2:  98%|█████████▊| 5864/5971 [53:07<00:58,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.13it/s][A
Epoch 2:  98%|█████████▊| 5868/5971 [53:07<00:55,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 25.05it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 24.54it/s][A
Epoch 2:  98%|█████████▊| 5872/5971 [53:07<00:53,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 24.41it/s][A
Epoch 2:  98%|█████████▊| 5876/5971 [53:08<00:51,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 24.67it/s][A
Epoch 2:  98%|█████████▊| 5880/5971 [53:08<00:49,  1.84it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.54it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.56it/s][A
Epoch 2:  99%|█████████▊| 5884/5971 [53:08<00:47,  1.85it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 24.58it/s][A
Epoch 2:  99%|█████████▊| 5888/5971 [53:08<00:44,  1.85it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 26.44it/s][A
Epoch 2:  99%|█████████▊| 5892/5971 [53:08<00:42,  1.85it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 26.81it/s][A
Epoch 2:  99%|█████████▊| 5896/5971 [53:08<00:40,  1.85it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▊| 5896/5971 [53:08<00:40,  1.85it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.28e-6, train/loss_step=0.00156, global_step=1699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.88it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.57it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.78it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.89it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.03it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.10it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.17it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.22it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.07it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.03it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.03it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.01it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.04it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.14it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.16it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.14it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.07it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.04it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.13it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.25it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.18it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.13it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.22it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.28it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.34it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.27it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.31it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.96it/s]

Epoch 2:  99%|█████████▉| 5897/5971 [53:21<00:40,  1.84it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000812, train/loss_step=0.230, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.73it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.31it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.36it/s][A
Epoch 2:  99%|█████████▉| 5897/5971 [53:26<00:40,  1.84it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000812, train/loss_step=0.230, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.27it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.42it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.40it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.34it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.13it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.17it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.23it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.26it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.21it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.19it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.18it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s]

Epoch 2:  99%|█████████▉| 5898/5971 [53:33<00:39,  1.84it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000812, train/loss_step=0.230, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5898/5971 [53:33<00:39,  1.84it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.00044, train/loss_step=0.134, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.19it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.80it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.25it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.56it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.81it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.85it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.91it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.95it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  4.97it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.26it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.28it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.05it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  4.96it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.02it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  4.99it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.09it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.53it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.28it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.19it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.17it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.34it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.56it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.57it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.60it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.61it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.61it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.05it/s]

Epoch 2:  99%|█████████▉| 5899/5971 [53:46<00:39,  1.83it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.00044, train/loss_step=0.134, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5899/5971 [53:46<00:39,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.0168, train/loss_step=0.689, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.24it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.61it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.87it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.06it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.42it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.55it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:07,  4.52it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:07,  4.43it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  4.71it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:06,  4.85it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.01it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.12it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.44it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.50it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s]

Epoch 2:  99%|█████████▉| 5900/5971 [53:59<00:38,  1.82it/s, loss=0.127, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.0168, train/loss_step=0.689, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5900/5971 [53:59<00:38,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000108, train/loss_step=0.0272, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5901/5971 [54:00<00:38,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000108, train/loss_step=0.0272, global_step=1700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5901/5971 [54:00<00:38,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=1701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  99%|█████████▉| 5902/5971 [54:01<00:37,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=1701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5902/5971 [54:01<00:37,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000171, train/loss_step=0.0488, global_step=1701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5903/5971 [54:02<00:37,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000171, train/loss_step=0.0488, global_step=1701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5903/5971 [54:02<00:37,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.37e-5, train/loss_step=0.0121, global_step=1701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5904/5971 [54:05<00:36,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.37e-5, train/loss_step=0.0121, global_step=1701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5904/5971 [54:05<00:36,  1.82it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.000246, train/loss_step=0.0718, global_step=1701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5905/5971 [54:05<00:36,  1.82it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.000246, train/loss_step=0.0718, global_step=1701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5905/5971 [54:05<00:36,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000109, train/loss_step=0.0298, global_step=1702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5906/5971 [54:06<00:35,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000109, train/loss_step=0.0298, global_step=1702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5906/5971 [54:06<00:35,  1.82it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=5.92e-5, train/loss_step=0.0144, global_step=1702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5907/5971 [54:07<00:35,  1.82it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=5.92e-5, train/loss_step=0.0144, global_step=1702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5907/5971 [54:07<00:35,  1.82it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.13e-5, train/loss_step=0.0172, global_step=1702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5908/5971 [54:09<00:34,  1.82it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.13e-5, train/loss_step=0.0172, global_step=1702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5908/5971 [54:09<00:34,  1.82it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.81e-5, train/loss_step=0.0102, global_step=1702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5909/5971 [54:10<00:34,  1.82it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.81e-5, train/loss_step=0.0102, global_step=1702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5909/5971 [54:10<00:34,  1.82it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.0623, train/loss_vlb_step=0.000212, train/loss_step=0.0623, global_step=1703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5910/5971 [54:11<00:33,  1.82it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.0623, train/loss_vlb_step=0.000212, train/loss_step=0.0623, global_step=1703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5910/5971 [54:11<00:33,  1.82it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0782, train/loss_vlb_step=0.000265, train/loss_step=0.0782, global_step=1703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5911/5971 [54:12<00:33,  1.82it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0782, train/loss_vlb_step=0.000265, train/loss_step=0.0782, global_step=1703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5911/5971 [54:12<00:33,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.0013, train/loss_step=0.317, global_step=1703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2:  99%|█████████▉| 5912/5971 [54:14<00:32,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.0013, train/loss_step=0.317, global_step=1703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5912/5971 [54:14<00:32,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.8e-5, train/loss_step=0.0189, global_step=1703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5913/5971 [54:15<00:31,  1.82it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.8e-5, train/loss_step=0.0189, global_step=1703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5913/5971 [54:15<00:31,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00137, train/loss_step=0.308, global_step=1704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5914/5971 [54:16<00:31,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00137, train/loss_step=0.308, global_step=1704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5914/5971 [54:16<00:31,  1.82it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.00023, train/loss_step=0.0685, global_step=1704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5915/5971 [54:17<00:30,  1.82it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.00023, train/loss_step=0.0685, global_step=1704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5915/5971 [54:17<00:30,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.43e-5, train/loss_step=0.0027, global_step=1704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5916/5971 [54:19<00:30,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.43e-5, train/loss_step=0.0027, global_step=1704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5916/5971 [54:19<00:30,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.00092, train/loss_step=0.240, global_step=1704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  99%|█████████▉| 5917/5971 [54:20<00:29,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.00092, train/loss_step=0.240, global_step=1704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5917/5971 [54:20<00:29,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.48e-5, train/loss_step=0.0211, global_step=1705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5918/5971 [54:21<00:29,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.48e-5, train/loss_step=0.0211, global_step=1705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5918/5971 [54:21<00:29,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.1e-5, train/loss_step=0.0172, global_step=1705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5919/5971 [54:22<00:28,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.1e-5, train/loss_step=0.0172, global_step=1705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5919/5971 [54:22<00:28,  1.81it/s, loss=0.0759, v_num=0, train/loss_simple_step=0.0312, train/loss_vlb_step=0.000126, train/loss_step=0.0312, global_step=1705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5920/5971 [54:24<00:28,  1.81it/s, loss=0.0759, v_num=0, train/loss_simple_step=0.0312, train/loss_vlb_step=0.000126, train/loss_step=0.0312, global_step=1705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5920/5971 [54:24<00:28,  1.81it/s, loss=0.0836, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000602, train/loss_step=0.181, global_step=1705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  99%|█████████▉| 5921/5971 [54:25<00:27,  1.81it/s, loss=0.0836, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000602, train/loss_step=0.181, global_step=1705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5921/5971 [54:25<00:27,  1.81it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00234, train/loss_step=0.397, global_step=1706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5922/5971 [54:26<00:27,  1.81it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00234, train/loss_step=0.397, global_step=1706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5922/5971 [54:26<00:27,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.0012, train/loss_step=0.295, global_step=1706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  99%|█████████▉| 5923/5971 [54:27<00:26,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.0012, train/loss_step=0.295, global_step=1706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5923/5971 [54:27<00:26,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00185, train/loss_vlb_step=1.11e-5, train/loss_step=0.00185, global_step=1706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5924/5971 [54:29<00:25,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00185, train/loss_vlb_step=1.11e-5, train/loss_step=0.00185, global_step=1706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5924/5971 [54:29<00:25,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.00014, train/loss_step=0.0401, global_step=1706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2:  99%|█████████▉| 5925/5971 [54:30<00:25,  1.81it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.00014, train/loss_step=0.0401, global_step=1706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5925/5971 [54:30<00:25,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00545, train/loss_vlb_step=2.7e-5, train/loss_step=0.00545, global_step=1707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5926/5971 [54:31<00:24,  1.81it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00545, train/loss_vlb_step=2.7e-5, train/loss_step=0.00545, global_step=1707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5926/5971 [54:31<00:24,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00124, train/loss_step=0.289, global_step=1707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2:  99%|█████████▉| 5927/5971 [54:32<00:24,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00124, train/loss_step=0.289, global_step=1707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5927/5971 [54:32<00:24,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000246, train/loss_step=0.0723, global_step=1707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5928/5971 [54:34<00:23,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000246, train/loss_step=0.0723, global_step=1707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5928/5971 [54:34<00:23,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00265, train/loss_step=0.452, global_step=1707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  99%|█████████▉| 5929/5971 [54:35<00:23,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00265, train/loss_step=0.452, global_step=1707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5929/5971 [54:35<00:23,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00243, train/loss_step=0.396, global_step=1708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5930/5971 [54:36<00:22,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00243, train/loss_step=0.396, global_step=1708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5930/5971 [54:36<00:22,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000738, train/loss_step=0.213, global_step=1708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5931/5971 [54:37<00:22,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000738, train/loss_step=0.213, global_step=1708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5931/5971 [54:37<00:22,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00043, train/loss_step=0.128, global_step=1708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5932/5971 [54:39<00:21,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00043, train/loss_step=0.128, global_step=1708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5932/5971 [54:39<00:21,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.71e-5, train/loss_step=0.0101, global_step=1708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5933/5971 [54:40<00:21,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.71e-5, train/loss_step=0.0101, global_step=1708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5933/5971 [54:40<00:21,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00829, train/loss_vlb_step=3.81e-5, train/loss_step=0.00829, global_step=1709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5934/5971 [54:41<00:20,  1.81it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00829, train/loss_vlb_step=3.81e-5, train/loss_step=0.00829, global_step=1709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5934/5971 [54:41<00:20,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000847, train/loss_step=0.250, global_step=1709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2:  99%|█████████▉| 5935/5971 [54:42<00:19,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000847, train/loss_step=0.250, global_step=1709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5935/5971 [54:42<00:19,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00214, train/loss_step=0.399, global_step=1709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5936/5971 [54:45<00:19,  1.81it/s, loss=0.172, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00214, train/loss_step=0.399, global_step=1709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5936/5971 [54:45<00:19,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000127, train/loss_step=0.035, global_step=1709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5937/5971 [54:46<00:18,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000127, train/loss_step=0.035, global_step=1709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5937/5971 [54:46<00:18,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000617, train/loss_step=0.187, global_step=1710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5938/5971 [54:47<00:18,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000617, train/loss_step=0.187, global_step=1710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5938/5971 [54:47<00:18,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00148, train/loss_step=0.378, global_step=1710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5939/5971 [54:48<00:17,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00148, train/loss_step=0.378, global_step=1710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5939/5971 [54:48<00:17,  1.81it/s, loss=0.21, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00232, train/loss_step=0.454, global_step=1710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:  99%|█████████▉| 5940/5971 [54:50<00:17,  1.81it/s, loss=0.21, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00232, train/loss_step=0.454, global_step=1710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5940/5971 [54:50<00:17,  1.81it/s, loss=0.219, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00198, train/loss_step=0.365, global_step=1710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5941/5971 [54:51<00:16,  1.81it/s, loss=0.219, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00198, train/loss_step=0.365, global_step=1710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2:  99%|█████████▉| 5941/5971 [54:51<00:16,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.81e-5, train/loss_step=0.00332, global_step=1711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5942/5971 [54:52<00:16,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.81e-5, train/loss_step=0.00332, global_step=1711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5942/5971 [54:52<00:16,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00799, train/loss_vlb_step=3.88e-5, train/loss_step=0.00799, global_step=1711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5943/5971 [54:53<00:15,  1.80it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00799, train/loss_vlb_step=3.88e-5, train/loss_step=0.00799, global_step=1711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5943/5971 [54:53<00:15,  1.80it/s, loss=0.191, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000436, train/loss_step=0.131, global_step=1711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2: 100%|█████████▉| 5944/5971 [54:55<00:14,  1.80it/s, loss=0.191, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000436, train/loss_step=0.131, global_step=1711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5944/5971 [54:55<00:14,  1.80it/s, loss=0.202, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00115, train/loss_step=0.256, global_step=1711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2: 100%|█████████▉| 5945/5971 [54:56<00:14,  1.80it/s, loss=0.202, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00115, train/loss_step=0.256, global_step=1711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5945/5971 [54:56<00:14,  1.80it/s, loss=0.208, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=1712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5946/5971 [54:57<00:13,  1.80it/s, loss=0.208, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=1712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5946/5971 [54:57<00:13,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.000149, train/loss_step=0.041, global_step=1712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5947/5971 [54:58<00:13,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.000149, train/loss_step=0.041, global_step=1712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5947/5971 [54:58<00:13,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.29e-5, train/loss_step=0.0197, global_step=1712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5948/5971 [55:00<00:12,  1.80it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.29e-5, train/loss_step=0.0197, global_step=1712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5948/5971 [55:00<00:12,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000739, train/loss_step=0.210, global_step=1712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2: 100%|█████████▉| 5949/5971 [55:01<00:12,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000739, train/loss_step=0.210, global_step=1712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5949/5971 [55:01<00:12,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.000216, train/loss_step=0.0637, global_step=1713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5950/5971 [55:02<00:11,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.000216, train/loss_step=0.0637, global_step=1713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5950/5971 [55:02<00:11,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.94e-5, train/loss_step=0.0131, global_step=1713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2: 100%|█████████▉| 5951/5971 [55:03<00:11,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.94e-5, train/loss_step=0.0131, global_step=1713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5951/5971 [55:03<00:11,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00834, train/loss_vlb_step=4.04e-5, train/loss_step=0.00834, global_step=1713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5952/5971 [55:05<00:10,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00834, train/loss_vlb_step=4.04e-5, train/loss_step=0.00834, global_step=1713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5952/5971 [55:05<00:10,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.0012, train/loss_step=0.286, global_step=1713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2: 100%|█████████▉| 5953/5971 [55:06<00:09,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.0012, train/loss_step=0.286, global_step=1713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5953/5971 [55:06<00:09,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=1.91e-5, train/loss_step=0.00355, global_step=1714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5954/5971 [55:07<00:09,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=1.91e-5, train/loss_step=0.00355, global_step=1714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5954/5971 [55:07<00:09,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000431, train/loss_step=0.131, global_step=1714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2: 100%|█████████▉| 5955/5971 [55:08<00:08,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000431, train/loss_step=0.131, global_step=1714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5955/5971 [55:08<00:08,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000232, train/loss_step=0.0698, global_step=1714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5956/5971 [55:10<00:08,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000232, train/loss_step=0.0698, global_step=1714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5956/5971 [55:10<00:08,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.00385, train/loss_step=0.489, global_step=1714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2: 100%|█████████▉| 5957/5971 [55:11<00:07,  1.80it/s, loss=0.162, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.00385, train/loss_step=0.489, global_step=1714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5957/5971 [55:11<00:07,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000295, train/loss_step=0.0899, global_step=1715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5958/5971 [55:12<00:07,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000295, train/loss_step=0.0899, global_step=1715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5958/5971 [55:12<00:07,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000752, train/loss_step=0.207, global_step=1715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2: 100%|█████████▉| 5959/5971 [55:13<00:06,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000752, train/loss_step=0.207, global_step=1715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5959/5971 [55:13<00:06,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000564, train/loss_step=0.156, global_step=1715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5960/5971 [55:15<00:06,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000564, train/loss_step=0.156, global_step=1715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5960/5971 [55:15<00:06,  1.80it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.000133, train/loss_step=0.0365, global_step=1715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5961/5971 [55:16<00:05,  1.80it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.000133, train/loss_step=0.0365, global_step=1715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5961/5971 [55:16<00:05,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.35e-5, train/loss_step=0.0123, global_step=1716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2: 100%|█████████▉| 5962/5971 [55:17<00:05,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.35e-5, train/loss_step=0.0123, global_step=1716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5962/5971 [55:17<00:05,  1.80it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.2e-5, train/loss_step=0.00209, global_step=1716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5963/5971 [55:17<00:04,  1.80it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.2e-5, train/loss_step=0.00209, global_step=1716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5963/5971 [55:17<00:04,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.25e-5, train/loss_step=0.00221, global_step=1716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5964/5971 [55:20<00:03,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.25e-5, train/loss_step=0.00221, global_step=1716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5964/5971 [55:20<00:03,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000261, train/loss_step=0.0772, global_step=1716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2: 100%|█████████▉| 5965/5971 [55:21<00:03,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000261, train/loss_step=0.0772, global_step=1716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5965/5971 [55:21<00:03,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00175, train/loss_step=0.349, global_step=1717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]   
Epoch 2: 100%|█████████▉| 5966/5971 [55:22<00:02,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00175, train/loss_step=0.349, global_step=1717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5966/5971 [55:22<00:02,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000806, train/loss_step=0.224, global_step=1717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5967/5971 [55:23<00:02,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000806, train/loss_step=0.224, global_step=1717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5967/5971 [55:23<00:02,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=3.91e-5, train/loss_step=0.00852, global_step=1717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5968/5971 [55:25<00:01,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=3.91e-5, train/loss_step=0.00852, global_step=1717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5968/5971 [55:25<00:01,  1.79it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=7.85e-5, train/loss_step=0.0203, global_step=1717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2: 100%|█████████▉| 5969/5971 [55:26<00:01,  1.79it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=7.85e-5, train/loss_step=0.0203, global_step=1717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5969/5971 [55:26<00:01,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000615, train/loss_step=0.180, global_step=1718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2: 100%|█████████▉| 5970/5971 [55:27<00:00,  1.79it/s, loss=0.118, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000615, train/loss_step=0.180, global_step=1718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|█████████▉| 5970/5971 [55:27<00:00,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.0047, train/loss_step=0.504, global_step=1718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2: 100%|██████████| 5971/5971 [55:28<00:00,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.0047, train/loss_step=0.504, global_step=1718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:28<00:00,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000215, train/loss_step=0.062, global_step=1718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:30<00:00,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.25e-5, train/loss_step=0.0153, global_step=1718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:31<00:00,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000118, train/loss_step=0.0323, global_step=1719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:32<00:00,  1.79it/s, loss=0.135, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000559, train/loss_step=0.156, global_step=1719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2: 100%|██████████| 5971/5971 [55:32<00:00,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0412, train/loss_vlb_step=0.000154, train/loss_step=0.0412, global_step=1719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:35<00:00,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.00107, train/loss_step=0.231, global_step=1719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2: 100%|██████████| 5971/5971 [55:36<00:00,  1.79it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00276, train/loss_vlb_step=1.47e-5, train/loss_step=0.00276, global_step=1720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:37<00:00,  1.79it/s, loss=0.13, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00431, train/loss_step=0.480, global_step=1720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]     
Epoch 2: 100%|██████████| 5971/5971 [55:38<00:00,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.0014, train/loss_step=0.336, global_step=1720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:40<00:00,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.57e-5, train/loss_step=0.0259, global_step=1720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:41<00:00,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0805, train/loss_vlb_step=0.000265, train/loss_step=0.0805, global_step=1721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:41<00:00,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000872, train/loss_step=0.238, global_step=1721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]  
Epoch 2: 100%|██████████| 5971/5971 [55:42<00:00,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.81e-5, train/loss_step=0.0125, global_step=1721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:45<00:00,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000163, train/loss_step=0.0453, global_step=1721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:46<00:00,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0933, train/loss_vlb_step=0.000307, train/loss_step=0.0933, global_step=1722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:46<00:00,  1.78it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0588, train/loss_vlb_step=0.000206, train/loss_step=0.0588, global_step=1722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:47<00:00,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000226, train/loss_step=0.0669, global_step=1722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:49<00:00,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00865, train/loss_vlb_step=4.28e-5, train/loss_step=0.00865, global_step=1722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:50<00:00,  1.78it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.66e-5, train/loss_step=0.00314, global_step=1723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:51<00:00,  1.78it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000193, train/loss_step=0.0542, global_step=1723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2: 100%|██████████| 5971/5971 [55:52<00:00,  1.78it/s, loss=0.129, v_num=0, train/loss_simple_step=0.601, train/loss_vlb_step=0.0114, train/loss_step=0.601, global_step=1723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]    
Epoch 2: 100%|██████████| 5971/5971 [55:54<00:00,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0729, train/loss_vlb_step=0.000242, train/loss_step=0.0729, global_step=1723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 2: 100%|██████████| 5971/5971 [55:58<00:00,  1.78it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.51e-5, train/loss_step=0.0172, global_step=1724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 2:   0%|          | 0/5971 [00:00<00:00, 9868.95it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.51e-5, train/loss_step=0.0172, global_step=1724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153] 
Epoch 3:   0%|          | 0/5971 [00:00<00:02, 2603.54it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.51e-5, train/loss_step=0.0172, global_step=1724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 3:   0%|          | 1/5971 [00:02<2:07:42,  1.28s/it, loss=0.131, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.51e-5, train/loss_step=0.0172, global_step=1724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00311, train/loss_epoch=0.153]
Epoch 3:   0%|          | 1/5971 [00:02<2:07:50,  1.28s/it, loss=0.144, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00202, train/loss_step=0.409, global_step=1725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   0%|          | 2/5971 [00:03<2:00:26,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00202, train/loss_step=0.409, global_step=1725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 2/5971 [00:03<2:00:28,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000168, train/loss_step=0.0469, global_step=1725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 3/5971 [00:04<1:52:12,  1.13s/it, loss=0.144, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000168, train/loss_step=0.0469, global_step=1725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 3/5971 [00:04<1:52:13,  1.13s/it, loss=0.133, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=3.39e-5, train/loss_step=0.00661, global_step=1725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 4/5971 [00:06<2:11:14,  1.32s/it, loss=0.133, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=3.39e-5, train/loss_step=0.00661, global_step=1725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 4/5971 [00:06<2:11:15,  1.32s/it, loss=0.138, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000316, train/loss_step=0.0955, global_step=1725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   0%|          | 5/5971 [00:07<2:04:10,  1.25s/it, loss=0.138, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000316, train/loss_step=0.0955, global_step=1725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 5/5971 [00:07<2:04:11,  1.25s/it, loss=0.128, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00146, train/loss_step=0.296, global_step=1726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   0%|          | 6/5971 [00:08<1:58:56,  1.20s/it, loss=0.128, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00146, train/loss_step=0.296, global_step=1726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 6/5971 [00:08<1:58:57,  1.20s/it, loss=0.119, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000531, train/loss_step=0.152, global_step=1726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 7/5971 [00:09<1:55:21,  1.16s/it, loss=0.119, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000531, train/loss_step=0.152, global_step=1726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 7/5971 [00:09<1:55:21,  1.16s/it, loss=0.119, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.82e-5, train/loss_step=0.0129, global_step=1726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 8/5971 [00:11<2:08:46,  1.30s/it, loss=0.119, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.82e-5, train/loss_step=0.0129, global_step=1726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 8/5971 [00:11<2:08:47,  1.30s/it, loss=0.115, v_num=0, train/loss_simple_step=0.00116, train/loss_vlb_step=6.99e-6, train/loss_step=0.00116, global_step=1726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 9/5971 [00:12<2:04:45,  1.26s/it, loss=0.115, v_num=0, train/loss_simple_step=0.00116, train/loss_vlb_step=6.99e-6, train/loss_step=0.00116, global_step=1726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 9/5971 [00:12<2:04:46,  1.26s/it, loss=0.11, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000461, train/loss_step=0.139, global_step=1727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   0%|          | 10/5971 [00:13<2:01:13,  1.22s/it, loss=0.11, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000461, train/loss_step=0.139, global_step=1727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 10/5971 [00:13<2:01:13,  1.22s/it, loss=0.11, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.46e-5, train/loss_step=0.0155, global_step=1727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 11/5971 [00:14<1:58:27,  1.19s/it, loss=0.11, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.46e-5, train/loss_step=0.0155, global_step=1727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 11/5971 [00:14<1:58:27,  1.19s/it, loss=0.11, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000202, train/loss_step=0.0542, global_step=1727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 12/5971 [00:16<2:05:43,  1.27s/it, loss=0.11, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000202, train/loss_step=0.0542, global_step=1727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 12/5971 [00:16<2:05:44,  1.27s/it, loss=0.116, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000679, train/loss_step=0.202, global_step=1727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   0%|          | 13/5971 [00:17<2:03:04,  1.24s/it, loss=0.116, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000679, train/loss_step=0.202, global_step=1727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 13/5971 [00:17<2:03:05,  1.24s/it, loss=0.113, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.29e-5, train/loss_step=0.00215, global_step=1728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 14/5971 [00:18<2:00:38,  1.22s/it, loss=0.113, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.29e-5, train/loss_step=0.00215, global_step=1728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 14/5971 [00:18<2:00:39,  1.22s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.05e-5, train/loss_step=0.0203, global_step=1728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   0%|          | 15/5971 [00:19<1:58:30,  1.19s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.05e-5, train/loss_step=0.0203, global_step=1728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 15/5971 [00:19<1:58:30,  1.19s/it, loss=0.113, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000182, train/loss_step=0.052, global_step=1728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   0%|          | 16/5971 [00:21<2:05:28,  1.26s/it, loss=0.113, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000182, train/loss_step=0.052, global_step=1728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 16/5971 [00:21<2:05:29,  1.26s/it, loss=0.137, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00392, train/loss_step=0.493, global_step=1728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   0%|          | 17/5971 [00:22<2:03:25,  1.24s/it, loss=0.137, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00392, train/loss_step=0.493, global_step=1728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 17/5971 [00:22<2:03:26,  1.24s/it, loss=0.144, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000636, train/loss_step=0.180, global_step=1729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 18/5971 [00:23<2:01:25,  1.22s/it, loss=0.144, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000636, train/loss_step=0.180, global_step=1729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 18/5971 [00:23<2:01:25,  1.22s/it, loss=0.114, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.17e-5, train/loss_step=0.018, global_step=1729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   0%|          | 19/5971 [00:24<1:59:50,  1.21s/it, loss=0.114, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.17e-5, train/loss_step=0.018, global_step=1729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 19/5971 [00:24<1:59:50,  1.21s/it, loss=0.111, v_num=0, train/loss_simple_step=0.00787, train/loss_vlb_step=3.75e-5, train/loss_step=0.00787, global_step=1729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 20/5971 [00:26<2:04:00,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.00787, train/loss_vlb_step=3.75e-5, train/loss_step=0.00787, global_step=1729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 20/5971 [00:26<2:04:00,  1.25s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0762, train/loss_vlb_step=0.000255, train/loss_step=0.0762, global_step=1729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   0%|          | 21/5971 [00:27<2:02:27,  1.23s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0762, train/loss_vlb_step=0.000255, train/loss_step=0.0762, global_step=1729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 21/5971 [00:27<2:02:27,  1.23s/it, loss=0.103, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000633, train/loss_step=0.182, global_step=1730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   0%|          | 22/5971 [00:28<2:00:53,  1.22s/it, loss=0.103, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000633, train/loss_step=0.182, global_step=1730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 22/5971 [00:28<2:00:54,  1.22s/it, loss=0.105, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.00032, train/loss_step=0.0974, global_step=1730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 23/5971 [00:28<1:59:29,  1.21s/it, loss=0.105, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.00032, train/loss_step=0.0974, global_step=1730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 23/5971 [00:28<1:59:29,  1.21s/it, loss=0.114, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000606, train/loss_step=0.178, global_step=1730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   0%|          | 24/5971 [00:31<2:05:03,  1.26s/it, loss=0.114, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000606, train/loss_step=0.178, global_step=1730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 24/5971 [00:31<2:05:03,  1.26s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.00014, train/loss_step=0.0375, global_step=1730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 25/5971 [00:32<2:03:40,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.00014, train/loss_step=0.0375, global_step=1730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 25/5971 [00:32<2:03:40,  1.25s/it, loss=0.0983, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000155, train/loss_step=0.0456, global_step=1731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 26/5971 [00:33<2:02:21,  1.23s/it, loss=0.0983, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000155, train/loss_step=0.0456, global_step=1731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 26/5971 [00:33<2:02:21,  1.23s/it, loss=0.109, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00168, train/loss_step=0.374, global_step=1731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   0%|          | 27/5971 [00:34<2:01:02,  1.22s/it, loss=0.109, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00168, train/loss_step=0.374, global_step=1731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 27/5971 [00:34<2:01:02,  1.22s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000167, train/loss_step=0.0467, global_step=1731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 28/5971 [00:36<2:04:17,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000167, train/loss_step=0.0467, global_step=1731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 28/5971 [00:36<2:04:18,  1.25s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000204, train/loss_step=0.0569, global_step=1731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 29/5971 [00:37<2:03:08,  1.24s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000204, train/loss_step=0.0569, global_step=1731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   0%|          | 29/5971 [00:37<2:03:08,  1.24s/it, loss=0.123, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00174, train/loss_step=0.312, global_step=1732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   1%|          | 30/5971 [00:38<2:02:07,  1.23s/it, loss=0.123, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00174, train/loss_step=0.312, global_step=1732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 30/5971 [00:38<2:02:07,  1.23s/it, loss=0.137, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.0014, train/loss_step=0.310, global_step=1732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 31/5971 [00:39<2:01:01,  1.22s/it, loss=0.137, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.0014, train/loss_step=0.310, global_step=1732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 31/5971 [00:39<2:01:01,  1.22s/it, loss=0.148, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00113, train/loss_step=0.268, global_step=1732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 32/5971 [00:41<2:03:41,  1.25s/it, loss=0.148, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00113, train/loss_step=0.268, global_step=1732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 32/5971 [00:41<2:03:41,  1.25s/it, loss=0.138, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.42e-5, train/loss_step=0.00254, global_step=1732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 33/5971 [00:42<2:02:39,  1.24s/it, loss=0.138, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.42e-5, train/loss_step=0.00254, global_step=1732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 33/5971 [00:42<2:02:39,  1.24s/it, loss=0.138, v_num=0, train/loss_simple_step=0.00553, train/loss_vlb_step=2.94e-5, train/loss_step=0.00553, global_step=1733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 34/5971 [00:43<2:01:40,  1.23s/it, loss=0.138, v_num=0, train/loss_simple_step=0.00553, train/loss_vlb_step=2.94e-5, train/loss_step=0.00553, global_step=1733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 34/5971 [00:43<2:01:40,  1.23s/it, loss=0.154, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00133, train/loss_step=0.327, global_step=1733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   1%|          | 35/5971 [00:43<2:00:44,  1.22s/it, loss=0.154, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00133, train/loss_step=0.327, global_step=1733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 35/5971 [00:43<2:00:44,  1.22s/it, loss=0.151, v_num=0, train/loss_simple_step=0.00681, train/loss_vlb_step=3.3e-5, train/loss_step=0.00681, global_step=1733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 36/5971 [00:46<2:04:13,  1.26s/it, loss=0.151, v_num=0, train/loss_simple_step=0.00681, train/loss_vlb_step=3.3e-5, train/loss_step=0.00681, global_step=1733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 36/5971 [00:46<2:04:13,  1.26s/it, loss=0.136, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000876, train/loss_step=0.197, global_step=1733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   1%|          | 37/5971 [00:47<2:03:20,  1.25s/it, loss=0.136, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000876, train/loss_step=0.197, global_step=1733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 37/5971 [00:47<2:03:20,  1.25s/it, loss=0.128, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.77e-5, train/loss_step=0.0102, global_step=1734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 38/5971 [00:48<2:02:22,  1.24s/it, loss=0.128, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.77e-5, train/loss_step=0.0102, global_step=1734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 38/5971 [00:48<2:02:22,  1.24s/it, loss=0.127, v_num=0, train/loss_simple_step=0.00604, train/loss_vlb_step=3e-5, train/loss_step=0.00604, global_step=1734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 39/5971 [00:49<2:01:29,  1.23s/it, loss=0.127, v_num=0, train/loss_simple_step=0.00604, train/loss_vlb_step=3e-5, train/loss_step=0.00604, global_step=1734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 39/5971 [00:49<2:01:29,  1.23s/it, loss=0.127, v_num=0, train/loss_simple_step=0.00578, train/loss_vlb_step=2.91e-5, train/loss_step=0.00578, global_step=1734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 40/5971 [00:51<2:03:35,  1.25s/it, loss=0.127, v_num=0, train/loss_simple_step=0.00578, train/loss_vlb_step=2.91e-5, train/loss_step=0.00578, global_step=1734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 40/5971 [00:51<2:03:35,  1.25s/it, loss=0.124, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=3.97e-5, train/loss_step=0.00852, global_step=1734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 41/5971 [00:52<2:02:44,  1.24s/it, loss=0.124, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=3.97e-5, train/loss_step=0.00852, global_step=1734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 41/5971 [00:52<2:02:44,  1.24s/it, loss=0.123, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000547, train/loss_step=0.162, global_step=1735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   1%|          | 42/5971 [00:53<2:01:54,  1.23s/it, loss=0.123, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000547, train/loss_step=0.162, global_step=1735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 42/5971 [00:53<2:01:54,  1.23s/it, loss=0.141, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00291, train/loss_step=0.465, global_step=1735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 43/5971 [00:53<2:01:05,  1.23s/it, loss=0.141, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00291, train/loss_step=0.465, global_step=1735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 43/5971 [00:53<2:01:05,  1.23s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.00011, train/loss_step=0.0281, global_step=1735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 44/5971 [00:56<2:03:39,  1.25s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.00011, train/loss_step=0.0281, global_step=1735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 44/5971 [00:56<2:03:39,  1.25s/it, loss=0.143, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000805, train/loss_step=0.225, global_step=1735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 45/5971 [00:57<2:02:54,  1.24s/it, loss=0.143, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000805, train/loss_step=0.225, global_step=1735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 45/5971 [00:57<2:02:55,  1.24s/it, loss=0.148, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000455, train/loss_step=0.135, global_step=1736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 46/5971 [00:58<2:02:09,  1.24s/it, loss=0.148, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000455, train/loss_step=0.135, global_step=1736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 46/5971 [00:58<2:02:09,  1.24s/it, loss=0.14, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000762, train/loss_step=0.223, global_step=1736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 47/5971 [00:59<2:01:24,  1.23s/it, loss=0.14, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000762, train/loss_step=0.223, global_step=1736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 47/5971 [00:59<2:01:24,  1.23s/it, loss=0.141, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000214, train/loss_step=0.0628, global_step=1736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 48/5971 [01:01<2:03:12,  1.25s/it, loss=0.141, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000214, train/loss_step=0.0628, global_step=1736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 48/5971 [01:01<2:03:13,  1.25s/it, loss=0.174, v_num=0, train/loss_simple_step=0.713, train/loss_vlb_step=0.0235, train/loss_step=0.713, global_step=1736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   1%|          | 49/5971 [01:02<2:02:31,  1.24s/it, loss=0.174, v_num=0, train/loss_simple_step=0.713, train/loss_vlb_step=0.0235, train/loss_step=0.713, global_step=1736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 49/5971 [01:02<2:02:31,  1.24s/it, loss=0.163, v_num=0, train/loss_simple_step=0.0986, train/loss_vlb_step=0.000324, train/loss_step=0.0986, global_step=1737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 50/5971 [01:02<2:01:51,  1.23s/it, loss=0.163, v_num=0, train/loss_simple_step=0.0986, train/loss_vlb_step=0.000324, train/loss_step=0.0986, global_step=1737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 50/5971 [01:02<2:01:51,  1.23s/it, loss=0.148, v_num=0, train/loss_simple_step=0.0056, train/loss_vlb_step=2.95e-5, train/loss_step=0.0056, global_step=1737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 51/5971 [01:03<2:01:12,  1.23s/it, loss=0.148, v_num=0, train/loss_simple_step=0.0056, train/loss_vlb_step=2.95e-5, train/loss_step=0.0056, global_step=1737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 51/5971 [01:03<2:01:12,  1.23s/it, loss=0.157, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00272, train/loss_step=0.445, global_step=1737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   1%|          | 52/5971 [01:06<2:02:51,  1.25s/it, loss=0.157, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00272, train/loss_step=0.445, global_step=1737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 52/5971 [01:06<2:02:51,  1.25s/it, loss=0.197, v_num=0, train/loss_simple_step=0.805, train/loss_vlb_step=0.0417, train/loss_step=0.805, global_step=1737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 53/5971 [01:06<2:02:13,  1.24s/it, loss=0.197, v_num=0, train/loss_simple_step=0.805, train/loss_vlb_step=0.0417, train/loss_step=0.805, global_step=1737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 53/5971 [01:06<2:02:13,  1.24s/it, loss=0.197, v_num=0, train/loss_simple_step=0.00472, train/loss_vlb_step=2.29e-5, train/loss_step=0.00472, global_step=1738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 54/5971 [01:07<2:01:33,  1.23s/it, loss=0.197, v_num=0, train/loss_simple_step=0.00472, train/loss_vlb_step=2.29e-5, train/loss_step=0.00472, global_step=1738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 54/5971 [01:07<2:01:33,  1.23s/it, loss=0.198, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00141, train/loss_step=0.360, global_step=1738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   1%|          | 55/5971 [01:08<2:00:55,  1.23s/it, loss=0.198, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00141, train/loss_step=0.360, global_step=1738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 55/5971 [01:08<2:00:55,  1.23s/it, loss=0.2, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000145, train/loss_step=0.0409, global_step=1738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 56/5971 [01:11<2:03:48,  1.26s/it, loss=0.2, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000145, train/loss_step=0.0409, global_step=1738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 56/5971 [01:11<2:03:48,  1.26s/it, loss=0.19, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.85e-5, train/loss_step=0.00329, global_step=1738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 57/5971 [01:12<2:03:15,  1.25s/it, loss=0.19, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.85e-5, train/loss_step=0.00329, global_step=1738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 57/5971 [01:12<2:03:15,  1.25s/it, loss=0.195, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=1739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   1%|          | 58/5971 [01:13<2:02:39,  1.24s/it, loss=0.195, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=1739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 58/5971 [01:13<2:02:39,  1.24s/it, loss=0.195, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.3e-5, train/loss_step=0.00429, global_step=1739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 59/5971 [01:14<2:02:03,  1.24s/it, loss=0.195, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.3e-5, train/loss_step=0.00429, global_step=1739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 59/5971 [01:14<2:02:03,  1.24s/it, loss=0.206, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000734, train/loss_step=0.210, global_step=1739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   1%|          | 60/5971 [01:16<2:03:29,  1.25s/it, loss=0.206, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000734, train/loss_step=0.210, global_step=1739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 60/5971 [01:16<2:03:29,  1.25s/it, loss=0.208, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.00022, train/loss_step=0.0649, global_step=1739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 61/5971 [01:17<2:02:54,  1.25s/it, loss=0.208, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.00022, train/loss_step=0.0649, global_step=1739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 61/5971 [01:17<2:02:54,  1.25s/it, loss=0.217, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00163, train/loss_step=0.332, global_step=1740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   1%|          | 62/5971 [01:18<2:02:19,  1.24s/it, loss=0.217, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00163, train/loss_step=0.332, global_step=1740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 62/5971 [01:18<2:02:19,  1.24s/it, loss=0.204, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000823, train/loss_step=0.203, global_step=1740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 63/5971 [01:19<2:01:47,  1.24s/it, loss=0.204, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000823, train/loss_step=0.203, global_step=1740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 63/5971 [01:19<2:01:47,  1.24s/it, loss=0.202, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.75e-5, train/loss_step=0.00315, global_step=1740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 64/5971 [01:21<2:03:06,  1.25s/it, loss=0.202, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.75e-5, train/loss_step=0.00315, global_step=1740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 64/5971 [01:21<2:03:06,  1.25s/it, loss=0.202, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000787, train/loss_step=0.211, global_step=1740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   1%|          | 65/5971 [01:22<2:02:35,  1.25s/it, loss=0.202, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000787, train/loss_step=0.211, global_step=1740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 65/5971 [01:22<2:02:35,  1.25s/it, loss=0.212, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00164, train/loss_step=0.337, global_step=1741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 66/5971 [01:23<2:02:01,  1.24s/it, loss=0.212, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00164, train/loss_step=0.337, global_step=1741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 66/5971 [01:23<2:02:01,  1.24s/it, loss=0.202, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000121, train/loss_step=0.0322, global_step=1741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 67/5971 [01:23<2:01:30,  1.23s/it, loss=0.202, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000121, train/loss_step=0.0322, global_step=1741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 67/5971 [01:23<2:01:30,  1.23s/it, loss=0.199, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.32e-5, train/loss_step=0.00456, global_step=1741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 68/5971 [01:26<2:02:42,  1.25s/it, loss=0.199, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.32e-5, train/loss_step=0.00456, global_step=1741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 68/5971 [01:26<2:02:42,  1.25s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.3e-5, train/loss_step=0.0115, global_step=1741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   1%|          | 69/5971 [01:26<2:02:13,  1.24s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.3e-5, train/loss_step=0.0115, global_step=1741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 69/5971 [01:26<2:02:13,  1.24s/it, loss=0.169, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000669, train/loss_step=0.190, global_step=1742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 70/5971 [01:27<2:01:45,  1.24s/it, loss=0.169, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000669, train/loss_step=0.190, global_step=1742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 70/5971 [01:27<2:01:45,  1.24s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=9.19e-5, train/loss_step=0.0222, global_step=1742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 71/5971 [01:28<2:01:14,  1.23s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=9.19e-5, train/loss_step=0.0222, global_step=1742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 71/5971 [01:28<2:01:14,  1.23s/it, loss=0.148, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.26e-5, train/loss_step=0.00218, global_step=1742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 72/5971 [01:31<2:02:34,  1.25s/it, loss=0.148, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.26e-5, train/loss_step=0.00218, global_step=1742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 72/5971 [01:31<2:02:34,  1.25s/it, loss=0.118, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000888, train/loss_step=0.216, global_step=1742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   1%|          | 73/5971 [01:31<2:02:08,  1.24s/it, loss=0.118, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000888, train/loss_step=0.216, global_step=1742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 73/5971 [01:31<2:02:08,  1.24s/it, loss=0.136, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00185, train/loss_step=0.367, global_step=1743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|          | 74/5971 [01:32<2:01:39,  1.24s/it, loss=0.136, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00185, train/loss_step=0.367, global_step=1743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|          | 74/5971 [01:32<2:01:39,  1.24s/it, loss=0.119, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.14e-5, train/loss_step=0.0195, global_step=1743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 75/5971 [01:33<2:01:10,  1.23s/it, loss=0.119, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.14e-5, train/loss_step=0.0195, global_step=1743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 75/5971 [01:33<2:01:10,  1.23s/it, loss=0.137, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00253, train/loss_step=0.404, global_step=1743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   1%|▏         | 76/5971 [01:36<2:02:55,  1.25s/it, loss=0.137, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00253, train/loss_step=0.404, global_step=1743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 76/5971 [01:36<2:02:55,  1.25s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00275, train/loss_vlb_step=1.51e-5, train/loss_step=0.00275, global_step=1743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 77/5971 [01:37<2:02:27,  1.25s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00275, train/loss_vlb_step=1.51e-5, train/loss_step=0.00275, global_step=1743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 77/5971 [01:37<2:02:27,  1.25s/it, loss=0.139, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000522, train/loss_step=0.151, global_step=1744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   1%|▏         | 78/5971 [01:38<2:01:58,  1.24s/it, loss=0.139, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000522, train/loss_step=0.151, global_step=1744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 78/5971 [01:38<2:01:58,  1.24s/it, loss=0.157, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.0018, train/loss_step=0.351, global_step=1744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   1%|▏         | 79/5971 [01:38<2:01:30,  1.24s/it, loss=0.157, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.0018, train/loss_step=0.351, global_step=1744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 79/5971 [01:38<2:01:30,  1.24s/it, loss=0.156, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000627, train/loss_step=0.191, global_step=1744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 80/5971 [01:41<2:02:38,  1.25s/it, loss=0.156, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000627, train/loss_step=0.191, global_step=1744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 80/5971 [01:41<2:02:38,  1.25s/it, loss=0.153, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.07e-5, train/loss_step=0.00403, global_step=1744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 81/5971 [01:42<2:02:12,  1.24s/it, loss=0.153, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.07e-5, train/loss_step=0.00403, global_step=1744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 81/5971 [01:42<2:02:12,  1.24s/it, loss=0.16, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00415, train/loss_step=0.474, global_step=1745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:   1%|▏         | 82/5971 [01:42<2:01:46,  1.24s/it, loss=0.16, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00415, train/loss_step=0.474, global_step=1745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 82/5971 [01:42<2:01:46,  1.24s/it, loss=0.15, v_num=0, train/loss_simple_step=0.00511, train/loss_vlb_step=2.65e-5, train/loss_step=0.00511, global_step=1745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 83/5971 [01:43<2:01:21,  1.24s/it, loss=0.15, v_num=0, train/loss_simple_step=0.00511, train/loss_vlb_step=2.65e-5, train/loss_step=0.00511, global_step=1745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 83/5971 [01:43<2:01:21,  1.24s/it, loss=0.168, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.0016, train/loss_step=0.364, global_step=1745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   1%|▏         | 84/5971 [01:46<2:02:22,  1.25s/it, loss=0.168, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.0016, train/loss_step=0.364, global_step=1745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 84/5971 [01:46<2:02:22,  1.25s/it, loss=0.162, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 85/5971 [01:46<2:01:57,  1.24s/it, loss=0.162, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=1745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 85/5971 [01:46<2:01:57,  1.24s/it, loss=0.146, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.7e-5, train/loss_step=0.0122, global_step=1746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   1%|▏         | 86/5971 [01:47<2:01:30,  1.24s/it, loss=0.146, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.7e-5, train/loss_step=0.0122, global_step=1746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 86/5971 [01:47<2:01:30,  1.24s/it, loss=0.144, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.35e-5, train/loss_step=0.00912, global_step=1746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 87/5971 [01:48<2:01:05,  1.23s/it, loss=0.144, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.35e-5, train/loss_step=0.00912, global_step=1746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 87/5971 [01:48<2:01:05,  1.23s/it, loss=0.147, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000187, train/loss_step=0.0531, global_step=1746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   1%|▏         | 88/5971 [01:50<2:02:10,  1.25s/it, loss=0.147, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000187, train/loss_step=0.0531, global_step=1746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 88/5971 [01:50<2:02:10,  1.25s/it, loss=0.184, v_num=0, train/loss_simple_step=0.758, train/loss_vlb_step=0.025, train/loss_step=0.758, global_step=1746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:   1%|▏         | 89/5971 [01:51<2:01:48,  1.24s/it, loss=0.184, v_num=0, train/loss_simple_step=0.758, train/loss_vlb_step=0.025, train/loss_step=0.758, global_step=1746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   1%|▏         | 89/5971 [01:51<2:01:49,  1.24s/it, loss=0.181, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=1747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 90/5971 [01:52<2:01:25,  1.24s/it, loss=0.181, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=1747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 90/5971 [01:52<2:01:25,  1.24s/it, loss=0.18, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.33e-5, train/loss_step=0.0208, global_step=1747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 91/5971 [01:53<2:01:03,  1.24s/it, loss=0.18, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.33e-5, train/loss_step=0.0208, global_step=1747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 91/5971 [01:53<2:01:03,  1.24s/it, loss=0.199, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00195, train/loss_step=0.364, global_step=1747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   2%|▏         | 92/5971 [01:55<2:02:00,  1.25s/it, loss=0.199, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00195, train/loss_step=0.364, global_step=1747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 92/5971 [01:55<2:02:00,  1.25s/it, loss=0.188, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.76e-5, train/loss_step=0.00768, global_step=1747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 93/5971 [01:56<2:01:47,  1.24s/it, loss=0.188, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.76e-5, train/loss_step=0.00768, global_step=1747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 93/5971 [01:56<2:01:47,  1.24s/it, loss=0.17, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.59e-5, train/loss_step=0.00278, global_step=1748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   2%|▏         | 94/5971 [01:57<2:01:25,  1.24s/it, loss=0.17, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.59e-5, train/loss_step=0.00278, global_step=1748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 94/5971 [01:57<2:01:25,  1.24s/it, loss=0.174, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=1748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   2%|▏         | 95/5971 [01:58<2:01:01,  1.24s/it, loss=0.174, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=1748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 95/5971 [01:58<2:01:01,  1.24s/it, loss=0.159, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.00033, train/loss_step=0.0998, global_step=1748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 96/5971 [02:00<2:02:05,  1.25s/it, loss=0.159, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.00033, train/loss_step=0.0998, global_step=1748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 96/5971 [02:00<2:02:05,  1.25s/it, loss=0.176, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00197, train/loss_step=0.347, global_step=1748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   2%|▏         | 97/5971 [02:01<2:01:45,  1.24s/it, loss=0.176, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00197, train/loss_step=0.347, global_step=1748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 97/5971 [02:01<2:01:45,  1.24s/it, loss=0.204, v_num=0, train/loss_simple_step=0.707, train/loss_vlb_step=0.0198, train/loss_step=0.707, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   2%|▏         | 98/5971 [02:02<2:01:22,  1.24s/it, loss=0.204, v_num=0, train/loss_simple_step=0.707, train/loss_vlb_step=0.0198, train/loss_step=0.707, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 98/5971 [02:02<2:01:22,  1.24s/it, loss=0.189, v_num=0, train/loss_simple_step=0.0633, train/loss_vlb_step=0.000212, train/loss_step=0.0633, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 99/5971 [02:03<2:00:59,  1.24s/it, loss=0.189, v_num=0, train/loss_simple_step=0.0633, train/loss_vlb_step=0.000212, train/loss_step=0.0633, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 99/5971 [02:03<2:01:00,  1.24s/it, loss=0.182, v_num=0, train/loss_simple_step=0.0447, train/loss_vlb_step=0.000162, train/loss_step=0.0447, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 100/5971 [02:05<2:01:51,  1.25s/it, loss=0.182, v_num=0, train/loss_simple_step=0.0447, train/loss_vlb_step=0.000162, train/loss_step=0.0447, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   2%|▏         | 100/5971 [02:05<2:01:51,  1.25s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.35it/s][A
Epoch 3:   2%|▏         | 102/5971 [02:06<1:59:53,  1.23s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:42,  3.86it/s][A
Epoch 3:   2%|▏         | 104/5971 [02:06<1:57:44,  1.20s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.77it/s][A
Epoch 3:   2%|▏         | 107/5971 [02:06<1:54:31,  1.17s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.98it/s][A
Epoch 3:   2%|▏         | 110/5971 [02:06<1:51:28,  1.14s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.01it/s][A
Epoch 3:   2%|▏         | 114/5971 [02:06<1:47:37,  1.10s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   9%|▉         | 15/167 [00:01<00:06, 22.11it/s][A
Epoch 3:   2%|▏         | 118/5971 [02:06<1:44:03,  1.07s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:01<00:06, 24.15it/s][A

Validating:  13%|█▎        | 21/167 [00:01<00:05, 25.03it/s][A
Epoch 3:   2%|▏         | 122/5971 [02:07<1:40:42,  1.03s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.69it/s][A
Epoch 3:   2%|▏         | 126/5971 [02:07<1:37:36,  1.00s/it, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.19it/s][A
Epoch 3:   2%|▏         | 130/5971 [02:07<1:34:40,  1.03it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.65it/s][A

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.61it/s][A
Epoch 3:   2%|▏         | 134/5971 [02:07<1:31:54,  1.06it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.60it/s][A
Epoch 3:   2%|▏         | 138/5971 [02:07<1:29:18,  1.09it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 26.98it/s][A
Epoch 3:   2%|▏         | 142/5971 [02:07<1:26:50,  1.12it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.66it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.77it/s][A
Epoch 3:   2%|▏         | 146/5971 [02:08<1:24:32,  1.15it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.32it/s][A
Epoch 3:   3%|▎         | 150/5971 [02:08<1:22:20,  1.18it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.97it/s][A
Epoch 3:   3%|▎         | 154/5971 [02:08<1:20:14,  1.21it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.16it/s][A

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.96it/s][A
Epoch 3:   3%|▎         | 158/5971 [02:08<1:18:15,  1.24it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.06it/s][A
Epoch 3:   3%|▎         | 162/5971 [02:08<1:16:23,  1.27it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.51it/s][A
Epoch 3:   3%|▎         | 166/5971 [02:08<1:14:35,  1.30it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:02<00:03, 27.07it/s][A

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.64it/s][A
Epoch 3:   3%|▎         | 170/5971 [02:08<1:12:53,  1.33it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.85it/s][A
Epoch 3:   3%|▎         | 174/5971 [02:09<1:11:15,  1.36it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 24.50it/s][A
Epoch 3:   3%|▎         | 178/5971 [02:09<1:09:42,  1.38it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 23.79it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 24.11it/s][A
Epoch 3:   3%|▎         | 182/5971 [02:09<1:08:14,  1.41it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|█████     | 84/167 [00:03<00:03, 24.41it/s][A
Epoch 3:   3%|▎         | 186/5971 [02:09<1:06:48,  1.44it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 23.80it/s][A
Epoch 3:   3%|▎         | 190/5971 [02:09<1:05:27,  1.47it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 90/167 [00:03<00:03, 24.39it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.42it/s][A
Epoch 3:   3%|▎         | 194/5971 [02:09<1:04:08,  1.50it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.37it/s][A
Epoch 3:   3%|▎         | 198/5971 [02:10<1:02:52,  1.53it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.47it/s][A
Epoch 3:   3%|▎         | 202/5971 [02:10<1:01:39,  1.56it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.51it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.11it/s][A
Epoch 3:   3%|▎         | 206/5971 [02:10<1:00:29,  1.59it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 27.66it/s][A
Epoch 3:   4%|▎         | 210/5971 [02:10<59:22,  1.62it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.13it/s][A
Epoch 3:   4%|▎         | 214/5971 [02:10<58:17,  1.65it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.33it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:02, 20.42it/s][A
Epoch 3:   4%|▎         | 218/5971 [02:10<57:18,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:05<00:02, 21.74it/s][A
Epoch 3:   4%|▎         | 222/5971 [02:11<56:18,  1.70it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 22.23it/s][A
Epoch 3:   4%|▍         | 226/5971 [02:11<55:20,  1.73it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 23.59it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 23.21it/s][A
Epoch 3:   4%|▍         | 230/5971 [02:11<54:25,  1.76it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 22.30it/s][A
Epoch 3:   4%|▍         | 234/5971 [02:11<53:32,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 22.74it/s][A
Epoch 3:   4%|▍         | 238/5971 [02:11<52:40,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 23.13it/s][A

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 22.84it/s][A
Epoch 3:   4%|▍         | 242/5971 [02:11<51:50,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 23.99it/s][A
Epoch 3:   4%|▍         | 246/5971 [02:12<51:01,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 24.95it/s][A
Epoch 3:   4%|▍         | 250/5971 [02:12<50:13,  1.90it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.59it/s][A
Epoch 3:   4%|▍         | 254/5971 [02:12<49:26,  1.93it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.77it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 25.72it/s][A
Epoch 3:   4%|▍         | 258/5971 [02:12<48:42,  1.95it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.84it/s][A
Epoch 3:   4%|▍         | 262/5971 [02:12<47:59,  1.98it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.95it/s][A
Epoch 3:   4%|▍         | 266/5971 [02:12<47:17,  2.01it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 26.24it/s][A
Epoch 3:   4%|▍         | 268/5971 [02:13<47:05,  2.02it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000175, train/loss_step=0.0497, global_step=1749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:   5%|▍         | 269/5971 [02:14<47:15,  2.01it/s, loss=0.166, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.00037, train/loss_step=0.113, global_step=1750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   5%|▍         | 270/5971 [02:15<47:22,  2.01it/s, loss=0.166, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.00037, train/loss_step=0.113, global_step=1750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 270/5971 [02:15<47:22,  2.01it/s, loss=0.172, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=1750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 271/5971 [02:16<47:30,  2.00it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.24e-5, train/loss_step=0.00656, global_step=1750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 272/5971 [02:18<48:16,  1.97it/s, loss=0.161, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000815, train/loss_step=0.226, global_step=1750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   5%|▍         | 273/5971 [02:19<48:23,  1.96it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00819, train/loss_vlb_step=3.79e-5, train/loss_step=0.00819, global_step=1751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 274/5971 [02:20<48:30,  1.96it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00819, train/loss_vlb_step=3.79e-5, train/loss_step=0.00819, global_step=1751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 274/5971 [02:20<48:30,  1.96it/s, loss=0.167, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000415, train/loss_step=0.125, global_step=1751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   5%|▍         | 275/5971 [02:21<48:37,  1.95it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.000314, train/loss_step=0.0941, global_step=1751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 276/5971 [02:24<49:23,  1.92it/s, loss=0.15, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00216, train/loss_step=0.394, global_step=1751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   5%|▍         | 277/5971 [02:25<49:34,  1.91it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=2.11e-5, train/loss_step=0.00377, global_step=1752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 278/5971 [02:26<49:41,  1.91it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=2.11e-5, train/loss_step=0.00377, global_step=1752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 278/5971 [02:26<49:41,  1.91it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00688, train/loss_vlb_step=3.36e-5, train/loss_step=0.00688, global_step=1752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 279/5971 [02:26<49:47,  1.91it/s, loss=0.149, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00285, train/loss_step=0.464, global_step=1752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   5%|▍         | 280/5971 [02:29<50:23,  1.88it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.35e-5, train/loss_step=0.0043, global_step=1752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 281/5971 [02:30<50:30,  1.88it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00822, train/loss_vlb_step=3.77e-5, train/loss_step=0.00822, global_step=1753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 282/5971 [02:31<50:37,  1.87it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00822, train/loss_vlb_step=3.77e-5, train/loss_step=0.00822, global_step=1753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 282/5971 [02:31<50:37,  1.87it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000192, train/loss_step=0.0568, global_step=1753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   5%|▍         | 283/5971 [02:31<50:43,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.48e-5, train/loss_step=0.00268, global_step=1753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 284/5971 [02:35<51:35,  1.84it/s, loss=0.143, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00246, train/loss_step=0.374, global_step=1753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   5%|▍         | 285/5971 [02:36<51:42,  1.83it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.07e-5, train/loss_step=0.0121, global_step=1754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 286/5971 [02:36<51:48,  1.83it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.07e-5, train/loss_step=0.0121, global_step=1754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 286/5971 [02:36<51:48,  1.83it/s, loss=0.11, v_num=0, train/loss_simple_step=0.092, train/loss_vlb_step=0.000303, train/loss_step=0.092, global_step=1754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   5%|▍         | 287/5971 [02:37<51:54,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=0.000102, train/loss_step=0.0246, global_step=1754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 288/5971 [02:39<52:24,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000146, train/loss_step=0.0399, global_step=1754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 289/5971 [02:40<52:31,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.722, train/loss_vlb_step=0.029, train/loss_step=0.722, global_step=1755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:   5%|▍         | 290/5971 [02:41<52:37,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.722, train/loss_vlb_step=0.029, train/loss_step=0.722, global_step=1755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 290/5971 [02:41<52:37,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.8e-5, train/loss_step=0.0186, global_step=1755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 291/5971 [02:42<52:43,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00058, train/loss_step=0.168, global_step=1755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   5%|▍         | 292/5971 [02:44<53:18,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.670, train/loss_vlb_step=0.0102, train/loss_step=0.670, global_step=1755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   5%|▍         | 293/5971 [02:45<53:23,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00597, train/loss_vlb_step=2.92e-5, train/loss_step=0.00597, global_step=1756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 294/5971 [02:46<53:29,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00597, train/loss_vlb_step=2.92e-5, train/loss_step=0.00597, global_step=1756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 294/5971 [02:46<53:29,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00929, train/loss_step=0.628, global_step=1756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   5%|▍         | 295/5971 [02:47<53:35,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.00077, train/loss_step=0.213, global_step=1756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 296/5971 [02:49<54:04,  1.75it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.98e-5, train/loss_step=0.0106, global_step=1756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 297/5971 [02:50<54:09,  1.75it/s, loss=0.176, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.58e-5, train/loss_step=0.005, global_step=1757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   5%|▍         | 298/5971 [02:51<54:15,  1.74it/s, loss=0.176, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.58e-5, train/loss_step=0.005, global_step=1757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▍         | 298/5971 [02:51<54:15,  1.74it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.9e-5, train/loss_step=0.0162, global_step=1757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 299/5971 [02:52<54:20,  1.74it/s, loss=0.171, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0019, train/loss_step=0.358, global_step=1757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   5%|▌         | 300/5971 [02:54<54:48,  1.72it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.87e-5, train/loss_step=0.0245, global_step=1757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 301/5971 [02:55<54:54,  1.72it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00454, train/loss_vlb_step=2.36e-5, train/loss_step=0.00454, global_step=1758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 302/5971 [02:56<54:59,  1.72it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00454, train/loss_vlb_step=2.36e-5, train/loss_step=0.00454, global_step=1758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 302/5971 [02:56<54:59,  1.72it/s, loss=0.186, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00171, train/loss_step=0.323, global_step=1758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   5%|▌         | 303/5971 [02:57<55:04,  1.72it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00818, train/loss_vlb_step=3.78e-5, train/loss_step=0.00818, global_step=1758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 304/5971 [02:59<55:36,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00227, train/loss_step=0.367, global_step=1758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   5%|▌         | 305/5971 [03:00<55:43,  1.69it/s, loss=0.198, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00101, train/loss_step=0.269, global_step=1759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 306/5971 [03:01<55:48,  1.69it/s, loss=0.198, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00101, train/loss_step=0.269, global_step=1759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 306/5971 [03:01<55:48,  1.69it/s, loss=0.209, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00137, train/loss_step=0.310, global_step=1759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 307/5971 [03:02<55:53,  1.69it/s, loss=0.227, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00194, train/loss_step=0.372, global_step=1759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 308/5971 [03:04<56:24,  1.67it/s, loss=0.242, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00195, train/loss_step=0.350, global_step=1759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 309/5971 [03:05<56:29,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000368, train/loss_step=0.110, global_step=1760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 310/5971 [03:06<56:34,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000368, train/loss_step=0.110, global_step=1760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 310/5971 [03:06<56:34,  1.67it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000158, train/loss_step=0.0436, global_step=1760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 311/5971 [03:07<56:38,  1.67it/s, loss=0.207, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.000151, train/loss_step=0.044, global_step=1760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   5%|▌         | 312/5971 [03:09<57:08,  1.65it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000148, train/loss_step=0.0403, global_step=1760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 313/5971 [03:10<57:13,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000249, train/loss_step=0.0738, global_step=1761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 314/5971 [03:11<57:17,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000249, train/loss_step=0.0738, global_step=1761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 314/5971 [03:11<57:17,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.18e-5, train/loss_step=0.00195, global_step=1761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 315/5971 [03:12<57:22,  1.64it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.15e-5, train/loss_step=0.00406, global_step=1761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 316/5971 [03:14<57:48,  1.63it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0058, train/loss_vlb_step=2.88e-5, train/loss_step=0.0058, global_step=1761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   5%|▌         | 317/5971 [03:15<57:53,  1.63it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.03e-5, train/loss_step=0.0139, global_step=1762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 318/5971 [03:16<57:57,  1.63it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.03e-5, train/loss_step=0.0139, global_step=1762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 318/5971 [03:16<57:57,  1.63it/s, loss=0.16, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00476, train/loss_step=0.478, global_step=1762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   5%|▌         | 319/5971 [03:17<58:02,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000353, train/loss_step=0.101, global_step=1762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 320/5971 [03:19<58:30,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000798, train/loss_step=0.220, global_step=1762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 321/5971 [03:20<58:35,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000184, train/loss_step=0.0525, global_step=1763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 322/5971 [03:21<58:39,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000184, train/loss_step=0.0525, global_step=1763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 322/5971 [03:21<58:39,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00103, train/loss_step=0.274, global_step=1763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   5%|▌         | 323/5971 [03:22<58:43,  1.60it/s, loss=0.162, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000348, train/loss_step=0.105, global_step=1763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 324/5971 [03:24<59:21,  1.59it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000166, train/loss_step=0.0449, global_step=1763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 325/5971 [03:26<59:30,  1.58it/s, loss=0.152, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00265, train/loss_step=0.384, global_step=1764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   5%|▌         | 326/5971 [03:27<59:33,  1.58it/s, loss=0.152, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00265, train/loss_step=0.384, global_step=1764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 326/5971 [03:27<59:33,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000489, train/loss_step=0.148, global_step=1764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   5%|▌         | 327/5971 [03:27<59:37,  1.58it/s, loss=0.151, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00454, train/loss_step=0.530, global_step=1764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   5%|▌         | 328/5971 [03:30<1:00:07,  1.56it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.11e-5, train/loss_step=0.0019, global_step=1764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 329/5971 [03:31<1:00:14,  1.56it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000142, train/loss_step=0.0394, global_step=1765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 330/5971 [03:32<1:00:18,  1.56it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000142, train/loss_step=0.0394, global_step=1765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 330/5971 [03:32<1:00:18,  1.56it/s, loss=0.148, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00286, train/loss_step=0.392, global_step=1765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   6%|▌         | 331/5971 [03:33<1:00:22,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000487, train/loss_step=0.147, global_step=1765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 332/5971 [03:36<1:00:58,  1.54it/s, loss=0.161, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000711, train/loss_step=0.207, global_step=1765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 333/5971 [03:37<1:01:06,  1.54it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.34e-5, train/loss_step=0.00435, global_step=1766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 334/5971 [03:38<1:01:09,  1.54it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.34e-5, train/loss_step=0.00435, global_step=1766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 334/5971 [03:38<1:01:09,  1.54it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.52e-5, train/loss_step=0.00262, global_step=1766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 335/5971 [03:38<1:01:12,  1.53it/s, loss=0.179, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00267, train/loss_step=0.423, global_step=1766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   6%|▌         | 336/5971 [03:42<1:01:53,  1.52it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000222, train/loss_step=0.0644, global_step=1766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 337/5971 [03:43<1:02:03,  1.51it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000222, train/loss_step=0.0675, global_step=1767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 338/5971 [03:44<1:02:06,  1.51it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000222, train/loss_step=0.0675, global_step=1767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 338/5971 [03:44<1:02:06,  1.51it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.000283, train/loss_step=0.0847, global_step=1767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 339/5971 [03:45<1:02:09,  1.51it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0853, train/loss_vlb_step=0.000289, train/loss_step=0.0853, global_step=1767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 340/5971 [03:47<1:02:34,  1.50it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=0.000105, train/loss_step=0.0254, global_step=1767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 341/5971 [03:48<1:02:38,  1.50it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.000244, train/loss_step=0.0728, global_step=1768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 342/5971 [03:49<1:02:40,  1.50it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.000244, train/loss_step=0.0728, global_step=1768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 342/5971 [03:49<1:02:41,  1.50it/s, loss=0.18, v_num=0, train/loss_simple_step=0.767, train/loss_vlb_step=0.0333, train/loss_step=0.767, global_step=1768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:   6%|▌         | 343/5971 [03:50<1:02:44,  1.50it/s, loss=0.191, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00177, train/loss_step=0.326, global_step=1768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 344/5971 [03:52<1:03:07,  1.49it/s, loss=0.202, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00128, train/loss_step=0.271, global_step=1768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 345/5971 [03:53<1:03:10,  1.48it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00323, train/loss_vlb_step=1.73e-5, train/loss_step=0.00323, global_step=1769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 346/5971 [03:53<1:03:13,  1.48it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00323, train/loss_vlb_step=1.73e-5, train/loss_step=0.00323, global_step=1769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 346/5971 [03:53<1:03:13,  1.48it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00972, train/loss_vlb_step=4.6e-5, train/loss_step=0.00972, global_step=1769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   6%|▌         | 347/5971 [03:54<1:03:15,  1.48it/s, loss=0.151, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.84e-5, train/loss_step=0.019, global_step=1769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   6%|▌         | 348/5971 [03:57<1:03:43,  1.47it/s, loss=0.157, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000393, train/loss_step=0.119, global_step=1769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 349/5971 [03:58<1:03:46,  1.47it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.57e-5, train/loss_step=0.00278, global_step=1770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 350/5971 [03:59<1:03:49,  1.47it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.57e-5, train/loss_step=0.00278, global_step=1770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 350/5971 [03:59<1:03:49,  1.47it/s, loss=0.141, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000361, train/loss_step=0.110, global_step=1770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   6%|▌         | 351/5971 [04:00<1:03:51,  1.47it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0093, train/loss_vlb_step=4.32e-5, train/loss_step=0.0093, global_step=1770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 352/5971 [04:02<1:04:15,  1.46it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=3.05e-5, train/loss_step=0.00626, global_step=1770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 353/5971 [04:03<1:04:18,  1.46it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0048, train/loss_vlb_step=2.55e-5, train/loss_step=0.0048, global_step=1771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   6%|▌         | 354/5971 [04:04<1:04:21,  1.45it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0048, train/loss_vlb_step=2.55e-5, train/loss_step=0.0048, global_step=1771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 354/5971 [04:04<1:04:21,  1.45it/s, loss=0.131, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000507, train/loss_step=0.150, global_step=1771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   6%|▌         | 355/5971 [04:04<1:04:23,  1.45it/s, loss=0.119, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000724, train/loss_step=0.189, global_step=1771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 356/5971 [04:07<1:04:50,  1.44it/s, loss=0.121, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 357/5971 [04:08<1:04:52,  1.44it/s, loss=0.126, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000713, train/loss_step=0.161, global_step=1772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 358/5971 [04:09<1:04:54,  1.44it/s, loss=0.126, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000713, train/loss_step=0.161, global_step=1772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 358/5971 [04:09<1:04:54,  1.44it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0607, train/loss_vlb_step=0.000202, train/loss_step=0.0607, global_step=1772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 359/5971 [04:09<1:04:56,  1.44it/s, loss=0.141, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00239, train/loss_step=0.399, global_step=1772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   6%|▌         | 360/5971 [04:12<1:05:24,  1.43it/s, loss=0.171, v_num=0, train/loss_simple_step=0.626, train/loss_vlb_step=0.0102, train/loss_step=0.626, global_step=1772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   6%|▌         | 361/5971 [04:13<1:05:26,  1.43it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000131, train/loss_step=0.0348, global_step=1773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 362/5971 [04:14<1:05:29,  1.43it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000131, train/loss_step=0.0348, global_step=1773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 362/5971 [04:14<1:05:29,  1.43it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000254, train/loss_step=0.0755, global_step=1773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 363/5971 [04:15<1:05:31,  1.43it/s, loss=0.135, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.0015, train/loss_step=0.338, global_step=1773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   6%|▌         | 364/5971 [04:17<1:05:52,  1.42it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.44e-5, train/loss_step=0.00269, global_step=1773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 365/5971 [04:18<1:05:54,  1.42it/s, loss=0.133, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.00087, train/loss_step=0.235, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   6%|▌         | 366/5971 [04:19<1:05:56,  1.42it/s, loss=0.133, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.00087, train/loss_step=0.235, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 366/5971 [04:19<1:05:56,  1.42it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0877, train/loss_vlb_step=0.000292, train/loss_step=0.0877, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   6%|▌         | 367/5971 [04:19<1:05:58,  1.42it/s, loss=0.142, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000395, train/loss_step=0.119, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   6%|▌         | 368/5971 [04:22<1:06:26,  1.41it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.31it/s][A
Epoch 3:   6%|▌         | 370/5971 [04:22<1:06:10,  1.41it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:46,  3.59it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.16it/s][A
Epoch 3:   6%|▋         | 374/5971 [04:23<1:05:29,  1.42it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.68it/s][A
Epoch 3:   6%|▋         | 378/5971 [04:23<1:04:47,  1.44it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.87it/s][A
Epoch 3:   6%|▋         | 382/5971 [04:23<1:04:06,  1.45it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.38it/s][A

Validating:  10%|█         | 17/167 [00:01<00:07, 20.55it/s][A
Epoch 3:   6%|▋         | 386/5971 [04:23<1:03:26,  1.47it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.70it/s][A
Epoch 3:   7%|▋         | 390/5971 [04:23<1:02:47,  1.48it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.79it/s][A
Epoch 3:   7%|▋         | 394/5971 [04:24<1:02:08,  1.50it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.61it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.45it/s][A
Epoch 3:   7%|▋         | 398/5971 [04:24<1:01:30,  1.51it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.48it/s][A
Epoch 3:   7%|▋         | 402/5971 [04:24<1:00:53,  1.52it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.58it/s][A
Epoch 3:   7%|▋         | 406/5971 [04:24<1:00:17,  1.54it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.34it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.43it/s][A
Epoch 3:   7%|▋         | 410/5971 [04:24<59:41,  1.55it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.76it/s][A
Epoch 3:   7%|▋         | 414/5971 [04:24<59:06,  1.57it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.68it/s][A
Epoch 3:   7%|▋         | 418/5971 [04:24<58:31,  1.58it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.72it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 25.93it/s][A
Epoch 3:   7%|▋         | 422/5971 [04:25<57:58,  1.60it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.69it/s][A
Epoch 3:   7%|▋         | 426/5971 [04:25<57:25,  1.61it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.26it/s][A
Epoch 3:   7%|▋         | 430/5971 [04:25<56:52,  1.62it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 26.04it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:04, 24.99it/s][A
Epoch 3:   7%|▋         | 434/5971 [04:25<56:20,  1.64it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:03<00:03, 25.32it/s][A
Epoch 3:   7%|▋         | 438/5971 [04:25<55:49,  1.65it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 24.40it/s][A
Epoch 3:   7%|▋         | 442/5971 [04:25<55:19,  1.67it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.66it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.09it/s][A
Epoch 3:   7%|▋         | 446/5971 [04:26<54:48,  1.68it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.54it/s][A
Epoch 3:   8%|▊         | 450/5971 [04:26<54:19,  1.69it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.04it/s][A
Epoch 3:   8%|▊         | 454/5971 [04:26<53:50,  1.71it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.15it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.37it/s][A
Epoch 3:   8%|▊         | 458/5971 [04:26<53:21,  1.72it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.88it/s][A
Epoch 3:   8%|▊         | 462/5971 [04:26<52:53,  1.74it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.66it/s][A
Epoch 3:   8%|▊         | 466/5971 [04:26<52:25,  1.75it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.10it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.79it/s][A
Epoch 3:   8%|▊         | 470/5971 [04:27<51:58,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.65it/s][A
Epoch 3:   8%|▊         | 474/5971 [04:27<51:32,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.57it/s][A
Epoch 3:   8%|▊         | 478/5971 [04:27<51:05,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.91it/s][A
Epoch 3:   8%|▊         | 482/5971 [04:27<50:39,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.82it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:01, 27.99it/s][A
Epoch 3:   8%|▊         | 486/5971 [04:27<50:13,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 27.79it/s][A
Epoch 3:   8%|▊         | 490/5971 [04:27<49:48,  1.83it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.61it/s][A
Epoch 3:   8%|▊         | 494/5971 [04:27<49:24,  1.85it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.09it/s][A
Epoch 3:   8%|▊         | 498/5971 [04:28<49:00,  1.86it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 24.47it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.75it/s][A
Epoch 3:   8%|▊         | 502/5971 [04:28<48:36,  1.88it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 25.47it/s][A
Epoch 3:   8%|▊         | 506/5971 [04:28<48:13,  1.89it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.50it/s][A
Epoch 3:   9%|▊         | 510/5971 [04:28<47:49,  1.90it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 25.39it/s][A

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.30it/s][A
Epoch 3:   9%|▊         | 514/5971 [04:28<47:27,  1.92it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 24.88it/s][A
Epoch 3:   9%|▊         | 518/5971 [04:28<47:04,  1.93it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 24.83it/s][A
Epoch 3:   9%|▊         | 522/5971 [04:29<46:42,  1.94it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.14it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.88it/s][A
Epoch 3:   9%|▉         | 526/5971 [04:29<46:20,  1.96it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.14it/s][A
Epoch 3:   9%|▉         | 530/5971 [04:29<45:59,  1.97it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.15it/s][A
Epoch 3:   9%|▉         | 534/5971 [04:29<45:38,  1.99it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 23.80it/s][A
Epoch 3:   9%|▉         | 536/5971 [04:29<45:32,  1.99it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000328, train/loss_step=0.0991, global_step=1774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:   9%|▉         | 537/5971 [04:30<45:36,  1.99it/s, loss=0.141, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.75e-5, train/loss_step=0.016, global_step=1775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   9%|▉         | 538/5971 [04:31<45:40,  1.98it/s, loss=0.141, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.75e-5, train/loss_step=0.016, global_step=1775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 538/5971 [04:31<45:40,  1.98it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000191, train/loss_step=0.0551, global_step=1775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 539/5971 [04:32<45:43,  1.98it/s, loss=0.158, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00222, train/loss_step=0.387, global_step=1775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   9%|▉         | 540/5971 [04:34<46:00,  1.97it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.77e-5, train/loss_step=0.0103, global_step=1775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 541/5971 [04:35<46:03,  1.96it/s, loss=0.174, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00143, train/loss_step=0.340, global_step=1776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   9%|▉         | 542/5971 [04:36<46:06,  1.96it/s, loss=0.174, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00143, train/loss_step=0.340, global_step=1776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 542/5971 [04:36<46:06,  1.96it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00956, train/loss_vlb_step=4.35e-5, train/loss_step=0.00956, global_step=1776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 543/5971 [04:37<46:10,  1.96it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.63e-5, train/loss_step=0.0102, global_step=1776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   9%|▉         | 544/5971 [04:40<46:35,  1.94it/s, loss=0.174, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.0026, train/loss_step=0.409, global_step=1776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   9%|▉         | 545/5971 [04:41<46:38,  1.94it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0878, train/loss_vlb_step=0.00029, train/loss_step=0.0878, global_step=1777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 546/5971 [04:42<46:41,  1.94it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0878, train/loss_vlb_step=0.00029, train/loss_step=0.0878, global_step=1777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 546/5971 [04:42<46:41,  1.94it/s, loss=0.174, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000439, train/loss_step=0.132, global_step=1777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 547/5971 [04:43<46:44,  1.93it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.57e-5, train/loss_step=0.0123, global_step=1777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 548/5971 [04:45<47:02,  1.92it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000142, train/loss_step=0.0403, global_step=1777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 549/5971 [04:46<47:06,  1.92it/s, loss=0.142, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.0027, train/loss_step=0.379, global_step=1778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   9%|▉         | 550/5971 [04:47<47:09,  1.92it/s, loss=0.142, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.0027, train/loss_step=0.379, global_step=1778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 550/5971 [04:47<47:09,  1.92it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.72e-5, train/loss_step=0.0244, global_step=1778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 551/5971 [04:48<47:12,  1.91it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000257, train/loss_step=0.0772, global_step=1778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 552/5971 [04:50<47:29,  1.90it/s, loss=0.163, v_num=0, train/loss_simple_step=0.734, train/loss_vlb_step=0.0196, train/loss_step=0.734, global_step=1778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   9%|▉         | 553/5971 [04:51<47:32,  1.90it/s, loss=0.163, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000833, train/loss_step=0.239, global_step=1779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 554/5971 [04:52<47:35,  1.90it/s, loss=0.163, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000833, train/loss_step=0.239, global_step=1779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 554/5971 [04:52<47:35,  1.90it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.69e-5, train/loss_step=0.0211, global_step=1779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 555/5971 [04:53<47:38,  1.89it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.000203, train/loss_step=0.0613, global_step=1779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 556/5971 [04:56<47:58,  1.88it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.71e-5, train/loss_step=0.0133, global_step=1779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   9%|▉         | 557/5971 [04:56<48:01,  1.88it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.16e-5, train/loss_step=0.00201, global_step=1780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 558/5971 [04:57<48:03,  1.88it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.16e-5, train/loss_step=0.00201, global_step=1780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 558/5971 [04:57<48:03,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000411, train/loss_step=0.125, global_step=1780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:   9%|▉         | 559/5971 [04:58<48:06,  1.87it/s, loss=0.154, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00205, train/loss_step=0.360, global_step=1780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:   9%|▉         | 560/5971 [05:01<48:24,  1.86it/s, loss=0.16, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=1780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 561/5971 [05:02<48:27,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000681, train/loss_step=0.179, global_step=1781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 562/5971 [05:02<48:30,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000681, train/loss_step=0.179, global_step=1781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 562/5971 [05:02<48:30,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0681, train/loss_vlb_step=0.00023, train/loss_step=0.0681, global_step=1781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 563/5971 [05:03<48:33,  1.86it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.21e-5, train/loss_step=0.00202, global_step=1781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 564/5971 [05:06<48:54,  1.84it/s, loss=0.14, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000429, train/loss_step=0.130, global_step=1781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:   9%|▉         | 565/5971 [05:07<48:56,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.573, train/loss_vlb_step=0.00634, train/loss_step=0.573, global_step=1782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 566/5971 [05:08<48:59,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.573, train/loss_vlb_step=0.00634, train/loss_step=0.573, global_step=1782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 566/5971 [05:08<48:59,  1.84it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.79e-5, train/loss_step=0.00316, global_step=1782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:   9%|▉         | 567/5971 [05:09<49:02,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.00014, train/loss_step=0.0383, global_step=1782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  10%|▉         | 568/5971 [05:11<49:19,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000168, train/loss_step=0.0465, global_step=1782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 569/5971 [05:12<49:22,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.49e-5, train/loss_step=0.0204, global_step=1783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 570/5971 [05:13<49:25,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.49e-5, train/loss_step=0.0204, global_step=1783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 570/5971 [05:13<49:25,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=8.66e-5, train/loss_step=0.0233, global_step=1783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 571/5971 [05:14<49:27,  1.82it/s, loss=0.147, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000604, train/loss_step=0.172, global_step=1783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|▉         | 572/5971 [05:16<49:42,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.54e-5, train/loss_step=0.0026, global_step=1783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 573/5971 [05:17<49:45,  1.81it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0924, train/loss_vlb_step=0.000306, train/loss_step=0.0924, global_step=1784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 574/5971 [05:18<49:48,  1.81it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0924, train/loss_vlb_step=0.000306, train/loss_step=0.0924, global_step=1784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 574/5971 [05:18<49:48,  1.81it/s, loss=0.118, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00147, train/loss_step=0.331, global_step=1784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  10%|▉         | 575/5971 [05:19<49:50,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0521, train/loss_vlb_step=0.000181, train/loss_step=0.0521, global_step=1784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 576/5971 [05:21<50:07,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00148, train/loss_step=0.323, global_step=1784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  10%|▉         | 577/5971 [05:22<50:09,  1.79it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000164, train/loss_step=0.0466, global_step=1785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 578/5971 [05:23<50:12,  1.79it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000164, train/loss_step=0.0466, global_step=1785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 578/5971 [05:23<50:12,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000805, train/loss_step=0.236, global_step=1785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  10%|▉         | 579/5971 [05:24<50:14,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000202, train/loss_step=0.0579, global_step=1785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 580/5971 [05:27<50:35,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000145, train/loss_step=0.0387, global_step=1785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 581/5971 [05:28<50:38,  1.77it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.00013, train/loss_step=0.0332, global_step=1786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|▉         | 582/5971 [05:28<50:40,  1.77it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.00013, train/loss_step=0.0332, global_step=1786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 582/5971 [05:28<50:40,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00582, train/loss_vlb_step=2.85e-5, train/loss_step=0.00582, global_step=1786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 583/5971 [05:29<50:43,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00233, train/loss_vlb_step=1.34e-5, train/loss_step=0.00233, global_step=1786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 584/5971 [05:32<50:59,  1.76it/s, loss=0.121, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00248, train/loss_step=0.320, global_step=1786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  10%|▉         | 585/5971 [05:33<51:01,  1.76it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.0978, train/loss_vlb_step=0.000324, train/loss_step=0.0978, global_step=1787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 586/5971 [05:33<51:03,  1.76it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.0978, train/loss_vlb_step=0.000324, train/loss_step=0.0978, global_step=1787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 586/5971 [05:33<51:03,  1.76it/s, loss=0.099, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.000147, train/loss_step=0.0401, global_step=1787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|▉         | 587/5971 [05:34<51:06,  1.76it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.54e-5, train/loss_step=0.0198, global_step=1787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 588/5971 [05:36<51:19,  1.75it/s, loss=0.115, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00247, train/loss_step=0.378, global_step=1787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  10%|▉         | 589/5971 [05:37<51:22,  1.75it/s, loss=0.126, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000979, train/loss_step=0.254, global_step=1788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 590/5971 [05:38<51:24,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000979, train/loss_step=0.254, global_step=1788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 590/5971 [05:38<51:24,  1.74it/s, loss=0.141, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00121, train/loss_step=0.311, global_step=1788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|▉         | 591/5971 [05:39<51:26,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.00013, train/loss_step=0.0342, global_step=1788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 592/5971 [05:41<51:40,  1.73it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000134, train/loss_step=0.0375, global_step=1788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 593/5971 [05:42<51:42,  1.73it/s, loss=0.143, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.00105, train/loss_step=0.248, global_step=1789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  10%|▉         | 594/5971 [05:43<51:44,  1.73it/s, loss=0.143, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.00105, train/loss_step=0.248, global_step=1789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 594/5971 [05:43<51:44,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000152, train/loss_step=0.0425, global_step=1789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 595/5971 [05:44<51:47,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.81e-5, train/loss_step=0.0198, global_step=1789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|▉         | 596/5971 [05:46<52:00,  1.72it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000154, train/loss_step=0.0427, global_step=1789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|▉         | 597/5971 [05:47<52:03,  1.72it/s, loss=0.132, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00232, train/loss_step=0.423, global_step=1790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  10%|█         | 598/5971 [05:48<52:05,  1.72it/s, loss=0.132, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00232, train/loss_step=0.423, global_step=1790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 598/5971 [05:48<52:05,  1.72it/s, loss=0.125, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000339, train/loss_step=0.102, global_step=1790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 599/5971 [05:49<52:07,  1.72it/s, loss=0.15, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00778, train/loss_step=0.550, global_step=1790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  10%|█         | 600/5971 [05:51<52:20,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=1790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 601/5971 [05:52<52:22,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000638, train/loss_step=0.179, global_step=1791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 602/5971 [05:53<52:24,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000638, train/loss_step=0.179, global_step=1791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 602/5971 [05:53<52:24,  1.71it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=9.21e-5, train/loss_step=0.0225, global_step=1791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 603/5971 [05:54<52:26,  1.71it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.28e-5, train/loss_step=0.00441, global_step=1791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 604/5971 [05:56<52:41,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=7.72e-5, train/loss_step=0.0196, global_step=1791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  10%|█         | 605/5971 [05:57<52:43,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=1792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|█         | 606/5971 [05:58<52:45,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=1792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 606/5971 [05:58<52:45,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.53e-5, train/loss_step=0.00262, global_step=1792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 607/5971 [05:59<52:47,  1.69it/s, loss=0.172, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00632, train/loss_step=0.524, global_step=1792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  10%|█         | 608/5971 [06:01<53:04,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.0009, train/loss_step=0.245, global_step=1792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|█         | 609/5971 [06:02<53:06,  1.68it/s, loss=0.18, v_num=0, train/loss_simple_step=0.540, train/loss_vlb_step=0.00629, train/loss_step=0.540, global_step=1793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 610/5971 [06:03<53:08,  1.68it/s, loss=0.18, v_num=0, train/loss_simple_step=0.540, train/loss_vlb_step=0.00629, train/loss_step=0.540, global_step=1793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 610/5971 [06:03<53:08,  1.68it/s, loss=0.173, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000597, train/loss_step=0.175, global_step=1793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 611/5971 [06:04<53:10,  1.68it/s, loss=0.178, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000421, train/loss_step=0.125, global_step=1793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 612/5971 [06:06<53:24,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000129, train/loss_step=0.0341, global_step=1793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 613/5971 [06:07<53:27,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000878, train/loss_step=0.250, global_step=1794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  10%|█         | 614/5971 [06:08<53:29,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000878, train/loss_step=0.250, global_step=1794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 614/5971 [06:08<53:29,  1.67it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.81e-5, train/loss_step=0.0178, global_step=1794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 615/5971 [06:09<53:31,  1.67it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.24e-5, train/loss_step=0.0201, global_step=1794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 616/5971 [06:11<53:46,  1.66it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000114, train/loss_step=0.0294, global_step=1794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 617/5971 [06:12<53:48,  1.66it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000107, train/loss_step=0.0272, global_step=1795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 618/5971 [06:13<53:50,  1.66it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000107, train/loss_step=0.0272, global_step=1795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 618/5971 [06:13<53:50,  1.66it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.03e-5, train/loss_step=0.00176, global_step=1795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 619/5971 [06:14<53:52,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.54e-5, train/loss_step=0.0122, global_step=1795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  10%|█         | 620/5971 [06:16<54:08,  1.65it/s, loss=0.136, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00182, train/loss_step=0.377, global_step=1795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  10%|█         | 621/5971 [06:17<54:09,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.0021, train/loss_step=0.394, global_step=1796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|█         | 622/5971 [06:18<54:11,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.0021, train/loss_step=0.394, global_step=1796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 622/5971 [06:18<54:11,  1.65it/s, loss=0.162, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00167, train/loss_step=0.330, global_step=1796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 623/5971 [06:19<54:13,  1.64it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.43e-5, train/loss_step=0.0118, global_step=1796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 624/5971 [06:21<54:27,  1.64it/s, loss=0.171, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000622, train/loss_step=0.187, global_step=1796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  10%|█         | 625/5971 [06:22<54:29,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0978, train/loss_vlb_step=0.000323, train/loss_step=0.0978, global_step=1797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 626/5971 [06:23<54:31,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0978, train/loss_vlb_step=0.000323, train/loss_step=0.0978, global_step=1797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  10%|█         | 626/5971 [06:23<54:31,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.72e-5, train/loss_step=0.0053, global_step=1797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  11%|█         | 627/5971 [06:24<54:32,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00386, train/loss_vlb_step=1.98e-5, train/loss_step=0.00386, global_step=1797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  11%|█         | 628/5971 [06:26<54:45,  1.63it/s, loss=0.157, v_num=0, train/loss_simple_step=0.494, train/loss_vlb_step=0.00449, train/loss_step=0.494, global_step=1797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  11%|█         | 629/5971 [06:27<54:47,  1.63it/s, loss=0.149, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00255, train/loss_step=0.387, global_step=1798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  11%|█         | 630/5971 [06:28<54:48,  1.62it/s, loss=0.149, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00255, train/loss_step=0.387, global_step=1798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  11%|█         | 630/5971 [06:28<54:48,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00396, train/loss_step=0.483, global_step=1798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  11%|█         | 631/5971 [06:29<54:50,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.41e-5, train/loss_step=0.00255, global_step=1798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  11%|█         | 632/5971 [06:31<55:04,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000181, train/loss_step=0.0505, global_step=1798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  11%|█         | 633/5971 [06:32<55:06,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.47e-5, train/loss_step=0.00739, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  11%|█         | 634/5971 [06:33<55:08,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.47e-5, train/loss_step=0.00739, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  11%|█         | 634/5971 [06:33<55:08,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000275, train/loss_step=0.0828, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  11%|█         | 635/5971 [06:34<55:09,  1.61it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.95e-5, train/loss_step=0.00351, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  11%|█         | 636/5971 [06:37<55:26,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.35it/s][A
Epoch 3:  11%|█         | 638/5971 [06:37<55:18,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:39,  4.16it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.60it/s][A
Epoch 3:  11%|█         | 642/5971 [06:37<54:57,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.44it/s][A
Epoch 3:  11%|█         | 646/5971 [06:38<54:36,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   6%|▌         | 10/167 [00:00<00:11, 13.21it/s][A

Validating:   8%|▊         | 13/167 [00:01<00:09, 16.47it/s][A
Epoch 3:  11%|█         | 650/5971 [06:38<54:15,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|▉         | 16/167 [00:01<00:07, 18.89it/s][A
Epoch 3:  11%|█         | 654/5971 [06:38<53:54,  1.64it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  11%|█▏        | 19/167 [00:01<00:07, 20.56it/s][A
Epoch 3:  11%|█         | 658/5971 [06:38<53:33,  1.65it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 22.87it/s][A

Validating:  15%|█▍        | 25/167 [00:01<00:05, 23.67it/s][A
Epoch 3:  11%|█         | 662/5971 [06:38<53:12,  1.66it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.87it/s][A
Epoch 3:  11%|█         | 666/5971 [06:38<52:52,  1.67it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.05it/s][A
Epoch 3:  11%|█         | 670/5971 [06:39<52:32,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.62it/s][A
Epoch 3:  11%|█▏        | 674/5971 [06:39<52:12,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.86it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.24it/s][A
Epoch 3:  11%|█▏        | 678/5971 [06:39<51:53,  1.70it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.13it/s][A
Epoch 3:  11%|█▏        | 682/5971 [06:39<51:33,  1.71it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.66it/s][A
Epoch 3:  11%|█▏        | 686/5971 [06:39<51:14,  1.72it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.75it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.48it/s][A
Epoch 3:  12%|█▏        | 690/5971 [06:39<50:55,  1.73it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.11it/s][A
Epoch 3:  12%|█▏        | 694/5971 [06:39<50:36,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.24it/s][A
Epoch 3:  12%|█▏        | 698/5971 [06:40<50:18,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.59it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.09it/s][A
Epoch 3:  12%|█▏        | 702/5971 [06:40<49:59,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.04it/s][A
Epoch 3:  12%|█▏        | 706/5971 [06:40<49:41,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.88it/s][A
Epoch 3:  12%|█▏        | 710/5971 [06:40<49:23,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.39it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.81it/s][A
Epoch 3:  12%|█▏        | 714/5971 [06:40<49:06,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.89it/s][A
Epoch 3:  12%|█▏        | 718/5971 [06:40<48:48,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.75it/s][A
Epoch 3:  12%|█▏        | 722/5971 [06:41<48:31,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.42it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.73it/s][A
Epoch 3:  12%|█▏        | 726/5971 [06:41<48:14,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.96it/s][A
Epoch 3:  12%|█▏        | 730/5971 [06:41<47:57,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.11it/s][A
Epoch 3:  12%|█▏        | 734/5971 [06:41<47:40,  1.83it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.42it/s][A
Epoch 3:  12%|█▏        | 738/5971 [06:41<47:23,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.89it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.26it/s][A
Epoch 3:  12%|█▏        | 742/5971 [06:41<47:07,  1.85it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.82it/s][A
Epoch 3:  12%|█▏        | 746/5971 [06:41<46:51,  1.86it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.04it/s][A
Epoch 3:  13%|█▎        | 750/5971 [06:42<46:35,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 26.58it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:01, 27.03it/s][A
Epoch 3:  13%|█▎        | 754/5971 [06:42<46:19,  1.88it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.38it/s][A
Epoch 3:  13%|█▎        | 758/5971 [06:42<46:03,  1.89it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.46it/s][A
Epoch 3:  13%|█▎        | 762/5971 [06:42<45:47,  1.90it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.02it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.08it/s][A
Epoch 3:  13%|█▎        | 766/5971 [06:42<45:32,  1.90it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.34it/s][A
Epoch 3:  13%|█▎        | 770/5971 [06:42<45:17,  1.91it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.14it/s][A
Epoch 3:  13%|█▎        | 774/5971 [06:42<45:01,  1.92it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 27.45it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.55it/s][A
Epoch 3:  13%|█▎        | 778/5971 [06:43<44:47,  1.93it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.39it/s][A
Epoch 3:  13%|█▎        | 782/5971 [06:43<44:32,  1.94it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.71it/s][A
Epoch 3:  13%|█▎        | 786/5971 [06:43<44:17,  1.95it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.04it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.94it/s][A
Epoch 3:  13%|█▎        | 790/5971 [06:43<44:03,  1.96it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.52it/s][A
Epoch 3:  13%|█▎        | 794/5971 [06:43<43:48,  1.97it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 27.04it/s][A
Epoch 3:  13%|█▎        | 798/5971 [06:43<43:34,  1.98it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.66it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.24it/s][A
Epoch 3:  13%|█▎        | 802/5971 [06:43<43:20,  1.99it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  13%|█▎        | 804/5971 [06:44<43:15,  1.99it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000321, train/loss_step=0.0958, global_step=1799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:24,  1.97it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:15,  3.16it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:11,  3.92it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.41it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.75it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:08,  4.97it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.12it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.21it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:01<00:07,  5.28it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.38it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.35it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.35it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.38it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.40it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.41it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.29it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.33it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.43it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.43it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.46it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.46it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.44it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.37it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.42it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.45it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.40it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.23it/s]

Epoch 3:  13%|█▎        | 805/5971 [06:56<44:31,  1.93it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.0002, train/loss_step=0.0601, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  13%|█▎        | 805/5971 [06:57<44:35,  1.93it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.0002, train/loss_step=0.0601, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.37it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.86it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.32it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.60it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.81it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.01it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.10it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.07it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.53it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.45it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.48it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.52it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.50it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.51it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.57it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.44it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.15it/s]

Epoch 3:  13%|█▎        | 806/5971 [07:08<45:45,  1.88it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.0002, train/loss_step=0.0601, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  13%|█▎        | 806/5971 [07:08<45:45,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=9.34e-5, train/loss_step=0.026, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:27,  1.79it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:16,  2.94it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.70it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.23it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.64it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:08,  4.90it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.08it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.21it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.45it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:07,  5.04it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.14it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.28it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.30it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.52it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.56it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.60it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.63it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.60it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.41it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.42it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.42it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.43it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.34it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.35it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.34it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.47it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.22it/s]

Epoch 3:  14%|█▎        | 807/5971 [07:21<46:58,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=9.34e-5, train/loss_step=0.026, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 807/5971 [07:21<46:58,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=1.02e-5, train/loss_step=0.0017, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.13it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.72it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.16it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.51it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.75it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.90it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.98it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.05it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.44it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.42it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.49it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.32it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.25it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.15it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.14it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.11it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.13it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.14it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.24it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.37it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.39it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.46it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.48it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.54it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.57it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.60it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s]

Epoch 3:  14%|█▎        | 808/5971 [07:34<48:21,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=1.02e-5, train/loss_step=0.0017, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 808/5971 [07:34<48:21,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.482, train/loss_vlb_step=0.00308, train/loss_step=0.482, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  14%|█▎        | 809/5971 [07:35<48:23,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.482, train/loss_vlb_step=0.00308, train/loss_step=0.482, global_step=1800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 809/5971 [07:35<48:23,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000124, train/loss_step=0.0327, global_step=1801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 810/5971 [07:36<48:24,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000124, train/loss_step=0.0327, global_step=1801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 810/5971 [07:36<48:24,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000787, train/loss_step=0.219, global_step=1801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  14%|█▎        | 811/5971 [07:37<48:26,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000787, train/loss_step=0.219, global_step=1801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 811/5971 [07:37<48:26,  1.78it/s, loss=0.145, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.00057, train/loss_step=0.170, global_step=1801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▎        | 812/5971 [07:40<48:39,  1.77it/s, loss=0.145, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.00057, train/loss_step=0.170, global_step=1801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 812/5971 [07:40<48:39,  1.77it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0457, train/loss_vlb_step=0.000153, train/loss_step=0.0457, global_step=1801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 813/5971 [07:40<48:40,  1.77it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0457, train/loss_vlb_step=0.000153, train/loss_step=0.0457, global_step=1801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 813/5971 [07:40<48:40,  1.77it/s, loss=0.138, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=1802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  14%|█▎        | 814/5971 [07:41<48:42,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=1802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 814/5971 [07:41<48:42,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.76e-5, train/loss_step=0.0153, global_step=1802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 815/5971 [07:42<48:43,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.76e-5, train/loss_step=0.0153, global_step=1802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 815/5971 [07:42<48:43,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000406, train/loss_step=0.123, global_step=1802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▎        | 816/5971 [07:45<49:00,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000406, train/loss_step=0.123, global_step=1802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 816/5971 [07:45<49:00,  1.75it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.000103, train/loss_step=0.0267, global_step=1802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 817/5971 [07:46<49:01,  1.75it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.000103, train/loss_step=0.0267, global_step=1802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 817/5971 [07:46<49:01,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.67e-5, train/loss_step=0.013, global_step=1803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  14%|█▎        | 818/5971 [07:47<49:03,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.67e-5, train/loss_step=0.013, global_step=1803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 818/5971 [07:47<49:03,  1.75it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.64e-6, train/loss_step=0.00162, global_step=1803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 819/5971 [07:48<49:04,  1.75it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.64e-6, train/loss_step=0.00162, global_step=1803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 819/5971 [07:48<49:04,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00461, train/loss_step=0.567, global_step=1803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  14%|█▎        | 820/5971 [07:51<49:15,  1.74it/s, loss=0.106, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00461, train/loss_step=0.567, global_step=1803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 820/5971 [07:51<49:15,  1.74it/s, loss=0.114, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000853, train/loss_step=0.212, global_step=1803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 821/5971 [07:51<49:16,  1.74it/s, loss=0.114, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000853, train/loss_step=0.212, global_step=1803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▎        | 821/5971 [07:51<49:16,  1.74it/s, loss=0.12, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000405, train/loss_step=0.122, global_step=1804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▍        | 822/5971 [07:52<49:18,  1.74it/s, loss=0.12, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000405, train/loss_step=0.122, global_step=1804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 822/5971 [07:52<49:18,  1.74it/s, loss=0.142, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00617, train/loss_step=0.512, global_step=1804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 823/5971 [07:53<49:19,  1.74it/s, loss=0.142, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00617, train/loss_step=0.512, global_step=1804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 823/5971 [07:53<49:19,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000156, train/loss_step=0.0448, global_step=1804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 824/5971 [07:56<49:30,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000156, train/loss_step=0.0448, global_step=1804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 824/5971 [07:56<49:30,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.42e-5, train/loss_step=0.00247, global_step=1804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 825/5971 [07:57<49:32,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.42e-5, train/loss_step=0.00247, global_step=1804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 825/5971 [07:57<49:32,  1.73it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.38e-5, train/loss_step=0.0125, global_step=1805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  14%|█▍        | 826/5971 [07:57<49:33,  1.73it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.38e-5, train/loss_step=0.0125, global_step=1805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 826/5971 [07:57<49:33,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.00082, train/loss_step=0.218, global_step=1805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  14%|█▍        | 827/5971 [07:58<49:34,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.00082, train/loss_step=0.218, global_step=1805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 827/5971 [07:58<49:34,  1.73it/s, loss=0.163, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00217, train/loss_step=0.332, global_step=1805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 828/5971 [08:01<49:47,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00217, train/loss_step=0.332, global_step=1805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 828/5971 [08:01<49:47,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00319, train/loss_step=0.452, global_step=1805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 829/5971 [08:02<49:50,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00319, train/loss_step=0.452, global_step=1805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 829/5971 [08:02<49:50,  1.72it/s, loss=0.175, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00126, train/loss_step=0.307, global_step=1806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 830/5971 [08:03<49:51,  1.72it/s, loss=0.175, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00126, train/loss_step=0.307, global_step=1806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 830/5971 [08:03<49:51,  1.72it/s, loss=0.17, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=1806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 831/5971 [08:04<49:53,  1.72it/s, loss=0.17, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=1806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 831/5971 [08:04<49:53,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000143, train/loss_step=0.0413, global_step=1806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 832/5971 [08:07<50:04,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000143, train/loss_step=0.0413, global_step=1806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 832/5971 [08:07<50:04,  1.71it/s, loss=0.184, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00277, train/loss_step=0.439, global_step=1806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  14%|█▍        | 833/5971 [08:07<50:06,  1.71it/s, loss=0.184, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00277, train/loss_step=0.439, global_step=1806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 833/5971 [08:07<50:06,  1.71it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00325, train/loss_vlb_step=1.71e-5, train/loss_step=0.00325, global_step=1807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 834/5971 [08:08<50:07,  1.71it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00325, train/loss_vlb_step=1.71e-5, train/loss_step=0.00325, global_step=1807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 834/5971 [08:08<50:07,  1.71it/s, loss=0.183, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000337, train/loss_step=0.101, global_step=1807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  14%|█▍        | 835/5971 [08:09<50:08,  1.71it/s, loss=0.183, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000337, train/loss_step=0.101, global_step=1807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 835/5971 [08:09<50:08,  1.71it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.4e-5, train/loss_step=0.0218, global_step=1807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 836/5971 [08:12<50:19,  1.70it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.4e-5, train/loss_step=0.0218, global_step=1807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 836/5971 [08:12<50:19,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00116, train/loss_step=0.249, global_step=1807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▍        | 837/5971 [08:13<50:21,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00116, train/loss_step=0.249, global_step=1807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 837/5971 [08:13<50:21,  1.70it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000232, train/loss_step=0.0706, global_step=1808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 838/5971 [08:14<50:22,  1.70it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000232, train/loss_step=0.0706, global_step=1808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 838/5971 [08:14<50:22,  1.70it/s, loss=0.203, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000741, train/loss_step=0.219, global_step=1808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  14%|█▍        | 839/5971 [08:14<50:23,  1.70it/s, loss=0.203, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000741, train/loss_step=0.219, global_step=1808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 839/5971 [08:14<50:23,  1.70it/s, loss=0.194, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00211, train/loss_step=0.395, global_step=1808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▍        | 840/5971 [08:17<50:33,  1.69it/s, loss=0.194, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00211, train/loss_step=0.395, global_step=1808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 840/5971 [08:17<50:33,  1.69it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=5.02e-5, train/loss_step=0.0105, global_step=1808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 841/5971 [08:18<50:34,  1.69it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=5.02e-5, train/loss_step=0.0105, global_step=1808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 841/5971 [08:18<50:34,  1.69it/s, loss=0.19, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000836, train/loss_step=0.233, global_step=1809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  14%|█▍        | 842/5971 [08:18<50:35,  1.69it/s, loss=0.19, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000836, train/loss_step=0.233, global_step=1809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 842/5971 [08:18<50:35,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0498, train/loss_vlb_step=0.000168, train/loss_step=0.0498, global_step=1809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 843/5971 [08:19<50:37,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0498, train/loss_vlb_step=0.000168, train/loss_step=0.0498, global_step=1809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 843/5971 [08:19<50:37,  1.69it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.53e-5, train/loss_step=0.0104, global_step=1809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▍        | 844/5971 [08:22<50:47,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.53e-5, train/loss_step=0.0104, global_step=1809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 844/5971 [08:22<50:47,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00279, train/loss_step=0.405, global_step=1809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  14%|█▍        | 845/5971 [08:23<50:48,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00279, train/loss_step=0.405, global_step=1809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 845/5971 [08:23<50:48,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.25e-5, train/loss_step=0.0203, global_step=1810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 846/5971 [08:23<50:49,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.25e-5, train/loss_step=0.0203, global_step=1810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 846/5971 [08:23<50:49,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00504, train/loss_step=0.513, global_step=1810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  14%|█▍        | 847/5971 [08:24<50:50,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00504, train/loss_step=0.513, global_step=1810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 847/5971 [08:24<50:50,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.670, train/loss_vlb_step=0.0164, train/loss_step=0.670, global_step=1810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 848/5971 [08:26<50:59,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.670, train/loss_vlb_step=0.0164, train/loss_step=0.670, global_step=1810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 848/5971 [08:26<50:59,  1.67it/s, loss=0.214, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00269, train/loss_step=0.403, global_step=1810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 849/5971 [08:27<51:00,  1.67it/s, loss=0.214, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00269, train/loss_step=0.403, global_step=1810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 849/5971 [08:27<51:00,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=1811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 850/5971 [08:28<51:01,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=1811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 850/5971 [08:28<51:01,  1.67it/s, loss=0.206, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.00046, train/loss_step=0.139, global_step=1811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▍        | 851/5971 [08:29<51:02,  1.67it/s, loss=0.206, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.00046, train/loss_step=0.139, global_step=1811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 851/5971 [08:29<51:02,  1.67it/s, loss=0.224, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00262, train/loss_step=0.407, global_step=1811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 852/5971 [08:31<51:11,  1.67it/s, loss=0.224, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00262, train/loss_step=0.407, global_step=1811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 852/5971 [08:31<51:11,  1.67it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.71e-5, train/loss_step=0.0219, global_step=1811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 853/5971 [08:32<51:12,  1.67it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.71e-5, train/loss_step=0.0219, global_step=1811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 853/5971 [08:32<51:12,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.74e-5, train/loss_step=0.0158, global_step=1812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 854/5971 [08:33<51:13,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.74e-5, train/loss_step=0.0158, global_step=1812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 854/5971 [08:33<51:13,  1.67it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.88e-5, train/loss_step=0.00352, global_step=1812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 855/5971 [08:34<51:14,  1.66it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.88e-5, train/loss_step=0.00352, global_step=1812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 855/5971 [08:34<51:14,  1.66it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.22e-5, train/loss_step=0.00435, global_step=1812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 856/5971 [08:36<51:24,  1.66it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.22e-5, train/loss_step=0.00435, global_step=1812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 856/5971 [08:36<51:24,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.00345, train/loss_step=0.520, global_step=1812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  14%|█▍        | 857/5971 [08:37<51:25,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.00345, train/loss_step=0.520, global_step=1812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 857/5971 [08:37<51:25,  1.66it/s, loss=0.22, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00109, train/loss_step=0.251, global_step=1813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▍        | 858/5971 [08:38<51:26,  1.66it/s, loss=0.22, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00109, train/loss_step=0.251, global_step=1813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 858/5971 [08:38<51:26,  1.66it/s, loss=0.221, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000868, train/loss_step=0.228, global_step=1813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 859/5971 [08:39<51:27,  1.66it/s, loss=0.221, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000868, train/loss_step=0.228, global_step=1813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 859/5971 [08:39<51:27,  1.66it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00606, train/loss_vlb_step=3.07e-5, train/loss_step=0.00606, global_step=1813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 860/5971 [08:41<51:35,  1.65it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00606, train/loss_vlb_step=3.07e-5, train/loss_step=0.00606, global_step=1813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 860/5971 [08:41<51:35,  1.65it/s, loss=0.207, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=1813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  14%|█▍        | 861/5971 [08:42<51:36,  1.65it/s, loss=0.207, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=1813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 861/5971 [08:42<51:36,  1.65it/s, loss=0.216, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00212, train/loss_step=0.403, global_step=1814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 862/5971 [08:43<51:38,  1.65it/s, loss=0.216, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00212, train/loss_step=0.403, global_step=1814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 862/5971 [08:43<51:38,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000142, train/loss_step=0.0373, global_step=1814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 863/5971 [08:44<51:39,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000142, train/loss_step=0.0373, global_step=1814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 863/5971 [08:44<51:39,  1.65it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000278, train/loss_step=0.0828, global_step=1814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 864/5971 [08:46<51:48,  1.64it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000278, train/loss_step=0.0828, global_step=1814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 864/5971 [08:46<51:48,  1.64it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.55e-5, train/loss_step=0.0126, global_step=1814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  14%|█▍        | 865/5971 [08:47<51:49,  1.64it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.55e-5, train/loss_step=0.0126, global_step=1814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  14%|█▍        | 865/5971 [08:47<51:49,  1.64it/s, loss=0.21, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000891, train/loss_step=0.237, global_step=1815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  15%|█▍        | 866/5971 [08:48<51:50,  1.64it/s, loss=0.21, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000891, train/loss_step=0.237, global_step=1815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 866/5971 [08:48<51:50,  1.64it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.11e-5, train/loss_step=0.00202, global_step=1815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 867/5971 [08:49<51:51,  1.64it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.11e-5, train/loss_step=0.00202, global_step=1815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 867/5971 [08:49<51:51,  1.64it/s, loss=0.181, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.00576, train/loss_step=0.595, global_step=1815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  15%|█▍        | 868/5971 [08:51<51:59,  1.64it/s, loss=0.181, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.00576, train/loss_step=0.595, global_step=1815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 868/5971 [08:51<51:59,  1.64it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.000211, train/loss_step=0.0613, global_step=1815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 869/5971 [08:52<52:01,  1.63it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.000211, train/loss_step=0.0613, global_step=1815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 869/5971 [08:52<52:01,  1.63it/s, loss=0.16, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.000153, train/loss_step=0.041, global_step=1816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  15%|█▍        | 870/5971 [08:53<52:02,  1.63it/s, loss=0.16, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.000153, train/loss_step=0.041, global_step=1816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 870/5971 [08:53<52:02,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.64e-5, train/loss_step=0.0055, global_step=1816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 871/5971 [08:53<52:03,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.64e-5, train/loss_step=0.0055, global_step=1816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 871/5971 [08:53<52:03,  1.63it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.91e-5, train/loss_step=0.0138, global_step=1816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 872/5971 [08:56<52:11,  1.63it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.91e-5, train/loss_step=0.0138, global_step=1816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 872/5971 [08:56<52:11,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000275, train/loss_step=0.080, global_step=1816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  15%|█▍        | 873/5971 [08:56<52:12,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000275, train/loss_step=0.080, global_step=1816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 873/5971 [08:56<52:12,  1.63it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=8.83e-5, train/loss_step=0.0233, global_step=1817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 874/5971 [08:57<52:13,  1.63it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=8.83e-5, train/loss_step=0.0233, global_step=1817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 874/5971 [08:57<52:13,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000522, train/loss_step=0.158, global_step=1817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  15%|█▍        | 875/5971 [08:58<52:14,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000522, train/loss_step=0.158, global_step=1817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 875/5971 [08:58<52:14,  1.63it/s, loss=0.16, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00181, train/loss_step=0.310, global_step=1817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  15%|█▍        | 876/5971 [09:01<52:23,  1.62it/s, loss=0.16, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00181, train/loss_step=0.310, global_step=1817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 876/5971 [09:01<52:23,  1.62it/s, loss=0.141, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000508, train/loss_step=0.150, global_step=1817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 877/5971 [09:01<52:24,  1.62it/s, loss=0.141, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000508, train/loss_step=0.150, global_step=1817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 877/5971 [09:01<52:24,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9.16e-5, train/loss_step=0.022, global_step=1818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  15%|█▍        | 878/5971 [09:02<52:25,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9.16e-5, train/loss_step=0.022, global_step=1818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 878/5971 [09:02<52:25,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000898, train/loss_step=0.237, global_step=1818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 879/5971 [09:03<52:26,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000898, train/loss_step=0.237, global_step=1818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 879/5971 [09:03<52:26,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.66e-5, train/loss_step=0.011, global_step=1818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  15%|█▍        | 880/5971 [09:05<52:34,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.66e-5, train/loss_step=0.011, global_step=1818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 880/5971 [09:05<52:34,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.58e-5, train/loss_step=0.0215, global_step=1818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 881/5971 [09:06<52:35,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.58e-5, train/loss_step=0.0215, global_step=1818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 881/5971 [09:06<52:35,  1.61it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000142, train/loss_step=0.0383, global_step=1819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 882/5971 [09:07<52:36,  1.61it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000142, train/loss_step=0.0383, global_step=1819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 882/5971 [09:07<52:36,  1.61it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.59e-5, train/loss_step=0.0077, global_step=1819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  15%|█▍        | 883/5971 [09:08<52:36,  1.61it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.59e-5, train/loss_step=0.0077, global_step=1819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 883/5971 [09:08<52:36,  1.61it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.91e-5, train/loss_step=0.00352, global_step=1819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 884/5971 [09:10<52:46,  1.61it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.91e-5, train/loss_step=0.00352, global_step=1819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 884/5971 [09:10<52:46,  1.61it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00492, train/loss_vlb_step=2.42e-5, train/loss_step=0.00492, global_step=1819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 885/5971 [09:11<52:47,  1.61it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00492, train/loss_vlb_step=2.42e-5, train/loss_step=0.00492, global_step=1819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 885/5971 [09:11<52:47,  1.61it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000123, train/loss_step=0.0338, global_step=1820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 886/5971 [09:12<52:48,  1.60it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000123, train/loss_step=0.0338, global_step=1820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 886/5971 [09:12<52:48,  1.60it/s, loss=0.113, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00318, train/loss_step=0.443, global_step=1820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  15%|█▍        | 887/5971 [09:13<52:49,  1.60it/s, loss=0.113, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00318, train/loss_step=0.443, global_step=1820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 887/5971 [09:13<52:49,  1.60it/s, loss=0.104, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00335, train/loss_step=0.421, global_step=1820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 888/5971 [09:15<52:57,  1.60it/s, loss=0.104, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00335, train/loss_step=0.421, global_step=1820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 888/5971 [09:15<52:57,  1.60it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.54e-5, train/loss_step=0.00265, global_step=1820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 889/5971 [09:16<52:58,  1.60it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.54e-5, train/loss_step=0.00265, global_step=1820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 889/5971 [09:16<52:58,  1.60it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.86e-5, train/loss_step=0.0102, global_step=1821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  15%|█▍        | 890/5971 [09:17<52:58,  1.60it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.86e-5, train/loss_step=0.0102, global_step=1821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 890/5971 [09:17<52:58,  1.60it/s, loss=0.109, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000639, train/loss_step=0.189, global_step=1821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  15%|█▍        | 891/5971 [09:18<52:59,  1.60it/s, loss=0.109, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000639, train/loss_step=0.189, global_step=1821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 891/5971 [09:18<52:59,  1.60it/s, loss=0.116, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000522, train/loss_step=0.157, global_step=1821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 892/5971 [09:20<53:07,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000522, train/loss_step=0.157, global_step=1821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 892/5971 [09:20<53:07,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000143, train/loss_step=0.0383, global_step=1821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 893/5971 [09:21<53:08,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000143, train/loss_step=0.0383, global_step=1821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 893/5971 [09:21<53:08,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.64e-5, train/loss_step=0.00298, global_step=1822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 894/5971 [09:22<53:09,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.64e-5, train/loss_step=0.00298, global_step=1822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 894/5971 [09:22<53:09,  1.59it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.37e-5, train/loss_step=0.0105, global_step=1822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  15%|█▍        | 895/5971 [09:23<53:10,  1.59it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.37e-5, train/loss_step=0.0105, global_step=1822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▍        | 895/5971 [09:23<53:10,  1.59it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.000212, train/loss_step=0.0613, global_step=1822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 896/5971 [09:25<53:18,  1.59it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.000212, train/loss_step=0.0613, global_step=1822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 896/5971 [09:25<53:18,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00691, train/loss_step=0.542, global_step=1822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  15%|█▌        | 897/5971 [09:26<53:19,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00691, train/loss_step=0.542, global_step=1822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 897/5971 [09:26<53:19,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.91e-5, train/loss_step=0.0128, global_step=1823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 898/5971 [09:27<53:20,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.91e-5, train/loss_step=0.0128, global_step=1823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 898/5971 [09:27<53:20,  1.59it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0492, train/loss_vlb_step=0.000162, train/loss_step=0.0492, global_step=1823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 899/5971 [09:28<53:21,  1.58it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0492, train/loss_vlb_step=0.000162, train/loss_step=0.0492, global_step=1823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 899/5971 [09:28<53:21,  1.58it/s, loss=0.109, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.00045, train/loss_step=0.135, global_step=1823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  15%|█▌        | 900/5971 [09:30<53:29,  1.58it/s, loss=0.109, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.00045, train/loss_step=0.135, global_step=1823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 900/5971 [09:30<53:29,  1.58it/s, loss=0.113, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 901/5971 [09:31<53:30,  1.58it/s, loss=0.113, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=1823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 901/5971 [09:31<53:30,  1.58it/s, loss=0.121, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000678, train/loss_step=0.197, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 902/5971 [09:31<53:30,  1.58it/s, loss=0.121, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000678, train/loss_step=0.197, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 902/5971 [09:31<53:30,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.84e-5, train/loss_step=0.0136, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 903/5971 [09:32<53:31,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.84e-5, train/loss_step=0.0136, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 903/5971 [09:32<53:31,  1.58it/s, loss=0.156, v_num=0, train/loss_simple_step=0.683, train/loss_vlb_step=0.0103, train/loss_step=0.683, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  15%|█▌        | 904/5971 [09:35<53:41,  1.57it/s, loss=0.156, v_num=0, train/loss_simple_step=0.683, train/loss_vlb_step=0.0103, train/loss_step=0.683, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  15%|█▌        | 904/5971 [09:35<53:41,  1.57it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.34it/s][A
Epoch 3:  15%|█▌        | 906/5971 [09:35<53:35,  1.57it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:39,  4.20it/s][A
Epoch 3:  15%|█▌        | 908/5971 [09:36<53:28,  1.58it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:15, 10.42it/s][A
Epoch 3:  15%|█▌        | 911/5971 [09:36<53:16,  1.58it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:10, 14.67it/s][A
Epoch 3:  15%|█▌        | 914/5971 [09:36<53:04,  1.59it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.36it/s][A
Epoch 3:  15%|█▌        | 917/5971 [09:36<52:54,  1.59it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:10, 15.00it/s][A
Epoch 3:  15%|█▌        | 921/5971 [09:36<52:38,  1.60it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:08, 17.58it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:07, 19.70it/s][A
Epoch 3:  15%|█▌        | 925/5971 [09:36<52:23,  1.61it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:07, 20.00it/s][A
Epoch 3:  16%|█▌        | 929/5971 [09:37<52:08,  1.61it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 21.01it/s][A
Epoch 3:  16%|█▌        | 933/5971 [09:37<51:53,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 21.91it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:06, 22.39it/s][A
Epoch 3:  16%|█▌        | 937/5971 [09:37<51:38,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:02<00:05, 23.48it/s][A
Epoch 3:  16%|█▌        | 941/5971 [09:37<51:23,  1.63it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.04it/s][A
Epoch 3:  16%|█▌        | 945/5971 [09:37<51:09,  1.64it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.28it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 24.96it/s][A
Epoch 3:  16%|█▌        | 949/5971 [09:37<50:54,  1.64it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.21it/s][A
Epoch 3:  16%|█▌        | 953/5971 [09:37<50:40,  1.65it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 24.05it/s][A
Epoch 3:  16%|█▌        | 957/5971 [09:38<50:25,  1.66it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 24.64it/s][A

Validating:  34%|███▎      | 56/167 [00:02<00:04, 24.93it/s][A
Epoch 3:  16%|█▌        | 961/5971 [09:38<50:11,  1.66it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.98it/s][A
Epoch 3:  16%|█▌        | 965/5971 [09:38<49:57,  1.67it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:03<00:03, 26.56it/s][A
Epoch 3:  16%|█▌        | 969/5971 [09:38<49:43,  1.68it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 27.20it/s][A

Validating:  41%|████      | 68/167 [00:03<00:03, 26.77it/s][A
Epoch 3:  16%|█▋        | 973/5971 [09:38<49:29,  1.68it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.51it/s][A
Epoch 3:  16%|█▋        | 977/5971 [09:38<49:15,  1.69it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.41it/s][A
Epoch 3:  16%|█▋        | 981/5971 [09:39<49:02,  1.70it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.08it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.37it/s][A
Epoch 3:  16%|█▋        | 985/5971 [09:39<48:48,  1.70it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.55it/s][A
Epoch 3:  17%|█▋        | 989/5971 [09:39<48:35,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 27.51it/s][A
Epoch 3:  17%|█▋        | 993/5971 [09:39<48:22,  1.72it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 27.27it/s][A
Epoch 3:  17%|█▋        | 997/5971 [09:39<48:08,  1.72it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.12it/s][A

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.51it/s][A
Epoch 3:  17%|█▋        | 1001/5971 [09:39<47:55,  1.73it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.51it/s][A
Epoch 3:  17%|█▋        | 1005/5971 [09:39<47:42,  1.73it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.19it/s][A
Epoch 3:  17%|█▋        | 1009/5971 [09:40<47:29,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.85it/s][A
Epoch 3:  17%|█▋        | 1013/5971 [09:40<47:16,  1.75it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 28.77it/s][A

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 27.36it/s][A
Epoch 3:  17%|█▋        | 1017/5971 [09:40<47:04,  1.75it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 26.98it/s][A
Epoch 3:  17%|█▋        | 1021/5971 [09:40<46:51,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.62it/s][A
Epoch 3:  17%|█▋        | 1025/5971 [09:40<46:39,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.74it/s][A

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.95it/s][A
Epoch 3:  17%|█▋        | 1029/5971 [09:40<46:26,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.81it/s][A
Epoch 3:  17%|█▋        | 1033/5971 [09:40<46:14,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 25.40it/s][A
Epoch 3:  17%|█▋        | 1037/5971 [09:41<46:02,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.82it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.11it/s][A
Epoch 3:  17%|█▋        | 1041/5971 [09:41<45:50,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.41it/s][A
Epoch 3:  18%|█▊        | 1045/5971 [09:41<45:38,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.05it/s][A
Epoch 3:  18%|█▊        | 1049/5971 [09:41<45:26,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.45it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.14it/s][A
Epoch 3:  18%|█▊        | 1053/5971 [09:41<45:14,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.20it/s][A
Epoch 3:  18%|█▊        | 1057/5971 [09:41<45:02,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 25.90it/s][A
Epoch 3:  18%|█▊        | 1061/5971 [09:42<44:50,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 25.89it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.23it/s][A
Epoch 3:  18%|█▊        | 1065/5971 [09:42<44:39,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.61it/s][A
Epoch 3:  18%|█▊        | 1069/5971 [09:42<44:28,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 25.22it/s][A
Epoch 3:  18%|█▊        | 1072/5971 [09:42<44:20,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  18%|█▊        | 1073/5971 [09:43<44:21,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.39e-5, train/loss_step=0.021, global_step=1824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1073/5971 [09:43<44:21,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000344, train/loss_step=0.105, global_step=1825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1074/5971 [09:44<44:22,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.52e-5, train/loss_step=0.0242, global_step=1825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1075/5971 [09:45<44:23,  1.84it/s, loss=0.128, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000705, train/loss_step=0.194, global_step=1825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  18%|█▊        | 1076/5971 [09:48<44:33,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.556, train/loss_vlb_step=0.00863, train/loss_step=0.556, global_step=1825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  18%|█▊        | 1077/5971 [09:49<44:34,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.556, train/loss_vlb_step=0.00863, train/loss_step=0.556, global_step=1825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1077/5971 [09:49<44:34,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.450, train/loss_vlb_step=0.00268, train/loss_step=0.450, global_step=1826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1078/5971 [09:50<44:35,  1.83it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.56e-5, train/loss_step=0.0132, global_step=1826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1079/5971 [09:50<44:36,  1.83it/s, loss=0.168, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000477, train/loss_step=0.145, global_step=1826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  18%|█▊        | 1080/5971 [09:53<44:45,  1.82it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.97e-5, train/loss_step=0.0221, global_step=1826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1081/5971 [09:54<44:46,  1.82it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.97e-5, train/loss_step=0.0221, global_step=1826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1081/5971 [09:54<44:46,  1.82it/s, loss=0.181, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.0012, train/loss_step=0.286, global_step=1827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  18%|█▊        | 1082/5971 [09:55<44:47,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000709, train/loss_step=0.200, global_step=1827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1083/5971 [09:56<44:48,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000914, train/loss_step=0.245, global_step=1827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  18%|█▊        | 1084/5971 [09:58<44:56,  1.81it/s, loss=0.179, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000422, train/loss_step=0.127, global_step=1827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1085/5971 [09:59<44:57,  1.81it/s, loss=0.179, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000422, train/loss_step=0.127, global_step=1827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1085/5971 [09:59<44:57,  1.81it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000167, train/loss_step=0.0456, global_step=1828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1086/5971 [10:00<44:58,  1.81it/s, loss=0.189, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000809, train/loss_step=0.217, global_step=1828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  18%|█▊        | 1087/5971 [10:01<44:59,  1.81it/s, loss=0.188, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000391, train/loss_step=0.118, global_step=1828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1088/5971 [10:03<45:06,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000377, train/loss_step=0.115, global_step=1828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1089/5971 [10:04<45:07,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000377, train/loss_step=0.115, global_step=1828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1089/5971 [10:04<45:07,  1.80it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.86e-5, train/loss_step=0.0102, global_step=1829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1090/5971 [10:05<45:08,  1.80it/s, loss=0.195, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00159, train/loss_step=0.318, global_step=1829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  18%|█▊        | 1091/5971 [10:06<45:09,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00194, train/loss_step=0.351, global_step=1829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1092/5971 [10:08<45:16,  1.80it/s, loss=0.199, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00268, train/loss_step=0.446, global_step=1829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1093/5971 [10:09<45:17,  1.79it/s, loss=0.199, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00268, train/loss_step=0.446, global_step=1829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1093/5971 [10:09<45:17,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0592, train/loss_vlb_step=0.0002, train/loss_step=0.0592, global_step=1830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1094/5971 [10:10<45:18,  1.79it/s, loss=0.22, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00395, train/loss_step=0.481, global_step=1830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  18%|█▊        | 1095/5971 [10:11<45:19,  1.79it/s, loss=0.227, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00147, train/loss_step=0.332, global_step=1830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1096/5971 [10:13<45:26,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0999, train/loss_vlb_step=0.000329, train/loss_step=0.0999, global_step=1830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1097/5971 [10:14<45:27,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0999, train/loss_vlb_step=0.000329, train/loss_step=0.0999, global_step=1830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1097/5971 [10:14<45:27,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.0035, train/loss_step=0.448, global_step=1831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  18%|█▊        | 1098/5971 [10:15<45:28,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.65e-5, train/loss_step=0.0105, global_step=1831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1099/5971 [10:16<45:29,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000507, train/loss_step=0.154, global_step=1831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  18%|█▊        | 1100/5971 [10:18<45:36,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.87e-5, train/loss_step=0.0132, global_step=1831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1101/5971 [10:19<45:37,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.87e-5, train/loss_step=0.0132, global_step=1831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1101/5971 [10:19<45:37,  1.78it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.13e-5, train/loss_step=0.00397, global_step=1832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1102/5971 [10:20<45:38,  1.78it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00613, train/loss_vlb_step=3.15e-5, train/loss_step=0.00613, global_step=1832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  18%|█▊        | 1103/5971 [10:21<45:39,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=7.07e-5, train/loss_step=0.016, global_step=1832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  18%|█▊        | 1104/5971 [10:23<45:45,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.664, train/loss_vlb_step=0.0313, train/loss_step=0.664, global_step=1832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▊        | 1105/5971 [10:24<45:46,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.664, train/loss_vlb_step=0.0313, train/loss_step=0.664, global_step=1832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1105/5971 [10:24<45:46,  1.77it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.87e-5, train/loss_step=0.0173, global_step=1833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1106/5971 [10:25<45:47,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000107, train/loss_step=0.0296, global_step=1833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1107/5971 [10:26<45:48,  1.77it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000284, train/loss_step=0.0864, global_step=1833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1108/5971 [10:28<45:55,  1.77it/s, loss=0.184, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=1833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  19%|█▊        | 1109/5971 [10:29<45:56,  1.76it/s, loss=0.184, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=1833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1109/5971 [10:29<45:56,  1.76it/s, loss=0.199, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00174, train/loss_step=0.303, global_step=1834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▊        | 1110/5971 [10:30<45:57,  1.76it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.82e-5, train/loss_step=0.0107, global_step=1834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1111/5971 [10:30<45:57,  1.76it/s, loss=0.178, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000876, train/loss_step=0.233, global_step=1834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▊        | 1112/5971 [10:33<46:05,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000748, train/loss_step=0.175, global_step=1834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1113/5971 [10:34<46:06,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000748, train/loss_step=0.175, global_step=1834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1113/5971 [10:34<46:06,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.05e-6, train/loss_step=0.0016, global_step=1835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1114/5971 [10:35<46:07,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00841, train/loss_vlb_step=4.12e-5, train/loss_step=0.00841, global_step=1835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1115/5971 [10:36<46:07,  1.75it/s, loss=0.132, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000836, train/loss_step=0.212, global_step=1835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  19%|█▊        | 1116/5971 [10:38<46:14,  1.75it/s, loss=0.14, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.0012, train/loss_step=0.272, global_step=1835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  19%|█▊        | 1117/5971 [10:39<46:14,  1.75it/s, loss=0.14, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.0012, train/loss_step=0.272, global_step=1835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1117/5971 [10:39<46:14,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0777, train/loss_vlb_step=0.000257, train/loss_step=0.0777, global_step=1836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▊        | 1118/5971 [10:40<46:15,  1.75it/s, loss=0.136, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00146, train/loss_step=0.302, global_step=1836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  19%|█▊        | 1119/5971 [10:40<46:16,  1.75it/s, loss=0.141, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00108, train/loss_step=0.260, global_step=1836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1120/5971 [10:43<46:23,  1.74it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000309, train/loss_step=0.0916, global_step=1836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1121/5971 [10:44<46:24,  1.74it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000309, train/loss_step=0.0916, global_step=1836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1121/5971 [10:44<46:24,  1.74it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.000138, train/loss_step=0.0365, global_step=1837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1122/5971 [10:45<46:25,  1.74it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000134, train/loss_step=0.0372, global_step=1837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1123/5971 [10:45<46:25,  1.74it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000142, train/loss_step=0.0381, global_step=1837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1124/5971 [10:48<46:32,  1.74it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000157, train/loss_step=0.0406, global_step=1837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1125/5971 [10:49<46:33,  1.73it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000157, train/loss_step=0.0406, global_step=1837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1125/5971 [10:49<46:33,  1.73it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.04e-5, train/loss_step=0.00396, global_step=1838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1126/5971 [10:49<46:33,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000819, train/loss_step=0.242, global_step=1838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  19%|█▉        | 1127/5971 [10:50<46:34,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=1838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1128/5971 [10:52<46:40,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=1838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1129/5971 [10:53<46:41,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=1838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1129/5971 [10:53<46:41,  1.73it/s, loss=0.114, v_num=0, train/loss_simple_step=0.008, train/loss_vlb_step=4.12e-5, train/loss_step=0.008, global_step=1839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1130/5971 [10:54<46:42,  1.73it/s, loss=0.135, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00257, train/loss_step=0.430, global_step=1839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1131/5971 [10:55<46:43,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.44e-5, train/loss_step=0.00262, global_step=1839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1132/5971 [10:57<46:49,  1.72it/s, loss=0.135, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00216, train/loss_step=0.406, global_step=1839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  19%|█▉        | 1133/5971 [10:58<46:50,  1.72it/s, loss=0.135, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00216, train/loss_step=0.406, global_step=1839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1133/5971 [10:58<46:50,  1.72it/s, loss=0.142, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=1840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1134/5971 [10:59<46:50,  1.72it/s, loss=0.173, v_num=0, train/loss_simple_step=0.641, train/loss_vlb_step=0.00972, train/loss_step=0.641, global_step=1840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1135/5971 [11:00<46:51,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.21e-5, train/loss_step=0.00422, global_step=1840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1136/5971 [11:02<46:57,  1.72it/s, loss=0.175, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00392, train/loss_step=0.507, global_step=1840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  19%|█▉        | 1137/5971 [11:03<46:58,  1.72it/s, loss=0.175, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00392, train/loss_step=0.507, global_step=1840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1137/5971 [11:03<46:58,  1.72it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.14e-5, train/loss_step=0.0019, global_step=1841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1138/5971 [11:04<46:58,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000168, train/loss_step=0.0487, global_step=1841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1139/5971 [11:05<46:59,  1.71it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0753, train/loss_vlb_step=0.000248, train/loss_step=0.0753, global_step=1841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1140/5971 [11:07<47:07,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00115, train/loss_step=0.278, global_step=1841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  19%|█▉        | 1141/5971 [11:08<47:08,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00115, train/loss_step=0.278, global_step=1841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1141/5971 [11:08<47:08,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.47e-5, train/loss_step=0.0213, global_step=1842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1142/5971 [11:09<47:08,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.46e-5, train/loss_step=0.00253, global_step=1842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1143/5971 [11:10<47:09,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=1.04e-5, train/loss_step=0.00171, global_step=1842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1144/5971 [11:12<47:15,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000144, train/loss_step=0.0378, global_step=1842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1145/5971 [11:13<47:16,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000144, train/loss_step=0.0378, global_step=1842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1145/5971 [11:13<47:16,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.2e-5, train/loss_step=0.00215, global_step=1843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1146/5971 [11:14<47:16,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.00019, train/loss_step=0.0546, global_step=1843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1147/5971 [11:15<47:17,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.79e-5, train/loss_step=0.0193, global_step=1843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1148/5971 [11:17<47:23,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9.17e-5, train/loss_step=0.022, global_step=1843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1149/5971 [11:18<47:24,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9.17e-5, train/loss_step=0.022, global_step=1843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1149/5971 [11:18<47:24,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00124, train/loss_step=0.275, global_step=1844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1150/5971 [11:19<47:24,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00172, train/loss_step=0.313, global_step=1844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1151/5971 [11:20<47:25,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.651, train/loss_vlb_step=0.0127, train/loss_step=0.651, global_step=1844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1152/5971 [11:22<47:31,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=1844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1153/5971 [11:23<47:32,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=1844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1153/5971 [11:23<47:32,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.51e-5, train/loss_step=0.0109, global_step=1845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1154/5971 [11:24<47:32,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.68e-5, train/loss_step=0.00308, global_step=1845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1155/5971 [11:24<47:33,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000134, train/loss_step=0.0381, global_step=1845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  19%|█▉        | 1156/5971 [11:27<47:39,  1.68it/s, loss=0.1, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000115, train/loss_step=0.030, global_step=1845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  19%|█▉        | 1157/5971 [11:27<47:39,  1.68it/s, loss=0.1, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000115, train/loss_step=0.030, global_step=1845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1157/5971 [11:27<47:39,  1.68it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.47e-5, train/loss_step=0.0145, global_step=1846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1158/5971 [11:28<47:40,  1.68it/s, loss=0.123, v_num=0, train/loss_simple_step=0.494, train/loss_vlb_step=0.00352, train/loss_step=0.494, global_step=1846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  19%|█▉        | 1159/5971 [11:29<47:41,  1.68it/s, loss=0.125, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=1846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1160/5971 [11:32<47:48,  1.68it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.27e-5, train/loss_step=0.0119, global_step=1846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1161/5971 [11:33<47:49,  1.68it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.27e-5, train/loss_step=0.0119, global_step=1846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1161/5971 [11:33<47:49,  1.68it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.22e-5, train/loss_step=0.0208, global_step=1847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1162/5971 [11:34<47:50,  1.68it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000189, train/loss_step=0.0536, global_step=1847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1163/5971 [11:34<47:50,  1.67it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000125, train/loss_step=0.0361, global_step=1847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  19%|█▉        | 1164/5971 [11:37<47:57,  1.67it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0041, train/loss_vlb_step=2.14e-5, train/loss_step=0.0041, global_step=1847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  20%|█▉        | 1165/5971 [11:38<47:57,  1.67it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0041, train/loss_vlb_step=2.14e-5, train/loss_step=0.0041, global_step=1847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  20%|█▉        | 1165/5971 [11:38<47:57,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00179, train/loss_step=0.396, global_step=1848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  20%|█▉        | 1166/5971 [11:39<47:58,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.96e-5, train/loss_step=0.0142, global_step=1848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  20%|█▉        | 1167/5971 [11:39<47:58,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000124, train/loss_step=0.033, global_step=1848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  20%|█▉        | 1168/5971 [11:42<48:05,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00362, train/loss_step=0.465, global_step=1848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  20%|█▉        | 1169/5971 [11:43<48:05,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00362, train/loss_step=0.465, global_step=1848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  20%|█▉        | 1169/5971 [11:43<48:05,  1.66it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.06e-5, train/loss_step=0.0019, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  20%|█▉        | 1170/5971 [11:44<48:06,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000589, train/loss_step=0.158, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  20%|█▉        | 1171/5971 [11:44<48:07,  1.66it/s, loss=0.108, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.00048, train/loss_step=0.144, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  20%|█▉        | 1172/5971 [11:47<48:13,  1.66it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  20%|█▉        | 1173/5971 [11:47<48:10,  1.66it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:24,  1.96it/s][A

Validating:   1%|          | 2/167 [00:00<00:45,  3.60it/s][A
Epoch 3:  20%|█▉        | 1177/5971 [11:48<48:01,  1.66it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.15it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.74it/s][A
Epoch 3:  20%|█▉        | 1181/5971 [11:48<47:49,  1.67it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.64it/s][A
Epoch 3:  20%|█▉        | 1185/5971 [11:48<47:38,  1.67it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.35it/s][A
Epoch 3:  20%|█▉        | 1189/5971 [11:48<47:27,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.77it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.97it/s][A
Epoch 3:  20%|█▉        | 1193/5971 [11:48<47:15,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.86it/s][A
Epoch 3:  20%|██        | 1197/5971 [11:48<47:04,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.53it/s][A
Epoch 3:  20%|██        | 1201/5971 [11:48<46:53,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.37it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.68it/s][A
Epoch 3:  20%|██        | 1205/5971 [11:49<46:42,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.54it/s][A
Epoch 3:  20%|██        | 1209/5971 [11:49<46:31,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 25.84it/s][A
Epoch 3:  20%|██        | 1213/5971 [11:49<46:20,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.55it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.89it/s][A
Epoch 3:  20%|██        | 1217/5971 [11:49<46:09,  1.72it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.31it/s][A
Epoch 3:  20%|██        | 1221/5971 [11:49<45:58,  1.72it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.00it/s][A
Epoch 3:  21%|██        | 1225/5971 [11:49<45:48,  1.73it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.60it/s][A

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.79it/s][A
Epoch 3:  21%|██        | 1229/5971 [11:50<45:37,  1.73it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 24.89it/s][A
Epoch 3:  21%|██        | 1233/5971 [11:50<45:26,  1.74it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.02it/s][A
Epoch 3:  21%|██        | 1237/5971 [11:50<45:16,  1.74it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 25.87it/s][A

Validating:  41%|████      | 68/167 [00:03<00:03, 26.85it/s][A
Epoch 3:  21%|██        | 1241/5971 [11:50<45:05,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.37it/s][A
Epoch 3:  21%|██        | 1245/5971 [11:50<44:55,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.40it/s][A
Epoch 3:  21%|██        | 1249/5971 [11:50<44:45,  1.76it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.54it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.80it/s][A
Epoch 3:  21%|██        | 1253/5971 [11:50<44:34,  1.76it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.84it/s][A
Epoch 3:  21%|██        | 1257/5971 [11:51<44:24,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 28.30it/s][A
Epoch 3:  21%|██        | 1261/5971 [11:51<44:14,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 28.09it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 27.03it/s][A
Epoch 3:  21%|██        | 1265/5971 [11:51<44:04,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.15it/s][A
Epoch 3:  21%|██▏       | 1269/5971 [11:51<43:54,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.81it/s][A
Epoch 3:  21%|██▏       | 1273/5971 [11:51<43:44,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.66it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.38it/s][A
Epoch 3:  21%|██▏       | 1277/5971 [11:51<43:34,  1.80it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.18it/s][A
Epoch 3:  21%|██▏       | 1281/5971 [11:51<43:24,  1.80it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 24.58it/s][A
Epoch 3:  22%|██▏       | 1285/5971 [11:52<43:14,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 26.27it/s][A
Epoch 3:  22%|██▏       | 1289/5971 [11:52<43:05,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.10it/s][A

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 27.00it/s][A
Epoch 3:  22%|██▏       | 1293/5971 [11:52<42:55,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.99it/s][A
Epoch 3:  22%|██▏       | 1297/5971 [11:52<42:45,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.62it/s][A
Epoch 3:  22%|██▏       | 1301/5971 [11:52<42:36,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.34it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.31it/s][A
Epoch 3:  22%|██▏       | 1305/5971 [11:52<42:27,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 25.49it/s][A
Epoch 3:  22%|██▏       | 1309/5971 [11:53<42:17,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.49it/s][A
Epoch 3:  22%|██▏       | 1313/5971 [11:53<42:08,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.44it/s][A

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.88it/s][A
Epoch 3:  22%|██▏       | 1317/5971 [11:53<41:58,  1.85it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 24.77it/s][A
Epoch 3:  22%|██▏       | 1321/5971 [11:53<41:49,  1.85it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.69it/s][A
Epoch 3:  22%|██▏       | 1325/5971 [11:53<41:40,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.04it/s][A

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.29it/s][A
Epoch 3:  22%|██▏       | 1329/5971 [11:53<41:31,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.91it/s][A
Epoch 3:  22%|██▏       | 1333/5971 [11:53<41:22,  1.87it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.55it/s][A
Epoch 3:  22%|██▏       | 1337/5971 [11:54<41:13,  1.87it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.90it/s][A
Epoch 3:  22%|██▏       | 1340/5971 [11:54<41:07,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  22%|██▏       | 1341/5971 [11:55<41:08,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0324, train/loss_step=0.687, global_step=1849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  22%|██▏       | 1341/5971 [11:55<41:08,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000436, train/loss_step=0.133, global_step=1850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  22%|██▏       | 1342/5971 [11:56<41:09,  1.87it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000159, train/loss_step=0.0436, global_step=1850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  22%|██▏       | 1343/5971 [11:57<41:09,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000193, train/loss_step=0.0552, global_step=1850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1344/5971 [11:59<41:14,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000153, train/loss_step=0.0425, global_step=1850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1345/5971 [12:00<41:15,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000153, train/loss_step=0.0425, global_step=1850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1345/5971 [12:00<41:15,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00243, train/loss_step=0.406, global_step=1851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  23%|██▎       | 1346/5971 [12:01<41:16,  1.87it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.24e-5, train/loss_step=0.00228, global_step=1851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1347/5971 [12:02<41:17,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000609, train/loss_step=0.177, global_step=1851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  23%|██▎       | 1348/5971 [12:04<41:23,  1.86it/s, loss=0.145, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000112, train/loss_step=0.031, global_step=1851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1349/5971 [12:05<41:23,  1.86it/s, loss=0.145, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000112, train/loss_step=0.031, global_step=1851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1349/5971 [12:05<41:23,  1.86it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.27e-5, train/loss_step=0.0158, global_step=1852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1350/5971 [12:06<41:24,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.000106, train/loss_step=0.0293, global_step=1852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1351/5971 [12:07<41:25,  1.86it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00764, train/loss_vlb_step=3.57e-5, train/loss_step=0.00764, global_step=1852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1352/5971 [12:09<41:30,  1.85it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00614, train/loss_vlb_step=3.03e-5, train/loss_step=0.00614, global_step=1852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1353/5971 [12:10<41:31,  1.85it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00614, train/loss_vlb_step=3.03e-5, train/loss_step=0.00614, global_step=1852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1353/5971 [12:10<41:31,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.29e-5, train/loss_step=0.00408, global_step=1853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1354/5971 [12:11<41:32,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000835, train/loss_step=0.228, global_step=1853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  23%|██▎       | 1355/5971 [12:12<41:32,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000559, train/loss_step=0.166, global_step=1853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  23%|██▎       | 1356/5971 [12:14<41:39,  1.85it/s, loss=0.143, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00559, train/loss_step=0.513, global_step=1853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1357/5971 [12:16<41:40,  1.84it/s, loss=0.143, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00559, train/loss_step=0.513, global_step=1853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1357/5971 [12:16<41:40,  1.84it/s, loss=0.174, v_num=0, train/loss_simple_step=0.639, train/loss_vlb_step=0.00814, train/loss_step=0.639, global_step=1854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1358/5971 [12:16<41:41,  1.84it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00333, train/loss_vlb_step=1.8e-5, train/loss_step=0.00333, global_step=1854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1359/5971 [12:17<41:42,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.77e-5, train/loss_step=0.0211, global_step=1854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  23%|██▎       | 1360/5971 [12:19<41:46,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0516, train/loss_vlb_step=0.00018, train/loss_step=0.0516, global_step=1854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1361/5971 [12:20<41:47,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0516, train/loss_vlb_step=0.00018, train/loss_step=0.0516, global_step=1854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1361/5971 [12:20<41:47,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.626, train/loss_vlb_step=0.00661, train/loss_step=0.626, global_step=1855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  23%|██▎       | 1362/5971 [12:21<41:48,  1.84it/s, loss=0.154, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.00017, train/loss_step=0.046, global_step=1855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1363/5971 [12:22<41:48,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.61e-5, train/loss_step=0.0204, global_step=1855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1364/5971 [12:24<41:53,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00115, train/loss_step=0.273, global_step=1855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  23%|██▎       | 1365/5971 [12:25<41:54,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00115, train/loss_step=0.273, global_step=1855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1365/5971 [12:25<41:54,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000825, train/loss_step=0.236, global_step=1856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1366/5971 [12:26<41:55,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00233, train/loss_vlb_step=1.33e-5, train/loss_step=0.00233, global_step=1856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1367/5971 [12:27<41:55,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00183, train/loss_step=0.339, global_step=1856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  23%|██▎       | 1368/5971 [12:30<42:02,  1.83it/s, loss=0.168, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=1856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1369/5971 [12:30<42:02,  1.82it/s, loss=0.168, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=1856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1369/5971 [12:30<42:02,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000135, train/loss_step=0.0369, global_step=1857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1370/5971 [12:31<42:03,  1.82it/s, loss=0.174, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000485, train/loss_step=0.144, global_step=1857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  23%|██▎       | 1371/5971 [12:32<42:03,  1.82it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000101, train/loss_step=0.0286, global_step=1857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1372/5971 [12:35<42:09,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.00021, train/loss_step=0.0631, global_step=1857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  23%|██▎       | 1373/5971 [12:36<42:10,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.00021, train/loss_step=0.0631, global_step=1857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1373/5971 [12:36<42:10,  1.82it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.48e-5, train/loss_step=0.0136, global_step=1858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1374/5971 [12:37<42:10,  1.82it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0776, train/loss_vlb_step=0.000257, train/loss_step=0.0776, global_step=1858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1375/5971 [12:37<42:11,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0686, train/loss_vlb_step=0.00023, train/loss_step=0.0686, global_step=1858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  23%|██▎       | 1376/5971 [12:40<42:17,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000235, train/loss_step=0.0688, global_step=1858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1377/5971 [12:41<42:18,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000235, train/loss_step=0.0688, global_step=1858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1377/5971 [12:41<42:18,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000711, train/loss_step=0.213, global_step=1859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  23%|██▎       | 1378/5971 [12:42<42:19,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00255, train/loss_step=0.382, global_step=1859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  23%|██▎       | 1379/5971 [12:43<42:19,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.15e-5, train/loss_step=0.00202, global_step=1859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1380/5971 [12:45<42:24,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00148, train/loss_step=0.360, global_step=1859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  23%|██▎       | 1381/5971 [12:46<42:24,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00148, train/loss_step=0.360, global_step=1859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1381/5971 [12:46<42:24,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0345, train/loss_vlb_step=0.000132, train/loss_step=0.0345, global_step=1860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1382/5971 [12:47<42:25,  1.80it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00627, train/loss_vlb_step=2.94e-5, train/loss_step=0.00627, global_step=1860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1383/5971 [12:48<42:26,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.0004, train/loss_step=0.121, global_step=1860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]      
Epoch 3:  23%|██▎       | 1384/5971 [12:50<42:30,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000791, train/loss_step=0.203, global_step=1860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1385/5971 [12:51<42:31,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000791, train/loss_step=0.203, global_step=1860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1385/5971 [12:51<42:31,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.000239, train/loss_step=0.0726, global_step=1861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1386/5971 [12:51<42:31,  1.80it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00527, train/loss_vlb_step=2.7e-5, train/loss_step=0.00527, global_step=1861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1387/5971 [12:52<42:32,  1.80it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.31e-5, train/loss_step=0.00231, global_step=1861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1388/5971 [12:55<42:37,  1.79it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.050, train/loss_vlb_step=0.000172, train/loss_step=0.050, global_step=1861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  23%|██▎       | 1389/5971 [12:56<42:38,  1.79it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.050, train/loss_vlb_step=0.000172, train/loss_step=0.050, global_step=1861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1389/5971 [12:56<42:38,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000739, train/loss_step=0.209, global_step=1862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  23%|██▎       | 1390/5971 [12:56<42:38,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000622, train/loss_step=0.180, global_step=1862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1391/5971 [12:57<42:39,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.00017, train/loss_step=0.0487, global_step=1862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1392/5971 [12:59<42:43,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.61e-5, train/loss_step=0.0179, global_step=1862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1393/5971 [13:00<42:44,  1.79it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.61e-5, train/loss_step=0.0179, global_step=1862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1393/5971 [13:00<42:44,  1.79it/s, loss=0.122, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00147, train/loss_step=0.326, global_step=1863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  23%|██▎       | 1394/5971 [13:01<42:44,  1.78it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000146, train/loss_step=0.0427, global_step=1863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1395/5971 [13:02<42:45,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0583, train/loss_vlb_step=0.000198, train/loss_step=0.0583, global_step=1863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  23%|██▎       | 1396/5971 [13:05<42:50,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000371, train/loss_step=0.110, global_step=1863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  23%|██▎       | 1397/5971 [13:05<42:51,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000371, train/loss_step=0.110, global_step=1863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1397/5971 [13:05<42:51,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.42e-5, train/loss_step=0.00237, global_step=1864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1398/5971 [13:06<42:51,  1.78it/s, loss=0.101, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000537, train/loss_step=0.162, global_step=1864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  23%|██▎       | 1399/5971 [13:07<42:52,  1.78it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.75e-5, train/loss_step=0.00314, global_step=1864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1400/5971 [13:09<42:56,  1.77it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.0002, train/loss_step=0.0571, global_step=1864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  23%|██▎       | 1401/5971 [13:10<42:57,  1.77it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.0002, train/loss_step=0.0571, global_step=1864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1401/5971 [13:10<42:57,  1.77it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000574, train/loss_step=0.170, global_step=1865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1402/5971 [13:11<42:58,  1.77it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.00543, train/loss_vlb_step=2.7e-5, train/loss_step=0.00543, global_step=1865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  23%|██▎       | 1403/5971 [13:12<42:58,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.698, train/loss_vlb_step=0.0281, train/loss_step=0.698, global_step=1865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  24%|██▎       | 1404/5971 [13:14<43:02,  1.77it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.26e-5, train/loss_step=0.0229, global_step=1865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1405/5971 [13:15<43:03,  1.77it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.26e-5, train/loss_step=0.0229, global_step=1865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1405/5971 [13:15<43:03,  1.77it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.78e-5, train/loss_step=0.0184, global_step=1866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1406/5971 [13:16<43:04,  1.77it/s, loss=0.125, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00166, train/loss_step=0.320, global_step=1866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  24%|██▎       | 1407/5971 [13:17<43:04,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.0022, train/loss_step=0.420, global_step=1866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  24%|██▎       | 1408/5971 [13:19<43:10,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.14e-5, train/loss_step=0.0139, global_step=1866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1409/5971 [13:20<43:10,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.14e-5, train/loss_step=0.0139, global_step=1866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1409/5971 [13:20<43:10,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.49e-6, train/loss_step=0.0014, global_step=1867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1410/5971 [13:21<43:11,  1.76it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0557, train/loss_vlb_step=0.000192, train/loss_step=0.0557, global_step=1867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1411/5971 [13:22<43:11,  1.76it/s, loss=0.162, v_num=0, train/loss_simple_step=0.737, train/loss_vlb_step=0.018, train/loss_step=0.737, global_step=1867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  24%|██▎       | 1412/5971 [13:24<43:16,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.1e-5, train/loss_step=0.00181, global_step=1867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1413/5971 [13:25<43:16,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.1e-5, train/loss_step=0.00181, global_step=1867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1413/5971 [13:25<43:16,  1.76it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0345, train/loss_vlb_step=0.000131, train/loss_step=0.0345, global_step=1868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1414/5971 [13:26<43:17,  1.75it/s, loss=0.15, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000357, train/loss_step=0.108, global_step=1868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  24%|██▎       | 1415/5971 [13:27<43:17,  1.75it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000208, train/loss_step=0.0606, global_step=1868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1416/5971 [13:29<43:22,  1.75it/s, loss=0.161, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00165, train/loss_step=0.333, global_step=1868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  24%|██▎       | 1417/5971 [13:30<43:23,  1.75it/s, loss=0.161, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00165, train/loss_step=0.333, global_step=1868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1417/5971 [13:30<43:23,  1.75it/s, loss=0.174, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00102, train/loss_step=0.259, global_step=1869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▎       | 1418/5971 [13:31<43:23,  1.75it/s, loss=0.208, v_num=0, train/loss_simple_step=0.845, train/loss_vlb_step=0.107, train/loss_step=0.845, global_step=1869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  24%|██▍       | 1419/5971 [13:32<43:24,  1.75it/s, loss=0.24, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.0149, train/loss_step=0.638, global_step=1869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1420/5971 [13:34<43:28,  1.74it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.4e-5, train/loss_step=0.0026, global_step=1869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1421/5971 [13:35<43:29,  1.74it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.4e-5, train/loss_step=0.0026, global_step=1869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1421/5971 [13:35<43:29,  1.74it/s, loss=0.249, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00233, train/loss_step=0.401, global_step=1870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  24%|██▍       | 1422/5971 [13:36<43:29,  1.74it/s, loss=0.271, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00295, train/loss_step=0.455, global_step=1870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1423/5971 [13:37<43:30,  1.74it/s, loss=0.242, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000358, train/loss_step=0.109, global_step=1870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1424/5971 [13:39<43:34,  1.74it/s, loss=0.243, v_num=0, train/loss_simple_step=0.0445, train/loss_vlb_step=0.000162, train/loss_step=0.0445, global_step=1870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1425/5971 [13:40<43:35,  1.74it/s, loss=0.243, v_num=0, train/loss_simple_step=0.0445, train/loss_vlb_step=0.000162, train/loss_step=0.0445, global_step=1870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1425/5971 [13:40<43:35,  1.74it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0659, train/loss_vlb_step=0.000222, train/loss_step=0.0659, global_step=1871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1426/5971 [13:41<43:35,  1.74it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.36e-5, train/loss_step=0.0211, global_step=1871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  24%|██▍       | 1427/5971 [13:42<43:35,  1.74it/s, loss=0.218, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000709, train/loss_step=0.180, global_step=1871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1428/5971 [13:44<43:40,  1.73it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=8.85e-5, train/loss_step=0.0237, global_step=1871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1429/5971 [13:45<43:40,  1.73it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=8.85e-5, train/loss_step=0.0237, global_step=1871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1429/5971 [13:45<43:40,  1.73it/s, loss=0.261, v_num=0, train/loss_simple_step=0.837, train/loss_vlb_step=0.0613, train/loss_step=0.837, global_step=1872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  24%|██▍       | 1430/5971 [13:45<43:40,  1.73it/s, loss=0.259, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000122, train/loss_step=0.0315, global_step=1872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1431/5971 [13:46<43:41,  1.73it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0015, train/loss_vlb_step=9.13e-6, train/loss_step=0.0015, global_step=1872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  24%|██▍       | 1432/5971 [13:48<43:45,  1.73it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000149, train/loss_step=0.0453, global_step=1872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1433/5971 [13:49<43:46,  1.73it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000149, train/loss_step=0.0453, global_step=1872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1433/5971 [13:49<43:46,  1.73it/s, loss=0.231, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.00057, train/loss_step=0.160, global_step=1873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  24%|██▍       | 1434/5971 [13:50<43:46,  1.73it/s, loss=0.226, v_num=0, train/loss_simple_step=0.00786, train/loss_vlb_step=3.75e-5, train/loss_step=0.00786, global_step=1873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1435/5971 [13:51<43:46,  1.73it/s, loss=0.232, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000618, train/loss_step=0.181, global_step=1873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  24%|██▍       | 1436/5971 [13:54<43:52,  1.72it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000138, train/loss_step=0.0366, global_step=1873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1437/5971 [13:54<43:52,  1.72it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000138, train/loss_step=0.0366, global_step=1873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1437/5971 [13:54<43:52,  1.72it/s, loss=0.214, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.00081, train/loss_step=0.195, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  24%|██▍       | 1438/5971 [13:55<43:52,  1.72it/s, loss=0.195, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00312, train/loss_step=0.457, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1439/5971 [13:56<43:53,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00575, train/loss_vlb_step=3.04e-5, train/loss_step=0.00575, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  24%|██▍       | 1440/5971 [13:58<43:57,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  24%|██▍       | 1441/5971 [13:58<43:55,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:12,  2.28it/s][A

Validating:   1%|          | 2/167 [00:00<00:40,  4.03it/s][A
Epoch 3:  24%|██▍       | 1445/5971 [13:59<43:47,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.71it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.93it/s][A
Epoch 3:  24%|██▍       | 1449/5971 [13:59<43:38,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.85it/s][A
Epoch 3:  24%|██▍       | 1453/5971 [13:59<43:29,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.36it/s][A
Epoch 3:  24%|██▍       | 1457/5971 [14:00<43:20,  1.74it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.48it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.92it/s][A
Epoch 3:  24%|██▍       | 1461/5971 [14:00<43:11,  1.74it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.49it/s][A
Epoch 3:  25%|██▍       | 1465/5971 [14:00<43:02,  1.74it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.67it/s][A
Epoch 3:  25%|██▍       | 1469/5971 [14:00<42:54,  1.75it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.53it/s][A
Epoch 3:  25%|██▍       | 1473/5971 [14:00<42:45,  1.75it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.64it/s][A

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.26it/s][A
Epoch 3:  25%|██▍       | 1477/5971 [14:00<42:36,  1.76it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:01<00:05, 25.49it/s][A
Epoch 3:  25%|██▍       | 1481/5971 [14:00<42:27,  1.76it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.04it/s][A
Epoch 3:  25%|██▍       | 1485/5971 [14:01<42:19,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.08it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.96it/s][A
Epoch 3:  25%|██▍       | 1489/5971 [14:01<42:10,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.61it/s][A
Epoch 3:  25%|██▌       | 1493/5971 [14:01<42:02,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.41it/s][A
Epoch 3:  25%|██▌       | 1497/5971 [14:01<41:53,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.83it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.16it/s][A
Epoch 3:  25%|██▌       | 1501/5971 [14:01<41:45,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.90it/s][A
Epoch 3:  25%|██▌       | 1505/5971 [14:01<41:36,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.09it/s][A
Epoch 3:  25%|██▌       | 1509/5971 [14:02<41:28,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.67it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.54it/s][A
Epoch 3:  25%|██▌       | 1513/5971 [14:02<41:19,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.16it/s][A
Epoch 3:  25%|██▌       | 1517/5971 [14:02<41:11,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.58it/s][A
Epoch 3:  25%|██▌       | 1521/5971 [14:02<41:03,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.03it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.35it/s][A
Epoch 3:  26%|██▌       | 1525/5971 [14:02<40:55,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.84it/s][A
Epoch 3:  26%|██▌       | 1529/5971 [14:02<40:46,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.08it/s][A
Epoch 3:  26%|██▌       | 1533/5971 [14:02<40:38,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.12it/s][A
Epoch 3:  26%|██▌       | 1537/5971 [14:03<40:30,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 23.80it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.17it/s][A
Epoch 3:  26%|██▌       | 1541/5971 [14:03<40:22,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.52it/s][A
Epoch 3:  26%|██▌       | 1545/5971 [14:03<40:14,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.04it/s][A
Epoch 3:  26%|██▌       | 1549/5971 [14:03<40:06,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 25.34it/s][A

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 25.57it/s][A
Epoch 3:  26%|██▌       | 1553/5971 [14:03<39:58,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:04<00:02, 25.75it/s][A
Epoch 3:  26%|██▌       | 1557/5971 [14:03<39:50,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.52it/s][A
Epoch 3:  26%|██▌       | 1561/5971 [14:04<39:42,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.74it/s][A
Epoch 3:  26%|██▌       | 1565/5971 [14:04<39:35,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 28.24it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 28.29it/s][A
Epoch 3:  26%|██▋       | 1569/5971 [14:04<39:27,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 28.34it/s][A
Epoch 3:  26%|██▋       | 1573/5971 [14:04<39:19,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:05<00:01, 28.25it/s][A
Epoch 3:  26%|██▋       | 1577/5971 [14:04<39:11,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.50it/s][A

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 25.53it/s][A
Epoch 3:  26%|██▋       | 1581/5971 [14:04<39:04,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.64it/s][A
Epoch 3:  27%|██▋       | 1585/5971 [14:04<38:56,  1.88it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.92it/s][A
Epoch 3:  27%|██▋       | 1589/5971 [14:05<38:48,  1.88it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 28.44it/s][A
Epoch 3:  27%|██▋       | 1593/5971 [14:05<38:41,  1.89it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 28.26it/s][A

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.67it/s][A
Epoch 3:  27%|██▋       | 1597/5971 [14:05<38:33,  1.89it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 28.64it/s][A
Epoch 3:  27%|██▋       | 1601/5971 [14:05<38:26,  1.89it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 29.62it/s][A
Epoch 3:  27%|██▋       | 1605/5971 [14:05<38:18,  1.90it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1608/5971 [14:06<38:14,  1.90it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  27%|██▋       | 1609/5971 [14:07<38:14,  1.90it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=1874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1609/5971 [14:07<38:14,  1.90it/s, loss=0.154, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000614, train/loss_step=0.180, global_step=1875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  27%|██▋       | 1610/5971 [14:07<38:15,  1.90it/s, loss=0.151, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00269, train/loss_step=0.406, global_step=1875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  27%|██▋       | 1611/5971 [14:08<38:15,  1.90it/s, loss=0.148, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000178, train/loss_step=0.049, global_step=1875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1612/5971 [14:11<38:21,  1.89it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.58e-5, train/loss_step=0.0127, global_step=1875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1613/5971 [14:12<38:22,  1.89it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.58e-5, train/loss_step=0.0127, global_step=1875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1613/5971 [14:12<38:22,  1.89it/s, loss=0.17, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.00405, train/loss_step=0.539, global_step=1876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  27%|██▋       | 1614/5971 [14:13<38:22,  1.89it/s, loss=0.185, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00148, train/loss_step=0.317, global_step=1876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1615/5971 [14:14<38:23,  1.89it/s, loss=0.191, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00111, train/loss_step=0.300, global_step=1876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1616/5971 [14:16<38:27,  1.89it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.36e-6, train/loss_step=0.00154, global_step=1876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1617/5971 [14:17<38:27,  1.89it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.36e-6, train/loss_step=0.00154, global_step=1876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1617/5971 [14:17<38:27,  1.89it/s, loss=0.168, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00204, train/loss_step=0.392, global_step=1877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  27%|██▋       | 1618/5971 [14:18<38:28,  1.89it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.00024, train/loss_step=0.0685, global_step=1877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1619/5971 [14:19<38:28,  1.88it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00684, train/loss_vlb_step=3.29e-5, train/loss_step=0.00684, global_step=1877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1620/5971 [14:22<38:35,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=1877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  27%|██▋       | 1621/5971 [14:23<38:36,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=1877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1621/5971 [14:23<38:36,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.58e-5, train/loss_step=0.010, global_step=1878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  27%|██▋       | 1622/5971 [14:24<38:36,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.36e-5, train/loss_step=0.00721, global_step=1878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1623/5971 [14:25<38:37,  1.88it/s, loss=0.16, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000182, train/loss_step=0.055, global_step=1878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  27%|██▋       | 1624/5971 [14:27<38:41,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.0003, train/loss_step=0.0899, global_step=1878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1625/5971 [14:28<38:42,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.0003, train/loss_step=0.0899, global_step=1878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1625/5971 [14:28<38:42,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00267, train/loss_vlb_step=1.46e-5, train/loss_step=0.00267, global_step=1879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1626/5971 [14:29<38:42,  1.87it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000106, train/loss_step=0.0262, global_step=1879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  27%|██▋       | 1627/5971 [14:30<38:42,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00222, train/loss_step=0.405, global_step=1879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  27%|██▋       | 1628/5971 [14:32<38:46,  1.87it/s, loss=0.168, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00212, train/loss_step=0.377, global_step=1879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1629/5971 [14:33<38:47,  1.87it/s, loss=0.168, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00212, train/loss_step=0.377, global_step=1879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1629/5971 [14:33<38:47,  1.87it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00906, train/loss_vlb_step=4.42e-5, train/loss_step=0.00906, global_step=1880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1630/5971 [14:34<38:47,  1.86it/s, loss=0.149, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000677, train/loss_step=0.191, global_step=1880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  27%|██▋       | 1631/5971 [14:35<38:48,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=1880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1632/5971 [14:37<38:51,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.56e-5, train/loss_step=0.0105, global_step=1880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1633/5971 [14:38<38:52,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.56e-5, train/loss_step=0.0105, global_step=1880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1633/5971 [14:38<38:52,  1.86it/s, loss=0.135, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000721, train/loss_step=0.197, global_step=1881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  27%|██▋       | 1634/5971 [14:39<38:52,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=6.74e-5, train/loss_step=0.0172, global_step=1881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1635/5971 [14:40<38:53,  1.86it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00454, train/loss_vlb_step=2.23e-5, train/loss_step=0.00454, global_step=1881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1636/5971 [14:42<38:56,  1.85it/s, loss=0.128, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00383, train/loss_step=0.461, global_step=1881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  27%|██▋       | 1637/5971 [14:43<38:57,  1.85it/s, loss=0.128, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00383, train/loss_step=0.461, global_step=1881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1637/5971 [14:43<38:57,  1.85it/s, loss=0.128, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00274, train/loss_step=0.391, global_step=1882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1638/5971 [14:44<38:57,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00427, train/loss_vlb_step=2.23e-5, train/loss_step=0.00427, global_step=1882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1639/5971 [14:45<38:58,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0118, train/loss_step=0.606, global_step=1882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  27%|██▋       | 1640/5971 [14:47<39:01,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00177, train/loss_step=0.356, global_step=1882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1641/5971 [14:48<39:02,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00177, train/loss_step=0.356, global_step=1882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1641/5971 [14:48<39:02,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.16e-5, train/loss_step=0.00212, global_step=1883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  27%|██▋       | 1642/5971 [14:49<39:02,  1.85it/s, loss=0.172, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=1883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  28%|██▊       | 1643/5971 [14:50<39:03,  1.85it/s, loss=0.2, v_num=0, train/loss_simple_step=0.623, train/loss_vlb_step=0.00901, train/loss_step=0.623, global_step=1883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  28%|██▊       | 1644/5971 [14:52<39:08,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000151, train/loss_step=0.0438, global_step=1883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1645/5971 [14:53<39:08,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000151, train/loss_step=0.0438, global_step=1883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1645/5971 [14:53<39:08,  1.84it/s, loss=0.213, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00161, train/loss_step=0.315, global_step=1884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  28%|██▊       | 1646/5971 [14:54<39:09,  1.84it/s, loss=0.22, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.00053, train/loss_step=0.155, global_step=1884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1647/5971 [14:55<39:10,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00679, train/loss_vlb_step=3.28e-5, train/loss_step=0.00679, global_step=1884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1648/5971 [14:57<39:13,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00362, train/loss_vlb_step=1.85e-5, train/loss_step=0.00362, global_step=1884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1649/5971 [14:58<39:14,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00362, train/loss_vlb_step=1.85e-5, train/loss_step=0.00362, global_step=1884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1649/5971 [14:58<39:14,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.38e-5, train/loss_step=0.0237, global_step=1885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  28%|██▊       | 1650/5971 [14:59<39:14,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00165, train/loss_step=0.316, global_step=1885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  28%|██▊       | 1651/5971 [15:00<39:14,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00103, train/loss_step=0.283, global_step=1885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1652/5971 [15:02<39:18,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00884, train/loss_vlb_step=3.95e-5, train/loss_step=0.00884, global_step=1885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1653/5971 [15:03<39:19,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00884, train/loss_vlb_step=3.95e-5, train/loss_step=0.00884, global_step=1885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1653/5971 [15:03<39:19,  1.83it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000134, train/loss_step=0.0375, global_step=1886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1654/5971 [15:04<39:19,  1.83it/s, loss=0.221, v_num=0, train/loss_simple_step=0.667, train/loss_vlb_step=0.013, train/loss_step=0.667, global_step=1886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  28%|██▊       | 1655/5971 [15:05<39:20,  1.83it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000115, train/loss_step=0.0299, global_step=1886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1656/5971 [15:08<39:25,  1.82it/s, loss=0.212, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.00105, train/loss_step=0.239, global_step=1886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  28%|██▊       | 1657/5971 [15:09<39:25,  1.82it/s, loss=0.212, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.00105, train/loss_step=0.239, global_step=1886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1657/5971 [15:09<39:25,  1.82it/s, loss=0.206, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00112, train/loss_step=0.280, global_step=1887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1658/5971 [15:10<39:25,  1.82it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.96e-5, train/loss_step=0.0108, global_step=1887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1659/5971 [15:10<39:26,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0904, train/loss_vlb_step=0.000305, train/loss_step=0.0904, global_step=1887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1660/5971 [15:13<39:30,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.57e-5, train/loss_step=0.00283, global_step=1887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1661/5971 [15:14<39:31,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.57e-5, train/loss_step=0.00283, global_step=1887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1661/5971 [15:14<39:31,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000151, train/loss_step=0.0402, global_step=1888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1662/5971 [15:15<39:31,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00159, train/loss_step=0.384, global_step=1888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  28%|██▊       | 1663/5971 [15:16<39:31,  1.82it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00703, train/loss_vlb_step=3.32e-5, train/loss_step=0.00703, global_step=1888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1664/5971 [15:18<39:36,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00464, train/loss_vlb_step=2.39e-5, train/loss_step=0.00464, global_step=1888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1665/5971 [15:19<39:36,  1.81it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00464, train/loss_vlb_step=2.39e-5, train/loss_step=0.00464, global_step=1888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1665/5971 [15:19<39:36,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00121, train/loss_step=0.288, global_step=1889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  28%|██▊       | 1666/5971 [15:20<39:36,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.000143, train/loss_step=0.0401, global_step=1889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1667/5971 [15:21<39:37,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00792, train/loss_vlb_step=3.47e-5, train/loss_step=0.00792, global_step=1889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1668/5971 [15:23<39:40,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.00058, train/loss_step=0.163, global_step=1889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  28%|██▊       | 1669/5971 [15:24<39:41,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.00058, train/loss_step=0.163, global_step=1889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1669/5971 [15:24<39:41,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000574, train/loss_step=0.154, global_step=1890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1670/5971 [15:25<39:41,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.626, train/loss_vlb_step=0.00703, train/loss_step=0.626, global_step=1890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1671/5971 [15:26<39:41,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.00011, train/loss_step=0.0275, global_step=1890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1672/5971 [15:28<39:45,  1.80it/s, loss=0.161, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=1890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1673/5971 [15:29<39:46,  1.80it/s, loss=0.161, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=1890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1673/5971 [15:29<39:46,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00338, train/loss_vlb_step=1.84e-5, train/loss_step=0.00338, global_step=1891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1674/5971 [15:30<39:46,  1.80it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.57e-5, train/loss_step=0.0124, global_step=1891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  28%|██▊       | 1675/5971 [15:31<39:46,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000207, train/loss_step=0.0622, global_step=1891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1676/5971 [15:33<39:50,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0702, train/loss_vlb_step=0.000231, train/loss_step=0.0702, global_step=1891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1677/5971 [15:34<39:50,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0702, train/loss_vlb_step=0.000231, train/loss_step=0.0702, global_step=1891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1677/5971 [15:34<39:50,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00102, train/loss_step=0.267, global_step=1892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  28%|██▊       | 1678/5971 [15:35<39:51,  1.80it/s, loss=0.137, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00193, train/loss_step=0.381, global_step=1892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1679/5971 [15:36<39:51,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00119, train/loss_step=0.284, global_step=1892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1680/5971 [15:38<39:55,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.805, train/loss_vlb_step=0.0417, train/loss_step=0.805, global_step=1892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1681/5971 [15:39<39:55,  1.79it/s, loss=0.187, v_num=0, train/loss_simple_step=0.805, train/loss_vlb_step=0.0417, train/loss_step=0.805, global_step=1892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1681/5971 [15:39<39:55,  1.79it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.00031, train/loss_step=0.0935, global_step=1893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1682/5971 [15:40<39:55,  1.79it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000105, train/loss_step=0.0273, global_step=1893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1683/5971 [15:41<39:56,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000243, train/loss_step=0.0734, global_step=1893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1684/5971 [15:43<40:01,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.92e-5, train/loss_step=0.0191, global_step=1893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1685/5971 [15:44<40:01,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.92e-5, train/loss_step=0.0191, global_step=1893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1685/5971 [15:44<40:01,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000592, train/loss_step=0.173, global_step=1894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  28%|██▊       | 1686/5971 [15:45<40:01,  1.78it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.00012, train/loss_step=0.0306, global_step=1894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1687/5971 [15:46<40:01,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000959, train/loss_step=0.184, global_step=1894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1688/5971 [15:48<40:05,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.584, train/loss_vlb_step=0.00825, train/loss_step=0.584, global_step=1894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1689/5971 [15:49<40:05,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.584, train/loss_vlb_step=0.00825, train/loss_step=0.584, global_step=1894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1689/5971 [15:49<40:05,  1.78it/s, loss=0.21, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00154, train/loss_step=0.365, global_step=1895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1690/5971 [15:50<40:05,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000371, train/loss_step=0.112, global_step=1895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1691/5971 [15:51<40:06,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.05e-5, train/loss_step=0.0174, global_step=1895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1692/5971 [15:53<40:10,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.07e-5, train/loss_step=0.00395, global_step=1895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1693/5971 [15:54<40:10,  1.77it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.07e-5, train/loss_step=0.00395, global_step=1895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1693/5971 [15:54<40:10,  1.77it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000207, train/loss_step=0.0619, global_step=1896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  28%|██▊       | 1694/5971 [15:55<40:11,  1.77it/s, loss=0.206, v_num=0, train/loss_simple_step=0.510, train/loss_vlb_step=0.00406, train/loss_step=0.510, global_step=1896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  28%|██▊       | 1695/5971 [15:56<40:11,  1.77it/s, loss=0.232, v_num=0, train/loss_simple_step=0.583, train/loss_vlb_step=0.00884, train/loss_step=0.583, global_step=1896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1696/5971 [15:59<40:17,  1.77it/s, loss=0.238, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000664, train/loss_step=0.188, global_step=1896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1697/5971 [16:00<40:17,  1.77it/s, loss=0.238, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000664, train/loss_step=0.188, global_step=1896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1697/5971 [16:00<40:17,  1.77it/s, loss=0.227, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000175, train/loss_step=0.0503, global_step=1897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1698/5971 [16:01<40:17,  1.77it/s, loss=0.214, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000384, train/loss_step=0.116, global_step=1897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  28%|██▊       | 1699/5971 [16:02<40:18,  1.77it/s, loss=0.207, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000458, train/loss_step=0.140, global_step=1897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1700/5971 [16:04<40:21,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.84e-5, train/loss_step=0.00322, global_step=1897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1701/5971 [16:05<40:21,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.84e-5, train/loss_step=0.00322, global_step=1897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  28%|██▊       | 1701/5971 [16:05<40:21,  1.76it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0597, train/loss_vlb_step=0.000202, train/loss_step=0.0597, global_step=1898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  29%|██▊       | 1702/5971 [16:06<40:21,  1.76it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.87e-5, train/loss_step=0.00335, global_step=1898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  29%|██▊       | 1703/5971 [16:07<40:22,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.48e-5, train/loss_step=0.0125, global_step=1898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  29%|██▊       | 1704/5971 [16:09<40:25,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00197, train/loss_vlb_step=1.14e-5, train/loss_step=0.00197, global_step=1898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  29%|██▊       | 1705/5971 [16:10<40:26,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00197, train/loss_vlb_step=1.14e-5, train/loss_step=0.00197, global_step=1898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  29%|██▊       | 1705/5971 [16:10<40:26,  1.76it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.17e-5, train/loss_step=0.00394, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  29%|██▊       | 1706/5971 [16:11<40:26,  1.76it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.22e-5, train/loss_step=0.00216, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  29%|██▊       | 1707/5971 [16:11<40:26,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.12e-5, train/loss_step=0.00387, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  29%|██▊       | 1708/5971 [16:14<40:29,  1.75it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  29%|██▊       | 1709/5971 [16:14<40:27,  1.76it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:19,  2.09it/s][A

Validating:   1%|          | 2/167 [00:00<00:51,  3.20it/s][A

Validating:   2%|▏         | 4/167 [00:00<00:25,  6.50it/s][A
Epoch 3:  29%|██▊       | 1713/5971 [16:14<40:22,  1.76it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   4%|▍         | 7/167 [00:00<00:13, 11.60it/s][A
Epoch 3:  29%|██▉       | 1717/5971 [16:15<40:14,  1.76it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   6%|▌         | 10/167 [00:01<00:10, 15.16it/s][A
Epoch 3:  29%|██▉       | 1721/5971 [16:15<40:07,  1.77it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:01<00:08, 18.13it/s][A

Validating:  10%|▉         | 16/167 [00:01<00:07, 20.52it/s][A
Epoch 3:  29%|██▉       | 1725/5971 [16:15<39:59,  1.77it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.88it/s][A
Epoch 3:  29%|██▉       | 1729/5971 [16:15<39:52,  1.77it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 23.50it/s][A
Epoch 3:  29%|██▉       | 1733/5971 [16:15<39:44,  1.78it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 24.81it/s][A

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.94it/s][A
Epoch 3:  29%|██▉       | 1737/5971 [16:15<39:37,  1.78it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 24.97it/s][A
Epoch 3:  29%|██▉       | 1741/5971 [16:16<39:30,  1.78it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|██        | 34/167 [00:01<00:05, 26.16it/s][A
Epoch 3:  29%|██▉       | 1745/5971 [16:16<39:22,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 25.50it/s][A

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.77it/s][A
Epoch 3:  29%|██▉       | 1749/5971 [16:16<39:15,  1.79it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.67it/s][A
Epoch 3:  29%|██▉       | 1753/5971 [16:16<39:08,  1.80it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.57it/s][A
Epoch 3:  29%|██▉       | 1757/5971 [16:16<39:00,  1.80it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.41it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 27.11it/s][A
Epoch 3:  29%|██▉       | 1761/5971 [16:16<38:53,  1.80it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.77it/s][A
Epoch 3:  30%|██▉       | 1765/5971 [16:16<38:46,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 25.28it/s][A
Epoch 3:  30%|██▉       | 1769/5971 [16:17<38:39,  1.81it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 26.03it/s][A

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.10it/s][A
Epoch 3:  30%|██▉       | 1773/5971 [16:17<38:32,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:03, 25.14it/s][A
Epoch 3:  30%|██▉       | 1777/5971 [16:17<38:25,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.12it/s][A
Epoch 3:  30%|██▉       | 1781/5971 [16:17<38:18,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.24it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.63it/s][A
Epoch 3:  30%|██▉       | 1785/5971 [16:17<38:11,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.86it/s][A
Epoch 3:  30%|██▉       | 1789/5971 [16:17<38:04,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.84it/s][A
Epoch 3:  30%|███       | 1793/5971 [16:17<37:57,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.36it/s][A
Epoch 3:  30%|███       | 1797/5971 [16:18<37:50,  1.84it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 25.57it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 24.77it/s][A
Epoch 3:  30%|███       | 1801/5971 [16:18<37:43,  1.84it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.77it/s][A
Epoch 3:  30%|███       | 1805/5971 [16:18<37:37,  1.85it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.16it/s][A
Epoch 3:  30%|███       | 1809/5971 [16:18<37:30,  1.85it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.70it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.84it/s][A
Epoch 3:  30%|███       | 1813/5971 [16:18<37:23,  1.85it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.30it/s][A
Epoch 3:  30%|███       | 1817/5971 [16:18<37:16,  1.86it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.04it/s][A
Epoch 3:  30%|███       | 1821/5971 [16:19<37:10,  1.86it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.07it/s][A

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 26.63it/s][A
Epoch 3:  31%|███       | 1825/5971 [16:19<37:03,  1.86it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.97it/s][A
Epoch 3:  31%|███       | 1829/5971 [16:19<36:56,  1.87it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.34it/s][A
Epoch 3:  31%|███       | 1833/5971 [16:19<36:50,  1.87it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.58it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.52it/s][A
Epoch 3:  31%|███       | 1837/5971 [16:19<36:43,  1.88it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.16it/s][A
Epoch 3:  31%|███       | 1841/5971 [16:19<36:36,  1.88it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.77it/s][A
Epoch 3:  31%|███       | 1845/5971 [16:20<36:30,  1.88it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.59it/s][A

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 25.77it/s][A
Epoch 3:  31%|███       | 1849/5971 [16:20<36:23,  1.89it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.12it/s][A
Epoch 3:  31%|███       | 1853/5971 [16:20<36:17,  1.89it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.34it/s][A
Epoch 3:  31%|███       | 1857/5971 [16:20<36:11,  1.89it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.03it/s][A

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.24it/s][A
Epoch 3:  31%|███       | 1861/5971 [16:20<36:04,  1.90it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.03it/s][A
Epoch 3:  31%|███       | 1865/5971 [16:20<35:58,  1.90it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.47it/s][A
Epoch 3:  31%|███▏      | 1869/5971 [16:20<35:51,  1.91it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.58it/s][A

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 19.12it/s][A
Epoch 3:  31%|███▏      | 1873/5971 [16:21<35:45,  1.91it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:07<00:00, 20.93it/s][A
Epoch 3:  31%|███▏      | 1876/5971 [16:21<35:41,  1.91it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.83it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.27it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.60it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.83it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.03it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.62it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.58it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.51it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.54it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.59it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.63it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.48it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.52it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.53it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.47it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.49it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.48it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.39it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.37it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.18it/s]

Epoch 3:  31%|███▏      | 1877/5971 [16:33<36:06,  1.89it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000214, train/loss_step=0.0642, global_step=1899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  31%|███▏      | 1877/5971 [16:33<36:06,  1.89it/s, loss=0.097, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.78e-5, train/loss_step=0.00337, global_step=1900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.21it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.69it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.92it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.02it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.95it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.36it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.37it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.48it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.48it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.34it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.18it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.19it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.21it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.24it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.29it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.37it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.37it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.05it/s]

Epoch 3:  31%|███▏      | 1878/5971 [16:46<36:32,  1.87it/s, loss=0.097, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.78e-5, train/loss_step=0.00337, global_step=1900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  31%|███▏      | 1878/5971 [16:46<36:32,  1.87it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=1900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.33it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.14it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.77it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.27it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.64it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.90it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.08it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.24it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.44it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.58it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.40it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.35it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.26it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.18it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.17it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.10it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.07it/s]

Epoch 3:  31%|███▏      | 1879/5971 [16:58<36:57,  1.85it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=1900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  31%|███▏      | 1879/5971 [16:58<36:57,  1.85it/s, loss=0.114, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00161, train/loss_step=0.358, global_step=1900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:25,  1.90it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:15,  3.03it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.76it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.24it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.66it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:08,  4.89it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.04it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.15it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.33it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.56it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.57it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.49it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.25it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.26it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.29it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.24it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.55it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.49it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.26it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.23it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.34it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.37it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.40it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.29it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.19it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.13it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.10it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.10it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.17it/s]

Epoch 3:  31%|███▏      | 1880/5971 [17:12<37:24,  1.82it/s, loss=0.114, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00161, train/loss_step=0.358, global_step=1900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  31%|███▏      | 1880/5971 [17:12<37:24,  1.82it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=1.02e-5, train/loss_step=0.00171, global_step=1900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1881/5971 [17:13<37:25,  1.82it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=1.02e-5, train/loss_step=0.00171, global_step=1900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1881/5971 [17:13<37:25,  1.82it/s, loss=0.112, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000132, train/loss_step=0.035, global_step=1901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  32%|███▏      | 1882/5971 [17:13<37:25,  1.82it/s, loss=0.112, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000132, train/loss_step=0.035, global_step=1901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1882/5971 [17:13<37:25,  1.82it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.00011, train/loss_step=0.0278, global_step=1901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1883/5971 [17:14<37:25,  1.82it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.00011, train/loss_step=0.0278, global_step=1901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1883/5971 [17:14<37:25,  1.82it/s, loss=0.0594, v_num=0, train/loss_simple_step=0.00569, train/loss_vlb_step=2.88e-5, train/loss_step=0.00569, global_step=1901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1884/5971 [17:17<37:28,  1.82it/s, loss=0.0594, v_num=0, train/loss_simple_step=0.00569, train/loss_vlb_step=2.88e-5, train/loss_step=0.00569, global_step=1901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1884/5971 [17:17<37:28,  1.82it/s, loss=0.0504, v_num=0, train/loss_simple_step=0.00733, train/loss_vlb_step=3.55e-5, train/loss_step=0.00733, global_step=1901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1885/5971 [17:18<37:28,  1.82it/s, loss=0.0504, v_num=0, train/loss_simple_step=0.00733, train/loss_vlb_step=3.55e-5, train/loss_step=0.00733, global_step=1901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1885/5971 [17:18<37:28,  1.82it/s, loss=0.049, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=7.9e-5, train/loss_step=0.0219, global_step=1902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  32%|███▏      | 1886/5971 [17:18<37:29,  1.82it/s, loss=0.049, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=7.9e-5, train/loss_step=0.0219, global_step=1902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1886/5971 [17:18<37:29,  1.82it/s, loss=0.0506, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000495, train/loss_step=0.149, global_step=1902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1887/5971 [17:19<37:29,  1.82it/s, loss=0.0506, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000495, train/loss_step=0.149, global_step=1902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1887/5971 [17:19<37:29,  1.82it/s, loss=0.0492, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000372, train/loss_step=0.112, global_step=1902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1888/5971 [17:22<37:32,  1.81it/s, loss=0.0492, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000372, train/loss_step=0.112, global_step=1902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1888/5971 [17:22<37:32,  1.81it/s, loss=0.0509, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000137, train/loss_step=0.0373, global_step=1902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1889/5971 [17:23<37:33,  1.81it/s, loss=0.0509, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000137, train/loss_step=0.0373, global_step=1902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1889/5971 [17:23<37:33,  1.81it/s, loss=0.0812, v_num=0, train/loss_simple_step=0.666, train/loss_vlb_step=0.0196, train/loss_step=0.666, global_step=1903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  32%|███▏      | 1890/5971 [17:24<37:33,  1.81it/s, loss=0.0812, v_num=0, train/loss_simple_step=0.666, train/loss_vlb_step=0.0196, train/loss_step=0.666, global_step=1903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1890/5971 [17:24<37:33,  1.81it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.00592, train/loss_vlb_step=2.96e-5, train/loss_step=0.00592, global_step=1903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1891/5971 [17:25<37:33,  1.81it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.00592, train/loss_vlb_step=2.96e-5, train/loss_step=0.00592, global_step=1903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1891/5971 [17:25<37:33,  1.81it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000162, train/loss_step=0.0453, global_step=1903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  32%|███▏      | 1892/5971 [17:27<37:37,  1.81it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000162, train/loss_step=0.0453, global_step=1903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1892/5971 [17:27<37:37,  1.81it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.6e-5, train/loss_step=0.0204, global_step=1903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  32%|███▏      | 1893/5971 [17:28<37:37,  1.81it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.6e-5, train/loss_step=0.0204, global_step=1903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1893/5971 [17:28<37:37,  1.81it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000148, train/loss_step=0.0418, global_step=1904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1894/5971 [17:29<37:37,  1.81it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000148, train/loss_step=0.0418, global_step=1904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1894/5971 [17:29<37:37,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000568, train/loss_step=0.145, global_step=1904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  32%|███▏      | 1895/5971 [17:30<37:37,  1.81it/s, loss=0.093, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000568, train/loss_step=0.145, global_step=1904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1895/5971 [17:30<37:37,  1.81it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000132, train/loss_step=0.0362, global_step=1904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1896/5971 [17:32<37:41,  1.80it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000132, train/loss_step=0.0362, global_step=1904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1896/5971 [17:32<37:41,  1.80it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.00017, train/loss_step=0.0455, global_step=1904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  32%|███▏      | 1897/5971 [17:33<37:41,  1.80it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.00017, train/loss_step=0.0455, global_step=1904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1897/5971 [17:33<37:41,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.593, train/loss_vlb_step=0.0128, train/loss_step=0.593, global_step=1905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  32%|███▏      | 1898/5971 [17:34<37:41,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.593, train/loss_vlb_step=0.0128, train/loss_step=0.593, global_step=1905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1898/5971 [17:34<37:41,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=1905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1899/5971 [17:35<37:42,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=1905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1899/5971 [17:35<37:42,  1.80it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000236, train/loss_step=0.0673, global_step=1905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1900/5971 [17:37<37:45,  1.80it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000236, train/loss_step=0.0673, global_step=1905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1900/5971 [17:37<37:45,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0826, train/loss_vlb_step=0.000273, train/loss_step=0.0826, global_step=1905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1901/5971 [17:38<37:45,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0826, train/loss_vlb_step=0.000273, train/loss_step=0.0826, global_step=1905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1901/5971 [17:38<37:45,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00897, train/loss_vlb_step=3.99e-5, train/loss_step=0.00897, global_step=1906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1902/5971 [17:39<37:45,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00897, train/loss_vlb_step=3.99e-5, train/loss_step=0.00897, global_step=1906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1902/5971 [17:39<37:45,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.0166, train/loss_step=0.595, global_step=1906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  32%|███▏      | 1903/5971 [17:40<37:45,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.0166, train/loss_step=0.595, global_step=1906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1903/5971 [17:40<37:45,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000265, train/loss_step=0.0807, global_step=1906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1904/5971 [17:42<37:48,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000265, train/loss_step=0.0807, global_step=1906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1904/5971 [17:42<37:48,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000113, train/loss_step=0.0334, global_step=1906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1905/5971 [17:43<37:49,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000113, train/loss_step=0.0334, global_step=1906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1905/5971 [17:43<37:49,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00438, train/loss_step=0.512, global_step=1907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  32%|███▏      | 1906/5971 [17:44<37:49,  1.79it/s, loss=0.169, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00438, train/loss_step=0.512, global_step=1907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1906/5971 [17:44<37:49,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.00022, train/loss_step=0.0654, global_step=1907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1907/5971 [17:45<37:49,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.00022, train/loss_step=0.0654, global_step=1907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1907/5971 [17:45<37:49,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.759, train/loss_vlb_step=0.0435, train/loss_step=0.759, global_step=1907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  32%|███▏      | 1908/5971 [17:47<37:52,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.759, train/loss_vlb_step=0.0435, train/loss_step=0.759, global_step=1907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1908/5971 [17:47<37:52,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000628, train/loss_step=0.174, global_step=1907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1909/5971 [17:48<37:52,  1.79it/s, loss=0.204, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000628, train/loss_step=0.174, global_step=1907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1909/5971 [17:48<37:52,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.21e-5, train/loss_step=0.00638, global_step=1908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1910/5971 [17:49<37:52,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.21e-5, train/loss_step=0.00638, global_step=1908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1910/5971 [17:49<37:52,  1.79it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.59e-5, train/loss_step=0.0179, global_step=1908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  32%|███▏      | 1911/5971 [17:50<37:52,  1.79it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.59e-5, train/loss_step=0.0179, global_step=1908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1911/5971 [17:50<37:53,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=1908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  32%|███▏      | 1912/5971 [17:52<37:55,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=1908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1912/5971 [17:52<37:55,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00138, train/loss_step=0.356, global_step=1908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  32%|███▏      | 1913/5971 [17:53<37:56,  1.78it/s, loss=0.193, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00138, train/loss_step=0.356, global_step=1908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1913/5971 [17:53<37:56,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=6.98e-5, train/loss_step=0.0175, global_step=1909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1914/5971 [17:54<37:56,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=6.98e-5, train/loss_step=0.0175, global_step=1909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1914/5971 [17:54<37:56,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.28e-5, train/loss_step=0.0125, global_step=1909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1915/5971 [17:55<37:56,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.28e-5, train/loss_step=0.0125, global_step=1909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1915/5971 [17:55<37:56,  1.78it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0785, train/loss_vlb_step=0.000263, train/loss_step=0.0785, global_step=1909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1916/5971 [17:57<37:59,  1.78it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0785, train/loss_vlb_step=0.000263, train/loss_step=0.0785, global_step=1909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1916/5971 [17:57<37:59,  1.78it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000155, train/loss_step=0.0411, global_step=1909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1917/5971 [17:58<37:59,  1.78it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000155, train/loss_step=0.0411, global_step=1909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1917/5971 [17:58<37:59,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000262, train/loss_step=0.0771, global_step=1910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1918/5971 [17:59<37:59,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000262, train/loss_step=0.0771, global_step=1910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1918/5971 [17:59<37:59,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.074, train/loss_vlb_step=0.000255, train/loss_step=0.074, global_step=1910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  32%|███▏      | 1919/5971 [18:00<38:00,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.074, train/loss_vlb_step=0.000255, train/loss_step=0.074, global_step=1910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1919/5971 [18:00<38:00,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.39e-5, train/loss_step=0.0124, global_step=1910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1920/5971 [18:03<38:03,  1.77it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.39e-5, train/loss_step=0.0124, global_step=1910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1920/5971 [18:03<38:03,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.0005, train/loss_step=0.142, global_step=1910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  32%|███▏      | 1921/5971 [18:03<38:04,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.0005, train/loss_step=0.142, global_step=1910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1921/5971 [18:03<38:04,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=1911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1922/5971 [18:04<38:04,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=1911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1922/5971 [18:04<38:04,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.00417, train/loss_step=0.561, global_step=1911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  32%|███▏      | 1923/5971 [18:05<38:04,  1.77it/s, loss=0.163, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.00417, train/loss_step=0.561, global_step=1911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1923/5971 [18:05<38:04,  1.77it/s, loss=0.207, v_num=0, train/loss_simple_step=0.951, train/loss_vlb_step=0.478, train/loss_step=0.951, global_step=1911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  32%|███▏      | 1924/5971 [18:09<38:09,  1.77it/s, loss=0.207, v_num=0, train/loss_simple_step=0.951, train/loss_vlb_step=0.478, train/loss_step=0.951, global_step=1911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1924/5971 [18:09<38:09,  1.77it/s, loss=0.219, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00119, train/loss_step=0.287, global_step=1911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1925/5971 [18:10<38:09,  1.77it/s, loss=0.219, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00119, train/loss_step=0.287, global_step=1911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1925/5971 [18:10<38:09,  1.77it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00711, train/loss_vlb_step=3.51e-5, train/loss_step=0.00711, global_step=1912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1926/5971 [18:10<38:09,  1.77it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00711, train/loss_vlb_step=3.51e-5, train/loss_step=0.00711, global_step=1912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1926/5971 [18:10<38:09,  1.77it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000158, train/loss_step=0.0428, global_step=1912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  32%|███▏      | 1927/5971 [18:11<38:10,  1.77it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000158, train/loss_step=0.0428, global_step=1912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1927/5971 [18:11<38:10,  1.77it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00847, train/loss_vlb_step=4.07e-5, train/loss_step=0.00847, global_step=1912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1928/5971 [18:14<38:13,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00847, train/loss_vlb_step=4.07e-5, train/loss_step=0.00847, global_step=1912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1928/5971 [18:14<38:13,  1.76it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00603, train/loss_vlb_step=3e-5, train/loss_step=0.00603, global_step=1912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  32%|███▏      | 1929/5971 [18:15<38:13,  1.76it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00603, train/loss_vlb_step=3e-5, train/loss_step=0.00603, global_step=1912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1929/5971 [18:15<38:13,  1.76it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.13e-5, train/loss_step=0.0111, global_step=1913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1930/5971 [18:16<38:13,  1.76it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.13e-5, train/loss_step=0.0111, global_step=1913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1930/5971 [18:16<38:13,  1.76it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=6.81e-5, train/loss_step=0.0171, global_step=1913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1931/5971 [18:16<38:13,  1.76it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=6.81e-5, train/loss_step=0.0171, global_step=1913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1931/5971 [18:16<38:13,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.44e-5, train/loss_step=0.00252, global_step=1913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1932/5971 [18:19<38:16,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.44e-5, train/loss_step=0.00252, global_step=1913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1932/5971 [18:19<38:16,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000204, train/loss_step=0.0606, global_step=1913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  32%|███▏      | 1933/5971 [18:20<38:16,  1.76it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000204, train/loss_step=0.0606, global_step=1913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1933/5971 [18:20<38:16,  1.76it/s, loss=0.143, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00204, train/loss_step=0.345, global_step=1914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  32%|███▏      | 1934/5971 [18:21<38:17,  1.76it/s, loss=0.143, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00204, train/loss_step=0.345, global_step=1914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1934/5971 [18:21<38:17,  1.76it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.82e-5, train/loss_step=0.00348, global_step=1914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1935/5971 [18:21<38:17,  1.76it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.82e-5, train/loss_step=0.00348, global_step=1914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1935/5971 [18:21<38:17,  1.76it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00651, train/loss_vlb_step=3.32e-5, train/loss_step=0.00651, global_step=1914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1936/5971 [18:25<38:22,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00651, train/loss_vlb_step=3.32e-5, train/loss_step=0.00651, global_step=1914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1936/5971 [18:25<38:22,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.32e-5, train/loss_step=0.0173, global_step=1914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  32%|███▏      | 1937/5971 [18:26<38:22,  1.75it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.32e-5, train/loss_step=0.0173, global_step=1914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1937/5971 [18:26<38:22,  1.75it/s, loss=0.15, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00149, train/loss_step=0.333, global_step=1915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  32%|███▏      | 1938/5971 [18:27<38:22,  1.75it/s, loss=0.15, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00149, train/loss_step=0.333, global_step=1915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1938/5971 [18:27<38:22,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=1915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1939/5971 [18:28<38:22,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=1915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1939/5971 [18:28<38:22,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.72e-5, train/loss_step=0.00317, global_step=1915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1940/5971 [18:30<38:26,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.72e-5, train/loss_step=0.00317, global_step=1915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  32%|███▏      | 1940/5971 [18:30<38:26,  1.75it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.0002, train/loss_step=0.0606, global_step=1915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  33%|███▎      | 1941/5971 [18:31<38:26,  1.75it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.0002, train/loss_step=0.0606, global_step=1915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1941/5971 [18:31<38:26,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.77e-6, train/loss_step=0.00144, global_step=1916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1942/5971 [18:32<38:26,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.77e-6, train/loss_step=0.00144, global_step=1916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1942/5971 [18:32<38:26,  1.75it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00535, train/loss_vlb_step=2.69e-5, train/loss_step=0.00535, global_step=1916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1943/5971 [18:33<38:26,  1.75it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00535, train/loss_vlb_step=2.69e-5, train/loss_step=0.00535, global_step=1916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1943/5971 [18:33<38:26,  1.75it/s, loss=0.0978, v_num=0, train/loss_simple_step=0.619, train/loss_vlb_step=0.00746, train/loss_step=0.619, global_step=1916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  33%|███▎      | 1944/5971 [18:35<38:30,  1.74it/s, loss=0.0978, v_num=0, train/loss_simple_step=0.619, train/loss_vlb_step=0.00746, train/loss_step=0.619, global_step=1916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1944/5971 [18:35<38:30,  1.74it/s, loss=0.084, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.21e-5, train/loss_step=0.0116, global_step=1916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1945/5971 [18:36<38:30,  1.74it/s, loss=0.084, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.21e-5, train/loss_step=0.0116, global_step=1916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1945/5971 [18:36<38:30,  1.74it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.00427, train/loss_vlb_step=2.23e-5, train/loss_step=0.00427, global_step=1917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1946/5971 [18:37<38:30,  1.74it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.00427, train/loss_vlb_step=2.23e-5, train/loss_step=0.00427, global_step=1917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1946/5971 [18:37<38:30,  1.74it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000354, train/loss_step=0.108, global_step=1917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  33%|███▎      | 1947/5971 [18:38<38:30,  1.74it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000354, train/loss_step=0.108, global_step=1917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1947/5971 [18:38<38:30,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.811, train/loss_vlb_step=0.0522, train/loss_step=0.811, global_step=1917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  33%|███▎      | 1948/5971 [18:40<38:33,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.811, train/loss_vlb_step=0.0522, train/loss_step=0.811, global_step=1917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1948/5971 [18:40<38:33,  1.74it/s, loss=0.137, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000755, train/loss_step=0.209, global_step=1917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1949/5971 [18:41<38:33,  1.74it/s, loss=0.137, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000755, train/loss_step=0.209, global_step=1917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1949/5971 [18:41<38:33,  1.74it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000192, train/loss_step=0.0561, global_step=1918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1950/5971 [18:42<38:33,  1.74it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000192, train/loss_step=0.0561, global_step=1918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1950/5971 [18:42<38:33,  1.74it/s, loss=0.15, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000973, train/loss_step=0.227, global_step=1918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  33%|███▎      | 1951/5971 [18:43<38:33,  1.74it/s, loss=0.15, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000973, train/loss_step=0.227, global_step=1918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1951/5971 [18:43<38:33,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000266, train/loss_step=0.0797, global_step=1918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1952/5971 [18:45<38:36,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000266, train/loss_step=0.0797, global_step=1918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1952/5971 [18:45<38:36,  1.74it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.1e-5, train/loss_step=0.00189, global_step=1918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1953/5971 [18:46<38:36,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.1e-5, train/loss_step=0.00189, global_step=1918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1953/5971 [18:46<38:36,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.95e-5, train/loss_step=0.0118, global_step=1919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  33%|███▎      | 1954/5971 [18:47<38:36,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.95e-5, train/loss_step=0.0118, global_step=1919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1954/5971 [18:47<38:36,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00443, train/loss_vlb_step=2.4e-5, train/loss_step=0.00443, global_step=1919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1955/5971 [18:48<38:36,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00443, train/loss_vlb_step=2.4e-5, train/loss_step=0.00443, global_step=1919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1955/5971 [18:48<38:36,  1.73it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000106, train/loss_step=0.0298, global_step=1919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1956/5971 [18:50<38:39,  1.73it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000106, train/loss_step=0.0298, global_step=1919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1956/5971 [18:50<38:39,  1.73it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00591, train/loss_vlb_step=2.96e-5, train/loss_step=0.00591, global_step=1919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1957/5971 [18:51<38:39,  1.73it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00591, train/loss_vlb_step=2.96e-5, train/loss_step=0.00591, global_step=1919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1957/5971 [18:51<38:39,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.23e-5, train/loss_step=0.0022, global_step=1920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  33%|███▎      | 1958/5971 [18:52<38:39,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.23e-5, train/loss_step=0.0022, global_step=1920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1958/5971 [18:52<38:39,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.68e-5, train/loss_step=0.003, global_step=1920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  33%|███▎      | 1959/5971 [18:53<38:39,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.68e-5, train/loss_step=0.003, global_step=1920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1959/5971 [18:53<38:39,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.29e-5, train/loss_step=0.00218, global_step=1920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1960/5971 [18:55<38:42,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.29e-5, train/loss_step=0.00218, global_step=1920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1960/5971 [18:55<38:42,  1.73it/s, loss=0.141, v_num=0, train/loss_simple_step=0.631, train/loss_vlb_step=0.00855, train/loss_step=0.631, global_step=1920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  33%|███▎      | 1961/5971 [18:56<38:43,  1.73it/s, loss=0.141, v_num=0, train/loss_simple_step=0.631, train/loss_vlb_step=0.00855, train/loss_step=0.631, global_step=1920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1961/5971 [18:56<38:43,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0942, train/loss_vlb_step=0.000311, train/loss_step=0.0942, global_step=1921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1962/5971 [18:57<38:43,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0942, train/loss_vlb_step=0.000311, train/loss_step=0.0942, global_step=1921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1962/5971 [18:57<38:43,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=1921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  33%|███▎      | 1963/5971 [18:58<38:43,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=1921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1963/5971 [18:58<38:43,  1.73it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.1e-5, train/loss_step=0.00189, global_step=1921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1964/5971 [19:01<38:47,  1.72it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.1e-5, train/loss_step=0.00189, global_step=1921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1964/5971 [19:01<38:47,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=1921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  33%|███▎      | 1965/5971 [19:02<38:47,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=1921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1965/5971 [19:02<38:47,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.38e-5, train/loss_step=0.0101, global_step=1922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1966/5971 [19:03<38:47,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.38e-5, train/loss_step=0.0101, global_step=1922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1966/5971 [19:03<38:47,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000103, train/loss_step=0.0266, global_step=1922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1967/5971 [19:04<38:47,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000103, train/loss_step=0.0266, global_step=1922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1967/5971 [19:04<38:47,  1.72it/s, loss=0.115, v_num=0, train/loss_simple_step=0.649, train/loss_vlb_step=0.00844, train/loss_step=0.649, global_step=1922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  33%|███▎      | 1968/5971 [19:07<38:51,  1.72it/s, loss=0.115, v_num=0, train/loss_simple_step=0.649, train/loss_vlb_step=0.00844, train/loss_step=0.649, global_step=1922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1968/5971 [19:07<38:51,  1.72it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.6e-5, train/loss_step=0.0219, global_step=1922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1969/5971 [19:07<38:51,  1.72it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.6e-5, train/loss_step=0.0219, global_step=1922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1969/5971 [19:07<38:51,  1.72it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.26e-5, train/loss_step=0.00212, global_step=1923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1970/5971 [19:08<38:51,  1.72it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.26e-5, train/loss_step=0.00212, global_step=1923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1970/5971 [19:08<38:51,  1.72it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000363, train/loss_step=0.109, global_step=1923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  33%|███▎      | 1971/5971 [19:09<38:52,  1.72it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000363, train/loss_step=0.109, global_step=1923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1971/5971 [19:09<38:52,  1.72it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.00012, train/loss_step=0.0303, global_step=1923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1972/5971 [19:12<38:55,  1.71it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.00012, train/loss_step=0.0303, global_step=1923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1972/5971 [19:12<38:55,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00424, train/loss_step=0.495, global_step=1923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  33%|███▎      | 1973/5971 [19:12<38:55,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00424, train/loss_step=0.495, global_step=1923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1973/5971 [19:12<38:55,  1.71it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.52e-5, train/loss_step=0.00262, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1974/5971 [19:13<38:55,  1.71it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.52e-5, train/loss_step=0.00262, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1974/5971 [19:13<38:55,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000624, train/loss_step=0.183, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  33%|███▎      | 1975/5971 [19:14<38:55,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000624, train/loss_step=0.183, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1975/5971 [19:14<38:55,  1.71it/s, loss=0.13, v_num=0, train/loss_simple_step=0.076, train/loss_vlb_step=0.00026, train/loss_step=0.076, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  33%|███▎      | 1976/5971 [19:16<38:57,  1.71it/s, loss=0.13, v_num=0, train/loss_simple_step=0.076, train/loss_vlb_step=0.00026, train/loss_step=0.076, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  33%|███▎      | 1976/5971 [19:16<38:57,  1.71it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<02:40,  1.03it/s][A
Epoch 3:  33%|███▎      | 1978/5971 [19:17<38:56,  1.71it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:01<01:25,  1.92it/s][A
Epoch 3:  33%|███▎      | 1980/5971 [19:18<38:53,  1.71it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:01<00:28,  5.65it/s][A
Epoch 3:  33%|███▎      | 1983/5971 [19:18<38:48,  1.71it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:01<00:16,  9.37it/s][A
Epoch 3:  33%|███▎      | 1986/5971 [19:18<38:43,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:01<00:12, 12.75it/s][A
Epoch 3:  33%|███▎      | 1989/5971 [19:18<38:38,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:10, 15.21it/s][A
Epoch 3:  33%|███▎      | 1992/5971 [19:18<38:33,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:08, 18.28it/s][A
Epoch 3:  33%|███▎      | 1995/5971 [19:18<38:28,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 19.72it/s][A
Epoch 3:  33%|███▎      | 1998/5971 [19:18<38:23,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:02<00:06, 20.80it/s][A
Epoch 3:  34%|███▎      | 2001/5971 [19:18<38:18,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:02<00:06, 22.72it/s][A
Epoch 3:  34%|███▎      | 2004/5971 [19:19<38:13,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:02<00:05, 23.71it/s][A
Epoch 3:  34%|███▎      | 2007/5971 [19:19<38:08,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:02<00:05, 24.40it/s][A
Epoch 3:  34%|███▎      | 2010/5971 [19:19<38:03,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:02<00:05, 24.29it/s][A
Epoch 3:  34%|███▎      | 2013/5971 [19:19<37:58,  1.74it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.00it/s][A
Epoch 3:  34%|███▍      | 2016/5971 [19:19<37:53,  1.74it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.13it/s][A
Epoch 3:  34%|███▍      | 2019/5971 [19:19<37:48,  1.74it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.64it/s][A
Epoch 3:  34%|███▍      | 2022/5971 [19:19<37:43,  1.74it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.52it/s][A
Epoch 3:  34%|███▍      | 2025/5971 [19:19<37:38,  1.75it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:03<00:04, 27.29it/s][A
Epoch 3:  34%|███▍      | 2028/5971 [19:19<37:34,  1.75it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:03<00:04, 25.35it/s][A
Epoch 3:  34%|███▍      | 2031/5971 [19:20<37:29,  1.75it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 26.46it/s][A
Epoch 3:  34%|███▍      | 2034/5971 [19:20<37:24,  1.75it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:03<00:04, 25.72it/s][A
Epoch 3:  34%|███▍      | 2037/5971 [19:20<37:19,  1.76it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:03<00:03, 26.27it/s][A
Epoch 3:  34%|███▍      | 2040/5971 [19:20<37:14,  1.76it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.81it/s][A
Epoch 3:  34%|███▍      | 2044/5971 [19:20<37:08,  1.76it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 28.04it/s][A
Epoch 3:  34%|███▍      | 2048/5971 [19:20<37:02,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 28.41it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.74it/s][A
Epoch 3:  34%|███▍      | 2052/5971 [19:20<36:55,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:04<00:03, 28.01it/s][A
Epoch 3:  34%|███▍      | 2056/5971 [19:20<36:49,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 27.80it/s][A
Epoch 3:  35%|███▍      | 2060/5971 [19:21<36:43,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:04<00:02, 27.39it/s][A
Epoch 3:  35%|███▍      | 2064/5971 [19:21<36:37,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 26.23it/s][A

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.88it/s][A
Epoch 3:  35%|███▍      | 2068/5971 [19:21<36:30,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.14it/s][A
Epoch 3:  35%|███▍      | 2072/5971 [19:21<36:24,  1.78it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.43it/s][A
Epoch 3:  35%|███▍      | 2076/5971 [19:21<36:18,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.89it/s][A
Epoch 3:  35%|███▍      | 2080/5971 [19:21<36:12,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:05<00:02, 25.84it/s][A

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 26.38it/s][A
Epoch 3:  35%|███▍      | 2084/5971 [19:22<36:06,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 27.16it/s][A
Epoch 3:  35%|███▍      | 2088/5971 [19:22<36:00,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 26.08it/s][A
Epoch 3:  35%|███▌      | 2092/5971 [19:22<35:54,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 26.01it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.31it/s][A
Epoch 3:  35%|███▌      | 2096/5971 [19:22<35:48,  1.80it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.67it/s][A
Epoch 3:  35%|███▌      | 2100/5971 [19:22<35:42,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.23it/s][A
Epoch 3:  35%|███▌      | 2104/5971 [19:22<35:36,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.50it/s][A

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 26.51it/s][A
Epoch 3:  35%|███▌      | 2108/5971 [19:22<35:30,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:06<00:01, 26.62it/s][A
Epoch 3:  35%|███▌      | 2112/5971 [19:23<35:24,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.40it/s][A
Epoch 3:  35%|███▌      | 2116/5971 [19:23<35:18,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.83it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.34it/s][A
Epoch 3:  36%|███▌      | 2120/5971 [19:23<35:12,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.40it/s][A
Epoch 3:  36%|███▌      | 2124/5971 [19:23<35:06,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 24.17it/s][A
Epoch 3:  36%|███▌      | 2128/5971 [19:23<35:00,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.03it/s][A
Epoch 3:  36%|███▌      | 2132/5971 [19:23<34:54,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:07<00:00, 26.35it/s][A

Validating:  95%|█████████▌| 159/167 [00:07<00:00, 26.72it/s][A
Epoch 3:  36%|███▌      | 2136/5971 [19:24<34:48,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 26.91it/s][A
Epoch 3:  36%|███▌      | 2140/5971 [19:24<34:43,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 27.53it/s][A
Epoch 3:  36%|███▌      | 2144/5971 [19:24<34:37,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2144/5971 [19:24<34:37,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00146, train/loss_step=0.329, global_step=1924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  36%|███▌      | 2145/5971 [19:25<34:37,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000224, train/loss_step=0.0673, global_step=1925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2146/5971 [19:26<34:37,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0033, train/loss_vlb_step=1.83e-5, train/loss_step=0.0033, global_step=1925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  36%|███▌      | 2147/5971 [19:27<34:38,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000253, train/loss_step=0.0761, global_step=1925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2148/5971 [19:29<34:40,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000253, train/loss_step=0.0761, global_step=1925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2148/5971 [19:29<34:40,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.61e-5, train/loss_step=0.0182, global_step=1925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  36%|███▌      | 2149/5971 [19:30<34:41,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.698, train/loss_vlb_step=0.0196, train/loss_step=0.698, global_step=1926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  36%|███▌      | 2150/5971 [19:31<34:41,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000667, train/loss_step=0.182, global_step=1926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2151/5971 [19:32<34:41,  1.84it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000179, train/loss_step=0.0502, global_step=1926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2152/5971 [19:34<34:43,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000179, train/loss_step=0.0502, global_step=1926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2152/5971 [19:34<34:43,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0829, train/loss_vlb_step=0.000278, train/loss_step=0.0829, global_step=1926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2153/5971 [19:35<34:43,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.24e-5, train/loss_step=0.00206, global_step=1927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2154/5971 [19:36<34:43,  1.83it/s, loss=0.156, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000144, train/loss_step=0.039, global_step=1927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  36%|███▌      | 2155/5971 [19:37<34:43,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000235, train/loss_step=0.0694, global_step=1927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2156/5971 [19:39<34:46,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000235, train/loss_step=0.0694, global_step=1927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2156/5971 [19:39<34:46,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00557, train/loss_vlb_step=2.75e-5, train/loss_step=0.00557, global_step=1927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2157/5971 [19:40<34:46,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0056, train/loss_vlb_step=2.83e-5, train/loss_step=0.0056, global_step=1928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  36%|███▌      | 2158/5971 [19:41<34:46,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00102, train/loss_step=0.249, global_step=1928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  36%|███▌      | 2159/5971 [19:42<34:46,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.66e-5, train/loss_step=0.0098, global_step=1928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2160/5971 [19:44<34:48,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.66e-5, train/loss_step=0.0098, global_step=1928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2160/5971 [19:44<34:48,  1.82it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=5.92e-5, train/loss_step=0.0149, global_step=1928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2161/5971 [19:45<34:48,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00104, train/loss_step=0.273, global_step=1929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  36%|███▌      | 2162/5971 [19:46<34:48,  1.82it/s, loss=0.122, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000707, train/loss_step=0.191, global_step=1929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2163/5971 [19:47<34:48,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.73e-5, train/loss_step=0.00322, global_step=1929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2164/5971 [19:49<34:51,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.73e-5, train/loss_step=0.00322, global_step=1929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▌      | 2164/5971 [19:49<34:51,  1.82it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0038, train/loss_vlb_step=2.03e-5, train/loss_step=0.0038, global_step=1929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  36%|███▋      | 2165/5971 [19:50<34:51,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0527, train/loss_vlb_step=0.00018, train/loss_step=0.0527, global_step=1930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2166/5971 [19:51<34:51,  1.82it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000117, train/loss_step=0.0309, global_step=1930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2167/5971 [19:51<34:51,  1.82it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.18e-5, train/loss_step=0.0202, global_step=1930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  36%|███▋      | 2168/5971 [19:54<34:54,  1.82it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.18e-5, train/loss_step=0.0202, global_step=1930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2168/5971 [19:54<34:54,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.0011, train/loss_step=0.269, global_step=1930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  36%|███▋      | 2169/5971 [19:55<34:54,  1.82it/s, loss=0.0843, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000443, train/loss_step=0.133, global_step=1931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2170/5971 [19:56<34:54,  1.82it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00125, train/loss_step=0.286, global_step=1931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  36%|███▋      | 2171/5971 [19:57<34:54,  1.81it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.17e-5, train/loss_step=0.0163, global_step=1931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2172/5971 [19:59<34:56,  1.81it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.17e-5, train/loss_step=0.0163, global_step=1931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2172/5971 [19:59<34:56,  1.81it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.0715, train/loss_vlb_step=0.000242, train/loss_step=0.0715, global_step=1931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2173/5971 [20:00<34:56,  1.81it/s, loss=0.0897, v_num=0, train/loss_simple_step=0.0517, train/loss_vlb_step=0.000176, train/loss_step=0.0517, global_step=1932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2174/5971 [20:00<34:56,  1.81it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000203, train/loss_step=0.0606, global_step=1932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2175/5971 [20:01<34:56,  1.81it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.00952, train/loss_vlb_step=4.43e-5, train/loss_step=0.00952, global_step=1932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2176/5971 [20:04<34:59,  1.81it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.00952, train/loss_vlb_step=4.43e-5, train/loss_step=0.00952, global_step=1932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2176/5971 [20:04<34:59,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00212, train/loss_step=0.446, global_step=1932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]      
Epoch 3:  36%|███▋      | 2177/5971 [20:05<34:59,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.77e-5, train/loss_step=0.0107, global_step=1933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2178/5971 [20:05<34:59,  1.81it/s, loss=0.0978, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.15e-5, train/loss_step=0.00193, global_step=1933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  36%|███▋      | 2179/5971 [20:06<34:59,  1.81it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.00015, train/loss_step=0.0393, global_step=1933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2180/5971 [20:08<35:01,  1.80it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.00015, train/loss_step=0.0393, global_step=1933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2180/5971 [20:08<35:01,  1.80it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.37e-5, train/loss_step=0.012, global_step=1933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2181/5971 [20:09<35:01,  1.80it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000433, train/loss_step=0.127, global_step=1934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2182/5971 [20:10<35:01,  1.80it/s, loss=0.0824, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.21e-5, train/loss_step=0.00224, global_step=1934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2183/5971 [20:11<35:01,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00258, train/loss_step=0.403, global_step=1934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  37%|███▋      | 2184/5971 [20:13<35:03,  1.80it/s, loss=0.102, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00258, train/loss_step=0.403, global_step=1934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2184/5971 [20:13<35:03,  1.80it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.12e-5, train/loss_step=0.0138, global_step=1934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2185/5971 [20:14<35:03,  1.80it/s, loss=0.112, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000908, train/loss_step=0.240, global_step=1935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  37%|███▋      | 2186/5971 [20:15<35:03,  1.80it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00362, train/loss_vlb_step=1.9e-5, train/loss_step=0.00362, global_step=1935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2187/5971 [20:16<35:03,  1.80it/s, loss=0.116, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000385, train/loss_step=0.116, global_step=1935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2188/5971 [20:18<35:05,  1.80it/s, loss=0.116, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000385, train/loss_step=0.116, global_step=1935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2188/5971 [20:18<35:05,  1.80it/s, loss=0.114, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000864, train/loss_step=0.240, global_step=1935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2189/5971 [20:19<35:05,  1.80it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000115, train/loss_step=0.0303, global_step=1936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2190/5971 [20:20<35:05,  1.80it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.67e-5, train/loss_step=0.0104, global_step=1936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2191/5971 [20:21<35:05,  1.80it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0747, train/loss_vlb_step=0.000252, train/loss_step=0.0747, global_step=1936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2192/5971 [20:23<35:07,  1.79it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0747, train/loss_vlb_step=0.000252, train/loss_step=0.0747, global_step=1936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2192/5971 [20:23<35:07,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00131, train/loss_step=0.301, global_step=1936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  37%|███▋      | 2193/5971 [20:24<35:08,  1.79it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.33e-5, train/loss_step=0.0207, global_step=1937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2194/5971 [20:25<35:08,  1.79it/s, loss=0.112, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000482, train/loss_step=0.145, global_step=1937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  37%|███▋      | 2195/5971 [20:25<35:07,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000537, train/loss_step=0.157, global_step=1937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  37%|███▋      | 2196/5971 [20:28<35:11,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000537, train/loss_step=0.157, global_step=1937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2196/5971 [20:28<35:11,  1.79it/s, loss=0.109, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000799, train/loss_step=0.228, global_step=1937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2197/5971 [20:29<35:11,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.913, train/loss_vlb_step=0.230, train/loss_step=0.913, global_step=1938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  37%|███▋      | 2198/5971 [20:30<35:11,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.64e-6, train/loss_step=0.00144, global_step=1938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2199/5971 [20:31<35:11,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.33e-5, train/loss_step=0.0023, global_step=1938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2200/5971 [20:33<35:13,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.33e-5, train/loss_step=0.0023, global_step=1938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2200/5971 [20:33<35:13,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00522, train/loss_vlb_step=2.52e-5, train/loss_step=0.00522, global_step=1938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2201/5971 [20:34<35:13,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000462, train/loss_step=0.138, global_step=1939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  37%|███▋      | 2202/5971 [20:35<35:13,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.059, train/loss_vlb_step=0.000194, train/loss_step=0.059, global_step=1939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2203/5971 [20:36<35:13,  1.78it/s, loss=0.143, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000558, train/loss_step=0.158, global_step=1939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2204/5971 [20:38<35:15,  1.78it/s, loss=0.143, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000558, train/loss_step=0.158, global_step=1939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2204/5971 [20:38<35:15,  1.78it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.000208, train/loss_step=0.0594, global_step=1939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2205/5971 [20:39<35:15,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000357, train/loss_step=0.109, global_step=1940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2206/5971 [20:40<35:15,  1.78it/s, loss=0.167, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.00536, train/loss_step=0.575, global_step=1940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  37%|███▋      | 2207/5971 [20:41<35:15,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=1940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2208/5971 [20:43<35:17,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=1940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2208/5971 [20:43<35:17,  1.78it/s, loss=0.163, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00097, train/loss_step=0.269, global_step=1940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  37%|███▋      | 2209/5971 [20:44<35:17,  1.78it/s, loss=0.169, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000479, train/loss_step=0.143, global_step=1941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2210/5971 [20:45<35:17,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00123, train/loss_step=0.296, global_step=1941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  37%|███▋      | 2211/5971 [20:45<35:17,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000906, train/loss_step=0.232, global_step=1941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2212/5971 [20:48<35:19,  1.77it/s, loss=0.191, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000906, train/loss_step=0.232, global_step=1941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2212/5971 [20:48<35:19,  1.77it/s, loss=0.191, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.0013, train/loss_step=0.310, global_step=1941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2213/5971 [20:49<35:20,  1.77it/s, loss=0.197, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000464, train/loss_step=0.138, global_step=1942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2214/5971 [20:50<35:20,  1.77it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0944, train/loss_vlb_step=0.00031, train/loss_step=0.0944, global_step=1942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2215/5971 [20:50<35:20,  1.77it/s, loss=0.208, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00285, train/loss_step=0.433, global_step=1942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2216/5971 [20:53<35:22,  1.77it/s, loss=0.208, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00285, train/loss_step=0.433, global_step=1942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2216/5971 [20:53<35:22,  1.77it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.24e-5, train/loss_step=0.00216, global_step=1942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2217/5971 [20:54<35:22,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00539, train/loss_vlb_step=2.67e-5, train/loss_step=0.00539, global_step=1943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2218/5971 [20:55<35:22,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000209, train/loss_step=0.0586, global_step=1943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  37%|███▋      | 2219/5971 [20:55<35:22,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000512, train/loss_step=0.143, global_step=1943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2220/5971 [20:58<35:24,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000512, train/loss_step=0.143, global_step=1943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2220/5971 [20:58<35:24,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.86e-5, train/loss_step=0.00366, global_step=1943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2221/5971 [20:58<35:24,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=1944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  37%|███▋      | 2222/5971 [20:59<35:24,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000635, train/loss_step=0.190, global_step=1944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2223/5971 [21:00<35:24,  1.76it/s, loss=0.187, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.0078, train/loss_step=0.564, global_step=1944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2224/5971 [21:03<35:27,  1.76it/s, loss=0.187, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.0078, train/loss_step=0.564, global_step=1944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2224/5971 [21:03<35:27,  1.76it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000109, train/loss_step=0.0299, global_step=1944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2225/5971 [21:03<35:27,  1.76it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0059, train/loss_vlb_step=3.04e-5, train/loss_step=0.0059, global_step=1945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2226/5971 [21:04<35:27,  1.76it/s, loss=0.158, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000417, train/loss_step=0.126, global_step=1945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2227/5971 [21:05<35:27,  1.76it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.65e-5, train/loss_step=0.0239, global_step=1945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2228/5971 [21:07<35:29,  1.76it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.65e-5, train/loss_step=0.0239, global_step=1945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2228/5971 [21:07<35:29,  1.76it/s, loss=0.177, v_num=0, train/loss_simple_step=0.630, train/loss_vlb_step=0.00911, train/loss_step=0.630, global_step=1945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2229/5971 [21:08<35:29,  1.76it/s, loss=0.18, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000783, train/loss_step=0.201, global_step=1946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2230/5971 [21:09<35:29,  1.76it/s, loss=0.172, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000463, train/loss_step=0.139, global_step=1946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2231/5971 [21:10<35:29,  1.76it/s, loss=0.166, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=1946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2232/5971 [21:12<35:31,  1.75it/s, loss=0.166, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=1946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2232/5971 [21:12<35:31,  1.75it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.43e-5, train/loss_step=0.00251, global_step=1946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2233/5971 [21:13<35:31,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.00021, train/loss_step=0.0621, global_step=1947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2234/5971 [21:14<35:31,  1.75it/s, loss=0.149, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000482, train/loss_step=0.145, global_step=1947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  37%|███▋      | 2235/5971 [21:15<35:31,  1.75it/s, loss=0.146, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.0016, train/loss_step=0.367, global_step=1947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  37%|███▋      | 2236/5971 [21:17<35:33,  1.75it/s, loss=0.146, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.0016, train/loss_step=0.367, global_step=1947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2236/5971 [21:17<35:33,  1.75it/s, loss=0.159, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00109, train/loss_step=0.266, global_step=1947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2237/5971 [21:18<35:33,  1.75it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.15e-5, train/loss_step=0.00409, global_step=1948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2238/5971 [21:19<35:33,  1.75it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00837, train/loss_vlb_step=3.73e-5, train/loss_step=0.00837, global_step=1948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  37%|███▋      | 2239/5971 [21:20<35:33,  1.75it/s, loss=0.159, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000621, train/loss_step=0.184, global_step=1948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  38%|███▊      | 2240/5971 [21:22<35:35,  1.75it/s, loss=0.159, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000621, train/loss_step=0.184, global_step=1948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  38%|███▊      | 2240/5971 [21:22<35:35,  1.75it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.33e-5, train/loss_step=0.00455, global_step=1948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  38%|███▊      | 2241/5971 [21:23<35:35,  1.75it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.44e-5, train/loss_step=0.00253, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  38%|███▊      | 2242/5971 [21:24<35:35,  1.75it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.36e-5, train/loss_step=0.0171, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  38%|███▊      | 2243/5971 [21:25<35:35,  1.75it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00288, train/loss_vlb_step=1.63e-5, train/loss_step=0.00288, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  38%|███▊      | 2244/5971 [21:27<35:37,  1.74it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00288, train/loss_vlb_step=1.63e-5, train/loss_step=0.00288, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  38%|███▊      | 2244/5971 [21:27<35:37,  1.74it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:15,  2.19it/s][A

Validating:   1%|          | 2/167 [00:00<00:44,  3.68it/s][A
Epoch 3:  38%|███▊      | 2248/5971 [21:28<35:32,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   2%|▏         | 4/167 [00:00<00:22,  7.12it/s][A

Validating:   4%|▍         | 7/167 [00:00<00:13, 11.57it/s][A
Epoch 3:  38%|███▊      | 2252/5971 [21:28<35:26,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   6%|▌         | 10/167 [00:00<00:10, 15.12it/s][A
Epoch 3:  38%|███▊      | 2256/5971 [21:28<35:21,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:01<00:08, 18.34it/s][A
Epoch 3:  38%|███▊      | 2260/5971 [21:28<35:15,  1.75it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|▉         | 16/167 [00:01<00:08, 18.85it/s][A

Validating:  11%|█▏        | 19/167 [00:01<00:08, 18.30it/s][A
Epoch 3:  38%|███▊      | 2264/5971 [21:29<35:09,  1.76it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:07, 18.42it/s][A
Epoch 3:  38%|███▊      | 2268/5971 [21:29<35:04,  1.76it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 20.56it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:06, 21.09it/s][A
Epoch 3:  38%|███▊      | 2272/5971 [21:29<34:58,  1.76it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:06, 22.34it/s][A
Epoch 3:  38%|███▊      | 2276/5971 [21:29<34:52,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.02it/s][A
Epoch 3:  38%|███▊      | 2280/5971 [21:29<34:46,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 24.85it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:04, 25.61it/s][A
Epoch 3:  38%|███▊      | 2284/5971 [21:29<34:41,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.90it/s][A
Epoch 3:  38%|███▊      | 2288/5971 [21:30<34:35,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.74it/s][A
Epoch 3:  38%|███▊      | 2292/5971 [21:30<34:30,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.81it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 26.51it/s][A
Epoch 3:  38%|███▊      | 2296/5971 [21:30<34:24,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.28it/s][A
Epoch 3:  39%|███▊      | 2300/5971 [21:30<34:18,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 27.80it/s][A
Epoch 3:  39%|███▊      | 2304/5971 [21:30<34:13,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.68it/s][A

Validating:  38%|███▊      | 63/167 [00:03<00:04, 24.55it/s][A
Epoch 3:  39%|███▊      | 2308/5971 [21:30<34:07,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 23.58it/s][A
Epoch 3:  39%|███▊      | 2312/5971 [21:30<34:02,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 24.06it/s][A
Epoch 3:  39%|███▉      | 2316/5971 [21:31<33:56,  1.79it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.04it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.31it/s][A
Epoch 3:  39%|███▉      | 2320/5971 [21:31<33:51,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.63it/s][A
Epoch 3:  39%|███▉      | 2324/5971 [21:31<33:45,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.79it/s][A
Epoch 3:  39%|███▉      | 2328/5971 [21:31<33:40,  1.80it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|█████     | 84/167 [00:03<00:03, 27.20it/s][A
Epoch 3:  39%|███▉      | 2332/5971 [21:31<33:34,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:04<00:02, 27.84it/s][A

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 27.98it/s][A
Epoch 3:  39%|███▉      | 2336/5971 [21:31<33:29,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 27.58it/s][A
Epoch 3:  39%|███▉      | 2340/5971 [21:31<33:23,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.53it/s][A
Epoch 3:  39%|███▉      | 2344/5971 [21:32<33:18,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.73it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.41it/s][A
Epoch 3:  39%|███▉      | 2348/5971 [21:32<33:13,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.40it/s][A
Epoch 3:  39%|███▉      | 2352/5971 [21:32<33:07,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.82it/s][A
Epoch 3:  39%|███▉      | 2356/5971 [21:32<33:02,  1.82it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 27.90it/s][A

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 27.75it/s][A
Epoch 3:  40%|███▉      | 2360/5971 [21:32<32:57,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.09it/s][A
Epoch 3:  40%|███▉      | 2364/5971 [21:32<32:51,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.90it/s][A
Epoch 3:  40%|███▉      | 2368/5971 [21:33<32:46,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.43it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.51it/s][A
Epoch 3:  40%|███▉      | 2372/5971 [21:33<32:41,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.01it/s][A
Epoch 3:  40%|███▉      | 2376/5971 [21:33<32:36,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.36it/s][A
Epoch 3:  40%|███▉      | 2380/5971 [21:33<32:30,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.82it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.70it/s][A
Epoch 3:  40%|███▉      | 2384/5971 [21:33<32:25,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 27.51it/s][A
Epoch 3:  40%|███▉      | 2388/5971 [21:33<32:20,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 27.00it/s][A
Epoch 3:  40%|████      | 2392/5971 [21:33<32:15,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.47it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 25.95it/s][A
Epoch 3:  40%|████      | 2396/5971 [21:34<32:10,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.29it/s][A
Epoch 3:  40%|████      | 2400/5971 [21:34<32:04,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.86it/s][A
Epoch 3:  40%|████      | 2404/5971 [21:34<31:59,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.62it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.17it/s][A
Epoch 3:  40%|████      | 2408/5971 [21:34<31:54,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 26.38it/s][A
Epoch 3:  40%|████      | 2412/5971 [21:34<31:49,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  40%|████      | 2412/5971 [21:35<31:50,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=1949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  40%|████      | 2413/5971 [21:36<31:50,  1.86it/s, loss=0.158, v_num=0, train/loss_simple_step=0.773, train/loss_vlb_step=0.0566, train/loss_step=0.773, global_step=1950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  40%|████      | 2414/5971 [21:36<31:50,  1.86it/s, loss=0.174, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00294, train/loss_step=0.440, global_step=1950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  40%|████      | 2415/5971 [21:37<31:50,  1.86it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=1950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  40%|████      | 2416/5971 [21:40<31:53,  1.86it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=1950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  40%|████      | 2416/5971 [21:40<31:53,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.08e-5, train/loss_step=0.00182, global_step=1950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  40%|████      | 2417/5971 [21:41<31:53,  1.86it/s, loss=0.142, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000422, train/loss_step=0.128, global_step=1951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  40%|████      | 2418/5971 [21:42<31:53,  1.86it/s, loss=0.16, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.00291, train/loss_step=0.496, global_step=1951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  41%|████      | 2419/5971 [21:43<31:53,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00906, train/loss_vlb_step=4.17e-5, train/loss_step=0.00906, global_step=1951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2420/5971 [21:45<31:55,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00906, train/loss_vlb_step=4.17e-5, train/loss_step=0.00906, global_step=1951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2420/5971 [21:45<31:55,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.38e-5, train/loss_step=0.0204, global_step=1951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  41%|████      | 2421/5971 [21:46<31:55,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000561, train/loss_step=0.169, global_step=1952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████      | 2422/5971 [21:47<31:55,  1.85it/s, loss=0.171, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00185, train/loss_step=0.351, global_step=1952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████      | 2423/5971 [21:48<31:55,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000673, train/loss_step=0.180, global_step=1952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2424/5971 [21:51<31:57,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000673, train/loss_step=0.180, global_step=1952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2424/5971 [21:51<31:57,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.51e-5, train/loss_step=0.0127, global_step=1952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2425/5971 [21:52<31:57,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000129, train/loss_step=0.0337, global_step=1953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2426/5971 [21:52<31:57,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.44e-5, train/loss_step=0.0102, global_step=1953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████      | 2427/5971 [21:53<31:57,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00107, train/loss_step=0.256, global_step=1953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  41%|████      | 2428/5971 [21:56<31:59,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00107, train/loss_step=0.256, global_step=1953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2428/5971 [21:56<31:59,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000138, train/loss_step=0.0349, global_step=1953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2429/5971 [21:57<31:59,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000104, train/loss_step=0.0272, global_step=1954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2430/5971 [21:57<31:59,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0953, train/loss_vlb_step=0.000316, train/loss_step=0.0953, global_step=1954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2431/5971 [21:58<31:59,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.53e-5, train/loss_step=0.00503, global_step=1954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2432/5971 [22:01<32:01,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.53e-5, train/loss_step=0.00503, global_step=1954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2432/5971 [22:01<32:01,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.33e-5, train/loss_step=0.00228, global_step=1954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2433/5971 [22:02<32:01,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000416, train/loss_step=0.126, global_step=1955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  41%|████      | 2434/5971 [22:03<32:01,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=9.08e-5, train/loss_step=0.0228, global_step=1955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2435/5971 [22:03<32:01,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=1955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████      | 2436/5971 [22:06<32:03,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=1955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2436/5971 [22:06<32:03,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000367, train/loss_step=0.111, global_step=1955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████      | 2437/5971 [22:07<32:03,  1.84it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000157, train/loss_step=0.0409, global_step=1956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2438/5971 [22:08<32:03,  1.84it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000157, train/loss_step=0.0466, global_step=1956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2439/5971 [22:08<32:03,  1.84it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000119, train/loss_step=0.0308, global_step=1956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2440/5971 [22:11<32:05,  1.83it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000119, train/loss_step=0.0308, global_step=1956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2440/5971 [22:11<32:05,  1.83it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=1956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  41%|████      | 2441/5971 [22:12<32:05,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.945, train/loss_vlb_step=0.239, train/loss_step=0.945, global_step=1957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  41%|████      | 2442/5971 [22:13<32:05,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00297, train/loss_step=0.465, global_step=1957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2443/5971 [22:13<32:05,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000122, train/loss_step=0.0304, global_step=1957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2444/5971 [22:16<32:07,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000122, train/loss_step=0.0304, global_step=1957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2444/5971 [22:16<32:07,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.73e-5, train/loss_step=0.00315, global_step=1957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2445/5971 [22:16<32:07,  1.83it/s, loss=0.136, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000964, train/loss_step=0.238, global_step=1958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  41%|████      | 2446/5971 [22:17<32:07,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.00015, train/loss_step=0.0397, global_step=1958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2447/5971 [22:18<32:07,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000163, train/loss_step=0.0437, global_step=1958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2448/5971 [22:20<32:08,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000163, train/loss_step=0.0437, global_step=1958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2448/5971 [22:20<32:08,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.5e-5, train/loss_step=0.00503, global_step=1958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2449/5971 [22:21<32:08,  1.83it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.03e-5, train/loss_step=0.0179, global_step=1959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████      | 2450/5971 [22:22<32:08,  1.83it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000103, train/loss_step=0.0277, global_step=1959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2451/5971 [22:23<32:08,  1.83it/s, loss=0.141, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00249, train/loss_step=0.408, global_step=1959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  41%|████      | 2452/5971 [22:26<32:11,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00249, train/loss_step=0.408, global_step=1959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2452/5971 [22:26<32:11,  1.82it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.00031, train/loss_step=0.0922, global_step=1959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2453/5971 [22:27<32:11,  1.82it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0783, train/loss_vlb_step=0.00026, train/loss_step=0.0783, global_step=1960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2454/5971 [22:28<32:11,  1.82it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.08e-5, train/loss_step=0.0148, global_step=1960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2455/5971 [22:29<32:11,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0873, train/loss_vlb_step=0.000291, train/loss_step=0.0873, global_step=1960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2456/5971 [22:31<32:13,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0873, train/loss_vlb_step=0.000291, train/loss_step=0.0873, global_step=1960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2456/5971 [22:31<32:13,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00237, train/loss_step=0.425, global_step=1960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  41%|████      | 2457/5971 [22:32<32:13,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000151, train/loss_step=0.0419, global_step=1961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2458/5971 [22:33<32:12,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000255, train/loss_step=0.077, global_step=1961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  41%|████      | 2459/5971 [22:33<32:12,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00595, train/loss_step=0.440, global_step=1961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  41%|████      | 2460/5971 [22:36<32:14,  1.81it/s, loss=0.18, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00595, train/loss_step=0.440, global_step=1961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2460/5971 [22:36<32:14,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.09e-5, train/loss_step=0.012, global_step=1961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2461/5971 [22:37<32:14,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.569, train/loss_vlb_step=0.0077, train/loss_step=0.569, global_step=1962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████      | 2462/5971 [22:37<32:14,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.55e-5, train/loss_step=0.00268, global_step=1962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████      | 2463/5971 [22:38<32:14,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.55e-5, train/loss_step=0.00269, global_step=1962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2464/5971 [22:41<32:16,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.55e-5, train/loss_step=0.00269, global_step=1962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2464/5971 [22:41<32:16,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.72e-5, train/loss_step=0.0032, global_step=1962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  41%|████▏     | 2465/5971 [22:41<32:16,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000701, train/loss_step=0.198, global_step=1963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████▏     | 2466/5971 [22:42<32:16,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.72e-5, train/loss_step=0.0132, global_step=1963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2467/5971 [22:43<32:16,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000646, train/loss_step=0.192, global_step=1963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  41%|████▏     | 2468/5971 [22:45<32:17,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000646, train/loss_step=0.192, global_step=1963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2468/5971 [22:45<32:17,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00426, train/loss_vlb_step=2.13e-5, train/loss_step=0.00426, global_step=1963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2469/5971 [22:46<32:17,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.644, train/loss_vlb_step=0.0122, train/loss_step=0.644, global_step=1964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  41%|████▏     | 2470/5971 [22:47<32:17,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00317, train/loss_step=0.378, global_step=1964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2471/5971 [22:48<32:17,  1.81it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.21e-5, train/loss_step=0.00206, global_step=1964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2472/5971 [22:50<32:19,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.21e-5, train/loss_step=0.00206, global_step=1964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2472/5971 [22:50<32:19,  1.80it/s, loss=0.17, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000723, train/loss_step=0.211, global_step=1964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  41%|████▏     | 2473/5971 [22:51<32:19,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000121, train/loss_step=0.0309, global_step=1965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2474/5971 [22:52<32:19,  1.80it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0486, train/loss_vlb_step=0.000177, train/loss_step=0.0486, global_step=1965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2475/5971 [22:53<32:19,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000133, train/loss_step=0.0371, global_step=1965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2476/5971 [22:56<32:21,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000133, train/loss_step=0.0371, global_step=1965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  41%|████▏     | 2476/5971 [22:56<32:21,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.0022, train/loss_step=0.368, global_step=1965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  41%|████▏     | 2477/5971 [22:56<32:21,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000353, train/loss_step=0.108, global_step=1966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2478/5971 [22:57<32:21,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00369, train/loss_step=0.518, global_step=1966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  42%|████▏     | 2479/5971 [22:58<32:21,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0031, train/loss_vlb_step=1.68e-5, train/loss_step=0.0031, global_step=1966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2480/5971 [23:01<32:23,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0031, train/loss_vlb_step=1.68e-5, train/loss_step=0.0031, global_step=1966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2480/5971 [23:01<32:23,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00773, train/loss_vlb_step=3.66e-5, train/loss_step=0.00773, global_step=1966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2481/5971 [23:02<32:23,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00112, train/loss_step=0.270, global_step=1967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  42%|████▏     | 2482/5971 [23:02<32:23,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0556, train/loss_vlb_step=0.000188, train/loss_step=0.0556, global_step=1967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2483/5971 [23:03<32:23,  1.80it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00825, train/loss_vlb_step=4.18e-5, train/loss_step=0.00825, global_step=1967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2484/5971 [23:05<32:24,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00825, train/loss_vlb_step=4.18e-5, train/loss_step=0.00825, global_step=1967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2484/5971 [23:05<32:24,  1.79it/s, loss=0.161, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000405, train/loss_step=0.122, global_step=1967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  42%|████▏     | 2485/5971 [23:06<32:24,  1.79it/s, loss=0.182, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00908, train/loss_step=0.628, global_step=1968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  42%|████▏     | 2486/5971 [23:07<32:24,  1.79it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0758, train/loss_vlb_step=0.000259, train/loss_step=0.0758, global_step=1968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2487/5971 [23:08<32:24,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.54e-5, train/loss_step=0.0027, global_step=1968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  42%|████▏     | 2488/5971 [23:10<32:26,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.54e-5, train/loss_step=0.0027, global_step=1968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2488/5971 [23:10<32:26,  1.79it/s, loss=0.189, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.001, train/loss_step=0.267, global_step=1968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  42%|████▏     | 2489/5971 [23:11<32:25,  1.79it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000315, train/loss_step=0.0958, global_step=1969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2490/5971 [23:12<32:25,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.23e-5, train/loss_step=0.0148, global_step=1969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  42%|████▏     | 2491/5971 [23:13<32:25,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00546, train/loss_vlb_step=2.82e-5, train/loss_step=0.00546, global_step=1969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2492/5971 [23:15<32:27,  1.79it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00546, train/loss_vlb_step=2.82e-5, train/loss_step=0.00546, global_step=1969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2492/5971 [23:15<32:27,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00145, train/loss_step=0.344, global_step=1969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  42%|████▏     | 2493/5971 [23:16<32:27,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0329, train/loss_vlb_step=0.000116, train/loss_step=0.0329, global_step=1970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2494/5971 [23:17<32:27,  1.79it/s, loss=0.15, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000142, train/loss_step=0.039, global_step=1970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  42%|████▏     | 2495/5971 [23:18<32:27,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000158, train/loss_step=0.0453, global_step=1970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2496/5971 [23:20<32:28,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000158, train/loss_step=0.0453, global_step=1970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2496/5971 [23:20<32:28,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.00016, train/loss_step=0.0454, global_step=1970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  42%|████▏     | 2497/5971 [23:21<32:28,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.65e-5, train/loss_step=0.0153, global_step=1971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  42%|████▏     | 2498/5971 [23:22<32:28,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000145, train/loss_step=0.0406, global_step=1971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2499/5971 [23:23<32:28,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00302, train/loss_step=0.476, global_step=1971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  42%|████▏     | 2500/5971 [23:25<32:30,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00302, train/loss_step=0.476, global_step=1971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2500/5971 [23:25<32:30,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.00054, train/loss_step=0.160, global_step=1971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2501/5971 [23:26<32:30,  1.78it/s, loss=0.129, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00036, train/loss_step=0.110, global_step=1972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2502/5971 [23:27<32:30,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000138, train/loss_step=0.0374, global_step=1972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2503/5971 [23:28<32:30,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000261, train/loss_step=0.0755, global_step=1972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2504/5971 [23:30<32:31,  1.78it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000261, train/loss_step=0.0755, global_step=1972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2504/5971 [23:30<32:31,  1.78it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.29e-5, train/loss_step=0.00223, global_step=1972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2505/5971 [23:31<32:31,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.521, train/loss_vlb_step=0.00462, train/loss_step=0.521, global_step=1973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  42%|████▏     | 2506/5971 [23:32<32:31,  1.78it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0791, train/loss_vlb_step=0.000263, train/loss_step=0.0791, global_step=1973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2507/5971 [23:33<32:31,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.620, train/loss_vlb_step=0.0129, train/loss_step=0.620, global_step=1973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  42%|████▏     | 2508/5971 [23:35<32:33,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.620, train/loss_vlb_step=0.0129, train/loss_step=0.620, global_step=1973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2508/5971 [23:35<32:33,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000575, train/loss_step=0.167, global_step=1973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2509/5971 [23:36<32:33,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.00101, train/loss_step=0.231, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  42%|████▏     | 2510/5971 [23:37<32:33,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000158, train/loss_step=0.0462, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2511/5971 [23:37<32:33,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.64e-5, train/loss_step=0.00292, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2512/5971 [23:40<32:34,  1.77it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.64e-5, train/loss_step=0.00292, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  42%|████▏     | 2512/5971 [23:40<32:34,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:12,  2.28it/s][A

Validating:   1%|          | 2/167 [00:00<00:48,  3.41it/s][A
Epoch 3:  42%|████▏     | 2516/5971 [23:40<32:30,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.99it/s][A
Epoch 3:  42%|████▏     | 2520/5971 [23:41<32:25,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.73it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.34it/s][A
Epoch 3:  42%|████▏     | 2524/5971 [23:41<32:20,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.40it/s][A
Epoch 3:  42%|████▏     | 2528/5971 [23:41<32:15,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:08, 17.38it/s][A
Epoch 3:  42%|████▏     | 2532/5971 [23:41<32:10,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 18.84it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:08, 16.70it/s][A
Epoch 3:  42%|████▏     | 2536/5971 [23:41<32:05,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:01<00:08, 17.09it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:09, 15.12it/s][A
Epoch 3:  43%|████▎     | 2540/5971 [23:42<32:00,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:02<00:07, 17.50it/s][A
Epoch 3:  43%|████▎     | 2544/5971 [23:42<31:55,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:02<00:09, 14.30it/s][A

Validating:  21%|██        | 35/167 [00:02<00:08, 15.28it/s][A
Epoch 3:  43%|████▎     | 2548/5971 [23:42<31:50,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:02<00:08, 16.08it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:08, 14.75it/s][A
Epoch 3:  43%|████▎     | 2552/5971 [23:43<31:45,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:08, 14.89it/s][A

Validating:  26%|██▌       | 43/167 [00:03<00:09, 13.04it/s][A
Epoch 3:  43%|████▎     | 2556/5971 [23:43<31:40,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:03<00:08, 13.58it/s][A

Validating:  28%|██▊       | 47/167 [00:03<00:08, 14.18it/s][A
Epoch 3:  43%|████▎     | 2560/5971 [23:43<31:36,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:03<00:07, 14.80it/s][A

Validating:  31%|███       | 51/167 [00:03<00:07, 15.03it/s][A
Epoch 3:  43%|████▎     | 2564/5971 [23:43<31:31,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:03<00:07, 14.52it/s][A

Validating:  33%|███▎      | 55/167 [00:03<00:07, 15.56it/s][A
Epoch 3:  43%|████▎     | 2568/5971 [23:44<31:26,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:03<00:06, 16.62it/s][A
Epoch 3:  43%|████▎     | 2572/5971 [23:44<31:21,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:04<00:06, 17.78it/s][A

Validating:  37%|███▋      | 62/167 [00:04<00:06, 16.98it/s][A
Epoch 3:  43%|████▎     | 2576/5971 [23:44<31:16,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:04<00:06, 16.37it/s][A

Validating:  40%|███▉      | 66/167 [00:04<00:05, 17.05it/s][A
Epoch 3:  43%|████▎     | 2580/5971 [23:44<31:11,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:04<00:05, 17.80it/s][A

Validating:  42%|████▏     | 70/167 [00:04<00:06, 15.07it/s][A
Epoch 3:  43%|████▎     | 2584/5971 [23:45<31:07,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:04<00:06, 14.08it/s][A

Validating:  44%|████▍     | 74/167 [00:04<00:06, 15.31it/s][A
Epoch 3:  43%|████▎     | 2588/5971 [23:45<31:02,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:05<00:05, 16.41it/s][A

Validating:  47%|████▋     | 78/167 [00:05<00:05, 16.94it/s][A
Epoch 3:  43%|████▎     | 2592/5971 [23:45<30:57,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:05<00:05, 15.35it/s][A

Validating:  49%|████▉     | 82/167 [00:05<00:06, 13.37it/s][A
Epoch 3:  43%|████▎     | 2596/5971 [23:46<30:53,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|█████     | 84/167 [00:05<00:06, 12.44it/s][A

Validating:  51%|█████▏    | 86/167 [00:05<00:06, 13.24it/s][A
Epoch 3:  44%|████▎     | 2600/5971 [23:46<30:48,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:06<00:07, 10.83it/s][A

Validating:  54%|█████▍    | 90/167 [00:06<00:06, 12.37it/s][A
Epoch 3:  44%|████▎     | 2604/5971 [23:46<30:43,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:06<00:06, 12.18it/s][A

Validating:  57%|█████▋    | 95/167 [00:06<00:04, 14.66it/s][A
Epoch 3:  44%|████▎     | 2608/5971 [23:46<30:39,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:06<00:05, 12.63it/s][A

Validating:  59%|█████▉    | 99/167 [00:06<00:05, 12.19it/s][A
Epoch 3:  44%|████▎     | 2612/5971 [23:47<30:34,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:07<00:05, 13.14it/s][A

Validating:  62%|██████▏   | 103/167 [00:07<00:06,  9.24it/s][A
Epoch 3:  44%|████▍     | 2616/5971 [23:47<30:30,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:07<00:05, 10.79it/s][A

Validating:  64%|██████▍   | 107/167 [00:07<00:05, 11.96it/s][A
Epoch 3:  44%|████▍     | 2620/5971 [23:47<30:25,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:07<00:04, 13.12it/s][A

Validating:  66%|██████▋   | 111/167 [00:08<00:05, 10.58it/s][A
Epoch 3:  44%|████▍     | 2624/5971 [23:48<30:21,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:08<00:04, 11.56it/s][A

Validating:  69%|██████▉   | 115/167 [00:08<00:04, 11.75it/s][A
Epoch 3:  44%|████▍     | 2628/5971 [23:48<30:16,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:08<00:03, 12.52it/s][A

Validating:  71%|███████▏  | 119/167 [00:08<00:03, 13.74it/s][A
Epoch 3:  44%|████▍     | 2632/5971 [23:48<30:12,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:08<00:02, 15.94it/s][A
Epoch 3:  44%|████▍     | 2636/5971 [23:49<30:07,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:08<00:02, 16.63it/s][A

Validating:  75%|███████▌  | 126/167 [00:08<00:02, 16.09it/s][A
Epoch 3:  44%|████▍     | 2640/5971 [23:49<30:02,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:09<00:02, 15.36it/s][A

Validating:  78%|███████▊  | 130/167 [00:09<00:02, 16.00it/s][A
Epoch 3:  44%|████▍     | 2644/5971 [23:49<29:58,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:09<00:02, 14.29it/s][A

Validating:  80%|████████  | 134/167 [00:09<00:02, 14.22it/s][A
Epoch 3:  44%|████▍     | 2648/5971 [23:49<29:53,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:09<00:02, 14.02it/s][A

Validating:  83%|████████▎ | 138/167 [00:09<00:01, 15.04it/s][A
Epoch 3:  44%|████▍     | 2652/5971 [23:50<29:49,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:09<00:01, 15.49it/s][A

Validating:  85%|████████▌ | 142/167 [00:10<00:01, 16.55it/s][A
Epoch 3:  44%|████▍     | 2656/5971 [23:50<29:44,  1.86it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:10<00:01, 16.09it/s][A

Validating:  87%|████████▋ | 146/167 [00:10<00:01, 14.72it/s][A
Epoch 3:  45%|████▍     | 2660/5971 [23:50<29:40,  1.86it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:10<00:01, 13.32it/s][A

Validating:  90%|████████▉ | 150/167 [00:10<00:01, 11.41it/s][A
Epoch 3:  45%|████▍     | 2664/5971 [23:51<29:36,  1.86it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:11<00:01,  9.80it/s][A

Validating:  93%|█████████▎| 155/167 [00:11<00:00, 12.57it/s][A
Epoch 3:  45%|████▍     | 2668/5971 [23:51<29:31,  1.86it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:11<00:00, 12.78it/s][A

Validating:  95%|█████████▌| 159/167 [00:11<00:00, 11.31it/s][A
Epoch 3:  45%|████▍     | 2672/5971 [23:51<29:27,  1.87it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:11<00:00, 10.69it/s][A

Validating:  98%|█████████▊| 163/167 [00:11<00:00, 10.90it/s][A
Epoch 3:  45%|████▍     | 2676/5971 [23:52<29:22,  1.87it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:12<00:00, 12.17it/s][A

Validating: 100%|██████████| 167/167 [00:12<00:00, 12.65it/s][A
Epoch 3:  45%|████▍     | 2680/5971 [23:52<29:18,  1.87it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▍     | 2680/5971 [23:53<29:19,  1.87it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000142, train/loss_step=0.0406, global_step=1974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  45%|████▍     | 2681/5971 [23:54<29:20,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00217, train/loss_step=0.336, global_step=1975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  45%|████▍     | 2682/5971 [23:56<29:20,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00807, train/loss_vlb_step=3.51e-5, train/loss_step=0.00807, global_step=1975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▍     | 2683/5971 [23:57<29:21,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.14e-5, train/loss_step=0.00396, global_step=1975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▍     | 2683/5971 [24:07<29:33,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.14e-5, train/loss_step=0.00396, global_step=1975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▍     | 2684/5971 [24:08<29:33,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.14e-5, train/loss_step=0.00396, global_step=1975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▍     | 2684/5971 [24:08<29:33,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000886, train/loss_step=0.223, global_step=1975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  45%|████▍     | 2685/5971 [24:09<29:33,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000886, train/loss_step=0.223, global_step=1975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▍     | 2685/5971 [24:09<29:33,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.89e-5, train/loss_step=0.0145, global_step=1976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▍     | 2686/5971 [24:11<29:34,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.89e-5, train/loss_step=0.0145, global_step=1976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▍     | 2686/5971 [24:11<29:34,  1.85it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=9.16e-5, train/loss_step=0.0221, global_step=1976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2687/5971 [24:12<29:34,  1.85it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=9.16e-5, train/loss_step=0.0221, global_step=1976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2687/5971 [24:12<29:34,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.73e-6, train/loss_step=0.00144, global_step=1976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2688/5971 [24:16<29:37,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.73e-6, train/loss_step=0.00144, global_step=1976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2688/5971 [24:16<29:37,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000583, train/loss_step=0.166, global_step=1976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  45%|████▌     | 2689/5971 [24:17<29:38,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000583, train/loss_step=0.166, global_step=1976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2689/5971 [24:17<29:38,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00287, train/loss_step=0.433, global_step=1977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  45%|████▌     | 2690/5971 [24:19<29:39,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00287, train/loss_step=0.433, global_step=1977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2690/5971 [24:19<29:39,  1.84it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.29e-6, train/loss_step=0.00156, global_step=1977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2691/5971 [24:21<29:40,  1.84it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.29e-6, train/loss_step=0.00156, global_step=1977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2691/5971 [24:21<29:40,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.14e-5, train/loss_step=0.00405, global_step=1977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2692/5971 [24:25<29:44,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.14e-5, train/loss_step=0.00405, global_step=1977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2692/5971 [24:25<29:44,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0587, train/loss_vlb_step=0.000199, train/loss_step=0.0587, global_step=1977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  45%|████▌     | 2693/5971 [24:27<29:45,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0587, train/loss_vlb_step=0.000199, train/loss_step=0.0587, global_step=1977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2693/5971 [24:27<29:45,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000667, train/loss_step=0.182, global_step=1978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  45%|████▌     | 2694/5971 [24:29<29:46,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000667, train/loss_step=0.182, global_step=1978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2694/5971 [24:29<29:46,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.87e-5, train/loss_step=0.0034, global_step=1978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2695/5971 [24:30<29:46,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.87e-5, train/loss_step=0.0034, global_step=1978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2695/5971 [24:30<29:46,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.00948, train/loss_step=0.595, global_step=1978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  45%|████▌     | 2696/5971 [24:36<29:53,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.00948, train/loss_step=0.595, global_step=1978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2696/5971 [24:36<29:53,  1.83it/s, loss=0.136, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00156, train/loss_step=0.339, global_step=1978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2697/5971 [24:38<29:53,  1.83it/s, loss=0.136, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00156, train/loss_step=0.339, global_step=1978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2697/5971 [24:38<29:53,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.94e-5, train/loss_step=0.0112, global_step=1979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2698/5971 [24:39<29:53,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.94e-5, train/loss_step=0.0112, global_step=1979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2698/5971 [24:39<29:53,  1.82it/s, loss=0.13, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000509, train/loss_step=0.151, global_step=1979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  45%|████▌     | 2699/5971 [24:40<29:53,  1.82it/s, loss=0.13, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000509, train/loss_step=0.151, global_step=1979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2699/5971 [24:40<29:53,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000414, train/loss_step=0.125, global_step=1979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2700/5971 [24:43<29:57,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000414, train/loss_step=0.125, global_step=1979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2700/5971 [24:43<29:57,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.08e-5, train/loss_step=0.00182, global_step=1979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2701/5971 [24:45<29:57,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.08e-5, train/loss_step=0.00182, global_step=1979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2701/5971 [24:45<29:57,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.87e-5, train/loss_step=0.0162, global_step=1980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  45%|████▌     | 2702/5971 [24:46<29:58,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.87e-5, train/loss_step=0.0162, global_step=1980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2702/5971 [24:46<29:58,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00678, train/loss_vlb_step=3.4e-5, train/loss_step=0.00678, global_step=1980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2703/5971 [24:48<29:58,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00678, train/loss_vlb_step=3.4e-5, train/loss_step=0.00678, global_step=1980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2703/5971 [24:48<29:58,  1.82it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00475, train/loss_vlb_step=2.28e-5, train/loss_step=0.00475, global_step=1980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2704/5971 [24:52<30:02,  1.81it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00475, train/loss_vlb_step=2.28e-5, train/loss_step=0.00475, global_step=1980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2704/5971 [24:52<30:02,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.00021, train/loss_step=0.0637, global_step=1980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  45%|████▌     | 2705/5971 [24:54<30:03,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.00021, train/loss_step=0.0637, global_step=1980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2705/5971 [24:54<30:04,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.593, train/loss_vlb_step=0.00787, train/loss_step=0.593, global_step=1981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  45%|████▌     | 2706/5971 [24:56<30:04,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.593, train/loss_vlb_step=0.00787, train/loss_step=0.593, global_step=1981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2706/5971 [24:56<30:04,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000852, train/loss_step=0.223, global_step=1981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2707/5971 [24:57<30:05,  1.81it/s, loss=0.149, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000852, train/loss_step=0.223, global_step=1981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2707/5971 [24:57<30:05,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.22e-5, train/loss_step=0.0188, global_step=1981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2708/5971 [25:01<30:09,  1.80it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.22e-5, train/loss_step=0.0188, global_step=1981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2708/5971 [25:01<30:09,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=1981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2709/5971 [25:03<30:09,  1.80it/s, loss=0.149, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=1981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2709/5971 [25:03<30:09,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=3.02e-5, train/loss_step=0.00618, global_step=1982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2710/5971 [25:04<30:10,  1.80it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=3.02e-5, train/loss_step=0.00618, global_step=1982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2710/5971 [25:04<30:10,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=1982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  45%|████▌     | 2711/5971 [25:06<30:10,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=1982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2711/5971 [25:06<30:10,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000156, train/loss_step=0.0424, global_step=1982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2712/5971 [25:10<30:14,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000156, train/loss_step=0.0424, global_step=1982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2712/5971 [25:10<30:14,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000148, train/loss_step=0.0417, global_step=1982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2713/5971 [25:12<30:15,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000148, train/loss_step=0.0417, global_step=1982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2713/5971 [25:12<30:15,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.00107, train/loss_step=0.239, global_step=1983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  45%|████▌     | 2714/5971 [25:13<30:16,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.00107, train/loss_step=0.239, global_step=1983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2714/5971 [25:13<30:16,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00198, train/loss_vlb_step=1.17e-5, train/loss_step=0.00198, global_step=1983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2715/5971 [25:15<30:16,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00198, train/loss_vlb_step=1.17e-5, train/loss_step=0.00198, global_step=1983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2715/5971 [25:15<30:16,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.23e-5, train/loss_step=0.00205, global_step=1983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2716/5971 [25:18<30:19,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.23e-5, train/loss_step=0.00205, global_step=1983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  45%|████▌     | 2716/5971 [25:18<30:19,  1.79it/s, loss=0.0899, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=3.93e-5, train/loss_step=0.00868, global_step=1983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2717/5971 [25:20<30:20,  1.79it/s, loss=0.0899, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=3.93e-5, train/loss_step=0.00868, global_step=1983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2717/5971 [25:20<30:20,  1.79it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.03e-5, train/loss_step=0.0174, global_step=1984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  46%|████▌     | 2718/5971 [25:21<30:20,  1.79it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.03e-5, train/loss_step=0.0174, global_step=1984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2718/5971 [25:21<30:20,  1.79it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.0011, train/loss_step=0.275, global_step=1984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  46%|████▌     | 2719/5971 [25:23<30:21,  1.79it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.0011, train/loss_step=0.275, global_step=1984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2719/5971 [25:23<30:21,  1.79it/s, loss=0.11, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00203, train/loss_step=0.398, global_step=1984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  46%|████▌     | 2720/5971 [25:26<30:23,  1.78it/s, loss=0.11, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00203, train/loss_step=0.398, global_step=1984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2720/5971 [25:26<30:23,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000124, train/loss_step=0.0331, global_step=1984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2721/5971 [25:27<30:23,  1.78it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000124, train/loss_step=0.0331, global_step=1984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2721/5971 [25:27<30:23,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000137, train/loss_step=0.0352, global_step=1985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2722/5971 [25:29<30:24,  1.78it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000137, train/loss_step=0.0352, global_step=1985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2722/5971 [25:29<30:24,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.516, train/loss_vlb_step=0.00917, train/loss_step=0.516, global_step=1985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  46%|████▌     | 2723/5971 [25:30<30:25,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.516, train/loss_vlb_step=0.00917, train/loss_step=0.516, global_step=1985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2723/5971 [25:30<30:25,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00107, train/loss_step=0.258, global_step=1985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2724/5971 [25:33<30:27,  1.78it/s, loss=0.151, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00107, train/loss_step=0.258, global_step=1985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2724/5971 [25:33<30:27,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.41e-5, train/loss_step=0.0132, global_step=1985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2725/5971 [25:35<30:28,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.41e-5, train/loss_step=0.0132, global_step=1985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2725/5971 [25:35<30:28,  1.78it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.87e-5, train/loss_step=0.00342, global_step=1986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2726/5971 [25:37<30:29,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.87e-5, train/loss_step=0.00342, global_step=1986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2726/5971 [25:37<30:29,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000259, train/loss_step=0.0743, global_step=1986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  46%|████▌     | 2727/5971 [25:38<30:29,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000259, train/loss_step=0.0743, global_step=1986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2727/5971 [25:38<30:29,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.65e-5, train/loss_step=0.00286, global_step=1986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2728/5971 [25:42<30:32,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.65e-5, train/loss_step=0.00286, global_step=1986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2728/5971 [25:42<30:32,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.44e-5, train/loss_step=0.00256, global_step=1986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2729/5971 [25:43<30:33,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.44e-5, train/loss_step=0.00256, global_step=1986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2729/5971 [25:43<30:33,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.00049, train/loss_step=0.142, global_step=1987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  46%|████▌     | 2730/5971 [25:45<30:33,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.00049, train/loss_step=0.142, global_step=1987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2730/5971 [25:45<30:33,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.78e-5, train/loss_step=0.0131, global_step=1987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2731/5971 [25:46<30:34,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.78e-5, train/loss_step=0.0131, global_step=1987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2731/5971 [25:46<30:34,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.48e-6, train/loss_step=0.00141, global_step=1987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2732/5971 [25:49<30:36,  1.76it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.48e-6, train/loss_step=0.00141, global_step=1987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2732/5971 [25:49<30:36,  1.76it/s, loss=0.109, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.00049, train/loss_step=0.148, global_step=1987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  46%|████▌     | 2733/5971 [25:51<30:37,  1.76it/s, loss=0.109, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.00049, train/loss_step=0.148, global_step=1987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2733/5971 [25:51<30:37,  1.76it/s, loss=0.107, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000662, train/loss_step=0.198, global_step=1988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2734/5971 [25:52<30:37,  1.76it/s, loss=0.107, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000662, train/loss_step=0.198, global_step=1988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2734/5971 [25:52<30:37,  1.76it/s, loss=0.113, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000359, train/loss_step=0.108, global_step=1988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2735/5971 [25:53<30:37,  1.76it/s, loss=0.113, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000359, train/loss_step=0.108, global_step=1988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2735/5971 [25:53<30:37,  1.76it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.04e-5, train/loss_step=0.00378, global_step=1988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2736/5971 [25:57<30:41,  1.76it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.04e-5, train/loss_step=0.00378, global_step=1988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2736/5971 [25:57<30:41,  1.76it/s, loss=0.119, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000417, train/loss_step=0.127, global_step=1988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  46%|████▌     | 2737/5971 [25:59<30:41,  1.76it/s, loss=0.119, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000417, train/loss_step=0.127, global_step=1988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2737/5971 [25:59<30:41,  1.76it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00279, train/loss_vlb_step=1.6e-5, train/loss_step=0.00279, global_step=1989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2738/5971 [26:00<30:42,  1.76it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00279, train/loss_vlb_step=1.6e-5, train/loss_step=0.00279, global_step=1989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2738/5971 [26:00<30:42,  1.76it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.000234, train/loss_step=0.0693, global_step=1989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2739/5971 [26:01<30:42,  1.75it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.000234, train/loss_step=0.0693, global_step=1989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2739/5971 [26:01<30:42,  1.75it/s, loss=0.107, v_num=0, train/loss_simple_step=0.383, train/loss_vlb_step=0.00273, train/loss_step=0.383, global_step=1989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  46%|████▌     | 2740/5971 [26:05<30:45,  1.75it/s, loss=0.107, v_num=0, train/loss_simple_step=0.383, train/loss_vlb_step=0.00273, train/loss_step=0.383, global_step=1989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2740/5971 [26:05<30:45,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00877, train/loss_vlb_step=4.07e-5, train/loss_step=0.00877, global_step=1989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2741/5971 [26:10<30:49,  1.75it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00877, train/loss_vlb_step=4.07e-5, train/loss_step=0.00877, global_step=1989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2741/5971 [26:10<30:49,  1.75it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000113, train/loss_step=0.0326, global_step=1990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  46%|████▌     | 2742/5971 [26:11<30:50,  1.75it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000113, train/loss_step=0.0326, global_step=1990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2742/5971 [26:11<30:50,  1.75it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.83e-5, train/loss_step=0.0129, global_step=1990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2743/5971 [26:13<30:50,  1.74it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.83e-5, train/loss_step=0.0129, global_step=1990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2743/5971 [26:13<30:50,  1.74it/s, loss=0.0705, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000217, train/loss_step=0.063, global_step=1990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  46%|████▌     | 2744/5971 [26:16<30:53,  1.74it/s, loss=0.0705, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000217, train/loss_step=0.063, global_step=1990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2744/5971 [26:16<30:53,  1.74it/s, loss=0.0737, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.000258, train/loss_step=0.078, global_step=1990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2745/5971 [26:17<30:53,  1.74it/s, loss=0.0737, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.000258, train/loss_step=0.078, global_step=1990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2745/5971 [26:17<30:53,  1.74it/s, loss=0.0766, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000205, train/loss_step=0.0606, global_step=1991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2746/5971 [26:19<30:54,  1.74it/s, loss=0.0766, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000205, train/loss_step=0.0606, global_step=1991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2746/5971 [26:19<30:54,  1.74it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000634, train/loss_step=0.181, global_step=1991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  46%|████▌     | 2747/5971 [26:20<30:54,  1.74it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000634, train/loss_step=0.181, global_step=1991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2747/5971 [26:20<30:54,  1.74it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00103, train/loss_step=0.262, global_step=1991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  46%|████▌     | 2748/5971 [26:23<30:56,  1.74it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00103, train/loss_step=0.262, global_step=1991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2748/5971 [26:23<30:56,  1.74it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.26e-5, train/loss_step=0.00222, global_step=1991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2749/5971 [26:25<30:57,  1.73it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.26e-5, train/loss_step=0.00222, global_step=1991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2749/5971 [26:25<30:57,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.042, train/loss_step=0.812, global_step=1992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]       
Epoch 3:  46%|████▌     | 2750/5971 [26:26<30:57,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.042, train/loss_step=0.812, global_step=1992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2750/5971 [26:26<30:57,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.71e-5, train/loss_step=0.0134, global_step=1992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2751/5971 [26:27<30:57,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.71e-5, train/loss_step=0.0134, global_step=1992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2751/5971 [26:27<30:57,  1.73it/s, loss=0.14, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.00085, train/loss_step=0.230, global_step=1992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  46%|████▌     | 2752/5971 [26:31<31:00,  1.73it/s, loss=0.14, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.00085, train/loss_step=0.230, global_step=1992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2752/5971 [26:31<31:00,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000146, train/loss_step=0.0413, global_step=1992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2753/5971 [26:33<31:01,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000146, train/loss_step=0.0413, global_step=1992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2753/5971 [26:33<31:01,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0843, train/loss_vlb_step=0.000284, train/loss_step=0.0843, global_step=1993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2754/5971 [26:34<31:02,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0843, train/loss_vlb_step=0.000284, train/loss_step=0.0843, global_step=1993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2754/5971 [26:34<31:02,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000745, train/loss_step=0.204, global_step=1993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  46%|████▌     | 2755/5971 [26:36<31:02,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000745, train/loss_step=0.204, global_step=1993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2755/5971 [26:36<31:02,  1.73it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.26e-5, train/loss_step=0.00216, global_step=1993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2756/5971 [26:39<31:05,  1.72it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.26e-5, train/loss_step=0.00216, global_step=1993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2756/5971 [26:39<31:05,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.706, train/loss_vlb_step=0.0142, train/loss_step=0.706, global_step=1993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  46%|████▌     | 2757/5971 [26:41<31:05,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.706, train/loss_vlb_step=0.0142, train/loss_step=0.706, global_step=1993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2757/5971 [26:41<31:05,  1.72it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000249, train/loss_step=0.0735, global_step=1994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2758/5971 [26:42<31:06,  1.72it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000249, train/loss_step=0.0735, global_step=1994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2758/5971 [26:42<31:06,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.43e-5, train/loss_step=0.012, global_step=1994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  46%|████▌     | 2759/5971 [26:43<31:06,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.43e-5, train/loss_step=0.012, global_step=1994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2759/5971 [26:43<31:06,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000162, train/loss_step=0.0475, global_step=1994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2760/5971 [26:46<31:08,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000162, train/loss_step=0.0475, global_step=1994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2760/5971 [26:46<31:08,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000178, train/loss_step=0.0497, global_step=1994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2761/5971 [26:48<31:09,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000178, train/loss_step=0.0497, global_step=1994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▌     | 2761/5971 [26:48<31:09,  1.72it/s, loss=0.154, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000501, train/loss_step=0.152, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  46%|████▋     | 2762/5971 [26:49<31:09,  1.72it/s, loss=0.154, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000501, train/loss_step=0.152, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2762/5971 [26:49<31:09,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00121, train/loss_step=0.290, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  46%|████▋     | 2763/5971 [26:50<31:09,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00121, train/loss_step=0.290, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2763/5971 [26:50<31:09,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.00022, train/loss_step=0.0641, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2764/5971 [26:54<31:12,  1.71it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.00022, train/loss_step=0.0641, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2764/5971 [26:54<31:12,  1.71it/s, loss=0.183, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00212, train/loss_step=0.368, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  46%|████▋     | 2765/5971 [26:55<31:13,  1.71it/s, loss=0.183, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00212, train/loss_step=0.368, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2765/5971 [26:55<31:13,  1.71it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.25e-5, train/loss_step=0.0208, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2766/5971 [26:57<31:13,  1.71it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.25e-5, train/loss_step=0.0208, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2766/5971 [26:57<31:13,  1.71it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.41e-5, train/loss_step=0.00704, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2767/5971 [26:58<31:13,  1.71it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.41e-5, train/loss_step=0.00704, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2767/5971 [26:58<31:13,  1.71it/s, loss=0.183, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00521, train/loss_step=0.471, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  46%|████▋     | 2768/5971 [27:02<31:16,  1.71it/s, loss=0.183, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00521, train/loss_step=0.471, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2768/5971 [27:02<31:16,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000422, train/loss_step=0.128, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2769/5971 [27:03<31:16,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000422, train/loss_step=0.128, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2769/5971 [27:03<31:16,  1.71it/s, loss=0.157, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000604, train/loss_step=0.174, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2770/5971 [27:04<31:16,  1.71it/s, loss=0.157, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000604, train/loss_step=0.174, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2770/5971 [27:04<31:16,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000141, train/loss_step=0.040, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2771/5971 [27:05<31:16,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000141, train/loss_step=0.040, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2771/5971 [27:05<31:16,  1.70it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.97e-5, train/loss_step=0.0127, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2772/5971 [27:10<31:20,  1.70it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.97e-5, train/loss_step=0.0127, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2772/5971 [27:10<31:20,  1.70it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.48e-5, train/loss_step=0.00255, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2773/5971 [27:11<31:21,  1.70it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.48e-5, train/loss_step=0.00255, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2773/5971 [27:11<31:21,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0605, train/loss_vlb_step=0.000208, train/loss_step=0.0605, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  46%|████▋     | 2774/5971 [27:12<31:21,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0605, train/loss_vlb_step=0.000208, train/loss_step=0.0605, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2774/5971 [27:12<31:21,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0556, train/loss_vlb_step=0.000196, train/loss_step=0.0556, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2775/5971 [27:14<31:21,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0556, train/loss_vlb_step=0.000196, train/loss_step=0.0556, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2775/5971 [27:14<31:21,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  46%|████▋     | 2776/5971 [27:19<31:26,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  46%|████▋     | 2776/5971 [27:19<31:26,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.00084, train/loss_step=0.227, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  47%|████▋     | 2777/5971 [27:21<31:26,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.00084, train/loss_step=0.227, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  47%|████▋     | 2777/5971 [27:21<31:26,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.000254, train/loss_step=0.0704, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  47%|████▋     | 2778/5971 [28:06<32:17,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.000254, train/loss_step=0.0704, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  47%|████▋     | 2778/5971 [28:06<32:17,  1.65it/s, loss=0.123, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000343, train/loss_step=0.103, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  47%|████▋     | 2779/5971 [28:08<32:18,  1.65it/s, loss=0.123, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000343, train/loss_step=0.103, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  47%|████▋     | 2779/5971 [28:08<32:18,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=9.19e-5, train/loss_step=0.0228, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  47%|████▋     | 2780/5971 [28:11<32:21,  1.64it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=9.19e-5, train/loss_step=0.0228, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  47%|████▋     | 2780/5971 [28:11<32:21,  1.64it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:37,  1.70it/s][A
Epoch 3:  47%|████▋     | 2782/5971 [28:12<32:19,  1.64it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:01<01:24,  1.96it/s][A
Epoch 3:  47%|████▋     | 2784/5971 [28:13<32:17,  1.64it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:01<00:28,  5.63it/s][A
Epoch 3:  47%|████▋     | 2787/5971 [28:13<32:13,  1.65it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:01<00:17,  9.23it/s][A
Epoch 3:  47%|████▋     | 2790/5971 [28:13<32:10,  1.65it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:01<00:12, 12.14it/s][A
Epoch 3:  47%|████▋     | 2793/5971 [28:13<32:06,  1.65it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:10, 14.59it/s][A
Epoch 3:  47%|████▋     | 2796/5971 [28:13<32:02,  1.65it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|▉         | 16/167 [00:01<00:09, 15.46it/s][A

Validating:  11%|█         | 18/167 [00:01<00:09, 15.96it/s][A
Epoch 3:  47%|████▋     | 2799/5971 [28:13<31:58,  1.65it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:08, 17.94it/s][A
Epoch 3:  47%|████▋     | 2802/5971 [28:13<31:55,  1.65it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:02<00:07, 18.64it/s][A
Epoch 3:  47%|████▋     | 2805/5971 [28:14<31:51,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:02<00:07, 18.93it/s][A
Epoch 3:  47%|████▋     | 2808/5971 [28:14<31:47,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:02<00:06, 20.21it/s][A
Epoch 3:  47%|████▋     | 2811/5971 [28:14<31:44,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:02<00:06, 21.38it/s][A
Epoch 3:  47%|████▋     | 2814/5971 [28:14<31:40,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:02<00:06, 20.15it/s][A
Epoch 3:  47%|████▋     | 2817/5971 [28:14<31:36,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:02<00:06, 20.31it/s][A
Epoch 3:  47%|████▋     | 2820/5971 [28:14<31:33,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:06, 19.07it/s][A
Epoch 3:  47%|████▋     | 2823/5971 [28:15<31:29,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:03<00:06, 17.80it/s][A
Epoch 3:  47%|████▋     | 2826/5971 [28:15<31:25,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:03<00:06, 17.81it/s][A
Epoch 3:  47%|████▋     | 2829/5971 [28:15<31:22,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:03<00:06, 19.64it/s][A
Epoch 3:  47%|████▋     | 2832/5971 [28:15<31:18,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:03<00:05, 20.97it/s][A
Epoch 3:  47%|████▋     | 2835/5971 [28:15<31:14,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:03<00:05, 21.84it/s][A
Epoch 3:  48%|████▊     | 2838/5971 [28:15<31:11,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:03<00:05, 21.03it/s][A
Epoch 3:  48%|████▊     | 2841/5971 [28:15<31:07,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 21.85it/s][A
Epoch 3:  48%|████▊     | 2844/5971 [28:16<31:04,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:04<00:05, 19.70it/s][A
Epoch 3:  48%|████▊     | 2847/5971 [28:16<31:00,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:04<00:05, 19.05it/s][A

Validating:  41%|████▏     | 69/167 [00:04<00:05, 19.18it/s][A
Epoch 3:  48%|████▊     | 2850/5971 [28:16<30:57,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:04<00:05, 16.51it/s][A
Epoch 3:  48%|████▊     | 2853/5971 [28:16<30:53,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:04<00:05, 16.81it/s][A
Epoch 3:  48%|████▊     | 2856/5971 [28:16<30:49,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:04<00:04, 18.42it/s][A
Epoch 3:  48%|████▊     | 2859/5971 [28:16<30:46,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:04<00:04, 19.19it/s][A

Validating:  49%|████▊     | 81/167 [00:05<00:04, 19.31it/s][A
Epoch 3:  48%|████▊     | 2862/5971 [28:17<30:42,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:05<00:04, 19.04it/s][A
Epoch 3:  48%|████▊     | 2865/5971 [28:17<30:39,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:05<00:04, 20.22it/s][A
Epoch 3:  48%|████▊     | 2868/5971 [28:17<30:35,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:05<00:03, 20.01it/s][A
Epoch 3:  48%|████▊     | 2871/5971 [28:17<30:32,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:05<00:03, 20.12it/s][A
Epoch 3:  48%|████▊     | 2874/5971 [28:17<30:28,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:05<00:03, 21.20it/s][A
Epoch 3:  48%|████▊     | 2877/5971 [28:17<30:25,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:05<00:03, 21.42it/s][A
Epoch 3:  48%|████▊     | 2880/5971 [28:17<30:21,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:05<00:03, 20.71it/s][A
Epoch 3:  48%|████▊     | 2883/5971 [28:18<30:18,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:06<00:02, 21.74it/s][A
Epoch 3:  48%|████▊     | 2886/5971 [28:18<30:14,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:06<00:02, 21.20it/s][A
Epoch 3:  48%|████▊     | 2889/5971 [28:18<30:11,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:06<00:02, 20.73it/s][A
Epoch 3:  48%|████▊     | 2892/5971 [28:18<30:07,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:06<00:02, 20.57it/s][A
Epoch 3:  48%|████▊     | 2895/5971 [28:18<30:04,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:06<00:02, 21.48it/s][A
Epoch 3:  49%|████▊     | 2898/5971 [28:18<30:00,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████▏  | 119/167 [00:06<00:02, 20.17it/s][A
Epoch 3:  49%|████▊     | 2901/5971 [28:18<29:57,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:06<00:02, 19.81it/s][A
Epoch 3:  49%|████▊     | 2904/5971 [28:19<29:53,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:07<00:02, 19.84it/s][A
Epoch 3:  49%|████▊     | 2907/5971 [28:19<29:50,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:07<00:01, 20.30it/s][A
Epoch 3:  49%|████▊     | 2910/5971 [28:19<29:46,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:07<00:01, 20.04it/s][A
Epoch 3:  49%|████▉     | 2913/5971 [28:19<29:43,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:07<00:01, 20.59it/s][A
Epoch 3:  49%|████▉     | 2916/5971 [28:19<29:40,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:07<00:01, 20.72it/s][A
Epoch 3:  49%|████▉     | 2919/5971 [28:19<29:36,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:07<00:01, 19.49it/s][A
Epoch 3:  49%|████▉     | 2922/5971 [28:19<29:33,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:08<00:01, 20.35it/s][A
Epoch 3:  49%|████▉     | 2925/5971 [28:20<29:29,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:08<00:00, 21.13it/s][A
Epoch 3:  49%|████▉     | 2928/5971 [28:20<29:26,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:08<00:00, 21.30it/s][A
Epoch 3:  49%|████▉     | 2931/5971 [28:20<29:23,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:08<00:00, 21.30it/s][A
Epoch 3:  49%|████▉     | 2934/5971 [28:20<29:19,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:08<00:00, 21.15it/s][A
Epoch 3:  49%|████▉     | 2937/5971 [28:20<29:16,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:08<00:00, 20.38it/s][A
Epoch 3:  49%|████▉     | 2940/5971 [28:20<29:12,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:08<00:00, 20.56it/s][A
Epoch 3:  49%|████▉     | 2943/5971 [28:20<29:09,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:08<00:00, 21.39it/s][A
Epoch 3:  49%|████▉     | 2946/5971 [28:21<29:06,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:09<00:00, 18.89it/s][A
Epoch 3:  49%|████▉     | 2948/5971 [28:21<29:04,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:01<00:50,  1.03s/it][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:34,  1.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:02<00:28,  1.63it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:02<00:26,  1.75it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:03<00:24,  1.80it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:03<00:20,  2.17it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:17,  2.45it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:04<00:16,  2.49it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:04<00:16,  2.53it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:15,  2.61it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:05<00:14,  2.62it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:05<00:13,  2.88it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:11,  3.15it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:05<00:10,  3.30it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:06<00:10,  3.45it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:06<00:09,  3.62it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:06<00:09,  3.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:06<00:08,  3.73it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:07<00:08,  3.70it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:07<00:07,  3.81it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:07<00:07,  3.93it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:07<00:07,  3.95it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:08<00:08,  3.23it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:08<00:08,  3.24it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:09<00:08,  3.04it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:09<00:08,  2.84it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:09<00:07,  2.96it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:10<00:07,  3.07it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:10<00:06,  3.11it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:10<00:06,  3.12it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:11<00:06,  3.12it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:11<00:05,  3.17it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:11<00:05,  3.22it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:11<00:04,  3.32it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:12<00:04,  3.41it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:12<00:04,  3.47it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:12<00:03,  3.25it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:13<00:03,  3.13it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:13<00:03,  3.13it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:13<00:03,  3.26it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:14<00:02,  3.41it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:14<00:02,  3.45it/s][A
Epoch 3:  49%|████▉     | 2948/5971 [28:37<29:20,  1.72it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Spaced Sampler:  86%|████████▌ | 43/50 [00:14<00:02,  3.14it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:15<00:02,  2.88it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:15<00:01,  2.78it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:15<00:01,  2.77it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:16<00:01,  2.66it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:16<00:00,  2.56it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:16<00:00,  2.75it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:17<00:00,  2.89it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:17<00:00,  2.89it/s]

Epoch 3:  49%|████▉     | 2949/5971 [28:44<29:26,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000627, train/loss_step=0.188, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2949/5971 [28:44<29:26,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:38,  1.26it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:25,  1.85it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:21,  2.16it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:19,  2.38it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:18,  2.45it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:17,  2.50it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:19,  2.26it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:18,  2.24it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:16,  2.43it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:17,  2.34it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:04<00:17,  2.17it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:05<00:16,  2.28it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:15,  2.35it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:06<00:14,  2.40it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:06<00:14,  2.45it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:06<00:13,  2.52it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:07<00:13,  2.51it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:07<00:12,  2.55it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:08<00:12,  2.54it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:08<00:11,  2.53it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:08<00:11,  2.53it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:09<00:11,  2.51it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:09<00:10,  2.53it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:09<00:09,  2.70it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:10<00:08,  2.87it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:10<00:08,  2.96it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:10<00:07,  2.96it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:11<00:07,  2.76it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:11<00:07,  2.73it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:12<00:07,  2.78it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:12<00:06,  2.85it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:12<00:06,  2.77it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:13<00:06,  2.69it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:13<00:05,  2.69it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:13<00:05,  2.71it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:14<00:05,  2.66it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:14<00:04,  2.62it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:15<00:04,  2.63it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:15<00:04,  2.61it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:15<00:03,  2.54it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:16<00:03,  2.50it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:16<00:03,  2.49it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:17<00:02,  2.46it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:17<00:02,  2.48it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:17<00:02,  2.46it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:18<00:01,  2.45it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:18<00:01,  2.44it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:19<00:00,  2.53it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:19<00:00,  2.67it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:19<00:00,  2.83it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:19<00:00,  2.53it/s]

Epoch 3:  49%|████▉     | 2950/5971 [29:07<29:49,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2950/5971 [29:07<29:49,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000183, train/loss_step=0.0515, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:43,  1.12it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:29,  1.61it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:23,  1.96it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:02<00:20,  2.20it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:19,  2.30it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:18,  2.44it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:16,  2.53it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:16,  2.54it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:15,  2.61it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:15,  2.63it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:04<00:14,  2.62it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:05<00:14,  2.67it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:14,  2.64it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:05<00:13,  2.63it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:06<00:13,  2.63it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:06<00:12,  2.64it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:06<00:12,  2.64it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:07<00:12,  2.63it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:07<00:11,  2.63it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:08<00:11,  2.63it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:08<00:11,  2.64it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:08<00:10,  2.67it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:09<00:10,  2.67it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:09<00:09,  2.65it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:09<00:09,  2.68it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:10<00:08,  2.69it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:10<00:08,  2.72it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:11<00:08,  2.69it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:11<00:07,  2.70it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:11<00:07,  2.67it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:12<00:07,  2.58it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:12<00:07,  2.57it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:13<00:06,  2.54it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:13<00:06,  2.57it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:13<00:05,  2.58it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:14<00:05,  2.60it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:14<00:04,  2.63it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:14<00:04,  2.66it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:15<00:04,  2.70it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:15<00:03,  2.63it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:16<00:03,  2.67it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:16<00:03,  2.64it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:16<00:02,  2.66it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:17<00:02,  2.65it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:17<00:01,  2.69it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:17<00:01,  2.70it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:18<00:01,  2.68it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:18<00:00,  2.67it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:19<00:00,  2.73it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:19<00:00,  2.71it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:19<00:00,  2.58it/s]

Epoch 3:  49%|████▉     | 2951/5971 [29:30<30:11,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000183, train/loss_step=0.0515, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2951/5971 [29:30<30:11,  1.67it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000168, train/loss_step=0.0499, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:30,  1.59it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:16,  2.82it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:15,  3.05it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:13,  3.27it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:12,  3.42it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:12,  3.55it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:12,  3.41it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:12,  3.17it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:03<00:13,  2.96it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:12,  3.07it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:03<00:11,  3.22it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:04<00:11,  3.28it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:04<00:10,  3.36it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:04<00:10,  3.43it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:05<00:09,  3.50it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:09,  3.56it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:09,  3.56it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:05<00:08,  3.51it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:08,  3.46it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:06<00:08,  3.49it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:06<00:07,  3.56it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:06<00:07,  3.58it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:07<00:07,  3.42it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:07<00:07,  3.14it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:08<00:07,  3.05it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:08<00:07,  3.08it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:08<00:06,  3.28it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:08<00:06,  3.33it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:09<00:06,  3.33it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:09<00:05,  3.35it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:09<00:05,  3.41it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:10<00:04,  3.49it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:10<00:04,  3.48it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:10<00:04,  3.56it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:10<00:03,  3.63it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:11<00:03,  3.73it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:11<00:03,  3.79it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:11<00:02,  3.84it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:11<00:02,  3.56it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:12<00:02,  3.36it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:12<00:02,  3.21it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:12<00:02,  3.14it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:13<00:01,  3.10it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:13<00:01,  3.14it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:13<00:01,  3.15it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:14<00:00,  3.07it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:14<00:00,  3.01it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:14<00:00,  2.99it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:15<00:00,  3.03it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:15<00:00,  3.27it/s]

Epoch 3:  49%|████▉     | 2952/5971 [29:51<30:31,  1.65it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000168, train/loss_step=0.0499, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2952/5971 [29:51<30:31,  1.65it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000124, train/loss_step=0.0326, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2953/5971 [29:53<30:31,  1.65it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000124, train/loss_step=0.0326, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2953/5971 [29:53<30:31,  1.65it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.88e-5, train/loss_step=0.016, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  49%|████▉     | 2954/5971 [29:54<30:32,  1.65it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.88e-5, train/loss_step=0.016, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2954/5971 [29:54<30:32,  1.65it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.00821, train/loss_vlb_step=3.98e-5, train/loss_step=0.00821, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2955/5971 [29:55<30:32,  1.65it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.00821, train/loss_vlb_step=3.98e-5, train/loss_step=0.00821, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  49%|████▉     | 2955/5971 [29:55<30:32,  1.65it/s, loss=0.0776, v_num=0, train/loss_simple_step=0.0345, train/loss_vlb_step=0.000127, train/loss_step=0.0345, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  50%|████▉     | 2956/5971 [29:59<30:34,  1.64it/s, loss=0.0776, v_num=0, train/loss_simple_step=0.0345, train/loss_vlb_step=0.000127, train/loss_step=0.0345, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2956/5971 [29:59<30:34,  1.64it/s, loss=0.0764, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000344, train/loss_step=0.104, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  50%|████▉     | 2957/5971 [30:01<30:35,  1.64it/s, loss=0.0764, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000344, train/loss_step=0.104, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2957/5971 [30:01<30:35,  1.64it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00485, train/loss_step=0.514, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  50%|████▉     | 2958/5971 [30:02<30:35,  1.64it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00485, train/loss_step=0.514, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2958/5971 [30:02<30:35,  1.64it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000154, train/loss_step=0.0403, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2959/5971 [30:04<30:35,  1.64it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000154, train/loss_step=0.0403, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2959/5971 [30:04<30:35,  1.64it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  50%|████▉     | 2960/5971 [30:07<30:37,  1.64it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2960/5971 [30:07<30:37,  1.64it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000163, train/loss_step=0.0428, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  50%|████▉     | 2961/5971 [30:08<30:37,  1.64it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000163, train/loss_step=0.0428, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2961/5971 [30:08<30:37,  1.64it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000143, train/loss_step=0.0395, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2962/5971 [30:09<30:37,  1.64it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000143, train/loss_step=0.0395, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2962/5971 [30:09<30:37,  1.64it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00728, train/loss_vlb_step=3.54e-5, train/loss_step=0.00728, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2963/5971 [30:10<30:37,  1.64it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00728, train/loss_vlb_step=3.54e-5, train/loss_step=0.00728, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2963/5971 [30:10<30:37,  1.64it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000118, train/loss_step=0.0317, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  50%|████▉     | 2964/5971 [30:15<30:41,  1.63it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000118, train/loss_step=0.0317, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2964/5971 [30:15<30:41,  1.63it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.00685, train/loss_vlb_step=3.31e-5, train/loss_step=0.00685, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2965/5971 [30:16<30:41,  1.63it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.00685, train/loss_vlb_step=3.31e-5, train/loss_step=0.00685, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2965/5971 [30:16<30:41,  1.63it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000352, train/loss_step=0.107, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  50%|████▉     | 2966/5971 [30:17<30:41,  1.63it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000352, train/loss_step=0.107, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2966/5971 [30:17<30:41,  1.63it/s, loss=0.0788, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.05e-5, train/loss_step=0.0139, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2967/5971 [30:18<30:40,  1.63it/s, loss=0.0788, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.05e-5, train/loss_step=0.0139, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2967/5971 [30:18<30:40,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.795, train/loss_vlb_step=0.0345, train/loss_step=0.795, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  50%|████▉     | 2968/5971 [30:22<30:43,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.795, train/loss_vlb_step=0.0345, train/loss_step=0.795, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2968/5971 [30:22<30:43,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.66e-5, train/loss_step=0.0159, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2969/5971 [30:23<30:42,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.66e-5, train/loss_step=0.0159, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2969/5971 [30:23<30:42,  1.63it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.89e-5, train/loss_step=0.0254, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2970/5971 [30:24<30:42,  1.63it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.89e-5, train/loss_step=0.0254, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2970/5971 [30:24<30:42,  1.63it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00792, train/loss_vlb_step=3.89e-5, train/loss_step=0.00792, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2971/5971 [30:25<30:42,  1.63it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00792, train/loss_vlb_step=3.89e-5, train/loss_step=0.00792, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2971/5971 [30:25<30:42,  1.63it/s, loss=0.118, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00301, train/loss_step=0.412, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  50%|████▉     | 2972/5971 [30:29<30:45,  1.63it/s, loss=0.118, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00301, train/loss_step=0.412, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2972/5971 [30:29<30:45,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00164, train/loss_step=0.375, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2973/5971 [30:30<30:44,  1.62it/s, loss=0.135, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00164, train/loss_step=0.375, global_step=2e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2973/5971 [30:30<30:44,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.766, train/loss_vlb_step=0.0492, train/loss_step=0.766, global_step=2006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2974/5971 [30:31<30:44,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.766, train/loss_vlb_step=0.0492, train/loss_step=0.766, global_step=2006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2974/5971 [30:31<30:44,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000447, train/loss_step=0.136, global_step=2006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2975/5971 [30:32<30:44,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000447, train/loss_step=0.136, global_step=2006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2975/5971 [30:32<30:44,  1.62it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00375, train/loss_vlb_step=1.96e-5, train/loss_step=0.00375, global_step=2006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2976/5971 [30:37<30:48,  1.62it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00375, train/loss_vlb_step=1.96e-5, train/loss_step=0.00375, global_step=2006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2976/5971 [30:37<30:48,  1.62it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.89e-5, train/loss_step=0.0252, global_step=2006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  50%|████▉     | 2977/5971 [30:38<30:48,  1.62it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.89e-5, train/loss_step=0.0252, global_step=2006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2977/5971 [30:38<30:48,  1.62it/s, loss=0.165, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00196, train/loss_step=0.344, global_step=2007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  50%|████▉     | 2978/5971 [30:39<30:48,  1.62it/s, loss=0.165, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00196, train/loss_step=0.344, global_step=2007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2978/5971 [30:39<30:48,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000489, train/loss_step=0.144, global_step=2007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2979/5971 [30:40<30:48,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000489, train/loss_step=0.144, global_step=2007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2979/5971 [30:40<30:48,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.67e-5, train/loss_step=0.0258, global_step=2007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2980/5971 [30:44<30:51,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.67e-5, train/loss_step=0.0258, global_step=2007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2980/5971 [30:44<30:51,  1.62it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.0002, train/loss_step=0.0579, global_step=2007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  50%|████▉     | 2981/5971 [30:46<30:51,  1.62it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.0002, train/loss_step=0.0579, global_step=2007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2981/5971 [30:46<30:51,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.6e-5, train/loss_step=0.0125, global_step=2008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2982/5971 [30:47<30:51,  1.61it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.6e-5, train/loss_step=0.0125, global_step=2008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2982/5971 [30:47<30:51,  1.61it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.71e-5, train/loss_step=0.0191, global_step=2008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2983/5971 [30:48<30:50,  1.61it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.71e-5, train/loss_step=0.0191, global_step=2008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2983/5971 [30:48<30:50,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.64e-5, train/loss_step=0.00303, global_step=2008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2984/5971 [30:52<30:53,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.64e-5, train/loss_step=0.00303, global_step=2008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2984/5971 [30:52<30:53,  1.61it/s, loss=0.186, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00279, train/loss_step=0.421, global_step=2008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  50%|████▉     | 2985/5971 [30:53<30:53,  1.61it/s, loss=0.186, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00279, train/loss_step=0.421, global_step=2008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|████▉     | 2985/5971 [30:53<30:53,  1.61it/s, loss=0.196, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.0013, train/loss_step=0.315, global_step=2009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  50%|█████     | 2986/5971 [30:54<30:53,  1.61it/s, loss=0.196, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.0013, train/loss_step=0.315, global_step=2009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2986/5971 [30:54<30:53,  1.61it/s, loss=0.211, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00127, train/loss_step=0.312, global_step=2009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2987/5971 [30:55<30:53,  1.61it/s, loss=0.211, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00127, train/loss_step=0.312, global_step=2009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2987/5971 [30:55<30:53,  1.61it/s, loss=0.202, v_num=0, train/loss_simple_step=0.626, train/loss_vlb_step=0.0108, train/loss_step=0.626, global_step=2009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  50%|█████     | 2988/5971 [30:59<30:56,  1.61it/s, loss=0.202, v_num=0, train/loss_simple_step=0.626, train/loss_vlb_step=0.0108, train/loss_step=0.626, global_step=2009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2988/5971 [30:59<30:56,  1.61it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.39e-6, train/loss_step=0.00165, global_step=2009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2989/5971 [31:01<30:56,  1.61it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.39e-6, train/loss_step=0.00165, global_step=2009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2989/5971 [31:01<30:56,  1.61it/s, loss=0.209, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000739, train/loss_step=0.175, global_step=2010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  50%|█████     | 2990/5971 [31:02<30:56,  1.61it/s, loss=0.209, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000739, train/loss_step=0.175, global_step=2010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2990/5971 [31:02<30:56,  1.61it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000135, train/loss_step=0.0382, global_step=2010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2991/5971 [31:04<30:56,  1.61it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000135, train/loss_step=0.0382, global_step=2010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2991/5971 [31:04<30:56,  1.61it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0377, train/loss_vlb_step=0.000139, train/loss_step=0.0377, global_step=2010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2992/5971 [31:07<30:58,  1.60it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0377, train/loss_vlb_step=0.000139, train/loss_step=0.0377, global_step=2010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2992/5971 [31:07<30:58,  1.60it/s, loss=0.179, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.00038, train/loss_step=0.115, global_step=2010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  50%|█████     | 2993/5971 [31:09<30:59,  1.60it/s, loss=0.179, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.00038, train/loss_step=0.115, global_step=2010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2993/5971 [31:09<30:59,  1.60it/s, loss=0.142, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=6.62e-5, train/loss_step=0.017, global_step=2011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2994/5971 [31:10<30:59,  1.60it/s, loss=0.142, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=6.62e-5, train/loss_step=0.017, global_step=2011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2994/5971 [31:10<30:59,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000374, train/loss_step=0.113, global_step=2011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2995/5971 [31:11<30:59,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000374, train/loss_step=0.113, global_step=2011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2995/5971 [31:11<30:59,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00418, train/loss_vlb_step=2.11e-5, train/loss_step=0.00418, global_step=2011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2996/5971 [31:15<31:01,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00418, train/loss_vlb_step=2.11e-5, train/loss_step=0.00418, global_step=2011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2996/5971 [31:15<31:01,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.31e-5, train/loss_step=0.0179, global_step=2011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  50%|█████     | 2997/5971 [31:16<31:01,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.31e-5, train/loss_step=0.0179, global_step=2011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2997/5971 [31:16<31:01,  1.60it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000167, train/loss_step=0.0456, global_step=2012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2998/5971 [31:18<31:02,  1.60it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000167, train/loss_step=0.0456, global_step=2012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2998/5971 [31:18<31:02,  1.60it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.92e-5, train/loss_step=0.00351, global_step=2012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2999/5971 [31:19<31:02,  1.60it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.92e-5, train/loss_step=0.00351, global_step=2012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 2999/5971 [31:19<31:02,  1.60it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.48e-5, train/loss_step=0.00273, global_step=2012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3000/5971 [31:23<31:04,  1.59it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.48e-5, train/loss_step=0.00273, global_step=2012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3000/5971 [31:23<31:04,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.84e-5, train/loss_step=0.00348, global_step=2012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3001/5971 [31:24<31:04,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.84e-5, train/loss_step=0.00348, global_step=2012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3001/5971 [31:24<31:04,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.42e-5, train/loss_step=0.0119, global_step=2013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  50%|█████     | 3002/5971 [31:25<31:04,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.42e-5, train/loss_step=0.0119, global_step=2013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3002/5971 [31:25<31:04,  1.59it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0987, train/loss_vlb_step=0.000325, train/loss_step=0.0987, global_step=2013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3003/5971 [31:27<31:04,  1.59it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0987, train/loss_vlb_step=0.000325, train/loss_step=0.0987, global_step=2013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3003/5971 [31:27<31:04,  1.59it/s, loss=0.137, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.0022, train/loss_step=0.374, global_step=2013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  50%|█████     | 3004/5971 [31:30<31:06,  1.59it/s, loss=0.137, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.0022, train/loss_step=0.374, global_step=2013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3004/5971 [31:30<31:06,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000459, train/loss_step=0.134, global_step=2013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3005/5971 [31:32<31:06,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000459, train/loss_step=0.134, global_step=2013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3005/5971 [31:32<31:06,  1.59it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.17e-5, train/loss_step=0.0197, global_step=2014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3006/5971 [31:33<31:06,  1.59it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.17e-5, train/loss_step=0.0197, global_step=2014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3006/5971 [31:33<31:06,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00155, train/loss_step=0.307, global_step=2014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  50%|█████     | 3007/5971 [31:34<31:06,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00155, train/loss_step=0.307, global_step=2014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3007/5971 [31:34<31:06,  1.59it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000121, train/loss_step=0.0339, global_step=2014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3008/5971 [31:39<31:10,  1.58it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000121, train/loss_step=0.0339, global_step=2014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3008/5971 [31:39<31:10,  1.58it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.29e-5, train/loss_step=0.00227, global_step=2014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3009/5971 [31:40<31:10,  1.58it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.29e-5, train/loss_step=0.00227, global_step=2014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3009/5971 [31:40<31:10,  1.58it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=2015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  50%|█████     | 3010/5971 [31:41<31:10,  1.58it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=2015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3010/5971 [31:41<31:10,  1.58it/s, loss=0.0741, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.97e-5, train/loss_step=0.0037, global_step=2015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3011/5971 [31:43<31:10,  1.58it/s, loss=0.0741, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.97e-5, train/loss_step=0.0037, global_step=2015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3011/5971 [31:43<31:10,  1.58it/s, loss=0.0776, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000357, train/loss_step=0.109, global_step=2015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  50%|█████     | 3012/5971 [31:46<31:11,  1.58it/s, loss=0.0776, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000357, train/loss_step=0.109, global_step=2015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3012/5971 [31:46<31:11,  1.58it/s, loss=0.0737, v_num=0, train/loss_simple_step=0.0377, train/loss_vlb_step=0.000137, train/loss_step=0.0377, global_step=2015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3013/5971 [31:47<31:12,  1.58it/s, loss=0.0737, v_num=0, train/loss_simple_step=0.0377, train/loss_vlb_step=0.000137, train/loss_step=0.0377, global_step=2015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3013/5971 [31:47<31:12,  1.58it/s, loss=0.0729, v_num=0, train/loss_simple_step=0.00112, train/loss_vlb_step=6.65e-6, train/loss_step=0.00112, global_step=2016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3014/5971 [31:48<31:12,  1.58it/s, loss=0.0729, v_num=0, train/loss_simple_step=0.00112, train/loss_vlb_step=6.65e-6, train/loss_step=0.00112, global_step=2016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3014/5971 [31:48<31:12,  1.58it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000816, train/loss_step=0.227, global_step=2016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  50%|█████     | 3015/5971 [31:50<31:12,  1.58it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000816, train/loss_step=0.227, global_step=2016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  50%|█████     | 3015/5971 [31:50<31:12,  1.58it/s, loss=0.0806, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000151, train/loss_step=0.0438, global_step=2016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3016/5971 [31:52<31:13,  1.58it/s, loss=0.0806, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000151, train/loss_step=0.0438, global_step=2016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3016/5971 [31:52<31:13,  1.58it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000508, train/loss_step=0.154, global_step=2016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  51%|█████     | 3017/5971 [31:53<31:13,  1.58it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000508, train/loss_step=0.154, global_step=2016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3017/5971 [31:53<31:13,  1.58it/s, loss=0.0904, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=2017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  51%|█████     | 3018/5971 [31:55<31:13,  1.58it/s, loss=0.0904, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=2017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3018/5971 [31:55<31:13,  1.58it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000134, train/loss_step=0.0371, global_step=2017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3019/5971 [31:56<31:13,  1.58it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000134, train/loss_step=0.0371, global_step=2017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3019/5971 [31:56<31:13,  1.58it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000394, train/loss_step=0.119, global_step=2017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  51%|█████     | 3020/5971 [31:59<31:14,  1.57it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000394, train/loss_step=0.119, global_step=2017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3020/5971 [31:59<31:14,  1.57it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.35e-5, train/loss_step=0.0204, global_step=2017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3021/5971 [32:00<31:14,  1.57it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.35e-5, train/loss_step=0.0204, global_step=2017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3021/5971 [32:00<31:14,  1.57it/s, loss=0.099, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.44e-5, train/loss_step=0.0151, global_step=2018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  51%|█████     | 3022/5971 [32:01<31:14,  1.57it/s, loss=0.099, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.44e-5, train/loss_step=0.0151, global_step=2018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3022/5971 [32:01<31:14,  1.57it/s, loss=0.103, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000672, train/loss_step=0.186, global_step=2018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  51%|█████     | 3023/5971 [32:02<31:14,  1.57it/s, loss=0.103, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000672, train/loss_step=0.186, global_step=2018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3023/5971 [32:02<31:14,  1.57it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000294, train/loss_step=0.0894, global_step=2018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3024/5971 [32:05<31:15,  1.57it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000294, train/loss_step=0.0894, global_step=2018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3024/5971 [32:05<31:15,  1.57it/s, loss=0.0827, v_num=0, train/loss_simple_step=0.00669, train/loss_vlb_step=3.24e-5, train/loss_step=0.00669, global_step=2018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3025/5971 [32:06<31:15,  1.57it/s, loss=0.0827, v_num=0, train/loss_simple_step=0.00669, train/loss_vlb_step=3.24e-5, train/loss_step=0.00669, global_step=2018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3025/5971 [32:06<31:15,  1.57it/s, loss=0.082, v_num=0, train/loss_simple_step=0.00473, train/loss_vlb_step=2.46e-5, train/loss_step=0.00473, global_step=2019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  51%|█████     | 3026/5971 [32:07<31:15,  1.57it/s, loss=0.082, v_num=0, train/loss_simple_step=0.00473, train/loss_vlb_step=2.46e-5, train/loss_step=0.00473, global_step=2019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3026/5971 [32:07<31:15,  1.57it/s, loss=0.0671, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.76e-5, train/loss_step=0.0106, global_step=2019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  51%|█████     | 3027/5971 [32:09<31:15,  1.57it/s, loss=0.0671, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.76e-5, train/loss_step=0.0106, global_step=2019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3027/5971 [32:09<31:15,  1.57it/s, loss=0.1, v_num=0, train/loss_simple_step=0.701, train/loss_vlb_step=0.0451, train/loss_step=0.701, global_step=2019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]      
Epoch 3:  51%|█████     | 3028/5971 [32:12<31:17,  1.57it/s, loss=0.1, v_num=0, train/loss_simple_step=0.701, train/loss_vlb_step=0.0451, train/loss_step=0.701, global_step=2019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3028/5971 [32:12<31:17,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000435, train/loss_step=0.131, global_step=2019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3029/5971 [32:13<31:17,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000435, train/loss_step=0.131, global_step=2019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3029/5971 [32:13<31:17,  1.57it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00839, train/loss_vlb_step=3.94e-5, train/loss_step=0.00839, global_step=2020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3030/5971 [32:14<31:16,  1.57it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00839, train/loss_vlb_step=3.94e-5, train/loss_step=0.00839, global_step=2020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3030/5971 [32:14<31:16,  1.57it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=8.33e-5, train/loss_step=0.0229, global_step=2020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  51%|█████     | 3031/5971 [32:15<31:16,  1.57it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=8.33e-5, train/loss_step=0.0229, global_step=2020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3031/5971 [32:15<31:16,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00195, train/loss_step=0.372, global_step=2020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  51%|█████     | 3032/5971 [32:19<31:19,  1.56it/s, loss=0.115, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00195, train/loss_step=0.372, global_step=2020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3032/5971 [32:19<31:19,  1.56it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.000261, train/loss_step=0.0788, global_step=2020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3033/5971 [32:20<31:19,  1.56it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.000261, train/loss_step=0.0788, global_step=2020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3033/5971 [32:20<31:19,  1.56it/s, loss=0.132, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00176, train/loss_step=0.310, global_step=2021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  51%|█████     | 3034/5971 [32:22<31:19,  1.56it/s, loss=0.132, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00176, train/loss_step=0.310, global_step=2021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3034/5971 [32:22<31:19,  1.56it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.31e-5, train/loss_step=0.00227, global_step=2021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3035/5971 [32:23<31:19,  1.56it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.31e-5, train/loss_step=0.00227, global_step=2021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3035/5971 [32:23<31:19,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00942, train/loss_vlb_step=4.34e-5, train/loss_step=0.00942, global_step=2021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3036/5971 [32:26<31:21,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00942, train/loss_vlb_step=4.34e-5, train/loss_step=0.00942, global_step=2021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3036/5971 [32:26<31:21,  1.56it/s, loss=0.129, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00153, train/loss_step=0.352, global_step=2021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  51%|█████     | 3037/5971 [32:27<31:21,  1.56it/s, loss=0.129, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00153, train/loss_step=0.352, global_step=2021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3037/5971 [32:27<31:21,  1.56it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.000302, train/loss_step=0.0912, global_step=2022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3038/5971 [32:29<31:21,  1.56it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.000302, train/loss_step=0.0912, global_step=2022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3038/5971 [32:29<31:21,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000813, train/loss_step=0.220, global_step=2022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  51%|█████     | 3039/5971 [32:30<31:21,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000813, train/loss_step=0.220, global_step=2022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3039/5971 [32:30<31:21,  1.56it/s, loss=0.146, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00144, train/loss_step=0.286, global_step=2022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  51%|█████     | 3040/5971 [32:33<31:22,  1.56it/s, loss=0.146, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00144, train/loss_step=0.286, global_step=2022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3040/5971 [32:33<31:22,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000988, train/loss_step=0.261, global_step=2022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3041/5971 [32:34<31:22,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000988, train/loss_step=0.261, global_step=2022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3041/5971 [32:34<31:22,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0814, train/loss_vlb_step=0.000268, train/loss_step=0.0814, global_step=2023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3042/5971 [32:35<31:22,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0814, train/loss_vlb_step=0.000268, train/loss_step=0.0814, global_step=2023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3042/5971 [32:35<31:22,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.00064, train/loss_step=0.190, global_step=2023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  51%|█████     | 3043/5971 [32:36<31:21,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.00064, train/loss_step=0.190, global_step=2023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3043/5971 [32:36<31:21,  1.56it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00581, train/loss_vlb_step=2.95e-5, train/loss_step=0.00581, global_step=2023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3044/5971 [32:40<31:24,  1.55it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00581, train/loss_vlb_step=2.95e-5, train/loss_step=0.00581, global_step=2023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3044/5971 [32:40<31:24,  1.55it/s, loss=0.167, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000725, train/loss_step=0.203, global_step=2023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  51%|█████     | 3045/5971 [32:41<31:24,  1.55it/s, loss=0.167, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000725, train/loss_step=0.203, global_step=2023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3045/5971 [32:41<31:24,  1.55it/s, loss=0.176, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000731, train/loss_step=0.191, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3046/5971 [32:42<31:23,  1.55it/s, loss=0.176, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000731, train/loss_step=0.191, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3046/5971 [32:42<31:23,  1.55it/s, loss=0.192, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00146, train/loss_step=0.330, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  51%|█████     | 3047/5971 [32:43<31:23,  1.55it/s, loss=0.192, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00146, train/loss_step=0.330, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3047/5971 [32:43<31:23,  1.55it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.23e-5, train/loss_step=0.0239, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3048/5971 [32:47<31:26,  1.55it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.23e-5, train/loss_step=0.0239, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  51%|█████     | 3048/5971 [32:47<31:26,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:23,  1.99it/s][A
Epoch 3:  51%|█████     | 3050/5971 [32:48<31:24,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:58,  2.82it/s][A
Epoch 3:  51%|█████     | 3052/5971 [32:48<31:21,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   2%|▏         | 4/167 [00:00<00:27,  5.90it/s][A
Epoch 3:  51%|█████     | 3054/5971 [32:48<31:19,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   4%|▎         | 6/167 [00:00<00:18,  8.77it/s][A
Epoch 3:  51%|█████     | 3056/5971 [32:48<31:17,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:01<00:14, 10.75it/s][A
Epoch 3:  51%|█████     | 3058/5971 [32:48<31:14,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:01<00:10, 14.35it/s][A
Epoch 3:  51%|█████▏    | 3061/5971 [32:48<31:11,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:09, 16.79it/s][A
Epoch 3:  51%|█████▏    | 3064/5971 [32:48<31:07,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:08, 18.32it/s][A
Epoch 3:  51%|█████▏    | 3067/5971 [32:49<31:03,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 19.86it/s][A
Epoch 3:  51%|█████▏    | 3070/5971 [32:49<31:00,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 21.15it/s][A
Epoch 3:  51%|█████▏    | 3073/5971 [32:49<30:56,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 21.85it/s][A
Epoch 3:  52%|█████▏    | 3076/5971 [32:49<30:52,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:02<00:06, 21.72it/s][A
Epoch 3:  52%|█████▏    | 3079/5971 [32:49<30:49,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:02<00:05, 22.56it/s][A
Epoch 3:  52%|█████▏    | 3082/5971 [32:49<30:45,  1.57it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:02<00:05, 22.85it/s][A
Epoch 3:  52%|█████▏    | 3085/5971 [32:49<30:42,  1.57it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:06, 21.49it/s][A
Epoch 3:  52%|█████▏    | 3088/5971 [32:49<30:38,  1.57it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 21.81it/s][A
Epoch 3:  52%|█████▏    | 3091/5971 [32:50<30:35,  1.57it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 21.77it/s][A
Epoch 3:  52%|█████▏    | 3094/5971 [32:50<30:31,  1.57it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:05, 22.92it/s][A
Epoch 3:  52%|█████▏    | 3097/5971 [32:50<30:27,  1.57it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:05, 22.30it/s][A
Epoch 3:  52%|█████▏    | 3100/5971 [32:50<30:24,  1.57it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:03<00:05, 22.01it/s][A
Epoch 3:  52%|█████▏    | 3103/5971 [32:50<30:20,  1.58it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 22.45it/s][A
Epoch 3:  52%|█████▏    | 3106/5971 [32:50<30:17,  1.58it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:03<00:04, 21.86it/s][A
Epoch 3:  52%|█████▏    | 3109/5971 [32:50<30:13,  1.58it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:03<00:04, 22.77it/s][A
Epoch 3:  52%|█████▏    | 3112/5971 [32:51<30:10,  1.58it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 22.46it/s][A
Epoch 3:  52%|█████▏    | 3115/5971 [32:51<30:06,  1.58it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:03<00:04, 23.40it/s][A
Epoch 3:  52%|█████▏    | 3118/5971 [32:51<30:03,  1.58it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 24.35it/s][A
Epoch 3:  52%|█████▏    | 3121/5971 [32:51<29:59,  1.58it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 23.93it/s][A
Epoch 3:  52%|█████▏    | 3124/5971 [32:51<29:56,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:04<00:03, 24.49it/s][A
Epoch 3:  52%|█████▏    | 3127/5971 [32:51<29:52,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:04<00:03, 23.59it/s][A
Epoch 3:  52%|█████▏    | 3130/5971 [32:51<29:49,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:04<00:03, 22.53it/s][A
Epoch 3:  52%|█████▏    | 3133/5971 [32:51<29:45,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 22.44it/s][A
Epoch 3:  53%|█████▎    | 3136/5971 [32:52<29:42,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 21.53it/s][A
Epoch 3:  53%|█████▎    | 3139/5971 [32:52<29:38,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 21.28it/s][A
Epoch 3:  53%|█████▎    | 3142/5971 [32:52<29:35,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:03, 21.25it/s][A
Epoch 3:  53%|█████▎    | 3145/5971 [32:52<29:31,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:05<00:03, 19.94it/s][A
Epoch 3:  53%|█████▎    | 3148/5971 [32:52<29:28,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:05<00:03, 19.71it/s][A
Epoch 3:  53%|█████▎    | 3151/5971 [32:52<29:25,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:05<00:03, 19.58it/s][A
Epoch 3:  53%|█████▎    | 3154/5971 [32:53<29:21,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:05<00:03, 19.14it/s][A
Epoch 3:  53%|█████▎    | 3157/5971 [32:53<29:18,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 20.79it/s][A
Epoch 3:  53%|█████▎    | 3160/5971 [32:53<29:14,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 21.80it/s][A
Epoch 3:  53%|█████▎    | 3163/5971 [32:53<29:11,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 22.24it/s][A
Epoch 3:  53%|█████▎    | 3166/5971 [32:53<29:07,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:06<00:02, 22.46it/s][A
Epoch 3:  53%|█████▎    | 3169/5971 [32:53<29:04,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:06<00:02, 19.04it/s][A
Epoch 3:  53%|█████▎    | 3172/5971 [32:53<29:01,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:06<00:02, 19.65it/s][A
Epoch 3:  53%|█████▎    | 3175/5971 [32:54<28:57,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:06<00:01, 21.35it/s][A
Epoch 3:  53%|█████▎    | 3178/5971 [32:54<28:54,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 22.89it/s][A
Epoch 3:  53%|█████▎    | 3181/5971 [32:54<28:51,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:06<00:01, 23.84it/s][A
Epoch 3:  53%|█████▎    | 3184/5971 [32:54<28:47,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 23.64it/s][A
Epoch 3:  53%|█████▎    | 3187/5971 [32:54<28:44,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:07<00:01, 23.47it/s][A
Epoch 3:  53%|█████▎    | 3190/5971 [32:54<28:40,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:07<00:01, 23.14it/s][A
Epoch 3:  53%|█████▎    | 3193/5971 [32:54<28:37,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:07<00:00, 23.01it/s][A
Epoch 3:  54%|█████▎    | 3196/5971 [32:54<28:34,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:07<00:00, 22.31it/s][A
Epoch 3:  54%|█████▎    | 3199/5971 [32:55<28:30,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:07<00:00, 22.15it/s][A
Epoch 3:  54%|█████▎    | 3202/5971 [32:55<28:27,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:07<00:00, 21.36it/s][A
Epoch 3:  54%|█████▎    | 3205/5971 [32:55<28:24,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:07<00:00, 20.32it/s][A
Epoch 3:  54%|█████▎    | 3208/5971 [32:55<28:20,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 20.96it/s][A
Epoch 3:  54%|█████▍    | 3211/5971 [32:55<28:17,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:08<00:00, 21.65it/s][A
Epoch 3:  54%|█████▍    | 3214/5971 [32:55<28:14,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:08<00:00, 21.59it/s][A
Epoch 3:  54%|█████▍    | 3216/5971 [32:56<28:12,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  54%|█████▍    | 3217/5971 [32:57<28:12,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.76e-5, train/loss_step=0.0173, global_step=2024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3217/5971 [32:57<28:12,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.21e-5, train/loss_step=0.0118, global_step=2025.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3218/5971 [32:58<28:12,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000102, train/loss_step=0.0279, global_step=2025.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3219/5971 [32:59<28:11,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00813, train/loss_vlb_step=3.84e-5, train/loss_step=0.00813, global_step=2025.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3220/5971 [33:02<28:13,  1.62it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00813, train/loss_vlb_step=3.84e-5, train/loss_step=0.00813, global_step=2025.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3220/5971 [33:02<28:13,  1.62it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00825, train/loss_vlb_step=3.7e-5, train/loss_step=0.00825, global_step=2025.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  54%|█████▍    | 3221/5971 [33:03<28:13,  1.62it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.73e-5, train/loss_step=0.00315, global_step=2026.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3222/5971 [33:05<28:13,  1.62it/s, loss=0.138, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00334, train/loss_step=0.445, global_step=2026.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  54%|█████▍    | 3223/5971 [33:06<28:12,  1.62it/s, loss=0.138, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00334, train/loss_step=0.445, global_step=2026.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3223/5971 [33:06<28:12,  1.62it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.66e-5, train/loss_step=0.0248, global_step=2026.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3224/5971 [33:09<28:14,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.000139, train/loss_step=0.0404, global_step=2026.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3225/5971 [33:10<28:13,  1.62it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.38e-5, train/loss_step=0.0139, global_step=2027.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  54%|█████▍    | 3226/5971 [33:11<28:13,  1.62it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.38e-5, train/loss_step=0.0139, global_step=2027.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3226/5971 [33:11<28:13,  1.62it/s, loss=0.116, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000487, train/loss_step=0.148, global_step=2027.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3227/5971 [33:11<28:13,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00102, train/loss_step=0.256, global_step=2027.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  54%|█████▍    | 3228/5971 [33:15<28:15,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000116, train/loss_step=0.0311, global_step=2027.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3229/5971 [33:16<28:15,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000116, train/loss_step=0.0311, global_step=2027.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3229/5971 [33:16<28:15,  1.62it/s, loss=0.119, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00232, train/loss_step=0.400, global_step=2028.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  54%|█████▍    | 3230/5971 [33:17<28:14,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.62e-5, train/loss_step=0.00283, global_step=2028.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3231/5971 [33:18<28:14,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0042, train/loss_vlb_step=2.27e-5, train/loss_step=0.0042, global_step=2028.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  54%|█████▍    | 3232/5971 [33:21<28:15,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0042, train/loss_vlb_step=2.27e-5, train/loss_step=0.0042, global_step=2028.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3232/5971 [33:21<28:15,  1.62it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000183, train/loss_step=0.0508, global_step=2028.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3233/5971 [33:22<28:15,  1.62it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.00526, train/loss_vlb_step=2.73e-5, train/loss_step=0.00526, global_step=2029.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3234/5971 [33:23<28:14,  1.61it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00133, train/loss_step=0.292, global_step=2029.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  54%|█████▍    | 3235/5971 [33:24<28:14,  1.61it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00133, train/loss_step=0.292, global_step=2029.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3235/5971 [33:24<28:14,  1.61it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000183, train/loss_step=0.0512, global_step=2029.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3236/5971 [33:27<28:16,  1.61it/s, loss=0.111, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00219, train/loss_step=0.396, global_step=2029.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  54%|█████▍    | 3237/5971 [33:28<28:15,  1.61it/s, loss=0.122, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000839, train/loss_step=0.235, global_step=2030.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3238/5971 [33:29<28:15,  1.61it/s, loss=0.122, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000839, train/loss_step=0.235, global_step=2030.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3238/5971 [33:29<28:15,  1.61it/s, loss=0.142, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00369, train/loss_step=0.431, global_step=2030.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  54%|█████▍    | 3239/5971 [33:30<28:15,  1.61it/s, loss=0.172, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00822, train/loss_step=0.594, global_step=2030.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3240/5971 [33:34<28:17,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.69e-5, train/loss_step=0.0258, global_step=2030.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3241/5971 [33:35<28:16,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.69e-5, train/loss_step=0.0258, global_step=2030.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3241/5971 [33:35<28:16,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.62e-5, train/loss_step=0.0122, global_step=2031.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3242/5971 [33:36<28:16,  1.61it/s, loss=0.162, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000805, train/loss_step=0.217, global_step=2031.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  54%|█████▍    | 3243/5971 [33:37<28:16,  1.61it/s, loss=0.169, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000696, train/loss_step=0.182, global_step=2031.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3244/5971 [33:39<28:17,  1.61it/s, loss=0.169, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000696, train/loss_step=0.182, global_step=2031.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3244/5971 [33:39<28:17,  1.61it/s, loss=0.178, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.00083, train/loss_step=0.217, global_step=2031.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  54%|█████▍    | 3245/5971 [33:40<28:17,  1.61it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.68e-5, train/loss_step=0.0105, global_step=2032.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3246/5971 [33:41<28:16,  1.61it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.0001, train/loss_step=0.0267, global_step=2032.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  54%|█████▍    | 3247/5971 [33:42<28:16,  1.61it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.0001, train/loss_step=0.0267, global_step=2032.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3247/5971 [33:42<28:16,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000384, train/loss_step=0.117, global_step=2032.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3248/5971 [33:45<28:17,  1.60it/s, loss=0.168, v_num=0, train/loss_simple_step=0.098, train/loss_vlb_step=0.000322, train/loss_step=0.098, global_step=2032.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3249/5971 [33:46<28:17,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=9.97e-5, train/loss_step=0.0274, global_step=2033.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3250/5971 [33:47<28:16,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=9.97e-5, train/loss_step=0.0274, global_step=2033.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3250/5971 [33:47<28:16,  1.60it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000152, train/loss_step=0.0444, global_step=2033.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3251/5971 [33:48<28:16,  1.60it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.44e-5, train/loss_step=0.0231, global_step=2033.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  54%|█████▍    | 3252/5971 [33:51<28:18,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.48e-5, train/loss_step=0.0027, global_step=2033.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  54%|█████▍    | 3253/5971 [33:52<28:17,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.48e-5, train/loss_step=0.0027, global_step=2033.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3253/5971 [33:52<28:17,  1.60it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.54e-5, train/loss_step=0.0103, global_step=2034.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  54%|█████▍    | 3254/5971 [33:53<28:17,  1.60it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.00012, train/loss_step=0.0338, global_step=2034.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3255/5971 [33:54<28:17,  1.60it/s, loss=0.142, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000451, train/loss_step=0.135, global_step=2034.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▍    | 3256/5971 [33:57<28:18,  1.60it/s, loss=0.142, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000451, train/loss_step=0.135, global_step=2034.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3256/5971 [33:57<28:18,  1.60it/s, loss=0.123, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.38e-5, train/loss_step=0.014, global_step=2034.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▍    | 3257/5971 [33:58<28:18,  1.60it/s, loss=0.119, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000554, train/loss_step=0.162, global_step=2035.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3258/5971 [33:59<28:17,  1.60it/s, loss=0.0983, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.04e-5, train/loss_step=0.0142, global_step=2035.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3259/5971 [34:00<28:17,  1.60it/s, loss=0.0983, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.04e-5, train/loss_step=0.0142, global_step=2035.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3259/5971 [34:00<28:17,  1.60it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.17e-5, train/loss_step=0.0147, global_step=2035.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3260/5971 [34:03<28:18,  1.60it/s, loss=0.101, v_num=0, train/loss_simple_step=0.653, train/loss_vlb_step=0.0099, train/loss_step=0.653, global_step=2035.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  55%|█████▍    | 3261/5971 [34:04<28:18,  1.60it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.04e-5, train/loss_step=0.00172, global_step=2036.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3262/5971 [34:05<28:18,  1.60it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.04e-5, train/loss_step=0.00172, global_step=2036.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3262/5971 [34:05<28:18,  1.60it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000363, train/loss_step=0.111, global_step=2036.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3263/5971 [34:06<28:17,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00351, train/loss_step=0.424, global_step=2036.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  55%|█████▍    | 3264/5971 [34:08<28:18,  1.59it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000144, train/loss_step=0.0407, global_step=2036.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3265/5971 [34:10<28:18,  1.59it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000144, train/loss_step=0.0407, global_step=2036.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3265/5971 [34:10<28:18,  1.59it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=6.74e-5, train/loss_step=0.0177, global_step=2037.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▍    | 3266/5971 [34:11<28:18,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00194, train/loss_step=0.386, global_step=2037.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  55%|█████▍    | 3267/5971 [34:12<28:17,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000151, train/loss_step=0.0409, global_step=2037.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3268/5971 [34:14<28:18,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000151, train/loss_step=0.0409, global_step=2037.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3268/5971 [34:14<28:18,  1.59it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.29e-5, train/loss_step=0.00226, global_step=2037.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3269/5971 [34:15<28:18,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00144, train/loss_step=0.338, global_step=2038.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  55%|█████▍    | 3270/5971 [34:16<28:18,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00567, train/loss_vlb_step=2.86e-5, train/loss_step=0.00567, global_step=2038.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3271/5971 [34:17<28:17,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00567, train/loss_vlb_step=2.86e-5, train/loss_step=0.00567, global_step=2038.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3271/5971 [34:17<28:17,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000844, train/loss_step=0.235, global_step=2038.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  55%|█████▍    | 3272/5971 [34:20<28:19,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=3.23e-5, train/loss_step=0.00661, global_step=2038.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3273/5971 [34:21<28:18,  1.59it/s, loss=0.142, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.00079, train/loss_step=0.211, global_step=2039.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  55%|█████▍    | 3274/5971 [34:22<28:18,  1.59it/s, loss=0.142, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.00079, train/loss_step=0.211, global_step=2039.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3274/5971 [34:22<28:18,  1.59it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00724, train/loss_vlb_step=3.5e-5, train/loss_step=0.00724, global_step=2039.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3275/5971 [34:23<28:18,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00427, train/loss_step=0.481, global_step=2039.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  55%|█████▍    | 3276/5971 [34:26<28:19,  1.59it/s, loss=0.192, v_num=0, train/loss_simple_step=0.682, train/loss_vlb_step=0.0182, train/loss_step=0.682, global_step=2039.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▍    | 3277/5971 [34:27<28:18,  1.59it/s, loss=0.192, v_num=0, train/loss_simple_step=0.682, train/loss_vlb_step=0.0182, train/loss_step=0.682, global_step=2039.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3277/5971 [34:27<28:18,  1.59it/s, loss=0.19, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000402, train/loss_step=0.118, global_step=2040.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3278/5971 [34:28<28:18,  1.59it/s, loss=0.19, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6e-5, train/loss_step=0.015, global_step=2040.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  55%|█████▍    | 3279/5971 [34:29<28:18,  1.59it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00977, train/loss_vlb_step=4.52e-5, train/loss_step=0.00977, global_step=2040.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3280/5971 [34:31<28:19,  1.58it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00977, train/loss_vlb_step=4.52e-5, train/loss_step=0.00977, global_step=2040.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3280/5971 [34:31<28:19,  1.58it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0593, train/loss_vlb_step=0.000204, train/loss_step=0.0593, global_step=2040.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  55%|█████▍    | 3281/5971 [34:32<28:18,  1.58it/s, loss=0.185, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00347, train/loss_step=0.500, global_step=2041.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  55%|█████▍    | 3282/5971 [34:33<28:18,  1.58it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.41e-5, train/loss_step=0.0098, global_step=2041.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3283/5971 [34:34<28:18,  1.58it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.41e-5, train/loss_step=0.0098, global_step=2041.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▍    | 3283/5971 [34:34<28:18,  1.58it/s, loss=0.174, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00134, train/loss_step=0.305, global_step=2041.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  55%|█████▍    | 3284/5971 [34:37<28:19,  1.58it/s, loss=0.205, v_num=0, train/loss_simple_step=0.665, train/loss_vlb_step=0.0288, train/loss_step=0.665, global_step=2041.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▌    | 3285/5971 [34:38<28:19,  1.58it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0431, train/loss_vlb_step=0.000154, train/loss_step=0.0431, global_step=2042.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3286/5971 [34:39<28:18,  1.58it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0431, train/loss_vlb_step=0.000154, train/loss_step=0.0431, global_step=2042.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3286/5971 [34:39<28:18,  1.58it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00892, train/loss_vlb_step=4.08e-5, train/loss_step=0.00892, global_step=2042.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3287/5971 [34:40<28:18,  1.58it/s, loss=0.229, v_num=0, train/loss_simple_step=0.883, train/loss_vlb_step=0.149, train/loss_step=0.883, global_step=2042.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]      
Epoch 3:  55%|█████▌    | 3288/5971 [34:43<28:19,  1.58it/s, loss=0.23, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=4.04e-5, train/loss_step=0.00852, global_step=2042.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3289/5971 [34:44<28:19,  1.58it/s, loss=0.23, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=4.04e-5, train/loss_step=0.00852, global_step=2042.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3289/5971 [34:44<28:19,  1.58it/s, loss=0.219, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=2043.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  55%|█████▌    | 3290/5971 [34:45<28:18,  1.58it/s, loss=0.238, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00247, train/loss_step=0.379, global_step=2043.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▌    | 3291/5971 [34:46<28:18,  1.58it/s, loss=0.227, v_num=0, train/loss_simple_step=0.00548, train/loss_vlb_step=2.75e-5, train/loss_step=0.00548, global_step=2043.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3292/5971 [34:48<28:19,  1.58it/s, loss=0.227, v_num=0, train/loss_simple_step=0.00548, train/loss_vlb_step=2.75e-5, train/loss_step=0.00548, global_step=2043.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3292/5971 [34:48<28:19,  1.58it/s, loss=0.231, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.00034, train/loss_step=0.102, global_step=2043.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  55%|█████▌    | 3293/5971 [34:49<28:18,  1.58it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.000252, train/loss_step=0.0748, global_step=2044.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3294/5971 [34:50<28:18,  1.58it/s, loss=0.247, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00352, train/loss_step=0.464, global_step=2044.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  55%|█████▌    | 3295/5971 [34:51<28:18,  1.58it/s, loss=0.247, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00352, train/loss_step=0.464, global_step=2044.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3295/5971 [34:51<28:18,  1.58it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00907, train/loss_vlb_step=4.07e-5, train/loss_step=0.00907, global_step=2044.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3296/5971 [34:55<28:19,  1.57it/s, loss=0.202, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000871, train/loss_step=0.248, global_step=2044.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  55%|█████▌    | 3297/5971 [34:56<28:19,  1.57it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0412, train/loss_vlb_step=0.000143, train/loss_step=0.0412, global_step=2045.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3298/5971 [34:57<28:19,  1.57it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0412, train/loss_vlb_step=0.000143, train/loss_step=0.0412, global_step=2045.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3298/5971 [34:57<28:19,  1.57it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.97e-5, train/loss_step=0.00346, global_step=2045.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3299/5971 [34:58<28:18,  1.57it/s, loss=0.226, v_num=0, train/loss_simple_step=0.581, train/loss_vlb_step=0.00804, train/loss_step=0.581, global_step=2045.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  55%|█████▌    | 3300/5971 [35:00<28:19,  1.57it/s, loss=0.252, v_num=0, train/loss_simple_step=0.579, train/loss_vlb_step=0.0052, train/loss_step=0.579, global_step=2045.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▌    | 3301/5971 [35:01<28:19,  1.57it/s, loss=0.252, v_num=0, train/loss_simple_step=0.579, train/loss_vlb_step=0.0052, train/loss_step=0.579, global_step=2045.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3301/5971 [35:01<28:19,  1.57it/s, loss=0.229, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000117, train/loss_step=0.0294, global_step=2046.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3302/5971 [35:02<28:19,  1.57it/s, loss=0.241, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000859, train/loss_step=0.247, global_step=2046.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  55%|█████▌    | 3303/5971 [35:03<28:18,  1.57it/s, loss=0.242, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00184, train/loss_step=0.329, global_step=2046.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▌    | 3304/5971 [35:07<28:20,  1.57it/s, loss=0.242, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00184, train/loss_step=0.329, global_step=2046.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3304/5971 [35:07<28:20,  1.57it/s, loss=0.209, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.13e-5, train/loss_step=0.00643, global_step=2046.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3305/5971 [35:08<28:19,  1.57it/s, loss=0.214, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000531, train/loss_step=0.155, global_step=2047.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  55%|█████▌    | 3306/5971 [35:09<28:19,  1.57it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.57e-5, train/loss_step=0.00484, global_step=2047.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3307/5971 [35:10<28:19,  1.57it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.57e-5, train/loss_step=0.00484, global_step=2047.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3307/5971 [35:10<28:19,  1.57it/s, loss=0.175, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=2047.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  55%|█████▌    | 3308/5971 [35:13<28:20,  1.57it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.66e-5, train/loss_step=0.00768, global_step=2047.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3309/5971 [35:14<28:20,  1.57it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0749, train/loss_vlb_step=0.000248, train/loss_step=0.0749, global_step=2048.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▌    | 3310/5971 [35:15<28:19,  1.57it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0749, train/loss_vlb_step=0.000248, train/loss_step=0.0749, global_step=2048.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3310/5971 [35:15<28:19,  1.57it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.32e-5, train/loss_step=0.0233, global_step=2048.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  55%|█████▌    | 3311/5971 [35:16<28:19,  1.57it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.55e-5, train/loss_step=0.0155, global_step=2048.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3312/5971 [35:18<28:20,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0612, train/loss_vlb_step=0.000204, train/loss_step=0.0612, global_step=2048.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3313/5971 [35:19<28:19,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0612, train/loss_vlb_step=0.000204, train/loss_step=0.0612, global_step=2048.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  55%|█████▌    | 3313/5971 [35:19<28:19,  1.56it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00453, train/loss_vlb_step=2.3e-5, train/loss_step=0.00453, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  56%|█████▌    | 3314/5971 [35:20<28:19,  1.56it/s, loss=0.141, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00173, train/loss_step=0.297, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  56%|█████▌    | 3315/5971 [35:21<28:18,  1.56it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=3.08e-5, train/loss_step=0.00636, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  56%|█████▌    | 3316/5971 [35:24<28:20,  1.56it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=3.08e-5, train/loss_step=0.00636, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  56%|█████▌    | 3316/5971 [35:24<28:20,  1.56it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.45it/s][A

Validating:   1%|          | 2/167 [00:01<01:59,  1.38it/s][A
Epoch 3:  56%|█████▌    | 3319/5971 [35:25<28:18,  1.56it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   2%|▏         | 4/167 [00:01<00:49,  3.30it/s][A
Epoch 3:  56%|█████▌    | 3322/5971 [35:25<28:14,  1.56it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   4%|▎         | 6/167 [00:01<00:29,  5.40it/s][A
Epoch 3:  56%|█████▌    | 3325/5971 [35:26<28:11,  1.56it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▌         | 9/167 [00:01<00:17,  8.92it/s][A
Epoch 3:  56%|█████▌    | 3328/5971 [35:26<28:08,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 12/167 [00:01<00:12, 12.21it/s][A
Epoch 3:  56%|█████▌    | 3331/5971 [35:26<28:04,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   9%|▉         | 15/167 [00:01<00:10, 14.53it/s][A
Epoch 3:  56%|█████▌    | 3334/5971 [35:26<28:01,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:02<00:08, 17.26it/s][A
Epoch 3:  56%|█████▌    | 3337/5971 [35:26<27:58,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:02<00:07, 19.21it/s][A
Epoch 3:  56%|█████▌    | 3340/5971 [35:26<27:54,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:02<00:06, 20.94it/s][A
Epoch 3:  56%|█████▌    | 3343/5971 [35:26<27:51,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 28/167 [00:02<00:05, 23.68it/s][A
Epoch 3:  56%|█████▌    | 3346/5971 [35:26<27:48,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:02<00:05, 22.79it/s][A
Epoch 3:  56%|█████▌    | 3349/5971 [35:27<27:44,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|██        | 34/167 [00:02<00:05, 24.19it/s][A
Epoch 3:  56%|█████▌    | 3352/5971 [35:27<27:41,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 24.60it/s][A
Epoch 3:  56%|█████▌    | 3355/5971 [35:27<27:38,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.48it/s][A
Epoch 3:  56%|█████▌    | 3358/5971 [35:27<27:34,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:03<00:04, 25.91it/s][A
Epoch 3:  56%|█████▋    | 3361/5971 [35:27<27:31,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:03<00:04, 26.69it/s][A
Epoch 3:  56%|█████▋    | 3364/5971 [35:27<27:28,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:03<00:04, 25.51it/s][A
Epoch 3:  56%|█████▋    | 3367/5971 [35:27<27:25,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:03<00:04, 24.80it/s][A
Epoch 3:  56%|█████▋    | 3370/5971 [35:27<27:21,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:03<00:04, 25.42it/s][A
Epoch 3:  56%|█████▋    | 3373/5971 [35:27<27:18,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:03<00:04, 25.56it/s][A
Epoch 3:  57%|█████▋    | 3376/5971 [35:28<27:15,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 25.54it/s][A
Epoch 3:  57%|█████▋    | 3379/5971 [35:28<27:12,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.34it/s][A
Epoch 3:  57%|█████▋    | 3382/5971 [35:28<27:08,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.21it/s][A
Epoch 3:  57%|█████▋    | 3385/5971 [35:28<27:05,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:04<00:03, 24.57it/s][A
Epoch 3:  57%|█████▋    | 3388/5971 [35:28<27:02,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:04<00:03, 23.99it/s][A
Epoch 3:  57%|█████▋    | 3391/5971 [35:28<26:59,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:04<00:03, 23.78it/s][A
Epoch 3:  57%|█████▋    | 3394/5971 [35:28<26:55,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:04<00:03, 23.93it/s][A
Epoch 3:  57%|█████▋    | 3397/5971 [35:28<26:52,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 23.23it/s][A
Epoch 3:  57%|█████▋    | 3400/5971 [35:29<26:49,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:04<00:03, 24.00it/s][A
Epoch 3:  57%|█████▋    | 3403/5971 [35:29<26:46,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 23.57it/s][A
Epoch 3:  57%|█████▋    | 3406/5971 [35:29<26:43,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:05<00:03, 22.43it/s][A
Epoch 3:  57%|█████▋    | 3409/5971 [35:29<26:39,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:05<00:03, 23.81it/s][A
Epoch 3:  57%|█████▋    | 3412/5971 [35:29<26:36,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:05<00:02, 24.94it/s][A
Epoch 3:  57%|█████▋    | 3415/5971 [35:29<26:33,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:05<00:02, 26.13it/s][A
Epoch 3:  57%|█████▋    | 3418/5971 [35:29<26:30,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:05<00:02, 24.48it/s][A
Epoch 3:  57%|█████▋    | 3421/5971 [35:29<26:27,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 23.72it/s][A
Epoch 3:  57%|█████▋    | 3424/5971 [35:30<26:24,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 23.80it/s][A
Epoch 3:  57%|█████▋    | 3427/5971 [35:30<26:20,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 24.44it/s][A
Epoch 3:  57%|█████▋    | 3430/5971 [35:30<26:17,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 24.80it/s][A
Epoch 3:  57%|█████▋    | 3433/5971 [35:30<26:14,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:06<00:01, 24.55it/s][A
Epoch 3:  58%|█████▊    | 3436/5971 [35:30<26:11,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:06<00:01, 24.86it/s][A
Epoch 3:  58%|█████▊    | 3439/5971 [35:30<26:08,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:06<00:01, 25.06it/s][A
Epoch 3:  58%|█████▊    | 3442/5971 [35:30<26:05,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:06<00:01, 25.12it/s][A
Epoch 3:  58%|█████▊    | 3445/5971 [35:30<26:01,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 25.55it/s][A
Epoch 3:  58%|█████▊    | 3448/5971 [35:30<25:58,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:06<00:01, 25.21it/s][A
Epoch 3:  58%|█████▊    | 3451/5971 [35:31<25:55,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 24.87it/s][A
Epoch 3:  58%|█████▊    | 3454/5971 [35:31<25:52,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 24.99it/s][A
Epoch 3:  58%|█████▊    | 3457/5971 [35:31<25:49,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:07<00:01, 24.80it/s][A
Epoch 3:  58%|█████▊    | 3460/5971 [35:31<25:46,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:07<00:00, 24.49it/s][A
Epoch 3:  58%|█████▊    | 3463/5971 [35:31<25:43,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:07<00:00, 24.81it/s][A
Epoch 3:  58%|█████▊    | 3466/5971 [35:31<25:40,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:07<00:00, 24.76it/s][A
Epoch 3:  58%|█████▊    | 3469/5971 [35:31<25:37,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:07<00:00, 25.16it/s][A
Epoch 3:  58%|█████▊    | 3472/5971 [35:31<25:34,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:07<00:00, 24.04it/s][A
Epoch 3:  58%|█████▊    | 3475/5971 [35:32<25:30,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 24.30it/s][A
Epoch 3:  58%|█████▊    | 3478/5971 [35:32<25:27,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 25.06it/s][A
Epoch 3:  58%|█████▊    | 3481/5971 [35:32<25:24,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 26.15it/s][A
Epoch 3:  58%|█████▊    | 3484/5971 [35:32<25:21,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3484/5971 [35:32<25:22,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00109, train/loss_step=0.282, global_step=2049.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  58%|█████▊    | 3485/5971 [35:33<25:21,  1.63it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.23e-5, train/loss_step=0.0126, global_step=2050.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3486/5971 [35:34<25:21,  1.63it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.68e-5, train/loss_step=0.0162, global_step=2050.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3487/5971 [35:35<25:21,  1.63it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.68e-5, train/loss_step=0.0162, global_step=2050.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3487/5971 [35:35<25:21,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00968, train/loss_vlb_step=4.35e-5, train/loss_step=0.00968, global_step=2050.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3488/5971 [35:40<25:23,  1.63it/s, loss=0.0917, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000489, train/loss_step=0.149, global_step=2050.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  58%|█████▊    | 3489/5971 [35:41<25:23,  1.63it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000492, train/loss_step=0.149, global_step=2051.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3490/5971 [35:42<25:22,  1.63it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000492, train/loss_step=0.149, global_step=2051.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3490/5971 [35:42<25:22,  1.63it/s, loss=0.0876, v_num=0, train/loss_simple_step=0.0452, train/loss_vlb_step=0.000159, train/loss_step=0.0452, global_step=2051.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3491/5971 [35:43<25:22,  1.63it/s, loss=0.0722, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=9.25e-5, train/loss_step=0.0221, global_step=2051.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  58%|█████▊    | 3492/5971 [35:46<25:23,  1.63it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00117, train/loss_step=0.266, global_step=2051.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  58%|█████▊    | 3493/5971 [35:47<25:23,  1.63it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00117, train/loss_step=0.266, global_step=2051.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  58%|█████▊    | 3493/5971 [35:47<25:23,  1.63it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000634, train/loss_step=0.178, global_step=2052.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3494/5971 [35:48<25:22,  1.63it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.00778, train/loss_vlb_step=3.67e-5, train/loss_step=0.00778, global_step=2052.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3495/5971 [35:49<25:22,  1.63it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00125, train/loss_step=0.281, global_step=2052.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  59%|█████▊    | 3496/5971 [35:53<25:23,  1.62it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00125, train/loss_step=0.281, global_step=2052.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3496/5971 [35:53<25:23,  1.62it/s, loss=0.125, v_num=0, train/loss_simple_step=0.592, train/loss_vlb_step=0.0102, train/loss_step=0.592, global_step=2052.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  59%|█████▊    | 3497/5971 [35:54<25:23,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.48e-5, train/loss_step=0.0115, global_step=2053.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3498/5971 [35:55<25:23,  1.62it/s, loss=0.132, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000884, train/loss_step=0.233, global_step=2053.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  59%|█████▊    | 3499/5971 [35:55<25:22,  1.62it/s, loss=0.132, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000884, train/loss_step=0.233, global_step=2053.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3499/5971 [35:55<25:22,  1.62it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00687, train/loss_vlb_step=3.3e-5, train/loss_step=0.00687, global_step=2053.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3500/5971 [35:58<25:23,  1.62it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.63e-5, train/loss_step=0.0125, global_step=2053.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  59%|█████▊    | 3501/5971 [35:59<25:23,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000111, train/loss_step=0.030, global_step=2054.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  59%|█████▊    | 3502/5971 [36:00<25:22,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000111, train/loss_step=0.030, global_step=2054.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3502/5971 [36:00<25:22,  1.62it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0773, train/loss_vlb_step=0.000254, train/loss_step=0.0773, global_step=2054.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3503/5971 [36:01<25:22,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.085, train/loss_vlb_step=0.00028, train/loss_step=0.085, global_step=2054.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  59%|█████▊    | 3504/5971 [36:04<25:23,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00811, train/loss_vlb_step=3.77e-5, train/loss_step=0.00811, global_step=2054.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3505/5971 [36:05<25:23,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00811, train/loss_vlb_step=3.77e-5, train/loss_step=0.00811, global_step=2054.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3505/5971 [36:05<25:23,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00151, train/loss_step=0.275, global_step=2055.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  59%|█████▊    | 3506/5971 [36:06<25:22,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00962, train/loss_vlb_step=4.38e-5, train/loss_step=0.00962, global_step=2055.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▊    | 3507/5971 [36:07<25:22,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.68e-5, train/loss_step=0.00299, global_step=2055.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3508/5971 [36:10<25:23,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.68e-5, train/loss_step=0.00299, global_step=2055.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3508/5971 [36:10<25:23,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.28e-5, train/loss_step=0.00221, global_step=2055.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3509/5971 [36:11<25:23,  1.62it/s, loss=0.118, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000776, train/loss_step=0.213, global_step=2056.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  59%|█████▉    | 3510/5971 [36:12<25:22,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.010, train/loss_step=0.546, global_step=2056.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  59%|█████▉    | 3511/5971 [36:13<25:22,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.010, train/loss_step=0.546, global_step=2056.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3511/5971 [36:13<25:22,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000313, train/loss_step=0.094, global_step=2056.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3512/5971 [36:16<25:23,  1.61it/s, loss=0.14, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000425, train/loss_step=0.129, global_step=2056.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  59%|█████▉    | 3513/5971 [36:17<25:23,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.00016, train/loss_step=0.0454, global_step=2057.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3514/5971 [36:18<25:22,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.00016, train/loss_step=0.0454, global_step=2057.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3514/5971 [36:18<25:22,  1.61it/s, loss=0.162, v_num=0, train/loss_simple_step=0.582, train/loss_vlb_step=0.00606, train/loss_step=0.582, global_step=2057.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  59%|█████▉    | 3515/5971 [36:19<25:22,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0353, train/loss_vlb_step=0.00012, train/loss_step=0.0353, global_step=2057.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3516/5971 [36:22<25:23,  1.61it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000198, train/loss_step=0.0579, global_step=2057.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3517/5971 [36:23<25:22,  1.61it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000198, train/loss_step=0.0579, global_step=2057.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3517/5971 [36:23<25:22,  1.61it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0384, train/loss_vlb_step=0.000132, train/loss_step=0.0384, global_step=2058.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3518/5971 [36:23<25:22,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.16e-5, train/loss_step=0.0166, global_step=2058.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  59%|█████▉    | 3519/5971 [36:24<25:21,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.31e-5, train/loss_step=0.00232, global_step=2058.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3520/5971 [36:27<25:22,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.31e-5, train/loss_step=0.00232, global_step=2058.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3520/5971 [36:27<25:22,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00185, train/loss_step=0.350, global_step=2058.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  59%|█████▉    | 3521/5971 [36:28<25:22,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000186, train/loss_step=0.0547, global_step=2059.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3522/5971 [36:29<25:21,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00495, train/loss_vlb_step=2.56e-5, train/loss_step=0.00495, global_step=2059.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3523/5971 [36:30<25:21,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00495, train/loss_vlb_step=2.56e-5, train/loss_step=0.00495, global_step=2059.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3523/5971 [36:30<25:21,  1.61it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.57e-5, train/loss_step=0.00273, global_step=2059.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3524/5971 [36:32<25:22,  1.61it/s, loss=0.14, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00134, train/loss_step=0.330, global_step=2059.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  59%|█████▉    | 3525/5971 [36:33<25:21,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.55e-5, train/loss_step=0.0121, global_step=2060.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3526/5971 [36:34<25:21,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.55e-5, train/loss_step=0.0121, global_step=2060.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3526/5971 [36:34<25:21,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00392, train/loss_step=0.542, global_step=2060.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  59%|█████▉    | 3527/5971 [36:35<25:20,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0836, train/loss_vlb_step=0.000283, train/loss_step=0.0836, global_step=2060.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3528/5971 [36:37<25:21,  1.61it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0877, train/loss_vlb_step=0.000289, train/loss_step=0.0877, global_step=2060.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3529/5971 [36:38<25:21,  1.61it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0877, train/loss_vlb_step=0.000289, train/loss_step=0.0877, global_step=2060.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3529/5971 [36:38<25:21,  1.61it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00965, train/loss_vlb_step=4.47e-5, train/loss_step=0.00965, global_step=2061.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3530/5971 [36:39<25:20,  1.61it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.57e-5, train/loss_step=0.00501, global_step=2061.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3531/5971 [36:40<25:20,  1.60it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.46e-5, train/loss_step=0.00478, global_step=2061.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  59%|█████▉    | 3532/5971 [36:43<25:21,  1.60it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.46e-5, train/loss_step=0.00478, global_step=2061.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3532/5971 [36:43<25:21,  1.60it/s, loss=0.131, v_num=0, train/loss_simple_step=0.362, train/loss_vlb_step=0.00197, train/loss_step=0.362, global_step=2061.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  59%|█████▉    | 3533/5971 [36:44<25:20,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000477, train/loss_step=0.138, global_step=2062.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3534/5971 [36:45<25:20,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000193, train/loss_step=0.0553, global_step=2062.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3535/5971 [36:46<25:19,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000193, train/loss_step=0.0553, global_step=2062.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3535/5971 [36:46<25:19,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000299, train/loss_step=0.0894, global_step=2062.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3536/5971 [36:48<25:20,  1.60it/s, loss=0.116, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000425, train/loss_step=0.129, global_step=2062.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  59%|█████▉    | 3537/5971 [36:49<25:19,  1.60it/s, loss=0.145, v_num=0, train/loss_simple_step=0.615, train/loss_vlb_step=0.0144, train/loss_step=0.615, global_step=2063.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  59%|█████▉    | 3538/5971 [36:50<25:19,  1.60it/s, loss=0.145, v_num=0, train/loss_simple_step=0.615, train/loss_vlb_step=0.0144, train/loss_step=0.615, global_step=2063.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3538/5971 [36:50<25:19,  1.60it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.95e-5, train/loss_step=0.0161, global_step=2063.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3539/5971 [36:51<25:19,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000145, train/loss_step=0.0427, global_step=2063.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3540/5971 [36:53<25:19,  1.60it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00808, train/loss_vlb_step=3.83e-5, train/loss_step=0.00808, global_step=2063.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3541/5971 [36:54<25:19,  1.60it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00808, train/loss_vlb_step=3.83e-5, train/loss_step=0.00808, global_step=2063.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3541/5971 [36:54<25:19,  1.60it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00877, train/loss_vlb_step=4.26e-5, train/loss_step=0.00877, global_step=2064.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3542/5971 [36:55<25:19,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000481, train/loss_step=0.140, global_step=2064.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  59%|█████▉    | 3543/5971 [36:56<25:18,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.72e-5, train/loss_step=0.00308, global_step=2064.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3544/5971 [36:58<25:19,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.72e-5, train/loss_step=0.00308, global_step=2064.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3544/5971 [36:58<25:19,  1.60it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=8.97e-5, train/loss_step=0.0236, global_step=2064.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  59%|█████▉    | 3545/5971 [36:59<25:18,  1.60it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0831, train/loss_vlb_step=0.000281, train/loss_step=0.0831, global_step=2065.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3546/5971 [37:00<25:18,  1.60it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000196, train/loss_step=0.0567, global_step=2065.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3547/5971 [37:01<25:17,  1.60it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000196, train/loss_step=0.0567, global_step=2065.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3547/5971 [37:01<25:17,  1.60it/s, loss=0.094, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.3e-5, train/loss_step=0.00224, global_step=2065.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  59%|█████▉    | 3548/5971 [37:03<25:18,  1.60it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.0794, train/loss_vlb_step=0.000269, train/loss_step=0.0794, global_step=2065.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3549/5971 [37:04<25:17,  1.60it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.0251, train/loss_vlb_step=9.77e-5, train/loss_step=0.0251, global_step=2066.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  59%|█████▉    | 3550/5971 [37:05<25:17,  1.60it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.0251, train/loss_vlb_step=9.77e-5, train/loss_step=0.0251, global_step=2066.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  59%|█████▉    | 3550/5971 [37:05<25:17,  1.60it/s, loss=0.103, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000679, train/loss_step=0.182, global_step=2066.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  59%|█████▉    | 3551/5971 [37:06<25:17,  1.60it/s, loss=0.108, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=2066.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  59%|█████▉    | 3552/5971 [37:09<25:17,  1.59it/s, loss=0.11, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00175, train/loss_step=0.394, global_step=2066.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3553/5971 [37:10<25:17,  1.59it/s, loss=0.11, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00175, train/loss_step=0.394, global_step=2066.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3553/5971 [37:10<25:17,  1.59it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00328, train/loss_vlb_step=1.78e-5, train/loss_step=0.00328, global_step=2067.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3554/5971 [37:10<25:16,  1.59it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.07e-5, train/loss_step=0.0147, global_step=2067.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  60%|█████▉    | 3555/5971 [37:11<25:16,  1.59it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0733, train/loss_vlb_step=0.000241, train/loss_step=0.0733, global_step=2067.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3556/5971 [37:14<25:16,  1.59it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0733, train/loss_vlb_step=0.000241, train/loss_step=0.0733, global_step=2067.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3556/5971 [37:14<25:16,  1.59it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000161, train/loss_step=0.0479, global_step=2067.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3557/5971 [37:15<25:16,  1.59it/s, loss=0.0697, v_num=0, train/loss_simple_step=0.0876, train/loss_vlb_step=0.000289, train/loss_step=0.0876, global_step=2068.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3558/5971 [37:16<25:16,  1.59it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=9.92e-5, train/loss_step=0.0268, global_step=2068.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3559/5971 [37:17<25:15,  1.59it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=9.92e-5, train/loss_step=0.0268, global_step=2068.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3559/5971 [37:17<25:15,  1.59it/s, loss=0.0682, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.13e-5, train/loss_step=0.00206, global_step=2068.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3560/5971 [37:19<25:16,  1.59it/s, loss=0.0683, v_num=0, train/loss_simple_step=0.00968, train/loss_vlb_step=4.36e-5, train/loss_step=0.00968, global_step=2068.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3561/5971 [37:20<25:15,  1.59it/s, loss=0.104, v_num=0, train/loss_simple_step=0.731, train/loss_vlb_step=0.0186, train/loss_step=0.731, global_step=2069.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]      
Epoch 3:  60%|█████▉    | 3562/5971 [37:21<25:15,  1.59it/s, loss=0.104, v_num=0, train/loss_simple_step=0.731, train/loss_vlb_step=0.0186, train/loss_step=0.731, global_step=2069.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3562/5971 [37:21<25:15,  1.59it/s, loss=0.105, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000532, train/loss_step=0.151, global_step=2069.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3563/5971 [37:21<25:14,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00328, train/loss_step=0.399, global_step=2069.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3564/5971 [37:24<25:15,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00334, train/loss_step=0.461, global_step=2069.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3565/5971 [37:25<25:14,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00334, train/loss_step=0.461, global_step=2069.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3565/5971 [37:25<25:14,  1.59it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000233, train/loss_step=0.0671, global_step=2070.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3566/5971 [37:26<25:14,  1.59it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.7e-5, train/loss_step=0.0184, global_step=2070.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  60%|█████▉    | 3567/5971 [37:27<25:14,  1.59it/s, loss=0.152, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000524, train/loss_step=0.160, global_step=2070.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3568/5971 [37:29<25:14,  1.59it/s, loss=0.152, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000524, train/loss_step=0.160, global_step=2070.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3568/5971 [37:29<25:14,  1.59it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000306, train/loss_step=0.0926, global_step=2070.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3569/5971 [37:30<25:14,  1.59it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000189, train/loss_step=0.0562, global_step=2071.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3570/5971 [37:31<25:13,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000537, train/loss_step=0.156, global_step=2071.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  60%|█████▉    | 3571/5971 [37:32<25:13,  1.59it/s, loss=0.153, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000537, train/loss_step=0.156, global_step=2071.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3571/5971 [37:32<25:13,  1.59it/s, loss=0.178, v_num=0, train/loss_simple_step=0.601, train/loss_vlb_step=0.00889, train/loss_step=0.601, global_step=2071.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3572/5971 [37:34<25:13,  1.58it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00333, train/loss_vlb_step=1.83e-5, train/loss_step=0.00333, global_step=2071.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3573/5971 [37:35<25:13,  1.58it/s, loss=0.169, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000816, train/loss_step=0.226, global_step=2072.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  60%|█████▉    | 3574/5971 [37:36<25:12,  1.58it/s, loss=0.169, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000816, train/loss_step=0.226, global_step=2072.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3574/5971 [37:36<25:12,  1.58it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000156, train/loss_step=0.0448, global_step=2072.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3575/5971 [37:37<25:12,  1.58it/s, loss=0.194, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00465, train/loss_step=0.530, global_step=2072.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  60%|█████▉    | 3576/5971 [37:39<25:12,  1.58it/s, loss=0.208, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.0018, train/loss_step=0.340, global_step=2072.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3577/5971 [37:40<25:12,  1.58it/s, loss=0.208, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.0018, train/loss_step=0.340, global_step=2072.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3577/5971 [37:40<25:12,  1.58it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0662, train/loss_vlb_step=0.000229, train/loss_step=0.0662, global_step=2073.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3578/5971 [37:41<25:11,  1.58it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=9.85e-5, train/loss_step=0.0226, global_step=2073.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3579/5971 [37:42<25:11,  1.58it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.8e-5, train/loss_step=0.0109, global_step=2073.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3580/5971 [37:44<25:11,  1.58it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.8e-5, train/loss_step=0.0109, global_step=2073.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3580/5971 [37:44<25:11,  1.58it/s, loss=0.229, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00372, train/loss_step=0.434, global_step=2073.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  60%|█████▉    | 3581/5971 [37:45<25:11,  1.58it/s, loss=0.198, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000384, train/loss_step=0.117, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|█████▉    | 3582/5971 [37:46<25:10,  1.58it/s, loss=0.2, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000619, train/loss_step=0.188, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  60%|██████    | 3583/5971 [37:47<25:10,  1.58it/s, loss=0.2, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000619, train/loss_step=0.188, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|██████    | 3583/5971 [37:47<25:10,  1.58it/s, loss=0.189, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000617, train/loss_step=0.181, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  60%|██████    | 3584/5971 [37:49<25:10,  1.58it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.27it/s][A
Epoch 3:  60%|██████    | 3586/5971 [37:49<25:09,  1.58it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:44,  3.67it/s][A
Epoch 3:  60%|██████    | 3589/5971 [37:49<25:06,  1.58it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.21it/s][A
Epoch 3:  60%|██████    | 3592/5971 [37:50<25:03,  1.58it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.97it/s][A
Epoch 3:  60%|██████    | 3595/5971 [37:50<24:59,  1.58it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.55it/s][A
Epoch 3:  60%|██████    | 3598/5971 [37:50<24:56,  1.59it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.64it/s][A
Epoch 3:  60%|██████    | 3601/5971 [37:50<24:53,  1.59it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.84it/s][A
Epoch 3:  60%|██████    | 3604/5971 [37:50<24:50,  1.59it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:05, 25.05it/s][A
Epoch 3:  60%|██████    | 3607/5971 [37:50<24:47,  1.59it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 26.11it/s][A
Epoch 3:  60%|██████    | 3611/5971 [37:50<24:43,  1.59it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 26.10it/s][A

Validating:  18%|█▊        | 30/167 [00:01<00:05, 26.54it/s][A
Epoch 3:  61%|██████    | 3615/5971 [37:50<24:39,  1.59it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.18it/s][A
Epoch 3:  61%|██████    | 3619/5971 [37:51<24:35,  1.59it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.80it/s][A
Epoch 3:  61%|██████    | 3623/5971 [37:51<24:31,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 27.22it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.71it/s][A
Epoch 3:  61%|██████    | 3627/5971 [37:51<24:27,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.34it/s][A
Epoch 3:  61%|██████    | 3631/5971 [37:51<24:23,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.18it/s][A
Epoch 3:  61%|██████    | 3635/5971 [37:51<24:19,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.38it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.58it/s][A
Epoch 3:  61%|██████    | 3639/5971 [37:51<24:15,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:02<00:03, 27.86it/s][A
Epoch 3:  61%|██████    | 3643/5971 [37:51<24:11,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 27.37it/s][A
Epoch 3:  61%|██████    | 3647/5971 [37:52<24:07,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.76it/s][A
Epoch 3:  61%|██████    | 3651/5971 [37:52<24:03,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:02<00:03, 27.40it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.32it/s][A
Epoch 3:  61%|██████    | 3655/5971 [37:52<23:59,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.65it/s][A
Epoch 3:  61%|██████▏   | 3659/5971 [37:52<23:55,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.26it/s][A
Epoch 3:  61%|██████▏   | 3663/5971 [37:52<23:51,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.09it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.16it/s][A
Epoch 3:  61%|██████▏   | 3667/5971 [37:52<23:47,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.70it/s][A
Epoch 3:  61%|██████▏   | 3671/5971 [37:52<23:43,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 26.47it/s][A
Epoch 3:  62%|██████▏   | 3675/5971 [37:53<23:39,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 26.02it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.33it/s][A
Epoch 3:  62%|██████▏   | 3679/5971 [37:53<23:35,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.20it/s][A
Epoch 3:  62%|██████▏   | 3683/5971 [37:53<23:31,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.62it/s][A
Epoch 3:  62%|██████▏   | 3687/5971 [37:53<23:28,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.04it/s][A
Epoch 3:  62%|██████▏   | 3691/5971 [37:53<23:24,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.09it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.92it/s][A
Epoch 3:  62%|██████▏   | 3695/5971 [37:53<23:20,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.61it/s][A
Epoch 3:  62%|██████▏   | 3699/5971 [37:54<23:16,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.39it/s][A
Epoch 3:  62%|██████▏   | 3703/5971 [37:54<23:12,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 27.32it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 27.93it/s][A
Epoch 3:  62%|██████▏   | 3707/5971 [37:54<23:08,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.45it/s][A
Epoch 3:  62%|██████▏   | 3711/5971 [37:54<23:04,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.26it/s][A
Epoch 3:  62%|██████▏   | 3715/5971 [37:54<23:00,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.89it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.06it/s][A
Epoch 3:  62%|██████▏   | 3719/5971 [37:54<22:57,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 25.88it/s][A
Epoch 3:  62%|██████▏   | 3723/5971 [37:54<22:53,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:05<00:01, 25.85it/s][A
Epoch 3:  62%|██████▏   | 3727/5971 [37:55<22:49,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 24.64it/s][A
Epoch 3:  62%|██████▏   | 3731/5971 [37:55<22:45,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 23.70it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.13it/s][A
Epoch 3:  63%|██████▎   | 3735/5971 [37:55<22:41,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 24.40it/s][A
Epoch 3:  63%|██████▎   | 3739/5971 [37:55<22:38,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.99it/s][A
Epoch 3:  63%|██████▎   | 3743/5971 [37:55<22:34,  1.65it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 24.77it/s][A

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.43it/s][A
Epoch 3:  63%|██████▎   | 3747/5971 [37:55<22:30,  1.65it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.38it/s][A
Epoch 3:  63%|██████▎   | 3751/5971 [37:56<22:26,  1.65it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3752/5971 [37:57<22:26,  1.65it/s, loss=0.17, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000273, train/loss_step=0.083, global_step=2074.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  63%|██████▎   | 3753/5971 [37:58<22:25,  1.65it/s, loss=0.174, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000525, train/loss_step=0.158, global_step=2075.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3754/5971 [37:59<22:25,  1.65it/s, loss=0.181, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000495, train/loss_step=0.148, global_step=2075.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3755/5971 [37:59<22:25,  1.65it/s, loss=0.181, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000495, train/loss_step=0.148, global_step=2075.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3755/5971 [37:59<22:25,  1.65it/s, loss=0.183, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000764, train/loss_step=0.209, global_step=2075.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3756/5971 [38:02<22:25,  1.65it/s, loss=0.18, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.57e-5, train/loss_step=0.017, global_step=2075.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  63%|██████▎   | 3757/5971 [38:03<22:25,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0513, train/loss_vlb_step=0.000182, train/loss_step=0.0513, global_step=2076.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3758/5971 [38:04<22:24,  1.65it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0867, train/loss_vlb_step=0.000285, train/loss_step=0.0867, global_step=2076.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3759/5971 [38:05<22:24,  1.65it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0867, train/loss_vlb_step=0.000285, train/loss_step=0.0867, global_step=2076.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3759/5971 [38:05<22:24,  1.65it/s, loss=0.163, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00149, train/loss_step=0.342, global_step=2076.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  63%|██████▎   | 3760/5971 [38:07<22:24,  1.64it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000151, train/loss_step=0.0387, global_step=2076.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3761/5971 [38:08<22:24,  1.64it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00104, train/loss_vlb_step=6.32e-6, train/loss_step=0.00104, global_step=2077.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3762/5971 [38:09<22:23,  1.64it/s, loss=0.157, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000411, train/loss_step=0.125, global_step=2077.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  63%|██████▎   | 3763/5971 [38:10<22:23,  1.64it/s, loss=0.157, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000411, train/loss_step=0.125, global_step=2077.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3763/5971 [38:10<22:23,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.552, train/loss_vlb_step=0.00631, train/loss_step=0.552, global_step=2077.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  63%|██████▎   | 3764/5971 [38:13<22:24,  1.64it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.1e-5, train/loss_step=0.00189, global_step=2077.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3765/5971 [38:13<22:23,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.600, train/loss_vlb_step=0.00932, train/loss_step=0.600, global_step=2078.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  63%|██████▎   | 3766/5971 [38:14<22:23,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.16e-5, train/loss_step=0.0116, global_step=2078.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3767/5971 [38:15<22:22,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.16e-5, train/loss_step=0.0116, global_step=2078.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3767/5971 [38:15<22:22,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00915, train/loss_vlb_step=4.01e-5, train/loss_step=0.00915, global_step=2078.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3768/5971 [38:18<22:23,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000529, train/loss_step=0.154, global_step=2078.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  63%|██████▎   | 3769/5971 [38:19<22:23,  1.64it/s, loss=0.181, v_num=0, train/loss_simple_step=0.666, train/loss_vlb_step=0.0249, train/loss_step=0.666, global_step=2079.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  63%|██████▎   | 3770/5971 [38:20<22:22,  1.64it/s, loss=0.195, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00308, train/loss_step=0.467, global_step=2079.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3771/5971 [38:21<22:22,  1.64it/s, loss=0.195, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00308, train/loss_step=0.467, global_step=2079.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3771/5971 [38:21<22:22,  1.64it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0878, train/loss_vlb_step=0.000289, train/loss_step=0.0878, global_step=2079.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3772/5971 [38:24<22:22,  1.64it/s, loss=0.194, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000591, train/loss_step=0.163, global_step=2079.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  63%|██████▎   | 3773/5971 [38:25<22:22,  1.64it/s, loss=0.213, v_num=0, train/loss_simple_step=0.529, train/loss_vlb_step=0.00802, train/loss_step=0.529, global_step=2080.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  63%|██████▎   | 3774/5971 [38:25<22:22,  1.64it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.84e-5, train/loss_step=0.00541, global_step=2080.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3775/5971 [38:26<22:21,  1.64it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.84e-5, train/loss_step=0.00541, global_step=2080.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3775/5971 [38:26<22:21,  1.64it/s, loss=0.204, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000581, train/loss_step=0.172, global_step=2080.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  63%|██████▎   | 3776/5971 [38:29<22:22,  1.64it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.00023, train/loss_step=0.0685, global_step=2080.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3777/5971 [38:30<22:21,  1.63it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.95e-5, train/loss_step=0.0109, global_step=2081.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3778/5971 [38:31<22:21,  1.63it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000143, train/loss_step=0.0403, global_step=2081.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3779/5971 [38:32<22:20,  1.63it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000143, train/loss_step=0.0403, global_step=2081.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3779/5971 [38:32<22:20,  1.63it/s, loss=0.207, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00374, train/loss_step=0.441, global_step=2081.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  63%|██████▎   | 3780/5971 [38:34<22:21,  1.63it/s, loss=0.218, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.00108, train/loss_step=0.252, global_step=2081.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3781/5971 [38:35<22:20,  1.63it/s, loss=0.237, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00206, train/loss_step=0.379, global_step=2082.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3782/5971 [38:36<22:20,  1.63it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000193, train/loss_step=0.0567, global_step=2082.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3783/5971 [38:37<22:20,  1.63it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000193, train/loss_step=0.0567, global_step=2082.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3783/5971 [38:37<22:20,  1.63it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0576, train/loss_vlb_step=0.000202, train/loss_step=0.0576, global_step=2082.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3784/5971 [38:39<22:20,  1.63it/s, loss=0.229, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00319, train/loss_step=0.406, global_step=2082.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  63%|██████▎   | 3785/5971 [38:40<22:20,  1.63it/s, loss=0.215, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00149, train/loss_step=0.326, global_step=2083.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3786/5971 [38:41<22:19,  1.63it/s, loss=0.232, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00183, train/loss_step=0.340, global_step=2083.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3787/5971 [38:42<22:19,  1.63it/s, loss=0.232, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.00183, train/loss_step=0.340, global_step=2083.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3787/5971 [38:42<22:19,  1.63it/s, loss=0.231, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.69e-6, train/loss_step=0.00161, global_step=2083.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3788/5971 [38:45<22:19,  1.63it/s, loss=0.227, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.00023, train/loss_step=0.065, global_step=2083.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  63%|██████▎   | 3789/5971 [38:45<22:19,  1.63it/s, loss=0.206, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000928, train/loss_step=0.252, global_step=2084.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3790/5971 [38:46<22:18,  1.63it/s, loss=0.189, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000388, train/loss_step=0.117, global_step=2084.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3791/5971 [38:47<22:18,  1.63it/s, loss=0.189, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000388, train/loss_step=0.117, global_step=2084.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  63%|██████▎   | 3791/5971 [38:47<22:18,  1.63it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00507, train/loss_vlb_step=2.61e-5, train/loss_step=0.00507, global_step=2084.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3792/5971 [38:50<22:18,  1.63it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.39e-6, train/loss_step=0.00154, global_step=2084.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3793/5971 [38:51<22:18,  1.63it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00802, train/loss_vlb_step=3.67e-5, train/loss_step=0.00802, global_step=2085.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  64%|██████▎   | 3794/5971 [38:52<22:17,  1.63it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000146, train/loss_step=0.0422, global_step=2085.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3795/5971 [38:53<22:17,  1.63it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000146, train/loss_step=0.0422, global_step=2085.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3795/5971 [38:53<22:17,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.1e-5, train/loss_step=0.0166, global_step=2085.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▎   | 3796/5971 [38:56<22:18,  1.63it/s, loss=0.151, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000727, train/loss_step=0.204, global_step=2085.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3797/5971 [38:57<22:17,  1.62it/s, loss=0.165, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00113, train/loss_step=0.286, global_step=2086.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  64%|██████▎   | 3798/5971 [38:58<22:17,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000528, train/loss_step=0.156, global_step=2086.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3799/5971 [38:59<22:17,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000528, train/loss_step=0.156, global_step=2086.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3799/5971 [38:59<22:17,  1.62it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.42e-5, train/loss_step=0.00721, global_step=2086.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3800/5971 [39:01<22:17,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.12e-5, train/loss_step=0.00186, global_step=2086.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3801/5971 [39:02<22:16,  1.62it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.38e-5, train/loss_step=0.0152, global_step=2087.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▎   | 3802/5971 [39:03<22:16,  1.62it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.87e-5, train/loss_step=0.0185, global_step=2087.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3803/5971 [39:04<22:16,  1.62it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.87e-5, train/loss_step=0.0185, global_step=2087.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3803/5971 [39:04<22:16,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000134, train/loss_step=0.0355, global_step=2087.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3804/5971 [39:07<22:16,  1.62it/s, loss=0.0963, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.000107, train/loss_step=0.0281, global_step=2087.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▎   | 3805/5971 [39:08<22:16,  1.62it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.00016, train/loss_step=0.046, global_step=2088.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  64%|██████▎   | 3806/5971 [39:08<22:15,  1.62it/s, loss=0.101, v_num=0, train/loss_simple_step=0.709, train/loss_vlb_step=0.0285, train/loss_step=0.709, global_step=2088.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▍   | 3807/5971 [39:09<22:15,  1.62it/s, loss=0.101, v_num=0, train/loss_simple_step=0.709, train/loss_vlb_step=0.0285, train/loss_step=0.709, global_step=2088.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3807/5971 [39:09<22:15,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.864, train/loss_vlb_step=0.218, train/loss_step=0.864, global_step=2088.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  64%|██████▍   | 3808/5971 [39:12<22:15,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0986, train/loss_vlb_step=0.000329, train/loss_step=0.0986, global_step=2088.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3809/5971 [39:13<22:15,  1.62it/s, loss=0.139, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000393, train/loss_step=0.119, global_step=2089.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▍   | 3810/5971 [39:13<22:14,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=8.03e-5, train/loss_step=0.0193, global_step=2089.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3811/5971 [39:14<22:14,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=8.03e-5, train/loss_step=0.0193, global_step=2089.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3811/5971 [39:14<22:14,  1.62it/s, loss=0.139, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000331, train/loss_step=0.100, global_step=2089.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  64%|██████▍   | 3812/5971 [39:17<22:14,  1.62it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.4e-5, train/loss_step=0.0172, global_step=2089.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  64%|██████▍   | 3813/5971 [39:18<22:14,  1.62it/s, loss=0.14, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=5.74e-5, train/loss_step=0.014, global_step=2090.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  64%|██████▍   | 3814/5971 [39:18<22:13,  1.62it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0555, train/loss_vlb_step=0.000186, train/loss_step=0.0555, global_step=2090.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3815/5971 [39:19<22:13,  1.62it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0555, train/loss_vlb_step=0.000186, train/loss_step=0.0555, global_step=2090.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3815/5971 [39:19<22:13,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00116, train/loss_step=0.297, global_step=2090.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  64%|██████▍   | 3816/5971 [39:22<22:13,  1.62it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.7e-5, train/loss_step=0.0127, global_step=2090.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3817/5971 [39:22<22:13,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0686, train/loss_vlb_step=0.000235, train/loss_step=0.0686, global_step=2091.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3818/5971 [39:23<22:12,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00145, train/loss_step=0.349, global_step=2091.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  64%|██████▍   | 3819/5971 [39:24<22:12,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00145, train/loss_step=0.349, global_step=2091.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3819/5971 [39:24<22:12,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.42e-5, train/loss_step=0.00239, global_step=2091.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3820/5971 [39:27<22:12,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000687, train/loss_step=0.176, global_step=2091.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  64%|██████▍   | 3821/5971 [39:27<22:12,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00233, train/loss_vlb_step=1.3e-5, train/loss_step=0.00233, global_step=2092.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3822/5971 [39:28<22:11,  1.61it/s, loss=0.162, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000974, train/loss_step=0.231, global_step=2092.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▍   | 3823/5971 [39:29<22:11,  1.61it/s, loss=0.162, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000974, train/loss_step=0.231, global_step=2092.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3823/5971 [39:29<22:11,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0874, train/loss_vlb_step=0.000289, train/loss_step=0.0874, global_step=2092.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3824/5971 [39:32<22:11,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000735, train/loss_step=0.196, global_step=2092.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▍   | 3825/5971 [39:32<22:10,  1.61it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.000216, train/loss_step=0.0647, global_step=2093.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3826/5971 [39:33<22:10,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.55e-5, train/loss_step=0.0127, global_step=2093.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  64%|██████▍   | 3827/5971 [39:34<22:10,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.55e-5, train/loss_step=0.0127, global_step=2093.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3827/5971 [39:34<22:10,  1.61it/s, loss=0.108, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00091, train/loss_step=0.234, global_step=2093.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▍   | 3828/5971 [39:37<22:10,  1.61it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=8.8e-5, train/loss_step=0.0227, global_step=2093.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3829/5971 [39:37<22:09,  1.61it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.00725, train/loss_vlb_step=3.56e-5, train/loss_step=0.00725, global_step=2094.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3830/5971 [39:38<22:09,  1.61it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.000158, train/loss_step=0.041, global_step=2094.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  64%|██████▍   | 3831/5971 [39:39<22:09,  1.61it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.000158, train/loss_step=0.041, global_step=2094.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3831/5971 [39:39<22:09,  1.61it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000182, train/loss_step=0.0497, global_step=2094.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3832/5971 [39:41<22:09,  1.61it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.00737, train/loss_vlb_step=3.58e-5, train/loss_step=0.00737, global_step=2094.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3833/5971 [39:42<22:08,  1.61it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=1.99e-5, train/loss_step=0.00372, global_step=2095.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3834/5971 [39:43<22:08,  1.61it/s, loss=0.112, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00167, train/loss_step=0.367, global_step=2095.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  64%|██████▍   | 3835/5971 [39:44<22:07,  1.61it/s, loss=0.112, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00167, train/loss_step=0.367, global_step=2095.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3835/5971 [39:44<22:07,  1.61it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000162, train/loss_step=0.047, global_step=2095.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3836/5971 [39:47<22:08,  1.61it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.000163, train/loss_step=0.0477, global_step=2095.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3837/5971 [39:47<22:07,  1.61it/s, loss=0.104, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=2096.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▍   | 3838/5971 [39:48<22:07,  1.61it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000328, train/loss_step=0.100, global_step=2096.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3839/5971 [39:49<22:06,  1.61it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000328, train/loss_step=0.100, global_step=2096.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3839/5971 [39:49<22:06,  1.61it/s, loss=0.1, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000607, train/loss_step=0.175, global_step=2096.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  64%|██████▍   | 3840/5971 [39:52<22:07,  1.61it/s, loss=0.109, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00158, train/loss_step=0.349, global_step=2096.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3841/5971 [39:53<22:06,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0853, train/loss_vlb_step=0.00028, train/loss_step=0.0853, global_step=2097.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3842/5971 [39:53<22:06,  1.61it/s, loss=0.117, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00155, train/loss_step=0.301, global_step=2097.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▍   | 3843/5971 [39:54<22:05,  1.61it/s, loss=0.117, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00155, train/loss_step=0.301, global_step=2097.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3843/5971 [39:54<22:05,  1.61it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000119, train/loss_step=0.0331, global_step=2097.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3844/5971 [39:57<22:05,  1.60it/s, loss=0.132, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.005, train/loss_step=0.557, global_step=2097.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  64%|██████▍   | 3845/5971 [39:57<22:05,  1.60it/s, loss=0.146, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00153, train/loss_step=0.341, global_step=2098.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3846/5971 [39:58<22:05,  1.60it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0773, train/loss_vlb_step=0.000255, train/loss_step=0.0773, global_step=2098.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3847/5971 [39:59<22:04,  1.60it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0773, train/loss_vlb_step=0.000255, train/loss_step=0.0773, global_step=2098.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3847/5971 [39:59<22:04,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.00021, train/loss_step=0.0606, global_step=2098.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  64%|██████▍   | 3848/5971 [40:02<22:04,  1.60it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.04e-5, train/loss_step=0.00172, global_step=2098.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3849/5971 [40:03<22:04,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.69e-5, train/loss_step=0.021, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  64%|██████▍   | 3850/5971 [40:03<22:03,  1.60it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000229, train/loss_step=0.0632, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3851/5971 [40:04<22:03,  1.60it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000229, train/loss_step=0.0632, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  64%|██████▍   | 3851/5971 [40:04<22:03,  1.60it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00817, train/loss_vlb_step=3.86e-5, train/loss_step=0.00817, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  65%|██████▍   | 3852/5971 [40:06<22:03,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.38it/s][A

Validating:   1%|          | 2/167 [00:00<00:53,  3.09it/s][A
Epoch 3:  65%|██████▍   | 3855/5971 [40:07<22:01,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.28it/s][A
Epoch 3:  65%|██████▍   | 3859/5971 [40:07<21:57,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.37it/s][A
Epoch 3:  65%|██████▍   | 3863/5971 [40:08<21:53,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.09it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.08it/s][A
Epoch 3:  65%|██████▍   | 3867/5971 [40:08<21:49,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.28it/s][A
Epoch 3:  65%|██████▍   | 3871/5971 [40:08<21:46,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 20.98it/s][A
Epoch 3:  65%|██████▍   | 3875/5971 [40:08<21:42,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 21.92it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:07, 19.43it/s][A
Epoch 3:  65%|██████▍   | 3879/5971 [40:08<21:38,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 21.40it/s][A
Epoch 3:  65%|██████▌   | 3883/5971 [40:08<21:35,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:06, 22.27it/s][A
Epoch 3:  65%|██████▌   | 3887/5971 [40:09<21:31,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:02<00:05, 23.16it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:07, 16.57it/s][A
Epoch 3:  65%|██████▌   | 3891/5971 [40:09<21:27,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:06, 18.34it/s][A
Epoch 3:  65%|██████▌   | 3895/5971 [40:09<21:23,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:06, 19.17it/s][A
Epoch 3:  65%|██████▌   | 3899/5971 [40:09<21:20,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:05, 20.74it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:05, 22.28it/s][A
Epoch 3:  65%|██████▌   | 3903/5971 [40:09<21:16,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:03<00:05, 21.56it/s][A
Epoch 3:  65%|██████▌   | 3907/5971 [40:10<21:12,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 23.09it/s][A
Epoch 3:  65%|██████▌   | 3911/5971 [40:10<21:09,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:03<00:04, 23.47it/s][A

Validating:  37%|███▋      | 62/167 [00:03<00:04, 24.67it/s][A
Epoch 3:  66%|██████▌   | 3915/5971 [40:10<21:05,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.20it/s][A
Epoch 3:  66%|██████▌   | 3919/5971 [40:10<21:01,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.90it/s][A
Epoch 3:  66%|██████▌   | 3923/5971 [40:10<20:58,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.21it/s][A
Epoch 3:  66%|██████▌   | 3927/5971 [40:10<20:54,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.79it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.81it/s][A
Epoch 3:  66%|██████▌   | 3931/5971 [40:10<20:50,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▊     | 81/167 [00:04<00:03, 27.14it/s][A
Epoch 3:  66%|██████▌   | 3935/5971 [40:11<20:47,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|█████     | 84/167 [00:04<00:03, 27.65it/s][A
Epoch 3:  66%|██████▌   | 3939/5971 [40:11<20:43,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  52%|█████▏    | 87/167 [00:04<00:02, 27.33it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.16it/s][A
Epoch 3:  66%|██████▌   | 3943/5971 [40:11<20:39,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.74it/s][A
Epoch 3:  66%|██████▌   | 3947/5971 [40:11<20:36,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 24.92it/s][A
Epoch 3:  66%|██████▌   | 3951/5971 [40:11<20:32,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.70it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.34it/s][A
Epoch 3:  66%|██████▌   | 3955/5971 [40:11<20:29,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.78it/s][A
Epoch 3:  66%|██████▋   | 3959/5971 [40:12<20:25,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 26.28it/s][A
Epoch 3:  66%|██████▋   | 3963/5971 [40:12<20:21,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 27.10it/s][A

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 25.28it/s][A
Epoch 3:  66%|██████▋   | 3967/5971 [40:12<20:18,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.38it/s][A
Epoch 3:  67%|██████▋   | 3971/5971 [40:12<20:14,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.77it/s][A
Epoch 3:  67%|██████▋   | 3975/5971 [40:12<20:11,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 28.10it/s][A
Epoch 3:  67%|██████▋   | 3979/5971 [40:12<20:07,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.96it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.06it/s][A
Epoch 3:  67%|██████▋   | 3983/5971 [40:12<20:04,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.65it/s][A
Epoch 3:  67%|██████▋   | 3987/5971 [40:13<20:00,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 27.35it/s][A
Epoch 3:  67%|██████▋   | 3991/5971 [40:13<19:56,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 26.47it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.55it/s][A
Epoch 3:  67%|██████▋   | 3995/5971 [40:13<19:53,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.86it/s][A
Epoch 3:  67%|██████▋   | 3999/5971 [40:13<19:49,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.78it/s][A
Epoch 3:  67%|██████▋   | 4003/5971 [40:13<19:46,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 28.09it/s][A
Epoch 3:  67%|██████▋   | 4007/5971 [40:13<19:42,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.60it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.27it/s][A
Epoch 3:  67%|██████▋   | 4011/5971 [40:13<19:39,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 26.31it/s][A
Epoch 3:  67%|██████▋   | 4015/5971 [40:14<19:35,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 26.04it/s][A
Epoch 3:  67%|██████▋   | 4019/5971 [40:14<19:32,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4020/5971 [40:14<19:31,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.0018, train/loss_step=0.391, global_step=2099.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.34it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.15it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.73it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  4.04it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.46it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:09,  4.56it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.70it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.79it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  4.88it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  4.95it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.03it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.09it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.13it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.21it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.34it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.23it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.22it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.18it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.15it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.13it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.05it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.07it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.09it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.09it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.01it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.02it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.12it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.22it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.44it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.93it/s]

Epoch 3:  67%|██████▋   | 4021/5971 [40:27<19:36,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0045, train/loss_vlb_step=2.26e-5, train/loss_step=0.0045, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4021/5971 [40:27<19:37,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0045, train/loss_vlb_step=2.26e-5, train/loss_step=0.0045, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.34it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.14it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.77it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.26it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.54it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.83it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.95it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.97it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.07it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.22it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.24it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.47it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.51it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.52it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.53it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.24it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.35it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.39it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.43it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.40it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.09it/s]

Epoch 3:  67%|██████▋   | 4022/5971 [40:39<19:41,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0045, train/loss_vlb_step=2.26e-5, train/loss_step=0.0045, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4022/5971 [40:39<19:41,  1.65it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00856, train/loss_vlb_step=4.06e-5, train/loss_step=0.00856, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:29,  1.64it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:17,  2.79it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:13,  3.60it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  4.16it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.52it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.79it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.86it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.01it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.06it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.33it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  4.97it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.06it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.15it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.29it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.35it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.29it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.26it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.18it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.37it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.27it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.23it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.18it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.12it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.19it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.30it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.33it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.08it/s]

Epoch 3:  67%|██████▋   | 4023/5971 [40:51<19:46,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00856, train/loss_vlb_step=4.06e-5, train/loss_step=0.00856, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4023/5971 [40:51<19:46,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00938, train/loss_vlb_step=4.45e-5, train/loss_step=0.00938, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:39,  1.24it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:21,  2.22it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  2.99it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.60it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.12it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.52it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.84it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.04it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.11it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.22it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.02it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  4.94it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.06it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  4.90it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:05,  4.92it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.00it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.05it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.19it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.32it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.46it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.00it/s]

Epoch 3:  67%|██████▋   | 4024/5971 [41:06<19:53,  1.63it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00938, train/loss_vlb_step=4.45e-5, train/loss_step=0.00938, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4024/5971 [41:06<19:53,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000764, train/loss_step=0.211, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  67%|██████▋   | 4025/5971 [41:07<19:52,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000764, train/loss_step=0.211, global_step=2100.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4025/5971 [41:07<19:52,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.001, train/loss_step=0.263, global_step=2101.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  67%|██████▋   | 4026/5971 [41:08<19:52,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.001, train/loss_step=0.263, global_step=2101.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4026/5971 [41:08<19:52,  1.63it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.24e-5, train/loss_step=0.0144, global_step=2101.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4027/5971 [41:08<19:51,  1.63it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.24e-5, train/loss_step=0.0144, global_step=2101.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4027/5971 [41:08<19:51,  1.63it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.48e-5, train/loss_step=0.00269, global_step=2101.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4028/5971 [41:11<19:51,  1.63it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.48e-5, train/loss_step=0.00269, global_step=2101.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4028/5971 [41:11<19:51,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00345, train/loss_vlb_step=1.82e-5, train/loss_step=0.00345, global_step=2101.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4029/5971 [41:12<19:51,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00345, train/loss_vlb_step=1.82e-5, train/loss_step=0.00345, global_step=2101.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4029/5971 [41:12<19:51,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0963, train/loss_vlb_step=0.000319, train/loss_step=0.0963, global_step=2102.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  67%|██████▋   | 4030/5971 [41:13<19:50,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0963, train/loss_vlb_step=0.000319, train/loss_step=0.0963, global_step=2102.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  67%|██████▋   | 4030/5971 [41:13<19:50,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.49e-5, train/loss_step=0.0215, global_step=2102.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4031/5971 [41:13<19:50,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.49e-5, train/loss_step=0.0215, global_step=2102.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4031/5971 [41:13<19:50,  1.63it/s, loss=0.114, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=2102.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4032/5971 [41:16<19:50,  1.63it/s, loss=0.114, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=2102.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4032/5971 [41:16<19:50,  1.63it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.24e-6, train/loss_step=0.00157, global_step=2102.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4033/5971 [41:16<19:49,  1.63it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.24e-6, train/loss_step=0.00157, global_step=2102.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4033/5971 [41:16<19:49,  1.63it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000136, train/loss_step=0.0369, global_step=2103.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4034/5971 [41:17<19:49,  1.63it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000136, train/loss_step=0.0369, global_step=2103.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4034/5971 [41:17<19:49,  1.63it/s, loss=0.0674, v_num=0, train/loss_simple_step=0.00896, train/loss_vlb_step=4.15e-5, train/loss_step=0.00896, global_step=2103.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4035/5971 [41:18<19:49,  1.63it/s, loss=0.0674, v_num=0, train/loss_simple_step=0.00896, train/loss_vlb_step=4.15e-5, train/loss_step=0.00896, global_step=2103.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4035/5971 [41:18<19:49,  1.63it/s, loss=0.077, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000958, train/loss_step=0.251, global_step=2103.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  68%|██████▊   | 4036/5971 [41:20<19:49,  1.63it/s, loss=0.077, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000958, train/loss_step=0.251, global_step=2103.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4036/5971 [41:20<19:49,  1.63it/s, loss=0.0772, v_num=0, train/loss_simple_step=0.00572, train/loss_vlb_step=3e-5, train/loss_step=0.00572, global_step=2103.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4037/5971 [41:21<19:48,  1.63it/s, loss=0.0772, v_num=0, train/loss_simple_step=0.00572, train/loss_vlb_step=3e-5, train/loss_step=0.00572, global_step=2103.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4037/5971 [41:21<19:48,  1.63it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00302, train/loss_step=0.456, global_step=2104.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4038/5971 [41:22<19:48,  1.63it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00302, train/loss_step=0.456, global_step=2104.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4038/5971 [41:22<19:48,  1.63it/s, loss=0.103, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.00051, train/loss_step=0.152, global_step=2104.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4039/5971 [41:23<19:47,  1.63it/s, loss=0.103, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.00051, train/loss_step=0.152, global_step=2104.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4039/5971 [41:23<19:47,  1.63it/s, loss=0.128, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00479, train/loss_step=0.500, global_step=2104.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4040/5971 [41:25<19:47,  1.63it/s, loss=0.128, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00479, train/loss_step=0.500, global_step=2104.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4040/5971 [41:25<19:47,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00496, train/loss_vlb_step=2.58e-5, train/loss_step=0.00496, global_step=2104.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4041/5971 [41:26<19:47,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00496, train/loss_vlb_step=2.58e-5, train/loss_step=0.00496, global_step=2104.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4041/5971 [41:26<19:47,  1.63it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000141, train/loss_step=0.0396, global_step=2105.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  68%|██████▊   | 4042/5971 [41:27<19:46,  1.63it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000141, train/loss_step=0.0396, global_step=2105.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4042/5971 [41:27<19:46,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000464, train/loss_step=0.139, global_step=2105.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4043/5971 [41:28<19:46,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000464, train/loss_step=0.139, global_step=2105.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4043/5971 [41:28<19:46,  1.63it/s, loss=0.128, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000787, train/loss_step=0.228, global_step=2105.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4044/5971 [41:30<19:46,  1.62it/s, loss=0.128, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000787, train/loss_step=0.228, global_step=2105.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4044/5971 [41:30<19:46,  1.62it/s, loss=0.152, v_num=0, train/loss_simple_step=0.698, train/loss_vlb_step=0.0206, train/loss_step=0.698, global_step=2105.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  68%|██████▊   | 4045/5971 [41:31<19:46,  1.62it/s, loss=0.152, v_num=0, train/loss_simple_step=0.698, train/loss_vlb_step=0.0206, train/loss_step=0.698, global_step=2105.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4045/5971 [41:31<19:46,  1.62it/s, loss=0.15, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000875, train/loss_step=0.226, global_step=2106.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4046/5971 [41:32<19:45,  1.62it/s, loss=0.15, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000875, train/loss_step=0.226, global_step=2106.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4046/5971 [41:32<19:45,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000455, train/loss_step=0.132, global_step=2106.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4047/5971 [41:33<19:45,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000455, train/loss_step=0.132, global_step=2106.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4047/5971 [41:33<19:45,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.89e-6, train/loss_step=0.00166, global_step=2106.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4048/5971 [41:35<19:45,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.89e-6, train/loss_step=0.00166, global_step=2106.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4048/5971 [41:35<19:45,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.549, train/loss_vlb_step=0.00714, train/loss_step=0.549, global_step=2106.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  68%|██████▊   | 4049/5971 [41:36<19:44,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.549, train/loss_vlb_step=0.00714, train/loss_step=0.549, global_step=2106.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4049/5971 [41:36<19:44,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.07e-5, train/loss_step=0.00408, global_step=2107.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4050/5971 [41:37<19:44,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.07e-5, train/loss_step=0.00408, global_step=2107.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4050/5971 [41:37<19:44,  1.62it/s, loss=0.189, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000821, train/loss_step=0.219, global_step=2107.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  68%|██████▊   | 4051/5971 [41:38<19:43,  1.62it/s, loss=0.189, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000821, train/loss_step=0.219, global_step=2107.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4051/5971 [41:38<19:43,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00522, train/loss_vlb_step=2.61e-5, train/loss_step=0.00522, global_step=2107.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4052/5971 [41:40<19:43,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00522, train/loss_vlb_step=2.61e-5, train/loss_step=0.00522, global_step=2107.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4052/5971 [41:40<19:43,  1.62it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000125, train/loss_step=0.0317, global_step=2107.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4053/5971 [41:41<19:43,  1.62it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000125, train/loss_step=0.0317, global_step=2107.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4053/5971 [41:41<19:43,  1.62it/s, loss=0.203, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00232, train/loss_step=0.407, global_step=2108.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  68%|██████▊   | 4054/5971 [41:42<19:42,  1.62it/s, loss=0.203, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00232, train/loss_step=0.407, global_step=2108.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4054/5971 [41:42<19:42,  1.62it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.95e-5, train/loss_step=0.00366, global_step=2108.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4055/5971 [41:43<19:42,  1.62it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.95e-5, train/loss_step=0.00366, global_step=2108.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4055/5971 [41:43<19:42,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00995, train/loss_vlb_step=4.61e-5, train/loss_step=0.00995, global_step=2108.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4056/5971 [41:45<19:42,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00995, train/loss_vlb_step=4.61e-5, train/loss_step=0.00995, global_step=2108.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4056/5971 [41:45<19:42,  1.62it/s, loss=0.201, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000868, train/loss_step=0.209, global_step=2108.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  68%|██████▊   | 4057/5971 [41:46<19:42,  1.62it/s, loss=0.201, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000868, train/loss_step=0.209, global_step=2108.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4057/5971 [41:46<19:42,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.21e-5, train/loss_step=0.0241, global_step=2109.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4058/5971 [41:47<19:41,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.21e-5, train/loss_step=0.0241, global_step=2109.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4058/5971 [41:47<19:41,  1.62it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=3.08e-5, train/loss_step=0.00626, global_step=2109.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4059/5971 [41:47<19:41,  1.62it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00626, train/loss_vlb_step=3.08e-5, train/loss_step=0.00626, global_step=2109.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4059/5971 [41:47<19:41,  1.62it/s, loss=0.182, v_num=0, train/loss_simple_step=0.706, train/loss_vlb_step=0.0159, train/loss_step=0.706, global_step=2109.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  68%|██████▊   | 4060/5971 [41:49<19:41,  1.62it/s, loss=0.182, v_num=0, train/loss_simple_step=0.706, train/loss_vlb_step=0.0159, train/loss_step=0.706, global_step=2109.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4060/5971 [41:49<19:41,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.00012, train/loss_step=0.030, global_step=2109.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4061/5971 [41:50<19:40,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.00012, train/loss_step=0.030, global_step=2109.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4061/5971 [41:50<19:40,  1.62it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000206, train/loss_step=0.0615, global_step=2110.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4062/5971 [41:51<19:40,  1.62it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000206, train/loss_step=0.0615, global_step=2110.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4062/5971 [41:51<19:40,  1.62it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.00015, train/loss_step=0.0426, global_step=2110.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  68%|██████▊   | 4063/5971 [41:52<19:39,  1.62it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.00015, train/loss_step=0.0426, global_step=2110.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4063/5971 [41:52<19:39,  1.62it/s, loss=0.184, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00133, train/loss_step=0.320, global_step=2110.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4064/5971 [41:54<19:39,  1.62it/s, loss=0.184, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00133, train/loss_step=0.320, global_step=2110.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4064/5971 [41:54<19:39,  1.62it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.13e-5, train/loss_step=0.00398, global_step=2110.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4065/5971 [41:55<19:39,  1.62it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.13e-5, train/loss_step=0.00398, global_step=2110.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4065/5971 [41:55<19:39,  1.62it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.000194, train/loss_step=0.0566, global_step=2111.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4066/5971 [41:56<19:38,  1.62it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.000194, train/loss_step=0.0566, global_step=2111.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4066/5971 [41:56<19:38,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.0026, train/loss_step=0.407, global_step=2111.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  68%|██████▊   | 4067/5971 [41:57<19:38,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.0026, train/loss_step=0.407, global_step=2111.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4067/5971 [41:57<19:38,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0095, train/loss_vlb_step=4.29e-5, train/loss_step=0.0095, global_step=2111.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4068/5971 [41:59<19:38,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0095, train/loss_vlb_step=4.29e-5, train/loss_step=0.0095, global_step=2111.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4068/5971 [41:59<19:38,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.71e-5, train/loss_step=0.00308, global_step=2111.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4069/5971 [42:00<19:37,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.71e-5, train/loss_step=0.00308, global_step=2111.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4069/5971 [42:00<19:37,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.46e-5, train/loss_step=0.0128, global_step=2112.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  68%|██████▊   | 4070/5971 [42:01<19:37,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.46e-5, train/loss_step=0.0128, global_step=2112.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4070/5971 [42:01<19:37,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000607, train/loss_step=0.174, global_step=2112.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4071/5971 [42:02<19:36,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000607, train/loss_step=0.174, global_step=2112.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4071/5971 [42:02<19:36,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0764, train/loss_vlb_step=0.000251, train/loss_step=0.0764, global_step=2112.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4072/5971 [42:04<19:36,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0764, train/loss_vlb_step=0.000251, train/loss_step=0.0764, global_step=2112.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4072/5971 [42:04<19:36,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00667, train/loss_step=0.534, global_step=2112.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  68%|██████▊   | 4073/5971 [42:05<19:36,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00667, train/loss_step=0.534, global_step=2112.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4073/5971 [42:05<19:36,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.33e-5, train/loss_step=0.00227, global_step=2113.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4074/5971 [42:06<19:36,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.33e-5, train/loss_step=0.00227, global_step=2113.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4074/5971 [42:06<19:36,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00201, train/loss_step=0.376, global_step=2113.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  68%|██████▊   | 4075/5971 [42:07<19:35,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00201, train/loss_step=0.376, global_step=2113.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4075/5971 [42:07<19:35,  1.61it/s, loss=0.168, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00134, train/loss_step=0.313, global_step=2113.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4076/5971 [42:09<19:35,  1.61it/s, loss=0.168, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00134, train/loss_step=0.313, global_step=2113.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4076/5971 [42:09<19:35,  1.61it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.00013, train/loss_step=0.0352, global_step=2113.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4077/5971 [42:10<19:35,  1.61it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.00013, train/loss_step=0.0352, global_step=2113.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4077/5971 [42:10<19:35,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0966, train/loss_vlb_step=0.000318, train/loss_step=0.0966, global_step=2114.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4078/5971 [42:11<19:34,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0966, train/loss_vlb_step=0.000318, train/loss_step=0.0966, global_step=2114.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4078/5971 [42:11<19:34,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.13e-5, train/loss_step=0.00192, global_step=2114.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4079/5971 [42:11<19:34,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.13e-5, train/loss_step=0.00192, global_step=2114.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4079/5971 [42:11<19:34,  1.61it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0775, train/loss_vlb_step=0.000258, train/loss_step=0.0775, global_step=2114.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4080/5971 [42:14<19:34,  1.61it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0775, train/loss_vlb_step=0.000258, train/loss_step=0.0775, global_step=2114.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4080/5971 [42:14<19:34,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.88e-5, train/loss_step=0.0257, global_step=2114.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  68%|██████▊   | 4081/5971 [42:14<19:33,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.88e-5, train/loss_step=0.0257, global_step=2114.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4081/5971 [42:14<19:33,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.000287, train/loss_step=0.0871, global_step=2115.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4082/5971 [42:15<19:33,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.000287, train/loss_step=0.0871, global_step=2115.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4082/5971 [42:15<19:33,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000145, train/loss_step=0.043, global_step=2115.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  68%|██████▊   | 4083/5971 [42:16<19:32,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000145, train/loss_step=0.043, global_step=2115.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4083/5971 [42:16<19:32,  1.61it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0997, train/loss_vlb_step=0.000329, train/loss_step=0.0997, global_step=2115.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4084/5971 [42:18<19:32,  1.61it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0997, train/loss_vlb_step=0.000329, train/loss_step=0.0997, global_step=2115.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4084/5971 [42:18<19:32,  1.61it/s, loss=0.129, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.00051, train/loss_step=0.150, global_step=2115.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  68%|██████▊   | 4085/5971 [42:19<19:32,  1.61it/s, loss=0.129, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.00051, train/loss_step=0.150, global_step=2115.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4085/5971 [42:19<19:32,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0885, train/loss_vlb_step=0.000291, train/loss_step=0.0885, global_step=2116.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4086/5971 [42:20<19:31,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0885, train/loss_vlb_step=0.000291, train/loss_step=0.0885, global_step=2116.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4086/5971 [42:20<19:31,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00287, train/loss_step=0.458, global_step=2116.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  68%|██████▊   | 4087/5971 [42:21<19:31,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00287, train/loss_step=0.458, global_step=2116.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4087/5971 [42:21<19:31,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.15e-5, train/loss_step=0.0118, global_step=2116.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4088/5971 [42:23<19:31,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.15e-5, train/loss_step=0.0118, global_step=2116.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4088/5971 [42:23<19:31,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00373, train/loss_vlb_step=1.95e-5, train/loss_step=0.00373, global_step=2116.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4089/5971 [42:24<19:30,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00373, train/loss_vlb_step=1.95e-5, train/loss_step=0.00373, global_step=2116.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4089/5971 [42:24<19:30,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.04e-5, train/loss_step=0.00172, global_step=2117.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4090/5971 [42:25<19:30,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.04e-5, train/loss_step=0.00172, global_step=2117.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  68%|██████▊   | 4090/5971 [42:25<19:30,  1.61it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.29e-5, train/loss_step=0.00226, global_step=2117.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4091/5971 [42:27<19:30,  1.61it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.29e-5, train/loss_step=0.00226, global_step=2117.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4091/5971 [42:27<19:30,  1.61it/s, loss=0.132, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000992, train/loss_step=0.235, global_step=2117.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  69%|██████▊   | 4092/5971 [42:30<19:30,  1.60it/s, loss=0.132, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000992, train/loss_step=0.235, global_step=2117.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4092/5971 [42:30<19:30,  1.60it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.85e-5, train/loss_step=0.0128, global_step=2117.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4093/5971 [42:32<19:30,  1.60it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.85e-5, train/loss_step=0.0128, global_step=2117.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4093/5971 [42:32<19:30,  1.60it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.75e-5, train/loss_step=0.0245, global_step=2118.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4094/5971 [42:33<19:30,  1.60it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.75e-5, train/loss_step=0.0245, global_step=2118.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4094/5971 [42:33<19:30,  1.60it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000711, train/loss_step=0.202, global_step=2118.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4095/5971 [42:35<19:30,  1.60it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000711, train/loss_step=0.202, global_step=2118.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4095/5971 [42:35<19:30,  1.60it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000931, train/loss_step=0.246, global_step=2118.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4096/5971 [42:38<19:30,  1.60it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000931, train/loss_step=0.246, global_step=2118.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4096/5971 [42:38<19:31,  1.60it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000119, train/loss_step=0.0301, global_step=2118.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4097/5971 [42:40<19:30,  1.60it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000119, train/loss_step=0.0301, global_step=2118.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4097/5971 [42:40<19:30,  1.60it/s, loss=0.091, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.72e-5, train/loss_step=0.0177, global_step=2119.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  69%|██████▊   | 4098/5971 [42:41<19:30,  1.60it/s, loss=0.091, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.72e-5, train/loss_step=0.0177, global_step=2119.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4098/5971 [42:41<19:30,  1.60it/s, loss=0.092, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.23e-5, train/loss_step=0.0235, global_step=2119.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4099/5971 [42:42<19:29,  1.60it/s, loss=0.092, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.23e-5, train/loss_step=0.0235, global_step=2119.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4099/5971 [42:42<19:29,  1.60it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.49e-5, train/loss_step=0.00704, global_step=2119.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4100/5971 [42:45<19:30,  1.60it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.49e-5, train/loss_step=0.00704, global_step=2119.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4100/5971 [42:45<19:30,  1.60it/s, loss=0.101, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00125, train/loss_step=0.276, global_step=2119.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  69%|██████▊   | 4101/5971 [42:46<19:29,  1.60it/s, loss=0.101, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00125, train/loss_step=0.276, global_step=2119.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4101/5971 [42:46<19:29,  1.60it/s, loss=0.114, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00216, train/loss_step=0.350, global_step=2120.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4102/5971 [42:47<19:29,  1.60it/s, loss=0.114, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00216, train/loss_step=0.350, global_step=2120.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4102/5971 [42:47<19:29,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.49e-5, train/loss_step=0.00456, global_step=2120.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4103/5971 [42:48<19:29,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.49e-5, train/loss_step=0.00456, global_step=2120.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4103/5971 [42:48<19:29,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0973, train/loss_vlb_step=0.00032, train/loss_step=0.0973, global_step=2120.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  69%|██████▊   | 4104/5971 [42:51<19:29,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0973, train/loss_vlb_step=0.00032, train/loss_step=0.0973, global_step=2120.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4104/5971 [42:51<19:29,  1.60it/s, loss=0.125, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00271, train/loss_step=0.408, global_step=2120.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  69%|██████▊   | 4105/5971 [42:52<19:29,  1.60it/s, loss=0.125, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00271, train/loss_step=0.408, global_step=2120.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▊   | 4105/5971 [42:52<19:29,  1.60it/s, loss=0.122, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.86e-5, train/loss_step=0.025, global_step=2121.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4106/5971 [42:53<19:28,  1.60it/s, loss=0.122, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.86e-5, train/loss_step=0.025, global_step=2121.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4106/5971 [42:53<19:28,  1.60it/s, loss=0.135, v_num=0, train/loss_simple_step=0.718, train/loss_vlb_step=0.0223, train/loss_step=0.718, global_step=2121.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  69%|██████▉   | 4107/5971 [42:54<19:28,  1.60it/s, loss=0.135, v_num=0, train/loss_simple_step=0.718, train/loss_vlb_step=0.0223, train/loss_step=0.718, global_step=2121.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4107/5971 [42:54<19:28,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.14e-5, train/loss_step=0.00194, global_step=2121.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4108/5971 [42:56<19:28,  1.59it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.14e-5, train/loss_step=0.00194, global_step=2121.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4108/5971 [42:56<19:28,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000114, train/loss_step=0.0299, global_step=2121.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  69%|██████▉   | 4109/5971 [42:57<19:27,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000114, train/loss_step=0.0299, global_step=2121.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4109/5971 [42:57<19:27,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.17e-5, train/loss_step=0.002, global_step=2122.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  69%|██████▉   | 4110/5971 [42:58<19:27,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.17e-5, train/loss_step=0.002, global_step=2122.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4110/5971 [42:58<19:27,  1.59it/s, loss=0.162, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00495, train/loss_step=0.538, global_step=2122.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4111/5971 [42:59<19:26,  1.59it/s, loss=0.162, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00495, train/loss_step=0.538, global_step=2122.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4111/5971 [42:59<19:26,  1.59it/s, loss=0.177, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.0054, train/loss_step=0.519, global_step=2122.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  69%|██████▉   | 4112/5971 [43:02<19:27,  1.59it/s, loss=0.177, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.0054, train/loss_step=0.519, global_step=2122.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4112/5971 [43:02<19:27,  1.59it/s, loss=0.19, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00103, train/loss_step=0.275, global_step=2122.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4113/5971 [43:03<19:26,  1.59it/s, loss=0.19, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00103, train/loss_step=0.275, global_step=2122.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4113/5971 [43:03<19:26,  1.59it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0802, train/loss_vlb_step=0.000268, train/loss_step=0.0802, global_step=2123.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4114/5971 [43:04<19:26,  1.59it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0802, train/loss_vlb_step=0.000268, train/loss_step=0.0802, global_step=2123.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4114/5971 [43:04<19:26,  1.59it/s, loss=0.187, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000316, train/loss_step=0.096, global_step=2123.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  69%|██████▉   | 4115/5971 [43:05<19:25,  1.59it/s, loss=0.187, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000316, train/loss_step=0.096, global_step=2123.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4115/5971 [43:05<19:25,  1.59it/s, loss=0.18, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00036, train/loss_step=0.109, global_step=2123.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  69%|██████▉   | 4116/5971 [43:07<19:25,  1.59it/s, loss=0.18, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00036, train/loss_step=0.109, global_step=2123.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4116/5971 [43:07<19:25,  1.59it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.92e-5, train/loss_step=0.0105, global_step=2123.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4117/5971 [43:08<19:25,  1.59it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.92e-5, train/loss_step=0.0105, global_step=2123.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4117/5971 [43:08<19:25,  1.59it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000175, train/loss_step=0.0467, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4118/5971 [43:09<19:24,  1.59it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000175, train/loss_step=0.0467, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4118/5971 [43:09<19:24,  1.59it/s, loss=0.201, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00253, train/loss_step=0.423, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  69%|██████▉   | 4119/5971 [43:10<19:24,  1.59it/s, loss=0.201, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00253, train/loss_step=0.423, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4119/5971 [43:10<19:24,  1.59it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000143, train/loss_step=0.0379, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4120/5971 [43:12<19:24,  1.59it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000143, train/loss_step=0.0379, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  69%|██████▉   | 4120/5971 [43:12<19:24,  1.59it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.37it/s][A
Epoch 3:  69%|██████▉   | 4122/5971 [43:13<19:23,  1.59it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:46,  3.52it/s][A
Epoch 3:  69%|██████▉   | 4124/5971 [43:13<19:21,  1.59it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   2%|▏         | 4/167 [00:00<00:23,  6.99it/s][A
Epoch 3:  69%|██████▉   | 4127/5971 [43:13<19:18,  1.59it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   4%|▍         | 7/167 [00:00<00:13, 12.18it/s][A
Epoch 3:  69%|██████▉   | 4130/5971 [43:13<19:15,  1.59it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   6%|▌         | 10/167 [00:00<00:09, 16.06it/s][A
Epoch 3:  69%|██████▉   | 4133/5971 [43:14<19:13,  1.59it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:01<00:08, 17.75it/s][A
Epoch 3:  69%|██████▉   | 4136/5971 [43:14<19:10,  1.59it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.84it/s][A
Epoch 3:  69%|██████▉   | 4139/5971 [43:14<19:08,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.29it/s][A
Epoch 3:  69%|██████▉   | 4142/5971 [43:14<19:05,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 21.53it/s][A
Epoch 3:  69%|██████▉   | 4145/5971 [43:14<19:02,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 23.13it/s][A
Epoch 3:  69%|██████▉   | 4148/5971 [43:14<19:00,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.33it/s][A
Epoch 3:  70%|██████▉   | 4151/5971 [43:14<18:57,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 24.16it/s][A
Epoch 3:  70%|██████▉   | 4154/5971 [43:14<18:54,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|██        | 34/167 [00:01<00:05, 24.06it/s][A
Epoch 3:  70%|██████▉   | 4157/5971 [43:15<18:52,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 24.76it/s][A
Epoch 3:  70%|██████▉   | 4160/5971 [43:15<18:49,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.72it/s][A
Epoch 3:  70%|██████▉   | 4163/5971 [43:15<18:46,  1.60it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.87it/s][A
Epoch 3:  70%|██████▉   | 4166/5971 [43:15<18:44,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.46it/s][A
Epoch 3:  70%|██████▉   | 4169/5971 [43:15<18:41,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 25.45it/s][A
Epoch 3:  70%|██████▉   | 4172/5971 [43:15<18:38,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:02<00:04, 25.95it/s][A
Epoch 3:  70%|██████▉   | 4175/5971 [43:15<18:36,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 25.66it/s][A
Epoch 3:  70%|██████▉   | 4178/5971 [43:15<18:33,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 25.24it/s][A
Epoch 3:  70%|███████   | 4181/5971 [43:15<18:31,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 26.19it/s][A
Epoch 3:  70%|███████   | 4184/5971 [43:16<18:28,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 25.88it/s][A
Epoch 3:  70%|███████   | 4187/5971 [43:16<18:25,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:03, 25.83it/s][A
Epoch 3:  70%|███████   | 4190/5971 [43:16<18:23,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.33it/s][A
Epoch 3:  70%|███████   | 4193/5971 [43:16<18:20,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.50it/s][A
Epoch 3:  70%|███████   | 4196/5971 [43:16<18:18,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.08it/s][A
Epoch 3:  70%|███████   | 4199/5971 [43:16<18:15,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.21it/s][A
Epoch 3:  70%|███████   | 4202/5971 [43:16<18:12,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 24.61it/s][A
Epoch 3:  70%|███████   | 4205/5971 [43:16<18:10,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:03, 24.44it/s][A
Epoch 3:  70%|███████   | 4208/5971 [43:17<18:07,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 23.52it/s][A
Epoch 3:  71%|███████   | 4211/5971 [43:17<18:05,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 24.42it/s][A
Epoch 3:  71%|███████   | 4214/5971 [43:17<18:02,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 24.50it/s][A
Epoch 3:  71%|███████   | 4217/5971 [43:17<18:00,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 24.62it/s][A
Epoch 3:  71%|███████   | 4220/5971 [43:17<17:57,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.11it/s][A
Epoch 3:  71%|███████   | 4223/5971 [43:17<17:54,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 26.15it/s][A
Epoch 3:  71%|███████   | 4226/5971 [43:17<17:52,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.53it/s][A
Epoch 3:  71%|███████   | 4229/5971 [43:17<17:49,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.45it/s][A
Epoch 3:  71%|███████   | 4232/5971 [43:17<17:47,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.75it/s][A
Epoch 3:  71%|███████   | 4235/5971 [43:18<17:44,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 27.12it/s][A
Epoch 3:  71%|███████   | 4238/5971 [43:18<17:42,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.43it/s][A
Epoch 3:  71%|███████   | 4241/5971 [43:18<17:39,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.19it/s][A
Epoch 3:  71%|███████   | 4244/5971 [43:18<17:37,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.05it/s][A
Epoch 3:  71%|███████   | 4247/5971 [43:18<17:34,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 24.64it/s][A
Epoch 3:  71%|███████   | 4250/5971 [43:18<17:32,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 25.39it/s][A
Epoch 3:  71%|███████   | 4253/5971 [43:18<17:29,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 24.56it/s][A
Epoch 3:  71%|███████▏  | 4256/5971 [43:18<17:27,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 24.73it/s][A
Epoch 3:  71%|███████▏  | 4259/5971 [43:19<17:24,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 24.77it/s][A
Epoch 3:  71%|███████▏  | 4262/5971 [43:19<17:21,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:06<00:01, 24.10it/s][A
Epoch 3:  71%|███████▏  | 4265/5971 [43:19<17:19,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 24.02it/s][A
Epoch 3:  71%|███████▏  | 4268/5971 [43:19<17:16,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 24.44it/s][A
Epoch 3:  72%|███████▏  | 4271/5971 [43:19<17:14,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 23.98it/s][A
Epoch 3:  72%|███████▏  | 4274/5971 [43:19<17:11,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 23.98it/s][A
Epoch 3:  72%|███████▏  | 4277/5971 [43:19<17:09,  1.65it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 22.96it/s][A
Epoch 3:  72%|███████▏  | 4280/5971 [43:19<17:06,  1.65it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 23.72it/s][A
Epoch 3:  72%|███████▏  | 4283/5971 [43:20<17:04,  1.65it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 24.15it/s][A
Epoch 3:  72%|███████▏  | 4286/5971 [43:20<17:01,  1.65it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 25.50it/s][A
Epoch 3:  72%|███████▏  | 4288/5971 [43:20<17:00,  1.65it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  72%|███████▏  | 4289/5971 [43:21<16:59,  1.65it/s, loss=0.198, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000663, train/loss_step=0.194, global_step=2124.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4289/5971 [43:21<16:59,  1.65it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.4e-5, train/loss_step=0.00251, global_step=2125.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4290/5971 [43:22<16:59,  1.65it/s, loss=0.19, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000601, train/loss_step=0.177, global_step=2125.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  72%|███████▏  | 4291/5971 [43:23<16:59,  1.65it/s, loss=0.199, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00109, train/loss_step=0.283, global_step=2125.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4292/5971 [43:26<16:59,  1.65it/s, loss=0.199, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00109, train/loss_step=0.283, global_step=2125.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4292/5971 [43:26<16:59,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=1.96e-5, train/loss_step=0.00374, global_step=2125.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4293/5971 [43:26<16:58,  1.65it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.52e-5, train/loss_step=0.0028, global_step=2126.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  72%|███████▏  | 4294/5971 [43:27<16:58,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000123, train/loss_step=0.0323, global_step=2126.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4295/5971 [43:28<16:57,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000123, train/loss_step=0.0323, global_step=2126.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4295/5971 [43:28<16:57,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00145, train/loss_step=0.322, global_step=2126.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  72%|███████▏  | 4296/5971 [43:31<16:58,  1.65it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.00023, train/loss_step=0.0682, global_step=2126.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4297/5971 [43:32<16:57,  1.65it/s, loss=0.19, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00481, train/loss_step=0.586, global_step=2127.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  72%|███████▏  | 4298/5971 [43:33<16:57,  1.64it/s, loss=0.19, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00481, train/loss_step=0.586, global_step=2127.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4298/5971 [43:33<16:57,  1.64it/s, loss=0.171, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000486, train/loss_step=0.147, global_step=2127.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4299/5971 [43:34<16:56,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000103, train/loss_step=0.028, global_step=2127.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4300/5971 [43:37<16:56,  1.64it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.51e-5, train/loss_step=0.00754, global_step=2127.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4301/5971 [43:38<16:56,  1.64it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.51e-5, train/loss_step=0.00754, global_step=2127.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4301/5971 [43:38<16:56,  1.64it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000115, train/loss_step=0.0279, global_step=2128.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  72%|███████▏  | 4302/5971 [43:39<16:55,  1.64it/s, loss=0.171, v_num=0, train/loss_simple_step=0.912, train/loss_vlb_step=0.154, train/loss_step=0.912, global_step=2128.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  72%|███████▏  | 4303/5971 [43:39<16:55,  1.64it/s, loss=0.186, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00234, train/loss_step=0.410, global_step=2128.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4304/5971 [43:42<16:55,  1.64it/s, loss=0.186, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00234, train/loss_step=0.410, global_step=2128.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4304/5971 [43:42<16:55,  1.64it/s, loss=0.213, v_num=0, train/loss_simple_step=0.548, train/loss_vlb_step=0.0057, train/loss_step=0.548, global_step=2128.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  72%|███████▏  | 4305/5971 [43:43<16:54,  1.64it/s, loss=0.223, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000999, train/loss_step=0.247, global_step=2129.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4306/5971 [43:44<16:54,  1.64it/s, loss=0.215, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00124, train/loss_step=0.258, global_step=2129.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  72%|███████▏  | 4307/5971 [43:44<16:53,  1.64it/s, loss=0.215, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00124, train/loss_step=0.258, global_step=2129.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4307/5971 [43:44<16:53,  1.64it/s, loss=0.233, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00324, train/loss_step=0.408, global_step=2129.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4308/5971 [43:47<16:53,  1.64it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00431, train/loss_vlb_step=2.25e-5, train/loss_step=0.00431, global_step=2129.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4309/5971 [43:48<16:53,  1.64it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.000173, train/loss_step=0.0493, global_step=2130.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  72%|███████▏  | 4310/5971 [43:49<16:52,  1.64it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.000173, train/loss_step=0.0493, global_step=2130.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4310/5971 [43:49<16:52,  1.64it/s, loss=0.226, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000738, train/loss_step=0.178, global_step=2130.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  72%|███████▏  | 4311/5971 [43:50<16:52,  1.64it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.16e-5, train/loss_step=0.00409, global_step=2130.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4312/5971 [43:52<16:52,  1.64it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=7.25e-5, train/loss_step=0.0161, global_step=2130.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  72%|███████▏  | 4313/5971 [43:53<16:52,  1.64it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=7.25e-5, train/loss_step=0.0161, global_step=2130.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4313/5971 [43:53<16:52,  1.64it/s, loss=0.23, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00169, train/loss_step=0.345, global_step=2131.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  72%|███████▏  | 4314/5971 [43:54<16:51,  1.64it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0793, train/loss_vlb_step=0.00026, train/loss_step=0.0793, global_step=2131.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4315/5971 [43:55<16:51,  1.64it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0829, train/loss_vlb_step=0.000273, train/loss_step=0.0829, global_step=2131.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4316/5971 [43:57<16:51,  1.64it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0829, train/loss_vlb_step=0.000273, train/loss_step=0.0829, global_step=2131.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4316/5971 [43:57<16:51,  1.64it/s, loss=0.241, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00414, train/loss_step=0.483, global_step=2131.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  72%|███████▏  | 4317/5971 [43:58<16:50,  1.64it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.61e-5, train/loss_step=0.00698, global_step=2132.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4318/5971 [43:58<16:50,  1.64it/s, loss=0.218, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00124, train/loss_step=0.271, global_step=2132.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  72%|███████▏  | 4319/5971 [43:59<16:49,  1.64it/s, loss=0.218, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00124, train/loss_step=0.271, global_step=2132.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4319/5971 [43:59<16:49,  1.64it/s, loss=0.252, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.0235, train/loss_step=0.712, global_step=2132.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  72%|███████▏  | 4320/5971 [44:03<16:50,  1.63it/s, loss=0.266, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.0012, train/loss_step=0.283, global_step=2132.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4321/5971 [44:04<16:49,  1.63it/s, loss=0.265, v_num=0, train/loss_simple_step=0.00469, train/loss_vlb_step=2.49e-5, train/loss_step=0.00469, global_step=2133.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4322/5971 [44:05<16:49,  1.63it/s, loss=0.265, v_num=0, train/loss_simple_step=0.00469, train/loss_vlb_step=2.49e-5, train/loss_step=0.00469, global_step=2133.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4322/5971 [44:05<16:49,  1.63it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00796, train/loss_vlb_step=3.89e-5, train/loss_step=0.00796, global_step=2133.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  72%|███████▏  | 4323/5971 [44:06<16:48,  1.63it/s, loss=0.206, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000409, train/loss_step=0.124, global_step=2133.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  72%|███████▏  | 4324/5971 [44:09<16:49,  1.63it/s, loss=0.197, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00181, train/loss_step=0.369, global_step=2133.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  72%|███████▏  | 4325/5971 [44:10<16:48,  1.63it/s, loss=0.197, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00181, train/loss_step=0.369, global_step=2133.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4325/5971 [44:10<16:48,  1.63it/s, loss=0.192, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000516, train/loss_step=0.155, global_step=2134.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4326/5971 [44:11<16:48,  1.63it/s, loss=0.188, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000584, train/loss_step=0.171, global_step=2134.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4327/5971 [44:12<16:47,  1.63it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00773, train/loss_vlb_step=3.75e-5, train/loss_step=0.00773, global_step=2134.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4328/5971 [44:15<16:47,  1.63it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00773, train/loss_vlb_step=3.75e-5, train/loss_step=0.00773, global_step=2134.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  72%|███████▏  | 4328/5971 [44:15<16:47,  1.63it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000136, train/loss_step=0.0369, global_step=2134.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  73%|███████▎  | 4329/5971 [44:16<16:47,  1.63it/s, loss=0.178, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000863, train/loss_step=0.227, global_step=2135.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4330/5971 [44:17<16:46,  1.63it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.03e-5, train/loss_step=0.00384, global_step=2135.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4331/5971 [44:17<16:46,  1.63it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.03e-5, train/loss_step=0.00384, global_step=2135.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4331/5971 [44:17<16:46,  1.63it/s, loss=0.179, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000736, train/loss_step=0.196, global_step=2135.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  73%|███████▎  | 4332/5971 [44:20<16:46,  1.63it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.36e-6, train/loss_step=0.00156, global_step=2135.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4333/5971 [44:21<16:45,  1.63it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=8.49e-5, train/loss_step=0.0232, global_step=2136.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4334/5971 [44:22<16:45,  1.63it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=8.49e-5, train/loss_step=0.0232, global_step=2136.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4334/5971 [44:22<16:45,  1.63it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0831, train/loss_vlb_step=0.000275, train/loss_step=0.0831, global_step=2136.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4335/5971 [44:23<16:44,  1.63it/s, loss=0.164, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2136.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  73%|███████▎  | 4336/5971 [44:25<16:44,  1.63it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.89e-5, train/loss_step=0.00344, global_step=2136.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4337/5971 [44:26<16:44,  1.63it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.89e-5, train/loss_step=0.00344, global_step=2136.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4337/5971 [44:26<16:44,  1.63it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00762, train/loss_vlb_step=3.48e-5, train/loss_step=0.00762, global_step=2137.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4338/5971 [44:27<16:43,  1.63it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.2e-5, train/loss_step=0.00209, global_step=2137.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4339/5971 [44:28<16:43,  1.63it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.0999, train/loss_vlb_step=0.000332, train/loss_step=0.0999, global_step=2137.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4340/5971 [44:30<16:43,  1.63it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.0999, train/loss_vlb_step=0.000332, train/loss_step=0.0999, global_step=2137.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4340/5971 [44:30<16:43,  1.63it/s, loss=0.108, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00412, train/loss_step=0.512, global_step=2137.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  73%|███████▎  | 4341/5971 [44:31<16:42,  1.63it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.89e-5, train/loss_step=0.0109, global_step=2138.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4342/5971 [44:32<16:42,  1.63it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.74e-5, train/loss_step=0.0125, global_step=2138.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4343/5971 [44:33<16:41,  1.62it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.74e-5, train/loss_step=0.0125, global_step=2138.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4343/5971 [44:33<16:41,  1.62it/s, loss=0.125, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00538, train/loss_step=0.463, global_step=2138.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4344/5971 [44:35<16:41,  1.62it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00323, train/loss_vlb_step=1.8e-5, train/loss_step=0.00323, global_step=2138.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4345/5971 [44:36<16:41,  1.62it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000102, train/loss_step=0.0257, global_step=2139.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4346/5971 [44:37<16:40,  1.62it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000102, train/loss_step=0.0257, global_step=2139.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4346/5971 [44:37<16:40,  1.62it/s, loss=0.102, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.00074, train/loss_step=0.208, global_step=2139.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  73%|███████▎  | 4347/5971 [44:38<16:40,  1.62it/s, loss=0.109, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000457, train/loss_step=0.138, global_step=2139.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4348/5971 [44:40<16:40,  1.62it/s, loss=0.107, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.64e-5, train/loss_step=0.003, global_step=2139.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  73%|███████▎  | 4349/5971 [44:41<16:39,  1.62it/s, loss=0.107, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.64e-5, train/loss_step=0.003, global_step=2139.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4349/5971 [44:41<16:39,  1.62it/s, loss=0.102, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.00041, train/loss_step=0.122, global_step=2140.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4350/5971 [44:42<16:39,  1.62it/s, loss=0.109, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000469, train/loss_step=0.141, global_step=2140.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4351/5971 [44:43<16:38,  1.62it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000214, train/loss_step=0.0596, global_step=2140.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4352/5971 [44:46<16:39,  1.62it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000214, train/loss_step=0.0596, global_step=2140.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4352/5971 [44:46<16:39,  1.62it/s, loss=0.125, v_num=0, train/loss_simple_step=0.451, train/loss_vlb_step=0.00267, train/loss_step=0.451, global_step=2140.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  73%|███████▎  | 4353/5971 [44:46<16:38,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0534, train/loss_vlb_step=0.000178, train/loss_step=0.0534, global_step=2141.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4354/5971 [44:47<16:37,  1.62it/s, loss=0.124, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000139, train/loss_step=0.037, global_step=2141.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4355/5971 [44:48<16:37,  1.62it/s, loss=0.124, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000139, train/loss_step=0.037, global_step=2141.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4355/5971 [44:48<16:37,  1.62it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00556, train/loss_vlb_step=2.89e-5, train/loss_step=0.00556, global_step=2141.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4356/5971 [44:50<16:37,  1.62it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.37e-5, train/loss_step=0.00238, global_step=2141.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4357/5971 [44:51<16:36,  1.62it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.000197, train/loss_step=0.0537, global_step=2142.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4358/5971 [44:52<16:36,  1.62it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.000197, train/loss_step=0.0537, global_step=2142.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4358/5971 [44:52<16:36,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.547, train/loss_vlb_step=0.00625, train/loss_step=0.547, global_step=2142.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4359/5971 [44:53<16:35,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=6.68e-5, train/loss_step=0.0171, global_step=2142.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4360/5971 [44:55<16:35,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000557, train/loss_step=0.163, global_step=2142.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  73%|███████▎  | 4361/5971 [44:56<16:35,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000557, train/loss_step=0.163, global_step=2142.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4361/5971 [44:56<16:35,  1.62it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.57e-5, train/loss_step=0.00277, global_step=2143.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4362/5971 [44:57<16:34,  1.62it/s, loss=0.14, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00155, train/loss_step=0.305, global_step=2143.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  73%|███████▎  | 4363/5971 [44:58<16:34,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000412, train/loss_step=0.124, global_step=2143.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4364/5971 [45:01<16:34,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000412, train/loss_step=0.124, global_step=2143.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4364/5971 [45:01<16:34,  1.62it/s, loss=0.132, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000598, train/loss_step=0.181, global_step=2143.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4365/5971 [45:02<16:34,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000992, train/loss_step=0.252, global_step=2144.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4366/5971 [45:03<16:33,  1.62it/s, loss=0.149, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00162, train/loss_step=0.315, global_step=2144.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  73%|███████▎  | 4367/5971 [45:04<16:32,  1.62it/s, loss=0.149, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00162, train/loss_step=0.315, global_step=2144.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4367/5971 [45:04<16:32,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0777, train/loss_vlb_step=0.000263, train/loss_step=0.0777, global_step=2144.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4368/5971 [45:06<16:32,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000645, train/loss_step=0.190, global_step=2144.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4369/5971 [45:07<16:32,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.36e-5, train/loss_step=0.0235, global_step=2145.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4370/5971 [45:07<16:31,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.36e-5, train/loss_step=0.0235, global_step=2145.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4370/5971 [45:07<16:31,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.00109, train/loss_step=0.238, global_step=2145.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  73%|███████▎  | 4371/5971 [45:08<16:31,  1.61it/s, loss=0.16, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000537, train/loss_step=0.160, global_step=2145.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4372/5971 [45:12<16:31,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00403, train/loss_step=0.433, global_step=2145.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4373/5971 [45:13<16:31,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00403, train/loss_step=0.433, global_step=2145.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4373/5971 [45:13<16:31,  1.61it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00197, train/loss_vlb_step=1.16e-5, train/loss_step=0.00197, global_step=2146.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4374/5971 [45:14<16:30,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.18e-5, train/loss_step=0.0139, global_step=2146.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4375/5971 [45:14<16:30,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000175, train/loss_step=0.0484, global_step=2146.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4376/5971 [45:17<16:30,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000175, train/loss_step=0.0484, global_step=2146.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4376/5971 [45:17<16:30,  1.61it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000224, train/loss_step=0.0655, global_step=2146.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4377/5971 [45:18<16:29,  1.61it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0857, train/loss_vlb_step=0.000288, train/loss_step=0.0857, global_step=2147.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4378/5971 [45:18<16:29,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00361, train/loss_vlb_step=1.97e-5, train/loss_step=0.00361, global_step=2147.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4379/5971 [45:19<16:28,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00361, train/loss_vlb_step=1.97e-5, train/loss_step=0.00361, global_step=2147.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4379/5971 [45:19<16:28,  1.61it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0762, train/loss_vlb_step=0.000251, train/loss_step=0.0762, global_step=2147.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  73%|███████▎  | 4380/5971 [45:22<16:28,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.35e-5, train/loss_step=0.00228, global_step=2147.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4381/5971 [45:23<16:28,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.44e-5, train/loss_step=0.00264, global_step=2148.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4382/5971 [45:24<16:27,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.44e-5, train/loss_step=0.00264, global_step=2148.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4382/5971 [45:24<16:27,  1.61it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.59e-5, train/loss_step=0.0177, global_step=2148.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  73%|███████▎  | 4383/5971 [45:24<16:27,  1.61it/s, loss=0.122, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00119, train/loss_step=0.257, global_step=2148.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  73%|███████▎  | 4384/5971 [45:27<16:27,  1.61it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.09e-5, train/loss_step=0.0174, global_step=2148.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4385/5971 [45:28<16:26,  1.61it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.09e-5, train/loss_step=0.0174, global_step=2148.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4385/5971 [45:28<16:26,  1.61it/s, loss=0.14, v_num=0, train/loss_simple_step=0.768, train/loss_vlb_step=0.0309, train/loss_step=0.768, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  73%|███████▎  | 4386/5971 [45:29<16:26,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.76e-5, train/loss_step=0.0204, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4387/5971 [45:29<16:25,  1.61it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=7.05e-5, train/loss_step=0.0164, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4388/5971 [45:32<16:25,  1.61it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=7.05e-5, train/loss_step=0.0164, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  73%|███████▎  | 4388/5971 [45:32<16:25,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.43it/s][A

Validating:   1%|          | 2/167 [00:00<00:45,  3.63it/s][A
Epoch 3:  74%|███████▎  | 4391/5971 [45:32<16:23,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   4%|▎         | 6/167 [00:00<00:14, 11.27it/s][A
Epoch 3:  74%|███████▎  | 4395/5971 [45:32<16:19,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   6%|▌         | 10/167 [00:00<00:09, 16.89it/s][A
Epoch 3:  74%|███████▎  | 4399/5971 [45:33<16:16,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:00<00:07, 19.85it/s][A
Epoch 3:  74%|███████▎  | 4403/5971 [45:33<16:13,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|▉         | 16/167 [00:01<00:06, 22.04it/s][A
Epoch 3:  74%|███████▍  | 4407/5971 [45:33<16:09,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 23.62it/s][A

Validating:  13%|█▎        | 22/167 [00:01<00:05, 24.81it/s][A
Epoch 3:  74%|███████▍  | 4411/5971 [45:33<16:06,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.86it/s][A
Epoch 3:  74%|███████▍  | 4415/5971 [45:33<16:03,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 26.07it/s][A
Epoch 3:  74%|███████▍  | 4419/5971 [45:33<15:59,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.48it/s][A

Validating:  20%|██        | 34/167 [00:01<00:05, 25.32it/s][A
Epoch 3:  74%|███████▍  | 4423/5971 [45:33<15:56,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:01<00:04, 26.04it/s][A
Epoch 3:  74%|███████▍  | 4427/5971 [45:34<15:53,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  24%|██▍       | 40/167 [00:01<00:05, 24.82it/s][A
Epoch 3:  74%|███████▍  | 4431/5971 [45:34<15:50,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.97it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.56it/s][A
Epoch 3:  74%|███████▍  | 4435/5971 [45:34<15:46,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.56it/s][A
Epoch 3:  74%|███████▍  | 4439/5971 [45:34<15:43,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:02<00:04, 26.73it/s][A
Epoch 3:  74%|███████▍  | 4443/5971 [45:34<15:40,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.15it/s][A

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.46it/s][A
Epoch 3:  74%|███████▍  | 4447/5971 [45:34<15:37,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 25.56it/s][A
Epoch 3:  75%|███████▍  | 4451/5971 [45:35<15:33,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 26.01it/s][A
Epoch 3:  75%|███████▍  | 4455/5971 [45:35<15:30,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:04, 24.58it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.71it/s][A
Epoch 3:  75%|███████▍  | 4459/5971 [45:35<15:27,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.02it/s][A
Epoch 3:  75%|███████▍  | 4463/5971 [45:35<15:24,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 23.92it/s][A
Epoch 3:  75%|███████▍  | 4467/5971 [45:35<15:20,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 23.02it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:03, 23.16it/s][A
Epoch 3:  75%|███████▍  | 4471/5971 [45:35<15:17,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:03, 24.16it/s][A
Epoch 3:  75%|███████▍  | 4475/5971 [45:36<15:14,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 24.76it/s][A
Epoch 3:  75%|███████▌  | 4479/5971 [45:36<15:11,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 25.34it/s][A
Epoch 3:  75%|███████▌  | 4483/5971 [45:36<15:08,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.58it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.40it/s][A
Epoch 3:  75%|███████▌  | 4487/5971 [45:36<15:04,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.03it/s][A
Epoch 3:  75%|███████▌  | 4491/5971 [45:36<15:01,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.75it/s][A
Epoch 3:  75%|███████▌  | 4495/5971 [45:36<14:58,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.49it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.51it/s][A
Epoch 3:  75%|███████▌  | 4499/5971 [45:36<14:55,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.59it/s][A
Epoch 3:  75%|███████▌  | 4503/5971 [45:37<14:52,  1.65it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:04<00:02, 22.85it/s][A
Epoch 3:  75%|███████▌  | 4507/5971 [45:37<14:48,  1.65it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████▏  | 119/167 [00:05<00:02, 23.70it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 24.78it/s][A
Epoch 3:  76%|███████▌  | 4511/5971 [45:37<14:45,  1.65it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.71it/s][A
Epoch 3:  76%|███████▌  | 4515/5971 [45:37<14:42,  1.65it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.30it/s][A
Epoch 3:  76%|███████▌  | 4519/5971 [45:37<14:39,  1.65it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 27.19it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 27.83it/s][A
Epoch 3:  76%|███████▌  | 4523/5971 [45:37<14:36,  1.65it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 28.24it/s][A
Epoch 3:  76%|███████▌  | 4527/5971 [45:37<14:33,  1.65it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 28.68it/s][A
Epoch 3:  76%|███████▌  | 4531/5971 [45:38<14:30,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 28.81it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 28.52it/s][A
Epoch 3:  76%|███████▌  | 4535/5971 [45:38<14:26,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 28.70it/s][A
Epoch 3:  76%|███████▌  | 4539/5971 [45:38<14:23,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.91it/s][A
Epoch 3:  76%|███████▌  | 4543/5971 [45:38<14:20,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.92it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.33it/s][A
Epoch 3:  76%|███████▌  | 4547/5971 [45:38<14:17,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.26it/s][A
Epoch 3:  76%|███████▌  | 4551/5971 [45:38<14:14,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 24.59it/s][A
Epoch 3:  76%|███████▋  | 4555/5971 [45:39<14:11,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:06<00:00, 25.71it/s][A
Epoch 3:  76%|███████▋  | 4556/5971 [45:39<14:10,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2149.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  76%|███████▋  | 4557/5971 [45:40<14:10,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000177, train/loss_step=0.0515, global_step=2150.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  76%|███████▋  | 4558/5971 [45:41<14:09,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00145, train/loss_step=0.283, global_step=2150.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  76%|███████▋  | 4559/5971 [45:42<14:09,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00145, train/loss_step=0.283, global_step=2150.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  76%|███████▋  | 4559/5971 [45:42<14:09,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000524, train/loss_step=0.159, global_step=2150.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  76%|███████▋  | 4560/5971 [45:45<14:09,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000611, train/loss_step=0.177, global_step=2150.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  76%|███████▋  | 4561/5971 [45:46<14:08,  1.66it/s, loss=0.119, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00142, train/loss_step=0.320, global_step=2151.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  76%|███████▋  | 4562/5971 [45:47<14:08,  1.66it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.42e-5, train/loss_step=0.00268, global_step=2151.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  76%|███████▋  | 4563/5971 [45:48<14:07,  1.66it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.42e-5, train/loss_step=0.00268, global_step=2151.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  76%|███████▋  | 4563/5971 [45:48<14:07,  1.66it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=8.11e-5, train/loss_step=0.0176, global_step=2151.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  76%|███████▋  | 4564/5971 [45:50<14:07,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=7.99e-6, train/loss_step=0.00132, global_step=2151.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  76%|███████▋  | 4565/5971 [45:51<14:07,  1.66it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.16e-5, train/loss_step=0.0156, global_step=2152.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  76%|███████▋  | 4566/5971 [45:52<14:06,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=2152.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  76%|███████▋  | 4567/5971 [45:53<14:06,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=2152.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  76%|███████▋  | 4567/5971 [45:53<14:06,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.00013, train/loss_step=0.0365, global_step=2152.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4568/5971 [45:55<14:06,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.46e-5, train/loss_step=0.00261, global_step=2152.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4569/5971 [45:56<14:05,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00227, train/loss_step=0.357, global_step=2153.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  77%|███████▋  | 4570/5971 [45:57<14:05,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.11e-5, train/loss_step=0.015, global_step=2153.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4571/5971 [45:58<14:04,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.11e-5, train/loss_step=0.015, global_step=2153.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4571/5971 [45:58<14:04,  1.66it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00594, train/loss_vlb_step=3.07e-5, train/loss_step=0.00594, global_step=2153.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4572/5971 [46:00<14:04,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000717, train/loss_step=0.214, global_step=2153.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  77%|███████▋  | 4573/5971 [46:01<14:04,  1.66it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000119, train/loss_step=0.0291, global_step=2154.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4574/5971 [46:02<14:03,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.517, train/loss_vlb_step=0.00572, train/loss_step=0.517, global_step=2154.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  77%|███████▋  | 4575/5971 [46:03<14:03,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.517, train/loss_vlb_step=0.00572, train/loss_step=0.517, global_step=2154.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4575/5971 [46:03<14:03,  1.66it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000305, train/loss_step=0.0918, global_step=2154.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4576/5971 [46:06<14:03,  1.65it/s, loss=0.127, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000459, train/loss_step=0.139, global_step=2154.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  77%|███████▋  | 4577/5971 [46:07<14:02,  1.65it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0563, train/loss_vlb_step=0.000186, train/loss_step=0.0563, global_step=2155.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4578/5971 [46:07<14:02,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00296, train/loss_step=0.420, global_step=2155.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  77%|███████▋  | 4579/5971 [46:08<14:01,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00296, train/loss_step=0.420, global_step=2155.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4579/5971 [46:08<14:01,  1.65it/s, loss=0.133, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000437, train/loss_step=0.128, global_step=2155.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4580/5971 [46:11<14:01,  1.65it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000286, train/loss_step=0.0842, global_step=2155.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4581/5971 [46:12<14:00,  1.65it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=9.03e-6, train/loss_step=0.00149, global_step=2156.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4582/5971 [46:13<14:00,  1.65it/s, loss=0.118, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2156.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  77%|███████▋  | 4583/5971 [46:13<13:59,  1.65it/s, loss=0.118, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2156.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4583/5971 [46:13<13:59,  1.65it/s, loss=0.124, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=2156.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4584/5971 [46:16<13:59,  1.65it/s, loss=0.133, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000627, train/loss_step=0.177, global_step=2156.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4585/5971 [46:17<13:59,  1.65it/s, loss=0.141, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000633, train/loss_step=0.182, global_step=2157.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4586/5971 [46:17<13:58,  1.65it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0504, train/loss_vlb_step=0.000174, train/loss_step=0.0504, global_step=2157.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4587/5971 [46:18<13:58,  1.65it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0504, train/loss_vlb_step=0.000174, train/loss_step=0.0504, global_step=2157.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4587/5971 [46:18<13:58,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=2157.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  77%|███████▋  | 4588/5971 [46:20<13:58,  1.65it/s, loss=0.149, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000368, train/loss_step=0.112, global_step=2157.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4589/5971 [46:21<13:57,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000206, train/loss_step=0.0596, global_step=2158.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4590/5971 [46:22<13:57,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.6e-5, train/loss_step=0.0158, global_step=2158.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  77%|███████▋  | 4591/5971 [46:23<13:56,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.6e-5, train/loss_step=0.0158, global_step=2158.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4591/5971 [46:23<13:56,  1.65it/s, loss=0.164, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.00603, train/loss_step=0.622, global_step=2158.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  77%|███████▋  | 4592/5971 [46:26<13:56,  1.65it/s, loss=0.169, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.0014, train/loss_step=0.314, global_step=2158.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  77%|███████▋  | 4593/5971 [46:27<13:56,  1.65it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000183, train/loss_step=0.0497, global_step=2159.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4594/5971 [46:28<13:55,  1.65it/s, loss=0.16, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00154, train/loss_step=0.301, global_step=2159.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  77%|███████▋  | 4595/5971 [46:28<13:54,  1.65it/s, loss=0.16, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00154, train/loss_step=0.301, global_step=2159.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4595/5971 [46:28<13:54,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.61e-5, train/loss_step=0.00291, global_step=2159.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4596/5971 [46:31<13:54,  1.65it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.00026, train/loss_step=0.0778, global_step=2159.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  77%|███████▋  | 4597/5971 [46:32<13:54,  1.65it/s, loss=0.174, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00512, train/loss_step=0.500, global_step=2160.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  77%|███████▋  | 4598/5971 [46:32<13:53,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000403, train/loss_step=0.123, global_step=2160.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4599/5971 [46:33<13:53,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000403, train/loss_step=0.123, global_step=2160.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4599/5971 [46:33<13:53,  1.65it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00498, train/loss_vlb_step=2.6e-5, train/loss_step=0.00498, global_step=2160.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4600/5971 [46:36<13:53,  1.65it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.19e-5, train/loss_step=0.0208, global_step=2160.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  77%|███████▋  | 4601/5971 [46:37<13:52,  1.65it/s, loss=0.156, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=2161.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4602/5971 [46:38<13:52,  1.65it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00494, train/loss_vlb_step=2.48e-5, train/loss_step=0.00494, global_step=2161.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4603/5971 [46:38<13:51,  1.64it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00494, train/loss_vlb_step=2.48e-5, train/loss_step=0.00494, global_step=2161.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4603/5971 [46:38<13:51,  1.64it/s, loss=0.16, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00216, train/loss_step=0.336, global_step=2161.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  77%|███████▋  | 4604/5971 [46:41<13:51,  1.64it/s, loss=0.156, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000336, train/loss_step=0.101, global_step=2161.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4605/5971 [46:42<13:50,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=2e-5, train/loss_step=0.00355, global_step=2162.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4606/5971 [46:42<13:50,  1.64it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.43e-5, train/loss_step=0.00483, global_step=2162.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4607/5971 [46:43<13:49,  1.64it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.43e-5, train/loss_step=0.00483, global_step=2162.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4607/5971 [46:43<13:49,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00652, train/loss_vlb_step=3.15e-5, train/loss_step=0.00652, global_step=2162.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4608/5971 [46:46<13:49,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.0011, train/loss_step=0.255, global_step=2162.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  77%|███████▋  | 4609/5971 [46:47<13:49,  1.64it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000129, train/loss_step=0.0355, global_step=2163.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4610/5971 [46:48<13:48,  1.64it/s, loss=0.157, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00127, train/loss_step=0.269, global_step=2163.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  77%|███████▋  | 4611/5971 [46:49<13:48,  1.64it/s, loss=0.157, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00127, train/loss_step=0.269, global_step=2163.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4611/5971 [46:49<13:48,  1.64it/s, loss=0.132, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=2163.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4612/5971 [46:51<13:48,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0079, train/loss_vlb_step=3.65e-5, train/loss_step=0.0079, global_step=2163.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4613/5971 [46:52<13:47,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00227, train/loss_step=0.388, global_step=2164.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  77%|███████▋  | 4614/5971 [46:53<13:47,  1.64it/s, loss=0.124, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000361, train/loss_step=0.110, global_step=2164.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4615/5971 [46:54<13:46,  1.64it/s, loss=0.124, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000361, train/loss_step=0.110, global_step=2164.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4615/5971 [46:54<13:46,  1.64it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.62e-5, train/loss_step=0.00294, global_step=2164.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4616/5971 [46:56<13:46,  1.64it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0458, train/loss_vlb_step=0.000159, train/loss_step=0.0458, global_step=2164.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  77%|███████▋  | 4617/5971 [46:57<13:46,  1.64it/s, loss=0.103, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000387, train/loss_step=0.116, global_step=2165.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  77%|███████▋  | 4618/5971 [46:58<13:45,  1.64it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.29e-5, train/loss_step=0.00671, global_step=2165.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4619/5971 [46:59<13:45,  1.64it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.29e-5, train/loss_step=0.00671, global_step=2165.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4619/5971 [46:59<13:45,  1.64it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000113, train/loss_step=0.0288, global_step=2165.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  77%|███████▋  | 4620/5971 [47:02<13:45,  1.64it/s, loss=0.114, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00153, train/loss_step=0.326, global_step=2165.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  77%|███████▋  | 4621/5971 [47:03<13:44,  1.64it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0856, train/loss_vlb_step=0.000286, train/loss_step=0.0856, global_step=2166.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4622/5971 [47:04<13:44,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0062, train/loss_vlb_step=3.04e-5, train/loss_step=0.0062, global_step=2166.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  77%|███████▋  | 4623/5971 [47:04<13:43,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0062, train/loss_vlb_step=3.04e-5, train/loss_step=0.0062, global_step=2166.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4623/5971 [47:04<13:43,  1.64it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.0321, train/loss_vlb_step=0.000125, train/loss_step=0.0321, global_step=2166.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4624/5971 [47:07<13:43,  1.64it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.0833, train/loss_vlb_step=0.000274, train/loss_step=0.0833, global_step=2166.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4625/5971 [47:07<13:42,  1.64it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.69e-5, train/loss_step=0.022, global_step=2167.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  77%|███████▋  | 4626/5971 [47:08<13:42,  1.64it/s, loss=0.103, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000422, train/loss_step=0.126, global_step=2167.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4627/5971 [47:09<13:41,  1.64it/s, loss=0.103, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000422, train/loss_step=0.126, global_step=2167.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  77%|███████▋  | 4627/5971 [47:09<13:41,  1.64it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.16e-5, train/loss_step=0.00405, global_step=2167.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4628/5971 [47:11<13:41,  1.63it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.43e-5, train/loss_step=0.00248, global_step=2167.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4629/5971 [47:12<13:41,  1.63it/s, loss=0.101, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000921, train/loss_step=0.247, global_step=2168.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  78%|███████▊  | 4630/5971 [47:13<13:40,  1.63it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.45e-5, train/loss_step=0.00262, global_step=2168.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4631/5971 [47:14<13:40,  1.63it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.45e-5, train/loss_step=0.00262, global_step=2168.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4631/5971 [47:14<13:40,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.608, train/loss_vlb_step=0.0086, train/loss_step=0.608, global_step=2168.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]      
Epoch 3:  78%|███████▊  | 4632/5971 [47:16<13:39,  1.63it/s, loss=0.132, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.0025, train/loss_step=0.399, global_step=2168.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4633/5971 [47:17<13:39,  1.63it/s, loss=0.133, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00214, train/loss_step=0.396, global_step=2169.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4634/5971 [47:18<13:38,  1.63it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000198, train/loss_step=0.0575, global_step=2169.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4635/5971 [47:19<13:38,  1.63it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000198, train/loss_step=0.0575, global_step=2169.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4635/5971 [47:19<13:38,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00102, train/loss_step=0.266, global_step=2169.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  78%|███████▊  | 4636/5971 [47:21<13:38,  1.63it/s, loss=0.149, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000588, train/loss_step=0.173, global_step=2169.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4637/5971 [47:22<13:37,  1.63it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=8.63e-5, train/loss_step=0.0233, global_step=2170.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4638/5971 [47:23<13:37,  1.63it/s, loss=0.157, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00109, train/loss_step=0.259, global_step=2170.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  78%|███████▊  | 4639/5971 [47:24<13:36,  1.63it/s, loss=0.157, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00109, train/loss_step=0.259, global_step=2170.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4639/5971 [47:24<13:36,  1.63it/s, loss=0.156, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.71e-5, train/loss_step=0.003, global_step=2170.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4640/5971 [47:26<13:36,  1.63it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000125, train/loss_step=0.0341, global_step=2170.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4641/5971 [47:27<13:35,  1.63it/s, loss=0.162, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.00405, train/loss_step=0.489, global_step=2171.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  78%|███████▊  | 4642/5971 [47:28<13:35,  1.63it/s, loss=0.167, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=2171.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4643/5971 [47:29<13:34,  1.63it/s, loss=0.167, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=2171.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4643/5971 [47:29<13:34,  1.63it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00655, train/loss_vlb_step=2.93e-5, train/loss_step=0.00655, global_step=2171.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4644/5971 [47:31<13:34,  1.63it/s, loss=0.179, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.0028, train/loss_step=0.365, global_step=2171.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  78%|███████▊  | 4645/5971 [47:32<13:34,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00228, train/loss_step=0.390, global_step=2172.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4646/5971 [47:33<13:33,  1.63it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00637, train/loss_vlb_step=3.12e-5, train/loss_step=0.00637, global_step=2172.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4647/5971 [47:34<13:33,  1.63it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00637, train/loss_vlb_step=3.12e-5, train/loss_step=0.00637, global_step=2172.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4647/5971 [47:34<13:33,  1.63it/s, loss=0.216, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00403, train/loss_step=0.486, global_step=2172.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  78%|███████▊  | 4648/5971 [47:37<13:33,  1.63it/s, loss=0.227, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000885, train/loss_step=0.227, global_step=2172.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4649/5971 [47:38<13:32,  1.63it/s, loss=0.215, v_num=0, train/loss_simple_step=0.00881, train/loss_vlb_step=4.19e-5, train/loss_step=0.00881, global_step=2173.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4650/5971 [47:38<13:32,  1.63it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000153, train/loss_step=0.0406, global_step=2173.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  78%|███████▊  | 4651/5971 [47:39<13:31,  1.63it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000153, train/loss_step=0.0406, global_step=2173.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4651/5971 [47:39<13:31,  1.63it/s, loss=0.196, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000687, train/loss_step=0.187, global_step=2173.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  78%|███████▊  | 4652/5971 [47:42<13:31,  1.63it/s, loss=0.191, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00119, train/loss_step=0.307, global_step=2173.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  78%|███████▊  | 4653/5971 [47:43<13:30,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.000966, train/loss_step=0.270, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4654/5971 [47:44<13:30,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.00018, train/loss_step=0.0512, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4655/5971 [47:44<13:29,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.00018, train/loss_step=0.0512, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  78%|███████▊  | 4655/5971 [47:44<13:29,  1.63it/s, loss=0.181, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.00084, train/loss_step=0.198, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  78%|███████▊  | 4656/5971 [47:48<13:29,  1.62it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.47it/s][A

Validating:   1%|          | 2/167 [00:00<00:40,  4.05it/s][A
Epoch 3:  78%|███████▊  | 4659/5971 [47:48<13:27,  1.62it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.82it/s][A
Epoch 3:  78%|███████▊  | 4663/5971 [47:48<13:24,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   4%|▍         | 7/167 [00:00<00:13, 12.28it/s][A

Validating:   6%|▌         | 10/167 [00:00<00:09, 16.54it/s][A
Epoch 3:  78%|███████▊  | 4667/5971 [47:49<13:21,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:01<00:08, 19.23it/s][A
Epoch 3:  78%|███████▊  | 4671/5971 [47:49<13:18,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|▉         | 16/167 [00:01<00:07, 20.50it/s][A
Epoch 3:  78%|███████▊  | 4675/5971 [47:49<13:15,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.56it/s][A

Validating:  13%|█▎        | 22/167 [00:01<00:06, 23.28it/s][A
Epoch 3:  78%|███████▊  | 4679/5971 [47:49<13:12,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 23.65it/s][A
Epoch 3:  78%|███████▊  | 4683/5971 [47:49<13:09,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.59it/s][A
Epoch 3:  78%|███████▊  | 4687/5971 [47:49<13:06,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 24.79it/s][A

Validating:  20%|██        | 34/167 [00:01<00:05, 24.81it/s][A
Epoch 3:  79%|███████▊  | 4691/5971 [47:49<13:02,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:01<00:05, 24.57it/s][A
Epoch 3:  79%|███████▊  | 4695/5971 [47:50<12:59,  1.64it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 25.07it/s][A
Epoch 3:  79%|███████▊  | 4699/5971 [47:50<12:56,  1.64it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 24.99it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.62it/s][A
Epoch 3:  79%|███████▉  | 4703/5971 [47:50<12:53,  1.64it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.19it/s][A
Epoch 3:  79%|███████▉  | 4707/5971 [47:50<12:50,  1.64it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:02<00:04, 26.49it/s][A
Epoch 3:  79%|███████▉  | 4711/5971 [47:50<12:47,  1.64it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.31it/s][A

Validating:  35%|███▍      | 58/167 [00:02<00:04, 25.86it/s][A
Epoch 3:  79%|███████▉  | 4715/5971 [47:50<12:44,  1.64it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 26.87it/s][A
Epoch 3:  79%|███████▉  | 4719/5971 [47:51<12:41,  1.64it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 26.67it/s][A
Epoch 3:  79%|███████▉  | 4723/5971 [47:51<12:38,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.31it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.93it/s][A
Epoch 3:  79%|███████▉  | 4727/5971 [47:51<12:35,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 28.11it/s][A
Epoch 3:  79%|███████▉  | 4731/5971 [47:51<12:32,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 28.12it/s][A
Epoch 3:  79%|███████▉  | 4735/5971 [47:51<12:29,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 27.82it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.21it/s][A
Epoch 3:  79%|███████▉  | 4739/5971 [47:51<12:26,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:02, 27.46it/s][A
Epoch 3:  79%|███████▉  | 4743/5971 [47:51<12:23,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.47it/s][A
Epoch 3:  80%|███████▉  | 4747/5971 [47:52<12:20,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.26it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.48it/s][A
Epoch 3:  80%|███████▉  | 4751/5971 [47:52<12:17,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 24.77it/s][A
Epoch 3:  80%|███████▉  | 4755/5971 [47:52<12:14,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.44it/s][A
Epoch 3:  80%|███████▉  | 4759/5971 [47:52<12:11,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.49it/s][A

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.11it/s][A
Epoch 3:  80%|███████▉  | 4763/5971 [47:52<12:08,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.07it/s][A
Epoch 3:  80%|███████▉  | 4767/5971 [47:52<12:05,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 27.83it/s][A
Epoch 3:  80%|███████▉  | 4771/5971 [47:52<12:02,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 28.31it/s][A

Validating:  71%|███████   | 118/167 [00:04<00:01, 25.99it/s][A
Epoch 3:  80%|███████▉  | 4775/5971 [47:53<11:59,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.28it/s][A
Epoch 3:  80%|████████  | 4779/5971 [47:53<11:56,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.75it/s][A
Epoch 3:  80%|████████  | 4783/5971 [47:53<11:53,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.02it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.84it/s][A
Epoch 3:  80%|████████  | 4787/5971 [47:53<11:50,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.42it/s][A
Epoch 3:  80%|████████  | 4791/5971 [47:53<11:47,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.88it/s][A
Epoch 3:  80%|████████  | 4795/5971 [47:53<11:44,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.67it/s][A
Epoch 3:  80%|████████  | 4799/5971 [47:54<11:41,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.77it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 28.00it/s][A
Epoch 3:  80%|████████  | 4803/5971 [47:54<11:38,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 28.27it/s][A
Epoch 3:  81%|████████  | 4807/5971 [47:54<11:35,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 28.22it/s][A
Epoch 3:  81%|████████  | 4811/5971 [47:54<11:32,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.91it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.65it/s][A
Epoch 3:  81%|████████  | 4815/5971 [47:54<11:29,  1.68it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.89it/s][A
Epoch 3:  81%|████████  | 4819/5971 [47:54<11:27,  1.68it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 29.27it/s][A
Epoch 3:  81%|████████  | 4823/5971 [47:54<11:24,  1.68it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4824/5971 [47:55<11:23,  1.68it/s, loss=0.182, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000733, train/loss_step=0.177, global_step=2174.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  81%|████████  | 4825/5971 [47:56<11:22,  1.68it/s, loss=0.191, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000728, train/loss_step=0.208, global_step=2175.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4826/5971 [47:57<11:22,  1.68it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.61e-5, train/loss_step=0.00301, global_step=2175.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4827/5971 [47:57<11:21,  1.68it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.61e-5, train/loss_step=0.00301, global_step=2175.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4827/5971 [47:57<11:21,  1.68it/s, loss=0.186, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000544, train/loss_step=0.155, global_step=2175.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  81%|████████  | 4828/5971 [48:00<11:21,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=8.47e-5, train/loss_step=0.0222, global_step=2175.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4829/5971 [48:01<11:21,  1.68it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.00014, train/loss_step=0.0383, global_step=2176.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4830/5971 [48:02<11:20,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000213, train/loss_step=0.0625, global_step=2176.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4831/5971 [48:03<11:20,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000213, train/loss_step=0.0625, global_step=2176.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4831/5971 [48:03<11:20,  1.68it/s, loss=0.183, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00547, train/loss_step=0.462, global_step=2176.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  81%|████████  | 4832/5971 [48:05<11:20,  1.67it/s, loss=0.174, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000643, train/loss_step=0.187, global_step=2176.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4833/5971 [48:06<11:19,  1.67it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.65e-5, train/loss_step=0.00588, global_step=2177.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4834/5971 [48:07<11:18,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.000163, train/loss_step=0.046, global_step=2177.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  81%|████████  | 4835/5971 [48:08<11:18,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.000163, train/loss_step=0.046, global_step=2177.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4835/5971 [48:08<11:18,  1.67it/s, loss=0.138, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=2177.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4836/5971 [48:10<11:18,  1.67it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.73e-5, train/loss_step=0.0132, global_step=2177.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4837/5971 [48:11<11:17,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0703, train/loss_vlb_step=0.000241, train/loss_step=0.0703, global_step=2178.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4838/5971 [48:12<11:17,  1.67it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.25e-5, train/loss_step=0.0043, global_step=2178.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4839/5971 [48:12<11:16,  1.67it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.25e-5, train/loss_step=0.0043, global_step=2178.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4839/5971 [48:12<11:16,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.0009, train/loss_step=0.217, global_step=2178.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  81%|████████  | 4840/5971 [48:15<11:16,  1.67it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0691, train/loss_vlb_step=0.000228, train/loss_step=0.0691, global_step=2178.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4841/5971 [48:15<11:15,  1.67it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000277, train/loss_step=0.0844, global_step=2179.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4842/5971 [48:16<11:15,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.664, train/loss_vlb_step=0.0381, train/loss_step=0.664, global_step=2179.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  81%|████████  | 4843/5971 [48:17<11:14,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.664, train/loss_vlb_step=0.0381, train/loss_step=0.664, global_step=2179.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4843/5971 [48:17<11:14,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000148, train/loss_step=0.040, global_step=2179.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4844/5971 [48:19<11:14,  1.67it/s, loss=0.135, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.0011, train/loss_step=0.250, global_step=2179.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  81%|████████  | 4845/5971 [48:20<11:14,  1.67it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000117, train/loss_step=0.0289, global_step=2180.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4846/5971 [48:21<11:13,  1.67it/s, loss=0.146, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00216, train/loss_step=0.391, global_step=2180.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  81%|████████  | 4847/5971 [48:22<11:12,  1.67it/s, loss=0.146, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00216, train/loss_step=0.391, global_step=2180.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4847/5971 [48:22<11:12,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00234, train/loss_step=0.394, global_step=2180.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4848/5971 [48:25<11:12,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000117, train/loss_step=0.0324, global_step=2180.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4849/5971 [48:25<11:12,  1.67it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00587, train/loss_vlb_step=2.99e-5, train/loss_step=0.00587, global_step=2181.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4850/5971 [48:26<11:11,  1.67it/s, loss=0.159, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000359, train/loss_step=0.108, global_step=2181.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  81%|████████  | 4851/5971 [48:27<11:11,  1.67it/s, loss=0.159, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000359, train/loss_step=0.108, global_step=2181.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████  | 4851/5971 [48:27<11:11,  1.67it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0329, train/loss_vlb_step=0.000125, train/loss_step=0.0329, global_step=2181.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4852/5971 [48:29<11:10,  1.67it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.37e-5, train/loss_step=0.00242, global_step=2181.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4853/5971 [48:30<11:10,  1.67it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.04e-5, train/loss_step=0.00384, global_step=2182.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4854/5971 [48:31<11:09,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.000989, train/loss_step=0.264, global_step=2182.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  81%|████████▏ | 4855/5971 [48:32<11:09,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.000989, train/loss_step=0.264, global_step=2182.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4855/5971 [48:32<11:09,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=2.02e-5, train/loss_step=0.0036, global_step=2182.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4856/5971 [48:34<11:09,  1.67it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000254, train/loss_step=0.0759, global_step=2182.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4857/5971 [48:35<11:08,  1.67it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000129, train/loss_step=0.0337, global_step=2183.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4858/5971 [48:36<11:08,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000253, train/loss_step=0.0761, global_step=2183.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4859/5971 [48:37<11:07,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000253, train/loss_step=0.0761, global_step=2183.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4859/5971 [48:37<11:07,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.08e-5, train/loss_step=0.0166, global_step=2183.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  81%|████████▏ | 4860/5971 [48:39<11:07,  1.67it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.64e-5, train/loss_step=0.00299, global_step=2183.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4861/5971 [48:40<11:06,  1.66it/s, loss=0.138, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00152, train/loss_step=0.341, global_step=2184.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  81%|████████▏ | 4862/5971 [48:41<11:06,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00781, train/loss_vlb_step=3.74e-5, train/loss_step=0.00781, global_step=2184.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4863/5971 [48:42<11:05,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00781, train/loss_vlb_step=3.74e-5, train/loss_step=0.00781, global_step=2184.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4863/5971 [48:42<11:05,  1.66it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00443, train/loss_vlb_step=2.34e-5, train/loss_step=0.00443, global_step=2184.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  81%|████████▏ | 4864/5971 [48:44<11:05,  1.66it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.47e-5, train/loss_step=0.0207, global_step=2184.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  81%|████████▏ | 4865/5971 [48:45<11:04,  1.66it/s, loss=0.102, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.00104, train/loss_step=0.230, global_step=2185.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  81%|████████▏ | 4866/5971 [48:46<11:04,  1.66it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=6.83e-5, train/loss_step=0.0177, global_step=2185.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4867/5971 [48:46<11:03,  1.66it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=6.83e-5, train/loss_step=0.0177, global_step=2185.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4867/5971 [48:46<11:03,  1.66it/s, loss=0.067, v_num=0, train/loss_simple_step=0.0611, train/loss_vlb_step=0.000218, train/loss_step=0.0611, global_step=2185.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4868/5971 [48:49<11:03,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0325, train/loss_step=0.750, global_step=2185.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  82%|████████▏ | 4869/5971 [48:50<11:03,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.632, train/loss_vlb_step=0.0101, train/loss_step=0.632, global_step=2186.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4870/5971 [48:51<11:02,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00104, train/loss_step=0.269, global_step=2186.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4871/5971 [48:51<11:01,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00104, train/loss_step=0.269, global_step=2186.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4871/5971 [48:51<11:01,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000952, train/loss_step=0.262, global_step=2186.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4872/5971 [48:54<11:01,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00162, train/loss_step=0.297, global_step=2186.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  82%|████████▏ | 4873/5971 [48:54<11:01,  1.66it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.7e-5, train/loss_step=0.0221, global_step=2187.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4874/5971 [48:55<11:00,  1.66it/s, loss=0.175, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00214, train/loss_step=0.387, global_step=2187.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  82%|████████▏ | 4875/5971 [48:56<11:00,  1.66it/s, loss=0.175, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00214, train/loss_step=0.387, global_step=2187.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4875/5971 [48:56<11:00,  1.66it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0821, train/loss_vlb_step=0.000275, train/loss_step=0.0821, global_step=2187.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4876/5971 [48:59<10:59,  1.66it/s, loss=0.184, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000586, train/loss_step=0.174, global_step=2187.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  82%|████████▏ | 4877/5971 [48:59<10:59,  1.66it/s, loss=0.184, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000121, train/loss_step=0.033, global_step=2188.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4878/5971 [49:00<10:58,  1.66it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.3e-5, train/loss_step=0.00224, global_step=2188.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4879/5971 [49:01<10:58,  1.66it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.3e-5, train/loss_step=0.00224, global_step=2188.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4879/5971 [49:01<10:58,  1.66it/s, loss=0.185, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.00036, train/loss_step=0.108, global_step=2188.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  82%|████████▏ | 4880/5971 [49:03<10:57,  1.66it/s, loss=0.216, v_num=0, train/loss_simple_step=0.614, train/loss_vlb_step=0.00753, train/loss_step=0.614, global_step=2188.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4881/5971 [49:04<10:57,  1.66it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00459, train/loss_vlb_step=2.42e-5, train/loss_step=0.00459, global_step=2189.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4882/5971 [49:05<10:56,  1.66it/s, loss=0.232, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0383, train/loss_step=0.668, global_step=2189.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  82%|████████▏ | 4883/5971 [49:06<10:56,  1.66it/s, loss=0.232, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0383, train/loss_step=0.668, global_step=2189.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4883/5971 [49:06<10:56,  1.66it/s, loss=0.243, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000848, train/loss_step=0.231, global_step=2189.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4884/5971 [49:08<10:56,  1.66it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.98e-6, train/loss_step=0.00165, global_step=2189.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4885/5971 [49:09<10:55,  1.66it/s, loss=0.234, v_num=0, train/loss_simple_step=0.0616, train/loss_vlb_step=0.00021, train/loss_step=0.0616, global_step=2190.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  82%|████████▏ | 4886/5971 [49:10<10:55,  1.66it/s, loss=0.26, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.0103, train/loss_step=0.546, global_step=2190.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  82%|████████▏ | 4887/5971 [49:11<10:54,  1.66it/s, loss=0.26, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.0103, train/loss_step=0.546, global_step=2190.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4887/5971 [49:11<10:54,  1.66it/s, loss=0.26, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000189, train/loss_step=0.055, global_step=2190.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4888/5971 [49:13<10:54,  1.66it/s, loss=0.234, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000858, train/loss_step=0.234, global_step=2190.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4889/5971 [49:14<10:53,  1.65it/s, loss=0.233, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.00733, train/loss_step=0.609, global_step=2191.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  82%|████████▏ | 4890/5971 [49:15<10:53,  1.65it/s, loss=0.231, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000955, train/loss_step=0.228, global_step=2191.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4891/5971 [49:16<10:52,  1.65it/s, loss=0.231, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000955, train/loss_step=0.228, global_step=2191.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4891/5971 [49:16<10:52,  1.65it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.32e-6, train/loss_step=0.00159, global_step=2191.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4892/5971 [49:18<10:52,  1.65it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.63e-5, train/loss_step=0.0134, global_step=2191.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  82%|████████▏ | 4893/5971 [49:19<10:51,  1.65it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000198, train/loss_step=0.0581, global_step=2192.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4894/5971 [49:20<10:51,  1.65it/s, loss=0.206, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00275, train/loss_step=0.393, global_step=2192.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  82%|████████▏ | 4895/5971 [49:21<10:50,  1.65it/s, loss=0.206, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00275, train/loss_step=0.393, global_step=2192.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4895/5971 [49:21<10:50,  1.65it/s, loss=0.208, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=2192.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4896/5971 [49:23<10:50,  1.65it/s, loss=0.234, v_num=0, train/loss_simple_step=0.700, train/loss_vlb_step=0.0187, train/loss_step=0.700, global_step=2192.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  82%|████████▏ | 4897/5971 [49:24<10:50,  1.65it/s, loss=0.232, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.44e-5, train/loss_step=0.00274, global_step=2193.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4898/5971 [49:25<10:49,  1.65it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.22e-5, train/loss_step=0.0134, global_step=2193.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  82%|████████▏ | 4899/5971 [49:26<10:48,  1.65it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.22e-5, train/loss_step=0.0134, global_step=2193.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4899/5971 [49:26<10:48,  1.65it/s, loss=0.237, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000758, train/loss_step=0.197, global_step=2193.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  82%|████████▏ | 4900/5971 [49:28<10:48,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000599, train/loss_step=0.173, global_step=2193.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4901/5971 [49:29<10:48,  1.65it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.91e-5, train/loss_step=0.0155, global_step=2194.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4902/5971 [49:30<10:47,  1.65it/s, loss=0.201, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00215, train/loss_step=0.375, global_step=2194.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  82%|████████▏ | 4903/5971 [49:31<10:47,  1.65it/s, loss=0.201, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00215, train/loss_step=0.375, global_step=2194.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4903/5971 [49:31<10:47,  1.65it/s, loss=0.213, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00568, train/loss_step=0.463, global_step=2194.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4904/5971 [49:33<10:46,  1.65it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000127, train/loss_step=0.0334, global_step=2194.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4905/5971 [49:34<10:46,  1.65it/s, loss=0.248, v_num=0, train/loss_simple_step=0.722, train/loss_vlb_step=0.027, train/loss_step=0.722, global_step=2195.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  82%|████████▏ | 4906/5971 [49:35<10:45,  1.65it/s, loss=0.225, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=2195.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4907/5971 [49:35<10:45,  1.65it/s, loss=0.225, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=2195.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4907/5971 [49:35<10:45,  1.65it/s, loss=0.257, v_num=0, train/loss_simple_step=0.690, train/loss_vlb_step=0.0215, train/loss_step=0.690, global_step=2195.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  82%|████████▏ | 4908/5971 [49:38<10:44,  1.65it/s, loss=0.261, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00164, train/loss_step=0.321, global_step=2195.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4909/5971 [49:39<10:44,  1.65it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000142, train/loss_step=0.0368, global_step=2196.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4910/5971 [49:39<10:43,  1.65it/s, loss=0.242, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00272, train/loss_step=0.408, global_step=2196.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  82%|████████▏ | 4911/5971 [49:40<10:43,  1.65it/s, loss=0.242, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00272, train/loss_step=0.408, global_step=2196.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4911/5971 [49:40<10:43,  1.65it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00902, train/loss_vlb_step=4.04e-5, train/loss_step=0.00902, global_step=2196.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4912/5971 [49:42<10:42,  1.65it/s, loss=0.282, v_num=0, train/loss_simple_step=0.810, train/loss_vlb_step=0.0284, train/loss_step=0.810, global_step=2196.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  82%|████████▏ | 4913/5971 [49:43<10:42,  1.65it/s, loss=0.279, v_num=0, train/loss_simple_step=0.00489, train/loss_vlb_step=2.61e-5, train/loss_step=0.00489, global_step=2197.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4914/5971 [49:44<10:41,  1.65it/s, loss=0.262, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000146, train/loss_step=0.0419, global_step=2197.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  82%|████████▏ | 4915/5971 [49:45<10:41,  1.65it/s, loss=0.262, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000146, train/loss_step=0.0419, global_step=2197.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4915/5971 [49:45<10:41,  1.65it/s, loss=0.256, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.06e-5, train/loss_step=0.00174, global_step=2197.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4916/5971 [49:48<10:41,  1.65it/s, loss=0.243, v_num=0, train/loss_simple_step=0.437, train/loss_vlb_step=0.00398, train/loss_step=0.437, global_step=2197.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  82%|████████▏ | 4917/5971 [49:49<10:40,  1.65it/s, loss=0.249, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000437, train/loss_step=0.132, global_step=2198.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4918/5971 [49:50<10:40,  1.65it/s, loss=0.265, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00163, train/loss_step=0.335, global_step=2198.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  82%|████████▏ | 4919/5971 [49:50<10:39,  1.64it/s, loss=0.265, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00163, train/loss_step=0.335, global_step=2198.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4919/5971 [49:50<10:39,  1.64it/s, loss=0.256, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.72e-5, train/loss_step=0.00768, global_step=2198.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4920/5971 [49:53<10:39,  1.64it/s, loss=0.289, v_num=0, train/loss_simple_step=0.843, train/loss_vlb_step=0.0366, train/loss_step=0.843, global_step=2198.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  82%|████████▏ | 4921/5971 [49:54<10:38,  1.64it/s, loss=0.291, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000172, train/loss_step=0.0503, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4922/5971 [49:54<10:38,  1.64it/s, loss=0.277, v_num=0, train/loss_simple_step=0.0875, train/loss_vlb_step=0.000288, train/loss_step=0.0875, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4923/5971 [49:55<10:37,  1.64it/s, loss=0.277, v_num=0, train/loss_simple_step=0.0875, train/loss_vlb_step=0.000288, train/loss_step=0.0875, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  82%|████████▏ | 4923/5971 [49:55<10:37,  1.64it/s, loss=0.262, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000555, train/loss_step=0.161, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  82%|████████▏ | 4924/5971 [49:58<10:37,  1.64it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.46it/s][A

Validating:   1%|          | 2/167 [00:00<00:46,  3.54it/s][A
Epoch 3:  83%|████████▎ | 4927/5971 [49:58<10:35,  1.64it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.06it/s][A
Epoch 3:  83%|████████▎ | 4931/5971 [49:58<10:32,  1.64it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.27it/s][A
Epoch 3:  83%|████████▎ | 4935/5971 [49:59<10:29,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.03it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.95it/s][A
Epoch 3:  83%|████████▎ | 4939/5971 [49:59<10:26,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.22it/s][A
Epoch 3:  83%|████████▎ | 4943/5971 [49:59<10:23,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.70it/s][A
Epoch 3:  83%|████████▎ | 4947/5971 [49:59<10:20,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.29it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.19it/s][A
Epoch 3:  83%|████████▎ | 4951/5971 [49:59<10:17,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.20it/s][A
Epoch 3:  83%|████████▎ | 4955/5971 [49:59<10:14,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.97it/s][A
Epoch 3:  83%|████████▎ | 4959/5971 [50:00<10:12,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.55it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.28it/s][A
Epoch 3:  83%|████████▎ | 4963/5971 [50:00<10:09,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.09it/s][A
Epoch 3:  83%|████████▎ | 4967/5971 [50:00<10:06,  1.66it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.48it/s][A
Epoch 3:  83%|████████▎ | 4971/5971 [50:00<10:03,  1.66it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.36it/s][A
Epoch 3:  83%|████████▎ | 4975/5971 [50:00<10:00,  1.66it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.18it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.01it/s][A
Epoch 3:  83%|████████▎ | 4979/5971 [50:00<09:57,  1.66it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.28it/s][A
Epoch 3:  83%|████████▎ | 4983/5971 [50:00<09:54,  1.66it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.36it/s][A
Epoch 3:  84%|████████▎ | 4987/5971 [50:01<09:52,  1.66it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.19it/s][A
Epoch 3:  84%|████████▎ | 4991/5971 [50:01<09:49,  1.66it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:03, 25.99it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.45it/s][A
Epoch 3:  84%|████████▎ | 4995/5971 [50:01<09:46,  1.66it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.50it/s][A
Epoch 3:  84%|████████▎ | 4999/5971 [50:01<09:43,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.41it/s][A
Epoch 3:  84%|████████▍ | 5003/5971 [50:01<09:40,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 24.92it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:03, 22.63it/s][A
Epoch 3:  84%|████████▍ | 5007/5971 [50:01<09:37,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:03, 23.52it/s][A
Epoch 3:  84%|████████▍ | 5011/5971 [50:02<09:35,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 24.32it/s][A
Epoch 3:  84%|████████▍ | 5015/5971 [50:02<09:32,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:04<00:04, 18.66it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:03, 20.83it/s][A
Epoch 3:  84%|████████▍ | 5019/5971 [50:02<09:29,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:03, 21.63it/s][A
Epoch 3:  84%|████████▍ | 5023/5971 [50:02<09:26,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 22.39it/s][A
Epoch 3:  84%|████████▍ | 5027/5971 [50:02<09:23,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 22.83it/s][A

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 23.10it/s][A
Epoch 3:  84%|████████▍ | 5031/5971 [50:03<09:20,  1.68it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 23.59it/s][A
Epoch 3:  84%|████████▍ | 5035/5971 [50:03<09:18,  1.68it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 24.71it/s][A
Epoch 3:  84%|████████▍ | 5039/5971 [50:03<09:15,  1.68it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 25.50it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.47it/s][A
Epoch 3:  84%|████████▍ | 5043/5971 [50:03<09:12,  1.68it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.89it/s][A
Epoch 3:  85%|████████▍ | 5047/5971 [50:03<09:09,  1.68it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.13it/s][A
Epoch 3:  85%|████████▍ | 5051/5971 [50:03<09:07,  1.68it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.55it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 24.90it/s][A
Epoch 3:  85%|████████▍ | 5055/5971 [50:03<09:04,  1.68it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.29it/s][A
Epoch 3:  85%|████████▍ | 5059/5971 [50:04<09:01,  1.68it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.97it/s][A
Epoch 3:  85%|████████▍ | 5063/5971 [50:04<08:58,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 25.84it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 25.39it/s][A
Epoch 3:  85%|████████▍ | 5067/5971 [50:04<08:55,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 24.88it/s][A
Epoch 3:  85%|████████▍ | 5071/5971 [50:04<08:53,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.50it/s][A
Epoch 3:  85%|████████▍ | 5075/5971 [50:04<08:50,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.49it/s][A
Epoch 3:  85%|████████▌ | 5079/5971 [50:04<08:47,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.26it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.63it/s][A
Epoch 3:  85%|████████▌ | 5083/5971 [50:05<08:44,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.39it/s][A
Epoch 3:  85%|████████▌ | 5087/5971 [50:05<08:42,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 26.47it/s][A
Epoch 3:  85%|████████▌ | 5091/5971 [50:05<08:39,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:07<00:00, 27.12it/s][A
Epoch 3:  85%|████████▌ | 5092/5971 [50:05<08:38,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.69it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.96it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.13it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.46it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.51it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.29it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.41it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.26it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.13it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.23it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.48it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.47it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.39it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.26it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.22it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.27it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.09it/s]

Epoch 3:  85%|████████▌ | 5092/5971 [50:17<08:40,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5093/5971 [50:17<08:40,  1.69it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000262, train/loss_step=0.0766, global_step=2199.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5093/5971 [50:17<08:40,  1.69it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.85e-5, train/loss_step=0.0106, global_step=2200.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.83it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.32it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.93it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.96it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.09it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.28it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.58it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.66it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.59it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.60it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.64it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.45it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.50it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.51it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.49it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.18it/s]

Epoch 3:  85%|████████▌ | 5094/5971 [50:29<08:41,  1.68it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.85e-5, train/loss_step=0.0106, global_step=2200.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5094/5971 [50:29<08:41,  1.68it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000109, train/loss_step=0.0272, global_step=2200.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.40it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.05it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.24it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.53it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.59it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.63it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.42it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.38it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.34it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.32it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.32it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.32it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.50it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.56it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.71it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.71it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.70it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.68it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.69it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.69it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s]

Epoch 3:  85%|████████▌ | 5095/5971 [50:41<08:42,  1.68it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000109, train/loss_step=0.0272, global_step=2200.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5095/5971 [50:41<08:42,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000498, train/loss_step=0.147, global_step=2200.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.23it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.82it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.25it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.58it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.81it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.96it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.08it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.38it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.60it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.52it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.66it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.67it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.68it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.45it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.37it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.34it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.33it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.30it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.35it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.42it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 3:  85%|████████▌ | 5096/5971 [50:55<08:44,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000498, train/loss_step=0.147, global_step=2200.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5096/5971 [50:55<08:44,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000209, train/loss_step=0.0621, global_step=2200.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5097/5971 [50:56<08:43,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000209, train/loss_step=0.0621, global_step=2200.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5097/5971 [50:56<08:43,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.15e-5, train/loss_step=0.00656, global_step=2201.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5098/5971 [50:56<08:43,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.15e-5, train/loss_step=0.00656, global_step=2201.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5098/5971 [50:56<08:43,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=2201.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  85%|████████▌ | 5099/5971 [50:57<08:42,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=2201.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5099/5971 [50:57<08:42,  1.67it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00727, train/loss_vlb_step=3.55e-5, train/loss_step=0.00727, global_step=2201.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5100/5971 [50:59<08:42,  1.67it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00727, train/loss_vlb_step=3.55e-5, train/loss_step=0.00727, global_step=2201.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5100/5971 [50:59<08:42,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000171, train/loss_step=0.0453, global_step=2201.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  85%|████████▌ | 5101/5971 [51:00<08:41,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000171, train/loss_step=0.0453, global_step=2201.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5101/5971 [51:00<08:41,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000407, train/loss_step=0.123, global_step=2202.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  85%|████████▌ | 5102/5971 [51:01<08:41,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000407, train/loss_step=0.123, global_step=2202.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5102/5971 [51:01<08:41,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000132, train/loss_step=0.0358, global_step=2202.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5103/5971 [51:02<08:40,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000132, train/loss_step=0.0358, global_step=2202.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5103/5971 [51:02<08:40,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.67e-5, train/loss_step=0.0104, global_step=2202.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  85%|████████▌ | 5104/5971 [51:04<08:40,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.67e-5, train/loss_step=0.0104, global_step=2202.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5104/5971 [51:04<08:40,  1.67it/s, loss=0.12, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000358, train/loss_step=0.108, global_step=2202.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  85%|████████▌ | 5105/5971 [51:05<08:39,  1.67it/s, loss=0.12, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000358, train/loss_step=0.108, global_step=2202.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  85%|████████▌ | 5105/5971 [51:05<08:39,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000875, train/loss_step=0.225, global_step=2203.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5106/5971 [51:06<08:39,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000875, train/loss_step=0.225, global_step=2203.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5106/5971 [51:06<08:39,  1.67it/s, loss=0.119, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000992, train/loss_step=0.232, global_step=2203.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5107/5971 [51:07<08:38,  1.67it/s, loss=0.119, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000992, train/loss_step=0.232, global_step=2203.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5107/5971 [51:07<08:38,  1.67it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.000225, train/loss_step=0.0682, global_step=2203.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5108/5971 [51:09<08:38,  1.66it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.000225, train/loss_step=0.0682, global_step=2203.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5108/5971 [51:09<08:38,  1.66it/s, loss=0.0822, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000145, train/loss_step=0.0406, global_step=2203.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5109/5971 [51:10<08:37,  1.66it/s, loss=0.0822, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000145, train/loss_step=0.0406, global_step=2203.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5109/5971 [51:10<08:37,  1.66it/s, loss=0.0811, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000116, train/loss_step=0.0299, global_step=2204.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5110/5971 [51:11<08:37,  1.66it/s, loss=0.0811, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000116, train/loss_step=0.0299, global_step=2204.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5110/5971 [51:11<08:37,  1.66it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000791, train/loss_step=0.222, global_step=2204.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  86%|████████▌ | 5111/5971 [51:12<08:36,  1.66it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000791, train/loss_step=0.222, global_step=2204.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5111/5971 [51:12<08:36,  1.66it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.33e-5, train/loss_step=0.0216, global_step=2204.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5112/5971 [51:15<08:36,  1.66it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.33e-5, train/loss_step=0.0216, global_step=2204.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5112/5971 [51:15<08:36,  1.66it/s, loss=0.0787, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000124, train/loss_step=0.0334, global_step=2204.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5113/5971 [51:16<08:36,  1.66it/s, loss=0.0787, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.000124, train/loss_step=0.0334, global_step=2204.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5113/5971 [51:16<08:36,  1.66it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.0018, train/loss_step=0.370, global_step=2205.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  86%|████████▌ | 5114/5971 [51:16<08:35,  1.66it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.0018, train/loss_step=0.370, global_step=2205.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5114/5971 [51:16<08:35,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00238, train/loss_step=0.278, global_step=2205.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5115/5971 [51:17<08:34,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00238, train/loss_step=0.278, global_step=2205.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5115/5971 [51:17<08:34,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.42e-5, train/loss_step=0.018, global_step=2205.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5116/5971 [51:20<08:34,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.42e-5, train/loss_step=0.018, global_step=2205.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5116/5971 [51:20<08:34,  1.66it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.57e-5, train/loss_step=0.00274, global_step=2205.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5117/5971 [51:21<08:34,  1.66it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.57e-5, train/loss_step=0.00274, global_step=2205.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5117/5971 [51:21<08:34,  1.66it/s, loss=0.119, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00219, train/loss_step=0.388, global_step=2206.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  86%|████████▌ | 5118/5971 [51:21<08:33,  1.66it/s, loss=0.119, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00219, train/loss_step=0.388, global_step=2206.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5118/5971 [51:21<08:33,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00375, train/loss_vlb_step=2.03e-5, train/loss_step=0.00375, global_step=2206.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5119/5971 [51:22<08:32,  1.66it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00375, train/loss_vlb_step=2.03e-5, train/loss_step=0.00375, global_step=2206.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5119/5971 [51:22<08:32,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00172, train/loss_step=0.380, global_step=2206.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  86%|████████▌ | 5120/5971 [51:25<08:32,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00172, train/loss_step=0.380, global_step=2206.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5120/5971 [51:25<08:32,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000189, train/loss_step=0.0538, global_step=2206.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5121/5971 [51:26<08:32,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000189, train/loss_step=0.0538, global_step=2206.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5121/5971 [51:26<08:32,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000188, train/loss_step=0.0531, global_step=2207.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5122/5971 [51:27<08:31,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000188, train/loss_step=0.0531, global_step=2207.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5122/5971 [51:27<08:31,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.51e-5, train/loss_step=0.00252, global_step=2207.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5123/5971 [51:28<08:31,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.51e-5, train/loss_step=0.00252, global_step=2207.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5123/5971 [51:28<08:31,  1.66it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000123, train/loss_step=0.0358, global_step=2207.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  86%|████████▌ | 5124/5971 [51:30<08:30,  1.66it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000123, train/loss_step=0.0358, global_step=2207.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5124/5971 [51:30<08:30,  1.66it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00766, train/loss_vlb_step=3.68e-5, train/loss_step=0.00766, global_step=2207.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5125/5971 [51:31<08:30,  1.66it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00766, train/loss_vlb_step=3.68e-5, train/loss_step=0.00766, global_step=2207.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5125/5971 [51:31<08:30,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.075, train/loss_vlb_step=0.000253, train/loss_step=0.075, global_step=2208.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  86%|████████▌ | 5126/5971 [51:32<08:29,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.075, train/loss_vlb_step=0.000253, train/loss_step=0.075, global_step=2208.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5126/5971 [51:32<08:29,  1.66it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0744, train/loss_vlb_step=0.000249, train/loss_step=0.0744, global_step=2208.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5127/5971 [51:32<08:29,  1.66it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0744, train/loss_vlb_step=0.000249, train/loss_step=0.0744, global_step=2208.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5127/5971 [51:32<08:29,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=9.92e-5, train/loss_step=0.0265, global_step=2208.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  86%|████████▌ | 5128/5971 [51:35<08:28,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=9.92e-5, train/loss_step=0.0265, global_step=2208.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5128/5971 [51:35<08:28,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000734, train/loss_step=0.194, global_step=2208.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  86%|████████▌ | 5129/5971 [51:35<08:28,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000734, train/loss_step=0.194, global_step=2208.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5129/5971 [51:35<08:28,  1.66it/s, loss=0.123, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000778, train/loss_step=0.218, global_step=2209.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5130/5971 [51:36<08:27,  1.66it/s, loss=0.123, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000778, train/loss_step=0.218, global_step=2209.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5130/5971 [51:36<08:27,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00903, train/loss_vlb_step=4.13e-5, train/loss_step=0.00903, global_step=2209.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5131/5971 [51:37<08:27,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00903, train/loss_vlb_step=4.13e-5, train/loss_step=0.00903, global_step=2209.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5131/5971 [51:37<08:27,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00191, train/loss_step=0.352, global_step=2209.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  86%|████████▌ | 5132/5971 [51:40<08:26,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00191, train/loss_step=0.352, global_step=2209.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5132/5971 [51:40<08:26,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.63e-5, train/loss_step=0.00299, global_step=2209.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5133/5971 [51:41<08:26,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.63e-5, train/loss_step=0.00299, global_step=2209.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5133/5971 [51:41<08:26,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.76e-5, train/loss_step=0.0053, global_step=2210.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  86%|████████▌ | 5134/5971 [51:42<08:25,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.76e-5, train/loss_step=0.0053, global_step=2210.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5134/5971 [51:42<08:25,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000965, train/loss_step=0.240, global_step=2210.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  86%|████████▌ | 5135/5971 [51:43<08:25,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000965, train/loss_step=0.240, global_step=2210.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5135/5971 [51:43<08:25,  1.66it/s, loss=0.115, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000635, train/loss_step=0.172, global_step=2210.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5136/5971 [51:45<08:24,  1.65it/s, loss=0.115, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000635, train/loss_step=0.172, global_step=2210.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5136/5971 [51:45<08:24,  1.65it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000121, train/loss_step=0.0319, global_step=2210.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5137/5971 [51:46<08:24,  1.65it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000121, train/loss_step=0.0319, global_step=2210.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5137/5971 [51:46<08:24,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.0029, train/loss_step=0.456, global_step=2211.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  86%|████████▌ | 5138/5971 [51:47<08:23,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.0029, train/loss_step=0.456, global_step=2211.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5138/5971 [51:47<08:23,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.32e-5, train/loss_step=0.0234, global_step=2211.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5139/5971 [51:48<08:23,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.32e-5, train/loss_step=0.0234, global_step=2211.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5139/5971 [51:48<08:23,  1.65it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000119, train/loss_step=0.0307, global_step=2211.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5140/5971 [51:50<08:22,  1.65it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000119, train/loss_step=0.0307, global_step=2211.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5140/5971 [51:50<08:22,  1.65it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000195, train/loss_step=0.0542, global_step=2211.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5141/5971 [51:51<08:22,  1.65it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000195, train/loss_step=0.0542, global_step=2211.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5141/5971 [51:51<08:22,  1.65it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000101, train/loss_step=0.0259, global_step=2212.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5142/5971 [51:51<08:21,  1.65it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000101, train/loss_step=0.0259, global_step=2212.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5142/5971 [51:51<08:21,  1.65it/s, loss=0.108, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000407, train/loss_step=0.122, global_step=2212.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  86%|████████▌ | 5143/5971 [51:52<08:21,  1.65it/s, loss=0.108, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000407, train/loss_step=0.122, global_step=2212.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5143/5971 [51:52<08:21,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.19e-5, train/loss_step=0.0175, global_step=2212.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5144/5971 [51:55<08:20,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.19e-5, train/loss_step=0.0175, global_step=2212.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5144/5971 [51:55<08:20,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.96e-5, train/loss_step=0.0154, global_step=2212.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5145/5971 [51:56<08:20,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.96e-5, train/loss_step=0.0154, global_step=2212.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5145/5971 [51:56<08:20,  1.65it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.84e-5, train/loss_step=0.00331, global_step=2213.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5146/5971 [51:56<08:19,  1.65it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.84e-5, train/loss_step=0.00331, global_step=2213.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5146/5971 [51:56<08:19,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000696, train/loss_step=0.203, global_step=2213.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  86%|████████▌ | 5147/5971 [51:57<08:19,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000696, train/loss_step=0.203, global_step=2213.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5147/5971 [51:57<08:19,  1.65it/s, loss=0.124, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00159, train/loss_step=0.312, global_step=2213.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5148/5971 [51:59<08:18,  1.65it/s, loss=0.124, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00159, train/loss_step=0.312, global_step=2213.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5148/5971 [51:59<08:18,  1.65it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00685, train/loss_vlb_step=3.44e-5, train/loss_step=0.00685, global_step=2213.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5149/5971 [52:00<08:18,  1.65it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00685, train/loss_vlb_step=3.44e-5, train/loss_step=0.00685, global_step=2213.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▌ | 5149/5971 [52:00<08:18,  1.65it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00514, train/loss_vlb_step=2.72e-5, train/loss_step=0.00514, global_step=2214.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5150/5971 [52:01<08:17,  1.65it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00514, train/loss_vlb_step=2.72e-5, train/loss_step=0.00514, global_step=2214.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5150/5971 [52:01<08:17,  1.65it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.76e-5, train/loss_step=0.00326, global_step=2214.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5151/5971 [52:02<08:17,  1.65it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.76e-5, train/loss_step=0.00326, global_step=2214.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5151/5971 [52:02<08:17,  1.65it/s, loss=0.108, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00318, train/loss_step=0.436, global_step=2214.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  86%|████████▋ | 5152/5971 [52:05<08:16,  1.65it/s, loss=0.108, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00318, train/loss_step=0.436, global_step=2214.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5152/5971 [52:05<08:16,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000582, train/loss_step=0.175, global_step=2214.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5153/5971 [52:06<08:16,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000582, train/loss_step=0.175, global_step=2214.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5153/5971 [52:06<08:16,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00116, train/loss_step=0.284, global_step=2215.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  86%|████████▋ | 5154/5971 [52:07<08:15,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00116, train/loss_step=0.284, global_step=2215.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5154/5971 [52:07<08:15,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.13e-5, train/loss_step=0.00189, global_step=2215.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5155/5971 [52:07<08:15,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.13e-5, train/loss_step=0.00189, global_step=2215.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5155/5971 [52:07<08:15,  1.65it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0664, train/loss_vlb_step=0.000224, train/loss_step=0.0664, global_step=2215.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  86%|████████▋ | 5156/5971 [52:10<08:14,  1.65it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0664, train/loss_vlb_step=0.000224, train/loss_step=0.0664, global_step=2215.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5156/5971 [52:10<08:14,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000646, train/loss_step=0.179, global_step=2215.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  86%|████████▋ | 5157/5971 [52:11<08:14,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000646, train/loss_step=0.179, global_step=2215.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5157/5971 [52:11<08:14,  1.65it/s, loss=0.106, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000497, train/loss_step=0.150, global_step=2216.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5158/5971 [52:12<08:13,  1.65it/s, loss=0.106, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000497, train/loss_step=0.150, global_step=2216.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5158/5971 [52:12<08:13,  1.65it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.91e-5, train/loss_step=0.0214, global_step=2216.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5159/5971 [52:12<08:13,  1.65it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.91e-5, train/loss_step=0.0214, global_step=2216.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5159/5971 [52:12<08:13,  1.65it/s, loss=0.109, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000355, train/loss_step=0.107, global_step=2216.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  86%|████████▋ | 5160/5971 [52:15<08:12,  1.65it/s, loss=0.109, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000355, train/loss_step=0.107, global_step=2216.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5160/5971 [52:15<08:12,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0633, train/loss_vlb_step=0.000215, train/loss_step=0.0633, global_step=2216.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5161/5971 [52:15<08:12,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0633, train/loss_vlb_step=0.000215, train/loss_step=0.0633, global_step=2216.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5161/5971 [52:16<08:12,  1.65it/s, loss=0.123, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00143, train/loss_step=0.279, global_step=2217.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  86%|████████▋ | 5162/5971 [52:16<08:11,  1.65it/s, loss=0.123, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00143, train/loss_step=0.279, global_step=2217.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5162/5971 [52:16<08:11,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000242, train/loss_step=0.0722, global_step=2217.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5163/5971 [52:17<08:10,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000242, train/loss_step=0.0722, global_step=2217.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5163/5971 [52:17<08:10,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.8e-5, train/loss_step=0.0108, global_step=2217.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  86%|████████▋ | 5164/5971 [52:20<08:10,  1.64it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.8e-5, train/loss_step=0.0108, global_step=2217.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  86%|████████▋ | 5164/5971 [52:20<08:10,  1.64it/s, loss=0.13, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000731, train/loss_step=0.219, global_step=2217.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5165/5971 [52:21<08:10,  1.64it/s, loss=0.13, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000731, train/loss_step=0.219, global_step=2217.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5165/5971 [52:21<08:10,  1.64it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.69e-5, train/loss_step=0.0104, global_step=2218.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5166/5971 [52:22<08:09,  1.64it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.69e-5, train/loss_step=0.0104, global_step=2218.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5166/5971 [52:22<08:09,  1.64it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.92e-5, train/loss_step=0.0112, global_step=2218.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5167/5971 [52:23<08:09,  1.64it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.92e-5, train/loss_step=0.0112, global_step=2218.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5167/5971 [52:23<08:09,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.37e-5, train/loss_step=0.00231, global_step=2218.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5168/5971 [52:25<08:08,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.37e-5, train/loss_step=0.00231, global_step=2218.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5168/5971 [52:25<08:08,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.18e-5, train/loss_step=0.0147, global_step=2218.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  87%|████████▋ | 5169/5971 [52:26<08:08,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.18e-5, train/loss_step=0.0147, global_step=2218.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5169/5971 [52:26<08:08,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00573, train/loss_vlb_step=2.93e-5, train/loss_step=0.00573, global_step=2219.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5170/5971 [52:27<08:07,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00573, train/loss_vlb_step=2.93e-5, train/loss_step=0.00573, global_step=2219.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5170/5971 [52:27<08:07,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00778, train/loss_vlb_step=3.79e-5, train/loss_step=0.00778, global_step=2219.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5171/5971 [52:28<08:06,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00778, train/loss_vlb_step=3.79e-5, train/loss_step=0.00778, global_step=2219.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5171/5971 [52:28<08:06,  1.64it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00118, train/loss_step=0.305, global_step=2219.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  87%|████████▋ | 5172/5971 [52:30<08:06,  1.64it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00118, train/loss_step=0.305, global_step=2219.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5172/5971 [52:30<08:06,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00119, train/loss_step=0.288, global_step=2219.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  87%|████████▋ | 5173/5971 [52:31<08:06,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00119, train/loss_step=0.288, global_step=2219.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5173/5971 [52:31<08:06,  1.64it/s, loss=0.0912, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.07e-5, train/loss_step=0.00868, global_step=2220.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5174/5971 [52:32<08:05,  1.64it/s, loss=0.0912, v_num=0, train/loss_simple_step=0.00868, train/loss_vlb_step=4.07e-5, train/loss_step=0.00868, global_step=2220.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5174/5971 [52:32<08:05,  1.64it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.2e-5, train/loss_step=0.0204, global_step=2220.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  87%|████████▋ | 5175/5971 [52:33<08:04,  1.64it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.2e-5, train/loss_step=0.0204, global_step=2220.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5175/5971 [52:33<08:04,  1.64it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.86e-5, train/loss_step=0.00565, global_step=2220.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5176/5971 [52:36<08:04,  1.64it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.86e-5, train/loss_step=0.00565, global_step=2220.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5176/5971 [52:36<08:04,  1.64it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00109, train/loss_step=0.253, global_step=2220.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  87%|████████▋ | 5177/5971 [52:37<08:04,  1.64it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00109, train/loss_step=0.253, global_step=2220.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5177/5971 [52:37<08:04,  1.64it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000193, train/loss_step=0.0568, global_step=2221.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5178/5971 [52:37<08:03,  1.64it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000193, train/loss_step=0.0568, global_step=2221.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5178/5971 [52:37<08:03,  1.64it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.00125, train/loss_vlb_step=7.61e-6, train/loss_step=0.00125, global_step=2221.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5179/5971 [52:38<08:02,  1.64it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.00125, train/loss_vlb_step=7.61e-6, train/loss_step=0.00125, global_step=2221.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5179/5971 [52:38<08:02,  1.64it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000399, train/loss_step=0.121, global_step=2221.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  87%|████████▋ | 5180/5971 [52:40<08:02,  1.64it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000399, train/loss_step=0.121, global_step=2221.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5180/5971 [52:40<08:02,  1.64it/s, loss=0.0899, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000356, train/loss_step=0.105, global_step=2221.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5181/5971 [52:41<08:02,  1.64it/s, loss=0.0899, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000356, train/loss_step=0.105, global_step=2221.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5181/5971 [52:41<08:02,  1.64it/s, loss=0.0783, v_num=0, train/loss_simple_step=0.0463, train/loss_vlb_step=0.000168, train/loss_step=0.0463, global_step=2222.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5182/5971 [52:42<08:01,  1.64it/s, loss=0.0783, v_num=0, train/loss_simple_step=0.0463, train/loss_vlb_step=0.000168, train/loss_step=0.0463, global_step=2222.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5182/5971 [52:42<08:01,  1.64it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.0634, train/loss_vlb_step=0.000214, train/loss_step=0.0634, global_step=2222.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5183/5971 [52:43<08:00,  1.64it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.0634, train/loss_vlb_step=0.000214, train/loss_step=0.0634, global_step=2222.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5183/5971 [52:43<08:00,  1.64it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00114, train/loss_step=0.285, global_step=2222.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  87%|████████▋ | 5184/5971 [52:45<08:00,  1.64it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00114, train/loss_step=0.285, global_step=2222.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5184/5971 [52:45<08:00,  1.64it/s, loss=0.082, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.00011, train/loss_step=0.0293, global_step=2222.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5185/5971 [52:46<07:59,  1.64it/s, loss=0.082, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.00011, train/loss_step=0.0293, global_step=2222.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5185/5971 [52:46<07:59,  1.64it/s, loss=0.0817, v_num=0, train/loss_simple_step=0.00447, train/loss_vlb_step=2.42e-5, train/loss_step=0.00447, global_step=2223.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5186/5971 [52:47<07:59,  1.64it/s, loss=0.0817, v_num=0, train/loss_simple_step=0.00447, train/loss_vlb_step=2.42e-5, train/loss_step=0.00447, global_step=2223.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5186/5971 [52:47<07:59,  1.64it/s, loss=0.0812, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.41e-6, train/loss_step=0.00162, global_step=2223.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5187/5971 [52:48<07:58,  1.64it/s, loss=0.0812, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.41e-6, train/loss_step=0.00162, global_step=2223.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5187/5971 [52:48<07:58,  1.64it/s, loss=0.0847, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000239, train/loss_step=0.0716, global_step=2223.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  87%|████████▋ | 5188/5971 [52:50<07:58,  1.64it/s, loss=0.0847, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000239, train/loss_step=0.0716, global_step=2223.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5188/5971 [52:50<07:58,  1.64it/s, loss=0.0845, v_num=0, train/loss_simple_step=0.00973, train/loss_vlb_step=4.22e-5, train/loss_step=0.00973, global_step=2223.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5189/5971 [52:51<07:57,  1.64it/s, loss=0.0845, v_num=0, train/loss_simple_step=0.00973, train/loss_vlb_step=4.22e-5, train/loss_step=0.00973, global_step=2223.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5189/5971 [52:51<07:57,  1.64it/s, loss=0.111, v_num=0, train/loss_simple_step=0.541, train/loss_vlb_step=0.00749, train/loss_step=0.541, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  87%|████████▋ | 5190/5971 [52:52<07:57,  1.64it/s, loss=0.111, v_num=0, train/loss_simple_step=0.541, train/loss_vlb_step=0.00749, train/loss_step=0.541, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5190/5971 [52:52<07:57,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000435, train/loss_step=0.132, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5191/5971 [52:53<07:56,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000435, train/loss_step=0.132, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5191/5971 [52:53<07:56,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000219, train/loss_step=0.065, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5192/5971 [52:55<07:56,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000219, train/loss_step=0.065, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  87%|████████▋ | 5192/5971 [52:55<07:56,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:00,  2.75it/s][A
Epoch 3:  87%|████████▋ | 5194/5971 [52:55<07:55,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:49,  3.33it/s][A
Epoch 3:  87%|████████▋ | 5196/5971 [52:56<07:53,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.85it/s][A
Epoch 3:  87%|████████▋ | 5199/5971 [52:56<07:51,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.90it/s][A
Epoch 3:  87%|████████▋ | 5202/5971 [52:56<07:49,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.16it/s][A
Epoch 3:  87%|████████▋ | 5205/5971 [52:56<07:47,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.99it/s][A
Epoch 3:  87%|████████▋ | 5208/5971 [52:56<07:45,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.39it/s][A
Epoch 3:  87%|████████▋ | 5211/5971 [52:56<07:43,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.71it/s][A
Epoch 3:  87%|████████▋ | 5214/5971 [52:56<07:41,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.63it/s][A
Epoch 3:  87%|████████▋ | 5217/5971 [52:57<07:39,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.68it/s][A
Epoch 3:  87%|████████▋ | 5220/5971 [52:57<07:37,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.66it/s][A
Epoch 3:  87%|████████▋ | 5223/5971 [52:57<07:34,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.56it/s][A
Epoch 3:  88%|████████▊ | 5226/5971 [52:57<07:32,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.19it/s][A
Epoch 3:  88%|████████▊ | 5229/5971 [52:57<07:30,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.80it/s][A
Epoch 3:  88%|████████▊ | 5233/5971 [52:57<07:28,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.71it/s][A
Epoch 3:  88%|████████▊ | 5237/5971 [52:57<07:25,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.57it/s][A
Epoch 3:  88%|████████▊ | 5241/5971 [52:57<07:22,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.36it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 26.89it/s][A
Epoch 3:  88%|████████▊ | 5245/5971 [52:58<07:19,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 27.65it/s][A
Epoch 3:  88%|████████▊ | 5249/5971 [52:58<07:17,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 28.80it/s][A
Epoch 3:  88%|████████▊ | 5253/5971 [52:58<07:14,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 28.50it/s][A
Epoch 3:  88%|████████▊ | 5257/5971 [52:58<07:11,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 28.28it/s][A

Validating:  41%|████      | 68/167 [00:03<00:03, 28.10it/s][A
Epoch 3:  88%|████████▊ | 5261/5971 [52:58<07:08,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 28.91it/s][A
Epoch 3:  88%|████████▊ | 5265/5971 [52:58<07:06,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.54it/s][A
Epoch 3:  88%|████████▊ | 5269/5971 [52:58<07:03,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 28.57it/s][A
Epoch 3:  88%|████████▊ | 5273/5971 [52:59<07:00,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 28.00it/s][A
Epoch 3:  88%|████████▊ | 5277/5971 [52:59<06:58,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:02, 28.29it/s][A

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 28.13it/s][A
Epoch 3:  88%|████████▊ | 5281/5971 [52:59<06:55,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 28.89it/s][A
Epoch 3:  89%|████████▊ | 5285/5971 [52:59<06:52,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 28.74it/s][A
Epoch 3:  89%|████████▊ | 5289/5971 [52:59<06:49,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 28.97it/s][A
Epoch 3:  89%|████████▊ | 5293/5971 [52:59<06:47,  1.66it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 29.00it/s][A
Epoch 3:  89%|████████▊ | 5297/5971 [52:59<06:44,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 29.08it/s][A
Epoch 3:  89%|████████▉ | 5301/5971 [53:00<06:41,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:04<00:01, 28.73it/s][A
Epoch 3:  89%|████████▉ | 5305/5971 [53:00<06:39,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.69it/s][A
Epoch 3:  89%|████████▉ | 5309/5971 [53:00<06:36,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.48it/s][A

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 27.57it/s][A
Epoch 3:  89%|████████▉ | 5313/5971 [53:00<06:33,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.37it/s][A
Epoch 3:  89%|████████▉ | 5317/5971 [53:00<06:31,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.56it/s][A
Epoch 3:  89%|████████▉ | 5321/5971 [53:00<06:28,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.62it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.67it/s][A
Epoch 3:  89%|████████▉ | 5325/5971 [53:00<06:25,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.73it/s][A
Epoch 3:  89%|████████▉ | 5329/5971 [53:01<06:23,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 27.19it/s][A
Epoch 3:  89%|████████▉ | 5333/5971 [53:01<06:20,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.92it/s][A
Epoch 3:  89%|████████▉ | 5337/5971 [53:01<06:17,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 29.71it/s][A

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 27.23it/s][A
Epoch 3:  89%|████████▉ | 5341/5971 [53:01<06:15,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.63it/s][A
Epoch 3:  90%|████████▉ | 5345/5971 [53:01<06:12,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 27.17it/s][A
Epoch 3:  90%|████████▉ | 5349/5971 [53:01<06:09,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.10it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.33it/s][A
Epoch 3:  90%|████████▉ | 5353/5971 [53:01<06:07,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.77it/s][A
Epoch 3:  90%|████████▉ | 5357/5971 [53:02<06:04,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.16it/s][A
Epoch 3:  90%|████████▉ | 5360/5971 [53:02<06:02,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  90%|████████▉ | 5361/5971 [53:03<06:02,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.0012, train/loss_step=0.288, global_step=2224.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5361/5971 [53:03<06:02,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00689, train/loss_vlb_step=3.36e-5, train/loss_step=0.00689, global_step=2225.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5362/5971 [53:04<06:01,  1.68it/s, loss=0.122, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00171, train/loss_step=0.354, global_step=2225.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  90%|████████▉ | 5363/5971 [53:05<06:01,  1.68it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00984, train/loss_vlb_step=4.44e-5, train/loss_step=0.00984, global_step=2225.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5364/5971 [53:07<06:00,  1.68it/s, loss=0.118, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000539, train/loss_step=0.163, global_step=2225.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  90%|████████▉ | 5365/5971 [53:08<06:00,  1.68it/s, loss=0.118, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000539, train/loss_step=0.163, global_step=2225.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5365/5971 [53:08<06:00,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000171, train/loss_step=0.0471, global_step=2226.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5366/5971 [53:09<05:59,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000976, train/loss_step=0.228, global_step=2226.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  90%|████████▉ | 5367/5971 [53:10<05:58,  1.68it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.72e-5, train/loss_step=0.0182, global_step=2226.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5368/5971 [53:12<05:58,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=2226.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  90%|████████▉ | 5369/5971 [53:13<05:58,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=2226.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5369/5971 [53:13<05:58,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.0014, train/loss_step=0.308, global_step=2227.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  90%|████████▉ | 5370/5971 [53:14<05:57,  1.68it/s, loss=0.167, v_num=0, train/loss_simple_step=0.646, train/loss_vlb_step=0.0151, train/loss_step=0.646, global_step=2227.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5371/5971 [53:15<05:56,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000127, train/loss_step=0.0316, global_step=2227.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5372/5971 [53:17<05:56,  1.68it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000229, train/loss_step=0.0674, global_step=2227.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5373/5971 [53:18<05:55,  1.68it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000229, train/loss_step=0.0674, global_step=2227.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|████████▉ | 5373/5971 [53:18<05:55,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000595, train/loss_step=0.178, global_step=2228.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  90%|█████████ | 5374/5971 [53:19<05:55,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.85e-5, train/loss_step=0.00347, global_step=2228.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5375/5971 [53:20<05:54,  1.68it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000112, train/loss_step=0.0287, global_step=2228.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  90%|█████████ | 5376/5971 [53:22<05:54,  1.68it/s, loss=0.173, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.00085, train/loss_step=0.220, global_step=2228.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  90%|█████████ | 5377/5971 [53:23<05:53,  1.68it/s, loss=0.173, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.00085, train/loss_step=0.220, global_step=2228.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5377/5971 [53:23<05:53,  1.68it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.25e-5, train/loss_step=0.0146, global_step=2229.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5378/5971 [53:24<05:53,  1.68it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00555, train/loss_vlb_step=2.86e-5, train/loss_step=0.00555, global_step=2229.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5379/5971 [53:24<05:52,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000155, train/loss_step=0.0468, global_step=2229.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5380/5971 [53:27<05:52,  1.68it/s, loss=0.126, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.1e-5, train/loss_step=0.022, global_step=2229.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  90%|█████████ | 5381/5971 [53:28<05:51,  1.68it/s, loss=0.126, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.1e-5, train/loss_step=0.022, global_step=2229.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5381/5971 [53:28<05:51,  1.68it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0903, train/loss_vlb_step=0.000298, train/loss_step=0.0903, global_step=2230.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5382/5971 [53:29<05:51,  1.68it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000169, train/loss_step=0.0464, global_step=2230.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5383/5971 [53:30<05:50,  1.68it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.79e-5, train/loss_step=0.00329, global_step=2230.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5384/5971 [53:32<05:50,  1.68it/s, loss=0.127, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00256, train/loss_step=0.404, global_step=2230.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  90%|█████████ | 5385/5971 [53:33<05:49,  1.68it/s, loss=0.127, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00256, train/loss_step=0.404, global_step=2230.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5385/5971 [53:33<05:49,  1.68it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.73e-5, train/loss_step=0.0208, global_step=2231.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5386/5971 [53:33<05:49,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0664, train/loss_vlb_step=0.000229, train/loss_step=0.0664, global_step=2231.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5387/5971 [53:34<05:48,  1.68it/s, loss=0.126, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000654, train/loss_step=0.187, global_step=2231.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  90%|█████████ | 5388/5971 [53:37<05:48,  1.68it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00546, train/loss_vlb_step=2.79e-5, train/loss_step=0.00546, global_step=2231.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5389/5971 [53:38<05:47,  1.67it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00546, train/loss_vlb_step=2.79e-5, train/loss_step=0.00546, global_step=2231.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5389/5971 [53:38<05:47,  1.67it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000111, train/loss_step=0.0284, global_step=2232.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5390/5971 [53:38<05:46,  1.67it/s, loss=0.096, v_num=0, train/loss_simple_step=0.450, train/loss_vlb_step=0.00321, train/loss_step=0.450, global_step=2232.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  90%|█████████ | 5391/5971 [53:39<05:46,  1.67it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.00288, train/loss_vlb_step=1.66e-5, train/loss_step=0.00288, global_step=2232.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5392/5971 [53:41<05:45,  1.67it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.52e-5, train/loss_step=0.00721, global_step=2232.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5393/5971 [53:42<05:45,  1.67it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.52e-5, train/loss_step=0.00721, global_step=2232.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5393/5971 [53:42<05:45,  1.67it/s, loss=0.106, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00346, train/loss_step=0.461, global_step=2233.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]     
Epoch 3:  90%|█████████ | 5394/5971 [53:43<05:44,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000849, train/loss_step=0.222, global_step=2233.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5395/5971 [53:44<05:44,  1.67it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0634, train/loss_vlb_step=0.000229, train/loss_step=0.0634, global_step=2233.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5396/5971 [53:47<05:43,  1.67it/s, loss=0.12, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00118, train/loss_step=0.259, global_step=2233.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  90%|█████████ | 5397/5971 [53:48<05:43,  1.67it/s, loss=0.12, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00118, train/loss_step=0.259, global_step=2233.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5397/5971 [53:48<05:43,  1.67it/s, loss=0.126, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000429, train/loss_step=0.129, global_step=2234.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5398/5971 [53:49<05:42,  1.67it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.72e-5, train/loss_step=0.0106, global_step=2234.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5399/5971 [53:49<05:42,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000731, train/loss_step=0.209, global_step=2234.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  90%|█████████ | 5400/5971 [53:52<05:41,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00163, train/loss_vlb_step=9.76e-6, train/loss_step=0.00163, global_step=2234.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5401/5971 [53:53<05:41,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00163, train/loss_vlb_step=9.76e-6, train/loss_step=0.00163, global_step=2234.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5401/5971 [53:53<05:41,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00635, train/loss_vlb_step=3.19e-5, train/loss_step=0.00635, global_step=2235.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  90%|█████████ | 5402/5971 [53:54<05:40,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000223, train/loss_step=0.0626, global_step=2235.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  90%|█████████ | 5403/5971 [53:55<05:40,  1.67it/s, loss=0.151, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00277, train/loss_step=0.420, global_step=2235.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  91%|█████████ | 5404/5971 [53:57<05:39,  1.67it/s, loss=0.14, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.00064, train/loss_step=0.181, global_step=2235.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████ | 5405/5971 [53:58<05:39,  1.67it/s, loss=0.14, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.00064, train/loss_step=0.181, global_step=2235.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5405/5971 [53:58<05:39,  1.67it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.23e-5, train/loss_step=0.0171, global_step=2236.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5406/5971 [53:59<05:38,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.35e-5, train/loss_step=0.00252, global_step=2236.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5407/5971 [53:59<05:37,  1.67it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.34e-5, train/loss_step=0.0123, global_step=2236.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  91%|█████████ | 5408/5971 [54:02<05:37,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0623, train/loss_vlb_step=0.00021, train/loss_step=0.0623, global_step=2236.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████ | 5409/5971 [54:03<05:36,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0623, train/loss_vlb_step=0.00021, train/loss_step=0.0623, global_step=2236.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5409/5971 [54:03<05:36,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.38e-5, train/loss_step=0.00455, global_step=2237.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5410/5971 [54:04<05:36,  1.67it/s, loss=0.113, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.00038, train/loss_step=0.116, global_step=2237.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  91%|█████████ | 5411/5971 [54:04<05:35,  1.67it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000125, train/loss_step=0.0336, global_step=2237.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5412/5971 [54:07<05:35,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0746, train/loss_vlb_step=0.000251, train/loss_step=0.0746, global_step=2237.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5413/5971 [54:07<05:34,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0746, train/loss_vlb_step=0.000251, train/loss_step=0.0746, global_step=2237.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5413/5971 [54:07<05:34,  1.67it/s, loss=0.116, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00183, train/loss_step=0.423, global_step=2238.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  91%|█████████ | 5414/5971 [54:08<05:34,  1.67it/s, loss=0.113, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000555, train/loss_step=0.166, global_step=2238.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5415/5971 [54:09<05:33,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.442, train/loss_vlb_step=0.00236, train/loss_step=0.442, global_step=2238.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████ | 5416/5971 [54:11<05:33,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000336, train/loss_step=0.100, global_step=2238.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5417/5971 [54:12<05:32,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000336, train/loss_step=0.100, global_step=2238.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5417/5971 [54:12<05:32,  1.67it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000222, train/loss_step=0.0674, global_step=2239.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5418/5971 [54:13<05:32,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000676, train/loss_step=0.194, global_step=2239.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  91%|█████████ | 5419/5971 [54:14<05:31,  1.67it/s, loss=0.13, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000847, train/loss_step=0.213, global_step=2239.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5420/5971 [54:17<05:31,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000127, train/loss_step=0.0317, global_step=2239.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5421/5971 [54:18<05:30,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000127, train/loss_step=0.0317, global_step=2239.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5421/5971 [54:18<05:30,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.741, train/loss_vlb_step=0.0145, train/loss_step=0.741, global_step=2240.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  91%|█████████ | 5422/5971 [54:18<05:29,  1.66it/s, loss=0.173, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000501, train/loss_step=0.151, global_step=2240.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5423/5971 [54:19<05:29,  1.66it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.00011, train/loss_step=0.0307, global_step=2240.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5424/5971 [54:22<05:28,  1.66it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0779, train/loss_vlb_step=0.000259, train/loss_step=0.0779, global_step=2240.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5425/5971 [54:22<05:28,  1.66it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0779, train/loss_vlb_step=0.000259, train/loss_step=0.0779, global_step=2240.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5425/5971 [54:22<05:28,  1.66it/s, loss=0.176, v_num=0, train/loss_simple_step=0.572, train/loss_vlb_step=0.00665, train/loss_step=0.572, global_step=2241.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  91%|█████████ | 5426/5971 [54:23<05:27,  1.66it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0388, train/loss_vlb_step=0.000141, train/loss_step=0.0388, global_step=2241.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5427/5971 [54:24<05:27,  1.66it/s, loss=0.192, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00144, train/loss_step=0.308, global_step=2241.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  91%|█████████ | 5428/5971 [54:27<05:26,  1.66it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00633, train/loss_vlb_step=3.2e-5, train/loss_step=0.00633, global_step=2241.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5429/5971 [54:28<05:26,  1.66it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00633, train/loss_vlb_step=3.2e-5, train/loss_step=0.00633, global_step=2241.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5429/5971 [54:28<05:26,  1.66it/s, loss=0.221, v_num=0, train/loss_simple_step=0.632, train/loss_vlb_step=0.0109, train/loss_step=0.632, global_step=2242.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  91%|█████████ | 5430/5971 [54:28<05:25,  1.66it/s, loss=0.237, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00348, train/loss_step=0.438, global_step=2242.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5431/5971 [54:29<05:25,  1.66it/s, loss=0.25, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00116, train/loss_step=0.285, global_step=2242.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████ | 5432/5971 [54:31<05:24,  1.66it/s, loss=0.249, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000216, train/loss_step=0.0655, global_step=2242.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5433/5971 [54:32<05:24,  1.66it/s, loss=0.249, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000216, train/loss_step=0.0655, global_step=2242.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5433/5971 [54:32<05:24,  1.66it/s, loss=0.228, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.3e-6, train/loss_step=0.00165, global_step=2243.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5434/5971 [54:33<05:23,  1.66it/s, loss=0.237, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00213, train/loss_step=0.338, global_step=2243.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  91%|█████████ | 5435/5971 [54:34<05:22,  1.66it/s, loss=0.231, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00192, train/loss_step=0.321, global_step=2243.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5436/5971 [54:36<05:22,  1.66it/s, loss=0.229, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.000245, train/loss_step=0.0707, global_step=2243.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5437/5971 [54:37<05:21,  1.66it/s, loss=0.229, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.000245, train/loss_step=0.0707, global_step=2243.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5437/5971 [54:37<05:21,  1.66it/s, loss=0.229, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000228, train/loss_step=0.0694, global_step=2244.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5438/5971 [54:38<05:21,  1.66it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=9.92e-5, train/loss_step=0.0279, global_step=2244.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████ | 5439/5971 [54:39<05:20,  1.66it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.00021, train/loss_step=0.0622, global_step=2244.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5440/5971 [54:41<05:20,  1.66it/s, loss=0.214, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000138, train/loss_step=0.037, global_step=2244.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████ | 5441/5971 [54:42<05:19,  1.66it/s, loss=0.214, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000138, train/loss_step=0.037, global_step=2244.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5441/5971 [54:42<05:19,  1.66it/s, loss=0.198, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00319, train/loss_step=0.433, global_step=2245.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████ | 5442/5971 [54:43<05:19,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00191, train/loss_step=0.403, global_step=2245.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5443/5971 [54:44<05:18,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.5e-5, train/loss_step=0.0259, global_step=2245.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5444/5971 [54:46<05:18,  1.66it/s, loss=0.214, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000461, train/loss_step=0.140, global_step=2245.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5445/5971 [54:47<05:17,  1.66it/s, loss=0.214, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000461, train/loss_step=0.140, global_step=2245.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5445/5971 [54:47<05:17,  1.66it/s, loss=0.186, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.64e-5, train/loss_step=0.022, global_step=2246.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████ | 5446/5971 [54:48<05:16,  1.66it/s, loss=0.199, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00126, train/loss_step=0.295, global_step=2246.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5447/5971 [54:49<05:16,  1.66it/s, loss=0.187, v_num=0, train/loss_simple_step=0.072, train/loss_vlb_step=0.000237, train/loss_step=0.072, global_step=2246.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████ | 5448/5971 [54:51<05:15,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000778, train/loss_step=0.207, global_step=2246.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5449/5971 [54:52<05:15,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000778, train/loss_step=0.207, global_step=2246.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5449/5971 [54:52<05:15,  1.66it/s, loss=0.167, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.27e-5, train/loss_step=0.018, global_step=2247.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████▏| 5450/5971 [54:53<05:14,  1.66it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0598, train/loss_vlb_step=0.000205, train/loss_step=0.0598, global_step=2247.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5451/5971 [54:54<05:14,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.09e-5, train/loss_step=0.0111, global_step=2247.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████▏| 5452/5971 [54:56<05:13,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00249, train/loss_vlb_step=1.42e-5, train/loss_step=0.00249, global_step=2247.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5453/5971 [54:57<05:13,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00249, train/loss_vlb_step=1.42e-5, train/loss_step=0.00249, global_step=2247.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5453/5971 [54:57<05:13,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.17e-5, train/loss_step=0.0112, global_step=2248.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  91%|█████████▏| 5454/5971 [54:57<05:12,  1.65it/s, loss=0.123, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000598, train/loss_step=0.177, global_step=2248.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  91%|█████████▏| 5455/5971 [54:58<05:11,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000279, train/loss_step=0.0842, global_step=2248.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5456/5971 [55:01<05:11,  1.65it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.29e-5, train/loss_step=0.00229, global_step=2248.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5457/5971 [55:02<05:10,  1.65it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.29e-5, train/loss_step=0.00229, global_step=2248.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5457/5971 [55:02<05:10,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00443, train/loss_step=0.534, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  91%|█████████▏| 5458/5971 [55:03<05:10,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000112, train/loss_step=0.0319, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5459/5971 [55:03<05:09,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.515, train/loss_vlb_step=0.0057, train/loss_step=0.515, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  91%|█████████▏| 5460/5971 [55:06<05:09,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  91%|█████████▏| 5461/5971 [55:06<05:08,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<00:57,  2.90it/s][A

Validating:   1%|          | 2/167 [00:00<00:51,  3.21it/s][A
Epoch 3:  92%|█████████▏| 5465/5971 [55:06<05:06,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.47it/s][A

Validating:   4%|▍         | 7/167 [00:00<00:14, 10.75it/s][A
Epoch 3:  92%|█████████▏| 5469/5971 [55:07<05:03,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   5%|▌         | 9/167 [00:01<00:14, 11.25it/s][A

Validating:   7%|▋         | 11/167 [00:01<00:11, 13.01it/s][A
Epoch 3:  92%|█████████▏| 5473/5971 [55:07<05:00,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:09, 16.29it/s][A
Epoch 3:  92%|█████████▏| 5477/5971 [55:07<04:58,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.31it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:07, 20.11it/s][A
Epoch 3:  92%|█████████▏| 5481/5971 [55:07<04:55,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:02<00:13, 10.55it/s][A
Epoch 3:  92%|█████████▏| 5485/5971 [55:08<04:53,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:02<00:14,  9.94it/s][A

Validating:  17%|█▋        | 28/167 [00:02<00:11, 12.54it/s][A
Epoch 3:  92%|█████████▏| 5489/5971 [55:08<04:50,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:02<00:08, 15.29it/s][A
Epoch 3:  92%|█████████▏| 5493/5971 [55:08<04:47,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|██        | 34/167 [00:02<00:07, 17.40it/s][A
Epoch 3:  92%|█████████▏| 5497/5971 [55:08<04:45,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:02<00:06, 20.01it/s][A

Validating:  24%|██▍       | 40/167 [00:02<00:05, 21.44it/s][A
Epoch 3:  92%|█████████▏| 5501/5971 [55:08<04:42,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:05, 22.33it/s][A
Epoch 3:  92%|█████████▏| 5505/5971 [55:09<04:40,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:03<00:05, 24.02it/s][A
Epoch 3:  92%|█████████▏| 5509/5971 [55:09<04:37,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:03<00:04, 24.72it/s][A

Validating:  31%|███       | 52/167 [00:03<00:04, 24.98it/s][A
Epoch 3:  92%|█████████▏| 5513/5971 [55:09<04:34,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:03<00:04, 25.28it/s][A
Epoch 3:  92%|█████████▏| 5517/5971 [55:09<04:32,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:03<00:04, 25.77it/s][A
Epoch 3:  92%|█████████▏| 5521/5971 [55:09<04:29,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 24.05it/s][A

Validating:  38%|███▊      | 64/167 [00:03<00:04, 24.10it/s][A
Epoch 3:  93%|█████████▎| 5525/5971 [55:09<04:27,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:04, 24.72it/s][A
Epoch 3:  93%|█████████▎| 5529/5971 [55:10<04:24,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:04<00:04, 23.44it/s][A
Epoch 3:  93%|█████████▎| 5533/5971 [55:10<04:21,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:04<00:03, 24.88it/s][A

Validating:  46%|████▌     | 76/167 [00:04<00:03, 25.78it/s][A
Epoch 3:  93%|█████████▎| 5537/5971 [55:10<04:19,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:04<00:03, 25.56it/s][A
Epoch 3:  93%|█████████▎| 5541/5971 [55:10<04:16,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 25.74it/s][A
Epoch 3:  93%|█████████▎| 5545/5971 [55:10<04:14,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:04<00:03, 24.40it/s][A

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 24.75it/s][A
Epoch 3:  93%|█████████▎| 5549/5971 [55:10<04:11,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.83it/s][A
Epoch 3:  93%|█████████▎| 5553/5971 [55:10<04:09,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:05<00:02, 25.32it/s][A
Epoch 3:  93%|█████████▎| 5557/5971 [55:11<04:06,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:05<00:02, 25.73it/s][A

Validating:  60%|█████▉    | 100/167 [00:05<00:02, 23.83it/s][A
Epoch 3:  93%|█████████▎| 5561/5971 [55:11<04:04,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:05<00:02, 24.16it/s][A
Epoch 3:  93%|█████████▎| 5565/5971 [55:11<04:01,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 24.96it/s][A
Epoch 3:  93%|█████████▎| 5569/5971 [55:11<03:59,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 25.77it/s][A

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 26.81it/s][A
Epoch 3:  93%|█████████▎| 5573/5971 [55:11<03:56,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 27.14it/s][A
Epoch 3:  93%|█████████▎| 5577/5971 [55:11<03:53,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.30it/s][A
Epoch 3:  93%|█████████▎| 5581/5971 [55:12<03:51,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:06<00:01, 27.85it/s][A

Validating:  74%|███████▍  | 124/167 [00:06<00:01, 26.10it/s][A
Epoch 3:  94%|█████████▎| 5585/5971 [55:12<03:48,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:06<00:01, 24.66it/s][A
Epoch 3:  94%|█████████▎| 5589/5971 [55:12<03:46,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 25.08it/s][A
Epoch 3:  94%|█████████▎| 5593/5971 [55:12<03:43,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:06<00:01, 26.38it/s][A
Epoch 3:  94%|█████████▎| 5597/5971 [55:12<03:41,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.55it/s][A

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.12it/s][A
Epoch 3:  94%|█████████▍| 5601/5971 [55:12<03:38,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 24.87it/s][A
Epoch 3:  94%|█████████▍| 5605/5971 [55:13<03:36,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:07<00:00, 25.12it/s][A
Epoch 3:  94%|█████████▍| 5609/5971 [55:13<03:33,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:07<00:00, 25.09it/s][A

Validating:  91%|█████████ | 152/167 [00:07<00:00, 25.37it/s][A
Epoch 3:  94%|█████████▍| 5613/5971 [55:13<03:31,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:07<00:00, 26.56it/s][A
Epoch 3:  94%|█████████▍| 5617/5971 [55:13<03:28,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 26.84it/s][A
Epoch 3:  94%|█████████▍| 5621/5971 [55:13<03:26,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 26.88it/s][A

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 27.10it/s][A
Epoch 3:  94%|█████████▍| 5625/5971 [55:13<03:23,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:07<00:00, 27.00it/s][A
Epoch 3:  94%|█████████▍| 5628/5971 [55:14<03:21,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  94%|█████████▍| 5629/5971 [55:15<03:21,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2249.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5629/5971 [55:15<03:21,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.17e-5, train/loss_step=0.0178, global_step=2250.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5630/5971 [55:16<03:20,  1.70it/s, loss=0.128, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000727, train/loss_step=0.204, global_step=2250.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  94%|█████████▍| 5631/5971 [55:16<03:20,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00763, train/loss_step=0.564, global_step=2250.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  94%|█████████▍| 5632/5971 [55:19<03:19,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00165, train/loss_step=0.301, global_step=2250.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5633/5971 [55:19<03:19,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00165, train/loss_step=0.301, global_step=2250.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5633/5971 [55:19<03:19,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.79e-6, train/loss_step=0.00164, global_step=2251.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5634/5971 [55:20<03:18,  1.70it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.09e-5, train/loss_step=0.00401, global_step=2251.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5635/5971 [55:21<03:18,  1.70it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000152, train/loss_step=0.0394, global_step=2251.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  94%|█████████▍| 5636/5971 [55:23<03:17,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00701, train/loss_vlb_step=3.44e-5, train/loss_step=0.00701, global_step=2251.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5637/5971 [55:24<03:16,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00701, train/loss_vlb_step=3.44e-5, train/loss_step=0.00701, global_step=2251.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5637/5971 [55:24<03:16,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.3e-5, train/loss_step=0.00228, global_step=2252.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  94%|█████████▍| 5638/5971 [55:25<03:16,  1.70it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.25e-5, train/loss_step=0.0194, global_step=2252.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  94%|█████████▍| 5639/5971 [55:26<03:15,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.572, train/loss_vlb_step=0.00889, train/loss_step=0.572, global_step=2252.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  94%|█████████▍| 5640/5971 [55:28<03:15,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000175, train/loss_step=0.0508, global_step=2252.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5641/5971 [55:29<03:14,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000175, train/loss_step=0.0508, global_step=2252.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  94%|█████████▍| 5641/5971 [55:29<03:14,  1.69it/s, loss=0.17, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000582, train/loss_step=0.161, global_step=2253.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  94%|█████████▍| 5642/5971 [55:30<03:14,  1.69it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00637, train/loss_vlb_step=3.1e-5, train/loss_step=0.00637, global_step=2253.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5643/5971 [55:31<03:13,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.57e-5, train/loss_step=0.0152, global_step=2253.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▍| 5644/5971 [55:34<03:13,  1.69it/s, loss=0.167, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000613, train/loss_step=0.177, global_step=2253.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▍| 5645/5971 [55:35<03:12,  1.69it/s, loss=0.167, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000613, train/loss_step=0.177, global_step=2253.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5645/5971 [55:35<03:12,  1.69it/s, loss=0.157, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00222, train/loss_step=0.337, global_step=2254.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▍| 5646/5971 [55:35<03:11,  1.69it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00821, train/loss_vlb_step=3.74e-5, train/loss_step=0.00821, global_step=2254.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5647/5971 [55:36<03:11,  1.69it/s, loss=0.14, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000704, train/loss_step=0.191, global_step=2254.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  95%|█████████▍| 5648/5971 [55:38<03:10,  1.69it/s, loss=0.145, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000882, train/loss_step=0.230, global_step=2254.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5649/5971 [55:39<03:10,  1.69it/s, loss=0.145, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000882, train/loss_step=0.230, global_step=2254.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5649/5971 [55:39<03:10,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000167, train/loss_step=0.0462, global_step=2255.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5650/5971 [55:40<03:09,  1.69it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0833, train/loss_vlb_step=0.000282, train/loss_step=0.0833, global_step=2255.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5651/5971 [55:41<03:09,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000781, train/loss_step=0.211, global_step=2255.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  95%|█████████▍| 5652/5971 [55:43<03:08,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000115, train/loss_step=0.0311, global_step=2255.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5653/5971 [55:44<03:08,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000115, train/loss_step=0.0311, global_step=2255.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5653/5971 [55:44<03:08,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000544, train/loss_step=0.160, global_step=2256.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▍| 5654/5971 [55:45<03:07,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.5e-5, train/loss_step=0.0224, global_step=2256.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5655/5971 [55:46<03:06,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.94e-5, train/loss_step=0.00347, global_step=2256.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5656/5971 [55:48<03:06,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00616, train/loss_vlb_step=3.08e-5, train/loss_step=0.00616, global_step=2256.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5657/5971 [55:49<03:05,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00616, train/loss_vlb_step=3.08e-5, train/loss_step=0.00616, global_step=2256.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5657/5971 [55:49<03:05,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0584, train/loss_vlb_step=0.000202, train/loss_step=0.0584, global_step=2257.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  95%|█████████▍| 5658/5971 [55:50<03:05,  1.69it/s, loss=0.125, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=2257.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▍| 5659/5971 [55:51<03:04,  1.69it/s, loss=0.114, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00164, train/loss_step=0.351, global_step=2257.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▍| 5660/5971 [55:53<03:04,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000463, train/loss_step=0.141, global_step=2257.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5661/5971 [55:54<03:03,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000463, train/loss_step=0.141, global_step=2257.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5661/5971 [55:54<03:03,  1.69it/s, loss=0.159, v_num=0, train/loss_simple_step=0.968, train/loss_vlb_step=0.487, train/loss_step=0.968, global_step=2258.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  95%|█████████▍| 5662/5971 [55:55<03:03,  1.69it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00949, train/loss_vlb_step=4.37e-5, train/loss_step=0.00949, global_step=2258.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5663/5971 [55:56<03:02,  1.69it/s, loss=0.17, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000829, train/loss_step=0.238, global_step=2258.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  95%|█████████▍| 5664/5971 [55:58<03:02,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=2258.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5665/5971 [55:59<03:01,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=2258.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5665/5971 [55:59<03:01,  1.69it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.3e-5, train/loss_step=0.00221, global_step=2259.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5666/5971 [56:00<03:00,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000635, train/loss_step=0.183, global_step=2259.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▍| 5667/5971 [56:01<03:00,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000192, train/loss_step=0.0561, global_step=2259.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5668/5971 [56:03<02:59,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=2259.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  95%|█████████▍| 5669/5971 [56:04<02:59,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=2259.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5669/5971 [56:04<02:59,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0947, train/loss_vlb_step=0.000311, train/loss_step=0.0947, global_step=2260.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5670/5971 [56:05<02:58,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00467, train/loss_step=0.495, global_step=2260.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  95%|█████████▍| 5671/5971 [56:06<02:58,  1.69it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.06e-5, train/loss_step=0.00179, global_step=2260.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▍| 5672/5971 [56:08<02:57,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000109, train/loss_step=0.0268, global_step=2260.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▌| 5673/5971 [56:09<02:56,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000109, train/loss_step=0.0268, global_step=2260.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5673/5971 [56:09<02:56,  1.68it/s, loss=0.157, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.00042, train/loss_step=0.124, global_step=2261.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  95%|█████████▌| 5674/5971 [56:10<02:56,  1.68it/s, loss=0.162, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000418, train/loss_step=0.127, global_step=2261.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5675/5971 [56:11<02:55,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00425, train/loss_step=0.535, global_step=2261.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▌| 5676/5971 [56:13<02:55,  1.68it/s, loss=0.209, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.0023, train/loss_step=0.421, global_step=2261.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▌| 5677/5971 [56:14<02:54,  1.68it/s, loss=0.209, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.0023, train/loss_step=0.421, global_step=2261.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5677/5971 [56:14<02:54,  1.68it/s, loss=0.214, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000543, train/loss_step=0.161, global_step=2262.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5678/5971 [56:15<02:54,  1.68it/s, loss=0.22, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000931, train/loss_step=0.231, global_step=2262.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▌| 5679/5971 [56:16<02:53,  1.68it/s, loss=0.22, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00209, train/loss_step=0.351, global_step=2262.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  95%|█████████▌| 5680/5971 [56:18<02:53,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=2262.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5681/5971 [56:19<02:52,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=2262.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5681/5971 [56:19<02:52,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=2263.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5682/5971 [56:20<02:51,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00146, train/loss_vlb_step=8.79e-6, train/loss_step=0.00146, global_step=2263.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5683/5971 [56:20<02:51,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.16e-5, train/loss_step=0.0166, global_step=2263.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  95%|█████████▌| 5684/5971 [56:23<02:50,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00473, train/loss_step=0.507, global_step=2263.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  95%|█████████▌| 5685/5971 [56:24<02:50,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00473, train/loss_step=0.507, global_step=2263.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5685/5971 [56:24<02:50,  1.68it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000103, train/loss_step=0.0259, global_step=2264.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5686/5971 [56:24<02:49,  1.68it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000226, train/loss_step=0.0656, global_step=2264.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5687/5971 [56:25<02:49,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.0041, train/loss_step=0.440, global_step=2264.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]      
Epoch 3:  95%|█████████▌| 5688/5971 [56:27<02:48,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000101, train/loss_step=0.0273, global_step=2264.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5689/5971 [56:28<02:47,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000101, train/loss_step=0.0273, global_step=2264.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5689/5971 [56:28<02:47,  1.68it/s, loss=0.212, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00249, train/loss_step=0.432, global_step=2265.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  95%|█████████▌| 5690/5971 [56:29<02:47,  1.68it/s, loss=0.206, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00237, train/loss_step=0.376, global_step=2265.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5691/5971 [56:30<02:46,  1.68it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0659, train/loss_vlb_step=0.000226, train/loss_step=0.0659, global_step=2265.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5692/5971 [56:33<02:46,  1.68it/s, loss=0.216, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000568, train/loss_step=0.166, global_step=2265.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  95%|█████████▌| 5693/5971 [56:34<02:45,  1.68it/s, loss=0.216, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000568, train/loss_step=0.166, global_step=2265.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5693/5971 [56:34<02:45,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.92e-5, train/loss_step=0.0108, global_step=2266.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5694/5971 [56:34<02:45,  1.68it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0745, train/loss_vlb_step=0.000246, train/loss_step=0.0745, global_step=2266.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5695/5971 [56:35<02:44,  1.68it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000217, train/loss_step=0.0641, global_step=2266.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5696/5971 [56:37<02:44,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000217, train/loss_step=0.062, global_step=2266.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  95%|█████████▌| 5697/5971 [56:38<02:43,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000217, train/loss_step=0.062, global_step=2266.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5697/5971 [56:38<02:43,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.2e-5, train/loss_step=0.00403, global_step=2267.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5698/5971 [56:39<02:42,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00102, train/loss_step=0.269, global_step=2267.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  95%|█████████▌| 5699/5971 [56:40<02:42,  1.68it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000171, train/loss_step=0.0471, global_step=2267.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5700/5971 [56:42<02:41,  1.68it/s, loss=0.15, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000861, train/loss_step=0.228, global_step=2267.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  95%|█████████▌| 5701/5971 [56:43<02:41,  1.68it/s, loss=0.15, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000861, train/loss_step=0.228, global_step=2267.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5701/5971 [56:43<02:41,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000793, train/loss_step=0.222, global_step=2268.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  95%|█████████▌| 5702/5971 [56:44<02:40,  1.68it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000126, train/loss_step=0.0337, global_step=2268.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5703/5971 [56:45<02:39,  1.68it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=7.71e-5, train/loss_step=0.0195, global_step=2268.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  96%|█████████▌| 5704/5971 [56:48<02:39,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.00011, train/loss_step=0.0284, global_step=2268.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5705/5971 [56:49<02:38,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.00011, train/loss_step=0.0284, global_step=2268.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5705/5971 [56:49<02:38,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00641, train/loss_vlb_step=3.07e-5, train/loss_step=0.00641, global_step=2269.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5706/5971 [56:50<02:38,  1.67it/s, loss=0.137, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000657, train/loss_step=0.171, global_step=2269.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  96%|█████████▌| 5707/5971 [56:51<02:37,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0295, train/loss_vlb_step=0.00011, train/loss_step=0.0295, global_step=2269.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5708/5971 [56:53<02:37,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00438, train/loss_step=0.331, global_step=2269.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  96%|█████████▌| 5709/5971 [56:54<02:36,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00438, train/loss_step=0.331, global_step=2269.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5709/5971 [56:54<02:36,  1.67it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000181, train/loss_step=0.0499, global_step=2270.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5710/5971 [56:55<02:36,  1.67it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000109, train/loss_step=0.0284, global_step=2270.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5711/5971 [56:56<02:35,  1.67it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=2270.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5712/5971 [56:58<02:34,  1.67it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.35e-5, train/loss_step=0.0148, global_step=2270.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  96%|█████████▌| 5713/5971 [56:59<02:34,  1.67it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.35e-5, train/loss_step=0.0148, global_step=2270.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5713/5971 [56:59<02:34,  1.67it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.0513, train/loss_vlb_step=0.000177, train/loss_step=0.0513, global_step=2271.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5714/5971 [57:00<02:33,  1.67it/s, loss=0.104, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00319, train/loss_step=0.413, global_step=2271.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  96%|█████████▌| 5715/5971 [57:01<02:33,  1.67it/s, loss=0.115, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00147, train/loss_step=0.298, global_step=2271.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5716/5971 [57:03<02:32,  1.67it/s, loss=0.119, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000425, train/loss_step=0.128, global_step=2271.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5717/5971 [57:04<02:32,  1.67it/s, loss=0.119, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000425, train/loss_step=0.128, global_step=2271.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5717/5971 [57:04<02:32,  1.67it/s, loss=0.149, v_num=0, train/loss_simple_step=0.604, train/loss_vlb_step=0.00801, train/loss_step=0.604, global_step=2272.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  96%|█████████▌| 5718/5971 [57:05<02:31,  1.67it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=8.89e-6, train/loss_step=0.00149, global_step=2272.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5719/5971 [57:06<02:30,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00338, train/loss_vlb_step=1.77e-5, train/loss_step=0.00338, global_step=2272.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5720/5971 [57:10<02:30,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000459, train/loss_step=0.138, global_step=2272.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  96%|█████████▌| 5721/5971 [57:11<02:29,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000459, train/loss_step=0.138, global_step=2272.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5721/5971 [57:11<02:29,  1.67it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00743, train/loss_vlb_step=3.7e-5, train/loss_step=0.00743, global_step=2273.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5722/5971 [57:13<02:29,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.25e-5, train/loss_step=0.0168, global_step=2273.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  96%|█████████▌| 5723/5971 [57:14<02:28,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00138, train/loss_step=0.313, global_step=2273.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  96%|█████████▌| 5724/5971 [57:18<02:28,  1.67it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00902, train/loss_vlb_step=4.18e-5, train/loss_step=0.00902, global_step=2273.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5725/5971 [57:19<02:27,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00902, train/loss_vlb_step=4.18e-5, train/loss_step=0.00902, global_step=2273.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5725/5971 [57:19<02:27,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000264, train/loss_step=0.0803, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  96%|█████████▌| 5726/5971 [57:21<02:27,  1.66it/s, loss=0.139, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00127, train/loss_step=0.267, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  96%|█████████▌| 5727/5971 [57:23<02:26,  1.66it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0031, train/loss_vlb_step=1.68e-5, train/loss_step=0.0031, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5728/5971 [57:26<02:26,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  96%|█████████▌| 5729/5971 [57:26<02:25,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:29,  1.85it/s][A

Validating:   1%|          | 2/167 [00:01<01:23,  1.97it/s][A

Validating:   2%|▏         | 4/167 [00:01<00:38,  4.25it/s][A
Epoch 3:  96%|█████████▌| 5733/5971 [57:27<02:23,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   4%|▎         | 6/167 [00:01<00:24,  6.52it/s][A

Validating:   5%|▍         | 8/167 [00:01<00:18,  8.41it/s][A
Epoch 3:  96%|█████████▌| 5737/5971 [57:27<02:20,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   6%|▌         | 10/167 [00:01<00:15, 10.33it/s][A

Validating:   7%|▋         | 12/167 [00:01<00:12, 12.21it/s][A
Epoch 3:  96%|█████████▌| 5741/5971 [57:28<02:18,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:11, 12.94it/s][A

Validating:  10%|▉         | 16/167 [00:01<00:11, 13.70it/s][A
Epoch 3:  96%|█████████▌| 5745/5971 [57:28<02:15,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:02<00:10, 13.57it/s][A
Epoch 3:  96%|█████████▋| 5749/5971 [57:28<02:13,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:02<00:09, 15.64it/s][A

Validating:  14%|█▍        | 23/167 [00:02<00:10, 14.13it/s][A
Epoch 3:  96%|█████████▋| 5753/5971 [57:28<02:10,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:02<00:11, 12.23it/s][A

Validating:  16%|█▌        | 27/167 [00:02<00:10, 12.85it/s][A
Epoch 3:  96%|█████████▋| 5757/5971 [57:29<02:08,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:02<00:10, 13.35it/s][A

Validating:  19%|█▊        | 31/167 [00:03<00:11, 12.18it/s][A
Epoch 3:  96%|█████████▋| 5761/5971 [57:29<02:05,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:03<00:09, 13.41it/s][A

Validating:  21%|██        | 35/167 [00:03<00:09, 14.52it/s][A
Epoch 3:  97%|█████████▋| 5765/5971 [57:29<02:03,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:03<00:08, 14.97it/s][A

Validating:  23%|██▎       | 39/167 [00:03<00:08, 14.43it/s][A
Epoch 3:  97%|█████████▋| 5769/5971 [57:30<02:00,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:03<00:09, 13.65it/s][A

Validating:  26%|██▌       | 43/167 [00:03<00:08, 14.80it/s][A
Epoch 3:  97%|█████████▋| 5773/5971 [57:30<01:58,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:03<00:07, 15.77it/s][A

Validating:  29%|██▊       | 48/167 [00:04<00:07, 16.36it/s][A
Epoch 3:  97%|█████████▋| 5777/5971 [57:30<01:55,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:04<00:06, 18.37it/s][A
Epoch 3:  97%|█████████▋| 5781/5971 [57:30<01:53,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:04<00:06, 17.51it/s][A

Validating:  33%|███▎      | 55/167 [00:04<00:06, 17.09it/s][A
Epoch 3:  97%|█████████▋| 5785/5971 [57:30<01:50,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:04<00:06, 17.62it/s][A

Validating:  36%|███▌      | 60/167 [00:04<00:05, 18.28it/s][A
Epoch 3:  97%|█████████▋| 5789/5971 [57:31<01:48,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:04<00:05, 18.09it/s][A
Epoch 3:  97%|█████████▋| 5793/5971 [57:31<01:46,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:05<00:05, 17.63it/s][A

Validating:  41%|████      | 68/167 [00:05<00:05, 17.86it/s][A
Epoch 3:  97%|█████████▋| 5797/5971 [57:31<01:43,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:05<00:07, 13.69it/s][A

Validating:  43%|████▎     | 72/167 [00:05<00:06, 13.66it/s][A
Epoch 3:  97%|█████████▋| 5801/5971 [57:32<01:41,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:05<00:06, 13.50it/s][A

Validating:  46%|████▌     | 76/167 [00:05<00:06, 14.56it/s][A
Epoch 3:  97%|█████████▋| 5805/5971 [57:32<01:38,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:06<00:06, 14.32it/s][A

Validating:  48%|████▊     | 80/167 [00:06<00:05, 15.46it/s][A
Epoch 3:  97%|█████████▋| 5809/5971 [57:32<01:36,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:06<00:05, 15.91it/s][A
Epoch 3:  97%|█████████▋| 5813/5971 [57:32<01:33,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:06<00:05, 15.66it/s][A

Validating:  52%|█████▏    | 87/167 [00:06<00:06, 12.15it/s][A
Epoch 3:  97%|█████████▋| 5817/5971 [57:33<01:31,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:07<00:07,  9.93it/s][A

Validating:  54%|█████▍    | 91/167 [00:07<00:06, 11.39it/s][A
Epoch 3:  97%|█████████▋| 5821/5971 [57:33<01:28,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  56%|█████▌    | 93/167 [00:07<00:06, 12.14it/s][A

Validating:  57%|█████▋    | 95/167 [00:07<00:05, 13.17it/s][A
Epoch 3:  98%|█████████▊| 5825/5971 [57:33<01:26,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:07<00:05, 13.56it/s][A

Validating:  60%|█████▉    | 100/167 [00:07<00:04, 15.82it/s][A
Epoch 3:  98%|█████████▊| 5829/5971 [57:34<01:24,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:07<00:03, 16.88it/s][A
Epoch 3:  98%|█████████▊| 5833/5971 [57:34<01:21,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:07<00:03, 17.66it/s][A

Validating:  65%|██████▍   | 108/167 [00:08<00:03, 17.94it/s][A
Epoch 3:  98%|█████████▊| 5837/5971 [57:34<01:19,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:08<00:03, 18.32it/s][A

Validating:  67%|██████▋   | 112/167 [00:08<00:02, 18.56it/s][A
Epoch 3:  98%|█████████▊| 5841/5971 [57:34<01:16,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:08<00:02, 18.79it/s][A
Epoch 3:  98%|█████████▊| 5845/5971 [57:34<01:14,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:08<00:02, 18.48it/s][A

Validating:  72%|███████▏  | 120/167 [00:08<00:02, 19.51it/s][A
Epoch 3:  98%|█████████▊| 5849/5971 [57:35<01:12,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:08<00:02, 20.21it/s][A
Epoch 3:  98%|█████████▊| 5853/5971 [57:35<01:09,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:08<00:02, 20.18it/s][A
Epoch 3:  98%|█████████▊| 5857/5971 [57:35<01:07,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:09<00:01, 19.20it/s][A

Validating:  79%|███████▉  | 132/167 [00:09<00:01, 19.78it/s][A
Epoch 3:  98%|█████████▊| 5861/5971 [57:35<01:04,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:09<00:01, 19.46it/s][A
Epoch 3:  98%|█████████▊| 5865/5971 [57:35<01:02,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:09<00:01, 18.86it/s][A

Validating:  83%|████████▎ | 139/167 [00:09<00:01, 18.40it/s][A
Epoch 3:  98%|█████████▊| 5869/5971 [57:36<01:00,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:09<00:01, 17.97it/s][A

Validating:  86%|████████▌ | 143/167 [00:09<00:01, 16.18it/s][A
Epoch 3:  98%|█████████▊| 5873/5971 [57:36<00:57,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:10<00:01, 15.22it/s][A

Validating:  88%|████████▊ | 147/167 [00:10<00:01, 15.47it/s][A
Epoch 3:  98%|█████████▊| 5877/5971 [57:36<00:55,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:10<00:01, 15.62it/s][A

Validating:  91%|█████████ | 152/167 [00:10<00:00, 16.85it/s][A
Epoch 3:  98%|█████████▊| 5881/5971 [57:36<00:52,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:10<00:00, 17.24it/s][A

Validating:  93%|█████████▎| 156/167 [00:10<00:00, 14.90it/s][A
Epoch 3:  99%|█████████▊| 5885/5971 [57:37<00:50,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:10<00:00, 15.43it/s][A

Validating:  96%|█████████▌| 160/167 [00:11<00:00, 14.81it/s][A
Epoch 3:  99%|█████████▊| 5889/5971 [57:37<00:48,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:11<00:00, 15.62it/s][A

Validating:  98%|█████████▊| 164/167 [00:11<00:00, 16.47it/s][A
Epoch 3:  99%|█████████▊| 5893/5971 [57:37<00:45,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:11<00:00, 18.58it/s][A
Epoch 3:  99%|█████████▊| 5896/5971 [57:39<00:43,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]

                                                             [A
Epoch 3:  99%|█████████▉| 5897/5971 [57:40<00:43,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=2274.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5897/5971 [57:40<00:43,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.26e-5, train/loss_step=0.0124, global_step=2275.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  99%|█████████▉| 5898/5971 [57:42<00:42,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00103, train/loss_step=0.277, global_step=2275.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  99%|█████████▉| 5899/5971 [57:43<00:42,  1.70it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0915, train/loss_vlb_step=0.000307, train/loss_step=0.0915, global_step=2275.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5900/5971 [57:48<00:41,  1.70it/s, loss=0.156, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00141, train/loss_step=0.341, global_step=2275.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  99%|█████████▉| 5901/5971 [57:50<00:41,  1.70it/s, loss=0.156, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00141, train/loss_step=0.341, global_step=2275.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5901/5971 [57:50<00:41,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.000261, train/loss_step=0.0788, global_step=2276.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5902/5971 [57:51<00:40,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00445, train/loss_step=0.525, global_step=2276.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  99%|█████████▉| 5903/5971 [57:52<00:39,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00595, train/loss_vlb_step=2.86e-5, train/loss_step=0.00595, global_step=2276.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5904/5971 [57:56<00:39,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.96e-5, train/loss_step=0.0138, global_step=2276.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  99%|█████████▉| 5905/5971 [57:57<00:38,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.96e-5, train/loss_step=0.0138, global_step=2276.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5905/5971 [57:57<00:38,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00281, train/loss_step=0.461, global_step=2277.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  99%|█████████▉| 5906/5971 [57:58<00:38,  1.70it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.8e-5, train/loss_step=0.0191, global_step=2277.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5907/5971 [58:00<00:37,  1.70it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00263, train/loss_vlb_step=1.5e-5, train/loss_step=0.00263, global_step=2277.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5908/5971 [58:03<00:37,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000856, train/loss_step=0.243, global_step=2277.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  99%|█████████▉| 5909/5971 [58:04<00:36,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000856, train/loss_step=0.243, global_step=2277.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5909/5971 [58:04<00:36,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00151, train/loss_step=0.345, global_step=2278.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  99%|█████████▉| 5910/5971 [58:06<00:35,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0824, train/loss_vlb_step=0.000275, train/loss_step=0.0824, global_step=2278.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5911/5971 [58:07<00:35,  1.70it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0083, train/loss_vlb_step=3.96e-5, train/loss_step=0.0083, global_step=2278.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  99%|█████████▉| 5912/5971 [58:11<00:34,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00566, train/loss_vlb_step=2.83e-5, train/loss_step=0.00566, global_step=2278.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5913/5971 [58:12<00:34,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00566, train/loss_vlb_step=2.83e-5, train/loss_step=0.00566, global_step=2278.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5913/5971 [58:12<00:34,  1.69it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.00016, train/loss_step=0.0433, global_step=2279.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  99%|█████████▉| 5914/5971 [58:13<00:33,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00137, train/loss_step=0.304, global_step=2279.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  99%|█████████▉| 5915/5971 [58:14<00:33,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.78e-5, train/loss_step=0.0211, global_step=2279.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5916/5971 [58:19<00:32,  1.69it/s, loss=0.18, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.0135, train/loss_step=0.712, global_step=2279.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  99%|█████████▉| 5917/5971 [58:21<00:31,  1.69it/s, loss=0.18, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.0135, train/loss_step=0.712, global_step=2279.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5917/5971 [58:21<00:31,  1.69it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.78e-5, train/loss_step=0.00533, global_step=2280.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5918/5971 [58:22<00:31,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.2e-5, train/loss_step=0.0204, global_step=2280.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  99%|█████████▉| 5919/5971 [58:23<00:30,  1.69it/s, loss=0.168, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000401, train/loss_step=0.121, global_step=2280.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5920/5971 [58:27<00:30,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000238, train/loss_step=0.0725, global_step=2280.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5921/5971 [58:28<00:29,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000238, train/loss_step=0.0725, global_step=2280.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5921/5971 [58:28<00:29,  1.69it/s, loss=0.17, v_num=0, train/loss_simple_step=0.383, train/loss_vlb_step=0.00171, train/loss_step=0.383, global_step=2281.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  99%|█████████▉| 5922/5971 [58:30<00:29,  1.69it/s, loss=0.15, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000413, train/loss_step=0.125, global_step=2281.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5923/5971 [58:31<00:28,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0494, train/loss_vlb_step=0.000184, train/loss_step=0.0494, global_step=2281.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5924/5971 [58:34<00:27,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00289, train/loss_step=0.398, global_step=2281.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  99%|█████████▉| 5925/5971 [58:36<00:27,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00289, train/loss_step=0.398, global_step=2281.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5925/5971 [58:36<00:27,  1.69it/s, loss=0.161, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00108, train/loss_step=0.249, global_step=2282.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5926/5971 [58:37<00:26,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=7.04e-5, train/loss_step=0.016, global_step=2282.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  99%|█████████▉| 5927/5971 [58:38<00:26,  1.68it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000179, train/loss_step=0.0497, global_step=2282.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5928/5971 [58:43<00:25,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00167, train/loss_step=0.278, global_step=2282.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3:  99%|█████████▉| 5929/5971 [58:44<00:24,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00167, train/loss_step=0.278, global_step=2282.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5929/5971 [58:44<00:24,  1.68it/s, loss=0.161, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00105, train/loss_step=0.276, global_step=2283.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5930/5971 [58:45<00:24,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000506, train/loss_step=0.153, global_step=2283.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5931/5971 [58:47<00:23,  1.68it/s, loss=0.193, v_num=0, train/loss_simple_step=0.573, train/loss_vlb_step=0.00666, train/loss_step=0.573, global_step=2283.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  99%|█████████▉| 5932/5971 [58:51<00:23,  1.68it/s, loss=0.198, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=2283.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5933/5971 [58:52<00:22,  1.68it/s, loss=0.198, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=2283.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5933/5971 [58:52<00:22,  1.68it/s, loss=0.223, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00629, train/loss_step=0.550, global_step=2284.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3:  99%|█████████▉| 5934/5971 [58:53<00:22,  1.68it/s, loss=0.218, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000688, train/loss_step=0.194, global_step=2284.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5935/5971 [58:55<00:21,  1.68it/s, loss=0.224, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000505, train/loss_step=0.150, global_step=2284.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5936/5971 [58:59<00:20,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.84e-5, train/loss_step=0.0107, global_step=2284.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5937/5971 [59:00<00:20,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.84e-5, train/loss_step=0.0107, global_step=2284.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5937/5971 [59:00<00:20,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.600, train/loss_vlb_step=0.00664, train/loss_step=0.600, global_step=2285.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:  99%|█████████▉| 5938/5971 [59:01<00:19,  1.68it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00741, train/loss_vlb_step=3.43e-5, train/loss_step=0.00741, global_step=2285.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5939/5971 [59:02<00:19,  1.68it/s, loss=0.235, v_num=0, train/loss_simple_step=0.450, train/loss_vlb_step=0.00425, train/loss_step=0.450, global_step=2285.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3:  99%|█████████▉| 5940/5971 [59:06<00:18,  1.68it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0946, train/loss_vlb_step=0.000312, train/loss_step=0.0946, global_step=2285.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5941/5971 [59:07<00:17,  1.67it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0946, train/loss_vlb_step=0.000312, train/loss_step=0.0946, global_step=2285.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3:  99%|█████████▉| 5941/5971 [59:07<00:17,  1.67it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0492, train/loss_vlb_step=0.000175, train/loss_step=0.0492, global_step=2286.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5942/5971 [59:08<00:17,  1.67it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00738, train/loss_vlb_step=3.31e-5, train/loss_step=0.00738, global_step=2286.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5943/5971 [59:09<00:16,  1.67it/s, loss=0.221, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000676, train/loss_step=0.195, global_step=2286.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3: 100%|█████████▉| 5944/5971 [59:13<00:16,  1.67it/s, loss=0.207, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=2286.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5945/5971 [59:14<00:15,  1.67it/s, loss=0.207, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=2286.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5945/5971 [59:14<00:15,  1.67it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00414, train/loss_vlb_step=2.18e-5, train/loss_step=0.00414, global_step=2287.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5946/5971 [59:16<00:14,  1.67it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.74e-5, train/loss_step=0.0178, global_step=2287.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3: 100%|█████████▉| 5947/5971 [59:17<00:14,  1.67it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000145, train/loss_step=0.0408, global_step=2287.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5948/5971 [59:20<00:13,  1.67it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00461, train/loss_vlb_step=2.22e-5, train/loss_step=0.00461, global_step=2287.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5949/5971 [59:21<00:13,  1.67it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00461, train/loss_vlb_step=2.22e-5, train/loss_step=0.00461, global_step=2287.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5949/5971 [59:21<00:13,  1.67it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00284, train/loss_vlb_step=1.56e-5, train/loss_step=0.00284, global_step=2288.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5950/5971 [59:23<00:12,  1.67it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.6e-5, train/loss_step=0.00283, global_step=2288.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3: 100%|█████████▉| 5951/5971 [59:24<00:11,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=2288.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3: 100%|█████████▉| 5952/5971 [59:28<00:11,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000128, train/loss_step=0.0355, global_step=2288.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5953/5971 [59:29<00:10,  1.67it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000128, train/loss_step=0.0355, global_step=2288.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5953/5971 [59:29<00:10,  1.67it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.71e-5, train/loss_step=0.00309, global_step=2289.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5954/5971 [59:30<00:10,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00114, train/loss_step=0.243, global_step=2289.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]    
Epoch 3: 100%|█████████▉| 5955/5971 [59:31<00:09,  1.67it/s, loss=0.115, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00146, train/loss_step=0.320, global_step=2289.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5956/5971 [59:35<00:09,  1.67it/s, loss=0.145, v_num=0, train/loss_simple_step=0.603, train/loss_vlb_step=0.011, train/loss_step=0.603, global_step=2289.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3: 100%|█████████▉| 5957/5971 [59:36<00:08,  1.67it/s, loss=0.145, v_num=0, train/loss_simple_step=0.603, train/loss_vlb_step=0.011, train/loss_step=0.603, global_step=2289.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5957/5971 [59:36<00:08,  1.67it/s, loss=0.124, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000662, train/loss_step=0.185, global_step=2290.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5958/5971 [59:38<00:07,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00431, train/loss_step=0.463, global_step=2290.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3: 100%|█████████▉| 5959/5971 [59:39<00:07,  1.67it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.00014, train/loss_step=0.0364, global_step=2290.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5960/5971 [59:42<00:06,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000164, train/loss_step=0.0479, global_step=2290.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5961/5971 [59:44<00:06,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000164, train/loss_step=0.0479, global_step=2290.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5961/5971 [59:44<00:06,  1.66it/s, loss=0.136, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00106, train/loss_step=0.284, global_step=2291.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3: 100%|█████████▉| 5962/5971 [59:46<00:05,  1.66it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00488, train/loss_vlb_step=2.5e-5, train/loss_step=0.00488, global_step=2291.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5963/5971 [59:47<00:04,  1.66it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.08e-5, train/loss_step=0.00397, global_step=2291.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5964/5971 [59:50<00:04,  1.66it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.11e-5, train/loss_step=0.0119, global_step=2291.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3: 100%|█████████▉| 5965/5971 [59:52<00:03,  1.66it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.11e-5, train/loss_step=0.0119, global_step=2291.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5965/5971 [59:52<00:03,  1.66it/s, loss=0.126, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=2292.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3: 100%|█████████▉| 5966/5971 [59:53<00:03,  1.66it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.77e-5, train/loss_step=0.0188, global_step=2292.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5967/5971 [59:55<00:02,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.00019, train/loss_step=0.0551, global_step=2292.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5968/5971 [59:58<00:01,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0989, train/loss_vlb_step=0.000325, train/loss_step=0.0989, global_step=2292.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5969/5971 [59:59<00:01,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0989, train/loss_vlb_step=0.000325, train/loss_step=0.0989, global_step=2292.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5969/5971 [59:59<00:01,  1.66it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.000292, train/loss_step=0.0881, global_step=2293.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|█████████▉| 5970/5971 [1:00:01<00:00,  1.66it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.00017, train/loss_step=0.0493, global_step=2293.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:02<00:00,  1.66it/s, loss=0.14, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000438, train/loss_step=0.133, global_step=2293.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3: 100%|██████████| 5971/5971 [1:00:06<00:00,  1.66it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000144, train/loss_step=0.0397, global_step=2293.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:07<00:00,  1.66it/s, loss=0.145, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=2294.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3: 100%|██████████| 5971/5971 [1:00:09<00:00,  1.65it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.41e-5, train/loss_step=0.00262, global_step=2294.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:10<00:00,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000222, train/loss_step=0.0669, global_step=2294.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3: 100%|██████████| 5971/5971 [1:00:14<00:00,  1.65it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000196, train/loss_step=0.0579, global_step=2294.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:16<00:00,  1.65it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.00079, train/loss_step=0.211, global_step=2295.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3: 100%|██████████| 5971/5971 [1:00:17<00:00,  1.65it/s, loss=0.0718, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.37e-6, train/loss_step=0.00155, global_step=2295.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:17<00:00,  1.65it/s, loss=0.0718, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.37e-6, train/loss_step=0.00155, global_step=2295.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:18<00:00,  1.65it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000518, train/loss_step=0.156, global_step=2295.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3: 100%|██████████| 5971/5971 [1:00:21<00:00,  1.65it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.000101, train/loss_step=0.0276, global_step=2295.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:23<00:00,  1.65it/s, loss=0.072, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000691, train/loss_step=0.189, global_step=2296.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3: 100%|██████████| 5971/5971 [1:00:24<00:00,  1.65it/s, loss=0.0759, v_num=0, train/loss_simple_step=0.0834, train/loss_vlb_step=0.000274, train/loss_step=0.0834, global_step=2296.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:25<00:00,  1.65it/s, loss=0.087, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000843, train/loss_step=0.226, global_step=2296.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3: 100%|██████████| 5971/5971 [1:00:29<00:00,  1.65it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000492, train/loss_step=0.150, global_step=2296.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:30<00:00,  1.64it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.00353, train/loss_vlb_step=1.87e-5, train/loss_step=0.00353, global_step=2297.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:31<00:00,  1.64it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000602, train/loss_step=0.178, global_step=2297.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 3: 100%|██████████| 5971/5971 [1:00:32<00:00,  1.64it/s, loss=0.094, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.4e-5, train/loss_step=0.0024, global_step=2297.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151] 
Epoch 3: 100%|██████████| 5971/5971 [1:00:35<00:00,  1.64it/s, loss=0.101, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000856, train/loss_step=0.247, global_step=2297.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:37<00:00,  1.64it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.00012, train/loss_step=0.0322, global_step=2298.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:38<00:00,  1.64it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0786, train/loss_vlb_step=0.000258, train/loss_step=0.0786, global_step=2298.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3: 100%|██████████| 5971/5971 [1:00:40<00:00,  1.64it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.63e-5, train/loss_step=0.0102, global_step=2298.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:42<00:00,  1.64it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.0911, train/loss_vlb_step=0.000302, train/loss_step=0.0911, global_step=2298.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 3: 100%|██████████| 5971/5971 [1:00:47<00:00,  1.64it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=2299.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]  
Epoch 3:   0%|          | 0/5971 [00:00<00:00, 8542.37it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=2299.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]   
Epoch 4:   0%|          | 0/5971 [00:00<00:03, 1536.38it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=2299.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:40,  1.21it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:27,  1.76it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:22,  2.05it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:02<00:20,  2.20it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:19,  2.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:17,  2.48it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:16,  2.53it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:16,  2.57it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:15,  2.63it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:14,  2.85it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:04<00:12,  3.01it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:04<00:12,  3.01it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:13,  2.80it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:05<00:14,  2.53it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:06<00:13,  2.56it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:06<00:12,  2.67it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:06<00:12,  2.68it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:07<00:12,  2.62it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:07<00:13,  2.37it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:08<00:13,  2.23it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:08<00:13,  2.21it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:09<00:12,  2.33it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:09<00:11,  2.38it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:09<00:10,  2.45it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:10<00:09,  2.52it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:10<00:09,  2.53it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:10<00:09,  2.51it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:11<00:08,  2.59it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:11<00:08,  2.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:12<00:07,  2.67it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:12<00:07,  2.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:12<00:06,  2.72it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:13<00:06,  2.74it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:13<00:05,  2.75it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:13<00:05,  2.80it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:14<00:04,  2.83it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:14<00:04,  2.77it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:14<00:04,  2.78it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:15<00:03,  2.84it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:15<00:03,  2.99it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:15<00:02,  3.16it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:16<00:02,  3.07it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:16<00:02,  2.95it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:16<00:02,  2.87it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:17<00:01,  2.85it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:17<00:01,  2.82it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:17<00:01,  2.99it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:18<00:00,  2.91it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:18<00:00,  2.52it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:19<00:00,  2.33it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:19<00:00,  2.59it/s]

Epoch 4:   0%|          | 1/5971 [00:24<20:12:36, 12.19s/it, loss=0.0966, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=2299.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.00238, train/loss_epoch=0.151]
Epoch 4:   0%|          | 1/5971 [00:24<20:12:42, 12.19s/it, loss=0.144, v_num=0, train/loss_simple_step=0.957, train/loss_vlb_step=0.482, train/loss_step=0.957, global_step=2300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:34,  1.43it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:21,  2.28it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:16,  2.78it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:14,  3.13it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:13,  3.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:13,  3.24it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:13,  3.26it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:12,  3.44it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:12,  3.39it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:03<00:12,  3.20it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:12,  3.24it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:03<00:11,  3.27it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:04<00:11,  3.20it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:04<00:11,  3.13it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:04<00:11,  3.10it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:05<00:11,  3.07it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:10,  3.08it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:10,  3.07it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:06<00:10,  3.05it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:09,  3.09it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:06<00:09,  3.11it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:07<00:08,  3.12it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:07<00:08,  3.17it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:07<00:08,  3.14it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:08<00:08,  3.00it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:08<00:08,  2.98it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:08<00:08,  2.82it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:09<00:07,  2.93it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:09<00:06,  3.03it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:09<00:06,  3.14it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:10<00:05,  3.22it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:10<00:05,  3.21it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:10<00:05,  3.06it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:11<00:05,  3.01it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:11<00:05,  2.97it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:11<00:04,  2.88it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:12<00:04,  2.68it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:12<00:04,  2.80it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:12<00:03,  2.97it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:13<00:03,  3.25it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:13<00:02,  3.42it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:13<00:02,  3.52it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:13<00:02,  3.44it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:14<00:01,  3.59it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:14<00:01,  3.72it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:14<00:01,  3.87it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:14<00:00,  3.83it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:15<00:00,  3.84it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:15<00:00,  3.93it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:15<00:00,  3.79it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:15<00:00,  3.18it/s]

Epoch 4:   0%|          | 2/5971 [00:43<23:54:58, 14.42s/it, loss=0.144, v_num=0, train/loss_simple_step=0.957, train/loss_vlb_step=0.482, train/loss_step=0.957, global_step=2300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 2/5971 [00:43<23:55:01, 14.42s/it, loss=0.141, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.5e-6, train/loss_step=0.0014, global_step=2300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:39,  1.23it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:26,  1.79it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:21,  2.23it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:19,  2.39it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:17,  2.51it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:16,  2.75it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:14,  3.02it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:13,  3.01it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:12,  3.24it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:03<00:11,  3.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:10,  3.59it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:04<00:10,  3.69it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:04<00:09,  3.81it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:04<00:09,  3.82it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:04<00:08,  3.91it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:05<00:08,  3.94it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:08,  3.98it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:08,  3.72it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:06<00:09,  3.40it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:08,  3.38it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:06<00:09,  3.16it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:07<00:08,  3.14it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:07<00:08,  3.19it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:07<00:08,  3.10it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:08<00:08,  3.01it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:08<00:07,  3.02it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:08<00:07,  2.98it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:09<00:07,  2.93it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:09<00:07,  2.93it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:09<00:06,  2.97it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:10<00:06,  2.96it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:10<00:06,  2.99it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:10<00:05,  2.93it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:11<00:05,  2.93it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:11<00:05,  2.88it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:11<00:04,  2.83it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:12<00:04,  2.86it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:12<00:04,  3.00it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:12<00:03,  3.27it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:13<00:03,  3.21it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:13<00:03,  2.76it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:13<00:03,  2.53it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:14<00:02,  2.39it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:14<00:02,  2.46it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:15<00:01,  2.62it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:15<00:01,  2.66it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:15<00:01,  2.81it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:16<00:00,  2.88it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:16<00:00,  2.86it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:16<00:00,  3.04it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:16<00:00,  2.98it/s]

Epoch 4:   0%|          | 3/5971 [01:05<27:03:27, 16.32s/it, loss=0.141, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.5e-6, train/loss_step=0.0014, global_step=2300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 3/5971 [01:05<27:03:29, 16.32s/it, loss=0.164, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.00462, train/loss_step=0.520, global_step=2300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:39,  1.24it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:24,  1.98it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:19,  2.40it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:16,  2.74it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:15,  2.90it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:15,  2.82it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:14,  2.97it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:13,  3.08it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:12,  3.22it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:03<00:11,  3.35it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:11,  3.47it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:04<00:10,  3.76it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:04<00:10,  3.70it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:04<00:10,  3.39it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:05<00:11,  3.18it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:05<00:10,  3.29it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:09,  3.46it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:08,  3.58it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:06<00:08,  3.61it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:08,  3.38it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:06<00:07,  3.64it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:06<00:07,  3.76it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:07<00:07,  3.84it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:07<00:06,  3.87it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:07<00:06,  3.91it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:07<00:06,  3.76it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:08<00:06,  3.56it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:08<00:06,  3.64it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:08<00:05,  3.69it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:09<00:05,  3.72it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:09<00:05,  3.71it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:09<00:04,  3.77it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:09<00:04,  3.81it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:10<00:04,  3.81it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:10<00:03,  3.80it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:10<00:03,  3.78it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:10<00:03,  3.78it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:11<00:03,  3.80it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:11<00:02,  3.85it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:11<00:02,  3.86it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:11<00:02,  3.58it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:12<00:02,  3.25it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:12<00:02,  3.14it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:12<00:01,  3.33it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:13<00:01,  3.28it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:13<00:01,  3.08it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:13<00:00,  3.00it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:14<00:00,  2.80it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:14<00:00,  2.83it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:15<00:00,  3.00it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:15<00:00,  3.33it/s]

Epoch 4:   0%|          | 4/5971 [01:24<28:06:51, 16.96s/it, loss=0.164, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.00462, train/loss_step=0.520, global_step=2300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 4/5971 [01:24<28:06:54, 16.96s/it, loss=0.167, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00117, train/loss_step=0.275, global_step=2300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 5/5971 [01:26<23:45:36, 14.34s/it, loss=0.167, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00117, train/loss_step=0.275, global_step=2300.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 5/5971 [01:26<23:45:37, 14.34s/it, loss=0.168, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.11e-5, train/loss_step=0.0173, global_step=2301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 6/5971 [01:27<20:40:01, 12.47s/it, loss=0.168, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.11e-5, train/loss_step=0.0173, global_step=2301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 6/5971 [01:27<20:40:02, 12.47s/it, loss=0.161, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.38e-5, train/loss_step=0.00237, global_step=2301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 7/5971 [01:28<18:19:57, 11.07s/it, loss=0.161, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.38e-5, train/loss_step=0.00237, global_step=2301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 7/5971 [01:28<18:19:59, 11.07s/it, loss=0.199, v_num=0, train/loss_simple_step=0.806, train/loss_vlb_step=0.038, train/loss_step=0.806, global_step=2301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:   0%|          | 8/5971 [01:32<16:56:00, 10.22s/it, loss=0.199, v_num=0, train/loss_simple_step=0.806, train/loss_vlb_step=0.038, train/loss_step=0.806, global_step=2301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 8/5971 [01:32<16:56:01, 10.22s/it, loss=0.19, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.82e-5, train/loss_step=0.00541, global_step=2301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 9/5971 [01:33<15:28:52,  9.35s/it, loss=0.19, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.82e-5, train/loss_step=0.00541, global_step=2301.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 9/5971 [01:33<15:28:53,  9.35s/it, loss=0.196, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000745, train/loss_step=0.205, global_step=2302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   0%|          | 10/5971 [01:34<14:13:28,  8.59s/it, loss=0.196, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000745, train/loss_step=0.205, global_step=2302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 10/5971 [01:34<14:13:28,  8.59s/it, loss=0.2, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00121, train/loss_step=0.296, global_step=2302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   0%|          | 11/5971 [01:35<13:10:53,  7.96s/it, loss=0.2, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00121, train/loss_step=0.296, global_step=2302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 11/5971 [01:35<13:10:54,  7.96s/it, loss=0.2, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000557, train/loss_step=0.161, global_step=2302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 12/5971 [01:38<12:34:40,  7.60s/it, loss=0.2, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000557, train/loss_step=0.161, global_step=2302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 12/5971 [01:38<12:34:41,  7.60s/it, loss=0.202, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000114, train/loss_step=0.0288, global_step=2302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 13/5971 [01:40<11:49:22,  7.14s/it, loss=0.202, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000114, train/loss_step=0.0288, global_step=2302.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 13/5971 [01:40<11:49:22,  7.14s/it, loss=0.205, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00109, train/loss_step=0.253, global_step=2303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   0%|          | 14/5971 [01:41<11:09:48,  6.75s/it, loss=0.205, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00109, train/loss_step=0.253, global_step=2303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 14/5971 [01:41<11:09:49,  6.75s/it, loss=0.205, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.79e-5, train/loss_step=0.00364, global_step=2303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 15/5971 [01:42<10:35:17,  6.40s/it, loss=0.205, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.79e-5, train/loss_step=0.00364, global_step=2303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 15/5971 [01:42<10:35:18,  6.40s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0648, train/loss_vlb_step=0.000229, train/loss_step=0.0648, global_step=2303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   0%|          | 16/5971 [01:45<10:18:36,  6.23s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0648, train/loss_vlb_step=0.000229, train/loss_step=0.0648, global_step=2303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 16/5971 [01:45<10:18:36,  6.23s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.76e-5, train/loss_step=0.0234, global_step=2303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   0%|          | 17/5971 [01:47<9:52:25,  5.97s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.76e-5, train/loss_step=0.0234, global_step=2303.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   0%|          | 17/5971 [01:47<9:52:25,  5.97s/it, loss=0.192, v_num=0, train/loss_simple_step=0.00386, train/loss_vlb_step=2.1e-5, train/loss_step=0.00386, global_step=2304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 18/5971 [01:48<9:27:51,  5.72s/it, loss=0.192, v_num=0, train/loss_simple_step=0.00386, train/loss_vlb_step=2.1e-5, train/loss_step=0.00386, global_step=2304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 18/5971 [01:48<9:27:51,  5.72s/it, loss=0.195, v_num=0, train/loss_simple_step=0.071, train/loss_vlb_step=0.000234, train/loss_step=0.071, global_step=2304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   0%|          | 19/5971 [01:49<9:05:04,  5.49s/it, loss=0.195, v_num=0, train/loss_simple_step=0.071, train/loss_vlb_step=0.000234, train/loss_step=0.071, global_step=2304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 19/5971 [01:49<9:05:04,  5.49s/it, loss=0.21, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.0021, train/loss_step=0.390, global_step=2304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   0%|          | 20/5971 [01:53<8:54:53,  5.39s/it, loss=0.21, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.0021, train/loss_step=0.390, global_step=2304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 20/5971 [01:53<8:54:53,  5.39s/it, loss=0.204, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.3e-5, train/loss_step=0.00215, global_step=2304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 21/5971 [01:54<8:36:34,  5.21s/it, loss=0.204, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.3e-5, train/loss_step=0.00215, global_step=2304.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 21/5971 [01:54<8:36:35,  5.21s/it, loss=0.157, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.62e-5, train/loss_step=0.00755, global_step=2305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 22/5971 [01:56<8:20:13,  5.05s/it, loss=0.157, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.62e-5, train/loss_step=0.00755, global_step=2305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 22/5971 [01:56<8:20:13,  5.05s/it, loss=0.163, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=2305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   0%|          | 23/5971 [01:57<8:05:35,  4.90s/it, loss=0.163, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=2305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 23/5971 [01:57<8:05:35,  4.90s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.48e-5, train/loss_step=0.00262, global_step=2305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 24/5971 [02:00<7:59:10,  4.83s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.48e-5, train/loss_step=0.00262, global_step=2305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 24/5971 [02:00<7:59:10,  4.83s/it, loss=0.137, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00364, train/loss_step=0.274, global_step=2305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   0%|          | 25/5971 [02:02<7:45:06,  4.69s/it, loss=0.137, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00364, train/loss_step=0.274, global_step=2305.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 25/5971 [02:02<7:45:06,  4.69s/it, loss=0.137, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=6.79e-5, train/loss_step=0.0175, global_step=2306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 26/5971 [02:03<7:32:03,  4.56s/it, loss=0.137, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=6.79e-5, train/loss_step=0.0175, global_step=2306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 26/5971 [02:03<7:32:03,  4.56s/it, loss=0.145, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000562, train/loss_step=0.159, global_step=2306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   0%|          | 27/5971 [02:04<7:20:51,  4.45s/it, loss=0.145, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000562, train/loss_step=0.159, global_step=2306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 27/5971 [02:04<7:20:51,  4.45s/it, loss=0.105, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.02e-5, train/loss_step=0.00175, global_step=2306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 28/5971 [02:08<7:18:22,  4.43s/it, loss=0.105, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.02e-5, train/loss_step=0.00175, global_step=2306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 28/5971 [02:08<7:18:23,  4.43s/it, loss=0.109, v_num=0, train/loss_simple_step=0.0929, train/loss_vlb_step=0.000314, train/loss_step=0.0929, global_step=2306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   0%|          | 29/5971 [02:09<7:08:29,  4.33s/it, loss=0.109, v_num=0, train/loss_simple_step=0.0929, train/loss_vlb_step=0.000314, train/loss_step=0.0929, global_step=2306.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   0%|          | 29/5971 [02:09<7:08:29,  4.33s/it, loss=0.103, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000255, train/loss_step=0.0757, global_step=2307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 30/5971 [02:11<6:58:51,  4.23s/it, loss=0.103, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000255, train/loss_step=0.0757, global_step=2307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 30/5971 [02:11<6:58:52,  4.23s/it, loss=0.0937, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=2307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|          | 31/5971 [02:12<6:49:58,  4.14s/it, loss=0.0937, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=2307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 31/5971 [02:12<6:49:58,  4.14s/it, loss=0.0858, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.19e-5, train/loss_step=0.00202, global_step=2307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 32/5971 [02:16<6:50:42,  4.15s/it, loss=0.0858, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.19e-5, train/loss_step=0.00202, global_step=2307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 32/5971 [02:16<6:50:43,  4.15s/it, loss=0.0844, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.17e-5, train/loss_step=0.00194, global_step=2307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 33/5971 [02:18<6:42:48,  4.07s/it, loss=0.0844, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.17e-5, train/loss_step=0.00194, global_step=2307.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 33/5971 [02:18<6:42:48,  4.07s/it, loss=0.0726, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.14e-5, train/loss_step=0.0165, global_step=2308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   1%|          | 34/5971 [02:19<6:35:12,  3.99s/it, loss=0.0726, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.14e-5, train/loss_step=0.0165, global_step=2308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 34/5971 [02:19<6:35:13,  3.99s/it, loss=0.0865, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00117, train/loss_step=0.282, global_step=2308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   1%|          | 35/5971 [02:21<6:28:08,  3.92s/it, loss=0.0865, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00117, train/loss_step=0.282, global_step=2308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 35/5971 [02:21<6:28:08,  3.92s/it, loss=0.0834, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.44e-5, train/loss_step=0.00253, global_step=2308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 36/5971 [02:25<6:28:10,  3.92s/it, loss=0.0834, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.44e-5, train/loss_step=0.00253, global_step=2308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 36/5971 [02:25<6:28:10,  3.92s/it, loss=0.095, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000913, train/loss_step=0.255, global_step=2308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   1%|          | 37/5971 [02:26<6:22:16,  3.87s/it, loss=0.095, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000913, train/loss_step=0.255, global_step=2308.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 37/5971 [02:26<6:22:16,  3.87s/it, loss=0.11, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00125, train/loss_step=0.302, global_step=2309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   1%|          | 38/5971 [02:27<6:15:08,  3.79s/it, loss=0.11, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00125, train/loss_step=0.302, global_step=2309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 38/5971 [02:27<6:15:08,  3.79s/it, loss=0.107, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.13e-5, train/loss_step=0.00912, global_step=2309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 39/5971 [02:29<6:09:02,  3.73s/it, loss=0.107, v_num=0, train/loss_simple_step=0.00912, train/loss_vlb_step=4.13e-5, train/loss_step=0.00912, global_step=2309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 39/5971 [02:29<6:09:02,  3.73s/it, loss=0.0877, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=4e-5, train/loss_step=0.00845, global_step=2309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   1%|          | 40/5971 [02:32<6:06:52,  3.71s/it, loss=0.0877, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=4e-5, train/loss_step=0.00845, global_step=2309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 40/5971 [02:32<6:06:52,  3.71s/it, loss=0.0881, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.42e-5, train/loss_step=0.0101, global_step=2309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 41/5971 [02:33<6:01:04,  3.65s/it, loss=0.0881, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.42e-5, train/loss_step=0.0101, global_step=2309.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 41/5971 [02:33<6:01:04,  3.65s/it, loss=0.114, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00416, train/loss_step=0.518, global_step=2310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   1%|          | 42/5971 [02:34<5:55:45,  3.60s/it, loss=0.114, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00416, train/loss_step=0.518, global_step=2310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 42/5971 [02:34<5:55:45,  3.60s/it, loss=0.112, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.00029, train/loss_step=0.0854, global_step=2310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 43/5971 [02:36<5:50:39,  3.55s/it, loss=0.112, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.00029, train/loss_step=0.0854, global_step=2310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 43/5971 [02:36<5:50:39,  3.55s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000145, train/loss_step=0.0395, global_step=2310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 44/5971 [02:40<5:52:39,  3.57s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000145, train/loss_step=0.0395, global_step=2310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 44/5971 [02:40<5:52:39,  3.57s/it, loss=0.1, v_num=0, train/loss_simple_step=0.00574, train/loss_vlb_step=2.92e-5, train/loss_step=0.00574, global_step=2310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|          | 45/5971 [02:42<5:48:11,  3.53s/it, loss=0.1, v_num=0, train/loss_simple_step=0.00574, train/loss_vlb_step=2.92e-5, train/loss_step=0.00574, global_step=2310.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 45/5971 [02:42<5:48:11,  3.53s/it, loss=0.105, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=2311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|          | 46/5971 [02:43<5:43:35,  3.48s/it, loss=0.105, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=2311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 46/5971 [02:43<5:43:35,  3.48s/it, loss=0.0984, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000111, train/loss_step=0.0278, global_step=2311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 47/5971 [02:45<5:39:40,  3.44s/it, loss=0.0984, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000111, train/loss_step=0.0278, global_step=2311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 47/5971 [02:45<5:39:40,  3.44s/it, loss=0.0983, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=7.86e-6, train/loss_step=0.00132, global_step=2311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 48/5971 [02:47<5:38:11,  3.43s/it, loss=0.0983, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=7.86e-6, train/loss_step=0.00132, global_step=2311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 48/5971 [02:47<5:38:11,  3.43s/it, loss=0.101, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000469, train/loss_step=0.141, global_step=2311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   1%|          | 49/5971 [02:49<5:33:58,  3.38s/it, loss=0.101, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000469, train/loss_step=0.141, global_step=2311.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 49/5971 [02:49<5:33:58,  3.38s/it, loss=0.106, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.00058, train/loss_step=0.174, global_step=2312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|          | 50/5971 [02:50<5:30:10,  3.35s/it, loss=0.106, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.00058, train/loss_step=0.174, global_step=2312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 50/5971 [02:50<5:30:10,  3.35s/it, loss=0.105, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=2312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 51/5971 [02:51<5:26:11,  3.31s/it, loss=0.105, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=2312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 51/5971 [02:51<5:26:12,  3.31s/it, loss=0.106, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.55e-5, train/loss_step=0.0213, global_step=2312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 52/5971 [02:54<5:24:57,  3.29s/it, loss=0.106, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.55e-5, train/loss_step=0.0213, global_step=2312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 52/5971 [02:54<5:24:57,  3.29s/it, loss=0.109, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000214, train/loss_step=0.0625, global_step=2312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 53/5971 [02:55<5:21:22,  3.26s/it, loss=0.109, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000214, train/loss_step=0.0625, global_step=2312.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 53/5971 [02:55<5:21:22,  3.26s/it, loss=0.108, v_num=0, train/loss_simple_step=0.00553, train/loss_vlb_step=2.81e-5, train/loss_step=0.00553, global_step=2313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 54/5971 [02:57<5:18:07,  3.23s/it, loss=0.108, v_num=0, train/loss_simple_step=0.00553, train/loss_vlb_step=2.81e-5, train/loss_step=0.00553, global_step=2313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 54/5971 [02:57<5:18:07,  3.23s/it, loss=0.1, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000413, train/loss_step=0.126, global_step=2313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:   1%|          | 55/5971 [02:58<5:14:43,  3.19s/it, loss=0.1, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000413, train/loss_step=0.126, global_step=2313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 55/5971 [02:58<5:14:43,  3.19s/it, loss=0.102, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000124, train/loss_step=0.0363, global_step=2313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 56/5971 [03:01<5:13:55,  3.18s/it, loss=0.102, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000124, train/loss_step=0.0363, global_step=2313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 56/5971 [03:01<5:13:55,  3.18s/it, loss=0.11, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00219, train/loss_step=0.411, global_step=2313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   1%|          | 57/5971 [03:02<5:10:38,  3.15s/it, loss=0.11, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00219, train/loss_step=0.411, global_step=2313.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 57/5971 [03:02<5:10:38,  3.15s/it, loss=0.109, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00117, train/loss_step=0.289, global_step=2314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 58/5971 [03:03<5:07:12,  3.12s/it, loss=0.109, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00117, train/loss_step=0.289, global_step=2314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 58/5971 [03:03<5:07:12,  3.12s/it, loss=0.121, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000953, train/loss_step=0.254, global_step=2314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 59/5971 [03:05<5:03:50,  3.08s/it, loss=0.121, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000953, train/loss_step=0.254, global_step=2314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 59/5971 [03:05<5:03:50,  3.08s/it, loss=0.134, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00104, train/loss_step=0.267, global_step=2314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|          | 60/5971 [03:07<5:03:23,  3.08s/it, loss=0.134, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00104, train/loss_step=0.267, global_step=2314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 60/5971 [03:07<5:03:24,  3.08s/it, loss=0.144, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000763, train/loss_step=0.204, global_step=2314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 61/5971 [03:09<5:00:16,  3.05s/it, loss=0.144, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000763, train/loss_step=0.204, global_step=2314.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 61/5971 [03:09<5:00:16,  3.05s/it, loss=0.128, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000727, train/loss_step=0.201, global_step=2315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 62/5971 [03:10<4:57:25,  3.02s/it, loss=0.128, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000727, train/loss_step=0.201, global_step=2315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 62/5971 [03:10<4:57:25,  3.02s/it, loss=0.124, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.99e-5, train/loss_step=0.0109, global_step=2315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 63/5971 [03:11<4:54:41,  2.99s/it, loss=0.124, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.99e-5, train/loss_step=0.0109, global_step=2315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 63/5971 [03:11<4:54:41,  2.99s/it, loss=0.128, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=2315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|          | 64/5971 [03:14<4:54:08,  2.99s/it, loss=0.128, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=2315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 64/5971 [03:14<4:54:08,  2.99s/it, loss=0.132, v_num=0, train/loss_simple_step=0.0858, train/loss_vlb_step=0.000295, train/loss_step=0.0858, global_step=2315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 65/5971 [03:15<4:51:25,  2.96s/it, loss=0.132, v_num=0, train/loss_simple_step=0.0858, train/loss_vlb_step=0.000295, train/loss_step=0.0858, global_step=2315.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 65/5971 [03:15<4:51:26,  2.96s/it, loss=0.163, v_num=0, train/loss_simple_step=0.745, train/loss_vlb_step=0.0199, train/loss_step=0.745, global_step=2316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   1%|          | 66/5971 [03:16<4:48:50,  2.93s/it, loss=0.163, v_num=0, train/loss_simple_step=0.745, train/loss_vlb_step=0.0199, train/loss_step=0.745, global_step=2316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 66/5971 [03:16<4:48:50,  2.93s/it, loss=0.162, v_num=0, train/loss_simple_step=0.00991, train/loss_vlb_step=4.5e-5, train/loss_step=0.00991, global_step=2316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 67/5971 [03:17<4:46:13,  2.91s/it, loss=0.162, v_num=0, train/loss_simple_step=0.00991, train/loss_vlb_step=4.5e-5, train/loss_step=0.00991, global_step=2316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 67/5971 [03:17<4:46:13,  2.91s/it, loss=0.169, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000447, train/loss_step=0.134, global_step=2316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   1%|          | 68/5971 [03:20<4:46:21,  2.91s/it, loss=0.169, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000447, train/loss_step=0.134, global_step=2316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 68/5971 [03:20<4:46:22,  2.91s/it, loss=0.166, v_num=0, train/loss_simple_step=0.0792, train/loss_vlb_step=0.000262, train/loss_step=0.0792, global_step=2316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 69/5971 [03:21<4:43:50,  2.89s/it, loss=0.166, v_num=0, train/loss_simple_step=0.0792, train/loss_vlb_step=0.000262, train/loss_step=0.0792, global_step=2316.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 69/5971 [03:21<4:43:50,  2.89s/it, loss=0.175, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00156, train/loss_step=0.348, global_step=2317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   1%|          | 70/5971 [03:23<4:41:25,  2.86s/it, loss=0.175, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00156, train/loss_step=0.348, global_step=2317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 70/5971 [03:23<4:41:25,  2.86s/it, loss=0.17, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=2.06e-5, train/loss_step=0.00366, global_step=2317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 71/5971 [03:24<4:39:14,  2.84s/it, loss=0.17, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=2.06e-5, train/loss_step=0.00366, global_step=2317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 71/5971 [03:24<4:39:14,  2.84s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.48e-5, train/loss_step=0.0245, global_step=2317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   1%|          | 72/5971 [03:27<4:39:21,  2.84s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.48e-5, train/loss_step=0.0245, global_step=2317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 72/5971 [03:27<4:39:21,  2.84s/it, loss=0.177, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000719, train/loss_step=0.204, global_step=2317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 73/5971 [03:28<4:37:21,  2.82s/it, loss=0.177, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000719, train/loss_step=0.204, global_step=2317.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 73/5971 [03:28<4:37:24,  2.82s/it, loss=0.19, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00106, train/loss_step=0.262, global_step=2318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   1%|          | 74/5971 [03:29<4:35:08,  2.80s/it, loss=0.19, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00106, train/loss_step=0.262, global_step=2318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|          | 74/5971 [03:29<4:35:08,  2.80s/it, loss=0.185, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.35e-5, train/loss_step=0.0227, global_step=2318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 75/5971 [03:31<4:33:01,  2.78s/it, loss=0.185, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.35e-5, train/loss_step=0.0227, global_step=2318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 75/5971 [03:31<4:33:01,  2.78s/it, loss=0.189, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000403, train/loss_step=0.122, global_step=2318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|▏         | 76/5971 [03:34<4:33:05,  2.78s/it, loss=0.189, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000403, train/loss_step=0.122, global_step=2318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 76/5971 [03:34<4:33:05,  2.78s/it, loss=0.177, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000608, train/loss_step=0.179, global_step=2318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 77/5971 [03:35<4:31:17,  2.76s/it, loss=0.177, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000608, train/loss_step=0.179, global_step=2318.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 77/5971 [03:35<4:31:17,  2.76s/it, loss=0.188, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00347, train/loss_step=0.495, global_step=2319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|▏         | 78/5971 [03:36<4:29:22,  2.74s/it, loss=0.188, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00347, train/loss_step=0.495, global_step=2319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 78/5971 [03:36<4:29:22,  2.74s/it, loss=0.187, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00095, train/loss_step=0.244, global_step=2319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 79/5971 [03:37<4:27:25,  2.72s/it, loss=0.187, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00095, train/loss_step=0.244, global_step=2319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 79/5971 [03:37<4:27:25,  2.72s/it, loss=0.175, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.82e-5, train/loss_step=0.0197, global_step=2319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 80/5971 [03:40<4:27:18,  2.72s/it, loss=0.175, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.82e-5, train/loss_step=0.0197, global_step=2319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 80/5971 [03:40<4:27:18,  2.72s/it, loss=0.198, v_num=0, train/loss_simple_step=0.660, train/loss_vlb_step=0.00841, train/loss_step=0.660, global_step=2319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   1%|▏         | 81/5971 [03:42<4:25:49,  2.71s/it, loss=0.198, v_num=0, train/loss_simple_step=0.660, train/loss_vlb_step=0.00841, train/loss_step=0.660, global_step=2319.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 81/5971 [03:42<4:25:49,  2.71s/it, loss=0.204, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00135, train/loss_step=0.327, global_step=2320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 82/5971 [03:43<4:24:08,  2.69s/it, loss=0.204, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00135, train/loss_step=0.327, global_step=2320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 82/5971 [03:43<4:24:08,  2.69s/it, loss=0.207, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000211, train/loss_step=0.0641, global_step=2320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 83/5971 [03:44<4:22:32,  2.68s/it, loss=0.207, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000211, train/loss_step=0.0641, global_step=2320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 83/5971 [03:44<4:22:32,  2.68s/it, loss=0.202, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=7.86e-5, train/loss_step=0.0201, global_step=2320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   1%|▏         | 84/5971 [03:48<4:23:25,  2.68s/it, loss=0.202, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=7.86e-5, train/loss_step=0.0201, global_step=2320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 84/5971 [03:48<4:23:26,  2.68s/it, loss=0.221, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.0034, train/loss_step=0.465, global_step=2320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   1%|▏         | 85/5971 [03:49<4:22:09,  2.67s/it, loss=0.221, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.0034, train/loss_step=0.465, global_step=2320.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 85/5971 [03:49<4:22:09,  2.67s/it, loss=0.185, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.86e-5, train/loss_step=0.0109, global_step=2321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 86/5971 [03:51<4:20:37,  2.66s/it, loss=0.185, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.86e-5, train/loss_step=0.0109, global_step=2321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 86/5971 [03:51<4:20:37,  2.66s/it, loss=0.185, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.04e-5, train/loss_step=0.0201, global_step=2321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 87/5971 [03:52<4:19:11,  2.64s/it, loss=0.185, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.04e-5, train/loss_step=0.0201, global_step=2321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 87/5971 [03:52<4:19:11,  2.64s/it, loss=0.19, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.00107, train/loss_step=0.238, global_step=2321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   1%|▏         | 88/5971 [03:55<4:19:46,  2.65s/it, loss=0.19, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.00107, train/loss_step=0.238, global_step=2321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 88/5971 [03:55<4:19:46,  2.65s/it, loss=0.188, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000134, train/loss_step=0.0367, global_step=2321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 89/5971 [03:57<4:18:18,  2.63s/it, loss=0.188, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000134, train/loss_step=0.0367, global_step=2321.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   1%|▏         | 89/5971 [03:57<4:18:18,  2.63s/it, loss=0.183, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000865, train/loss_step=0.233, global_step=2322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   2%|▏         | 90/5971 [03:58<4:16:50,  2.62s/it, loss=0.183, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000865, train/loss_step=0.233, global_step=2322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 90/5971 [03:58<4:16:50,  2.62s/it, loss=0.186, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.000225, train/loss_step=0.0653, global_step=2322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 91/5971 [03:59<4:15:25,  2.61s/it, loss=0.186, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.000225, train/loss_step=0.0653, global_step=2322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 91/5971 [03:59<4:15:25,  2.61s/it, loss=0.185, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.57e-5, train/loss_step=0.003, global_step=2322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   2%|▏         | 92/5971 [04:02<4:15:54,  2.61s/it, loss=0.185, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.57e-5, train/loss_step=0.003, global_step=2322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 92/5971 [04:02<4:15:54,  2.61s/it, loss=0.175, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.44e-5, train/loss_step=0.0122, global_step=2322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 93/5971 [04:04<4:14:28,  2.60s/it, loss=0.175, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.44e-5, train/loss_step=0.0122, global_step=2322.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 93/5971 [04:04<4:14:28,  2.60s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000129, train/loss_step=0.0359, global_step=2323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 94/5971 [04:05<4:12:53,  2.58s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000129, train/loss_step=0.0359, global_step=2323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 94/5971 [04:05<4:12:53,  2.58s/it, loss=0.168, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=2323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   2%|▏         | 95/5971 [04:06<4:11:30,  2.57s/it, loss=0.168, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=2323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 95/5971 [04:06<4:11:30,  2.57s/it, loss=0.172, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000789, train/loss_step=0.197, global_step=2323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 96/5971 [04:09<4:11:40,  2.57s/it, loss=0.172, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000789, train/loss_step=0.197, global_step=2323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 96/5971 [04:09<4:11:40,  2.57s/it, loss=0.167, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000224, train/loss_step=0.066, global_step=2323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 97/5971 [04:10<4:10:38,  2.56s/it, loss=0.167, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000224, train/loss_step=0.066, global_step=2323.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 97/5971 [04:10<4:10:38,  2.56s/it, loss=0.153, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000965, train/loss_step=0.230, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 98/5971 [04:12<4:09:26,  2.55s/it, loss=0.153, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000965, train/loss_step=0.230, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 98/5971 [04:12<4:09:26,  2.55s/it, loss=0.149, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000562, train/loss_step=0.161, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 99/5971 [04:13<4:07:57,  2.53s/it, loss=0.149, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000562, train/loss_step=0.161, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 99/5971 [04:13<4:07:58,  2.53s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000223, train/loss_step=0.0661, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 100/5971 [04:15<4:07:58,  2.53s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000223, train/loss_step=0.0661, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   2%|▏         | 100/5971 [04:15<4:07:58,  2.53s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:28,  1.89it/s][A
Epoch 4:   2%|▏         | 102/5971 [04:16<4:03:37,  2.49s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:58,  2.83it/s][A
Epoch 4:   2%|▏         | 104/5971 [04:16<3:59:10,  2.45s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   2%|▏         | 4/167 [00:00<00:27,  5.99it/s][A
Epoch 4:   2%|▏         | 106/5971 [04:16<3:54:43,  2.40s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   4%|▎         | 6/167 [00:00<00:18,  8.92it/s][A
Epoch 4:   2%|▏         | 109/5971 [04:17<3:48:19,  2.34s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▌         | 9/167 [00:01<00:11, 13.21it/s][A
Epoch 4:   2%|▏         | 112/5971 [04:17<3:42:14,  2.28s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 12/167 [00:01<00:09, 16.28it/s][A
Epoch 4:   2%|▏         | 115/5971 [04:17<3:36:29,  2.22s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   9%|▉         | 15/167 [00:01<00:08, 17.93it/s][A
Epoch 4:   2%|▏         | 118/5971 [04:17<3:31:02,  2.16s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  11%|█         | 18/167 [00:01<00:07, 19.71it/s][A
Epoch 4:   2%|▏         | 121/5971 [04:17<3:25:50,  2.11s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:07, 20.63it/s][A
Epoch 4:   2%|▏         | 124/5971 [04:17<3:20:54,  2.06s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 20.93it/s][A
Epoch 4:   2%|▏         | 127/5971 [04:17<3:16:12,  2.01s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 27/167 [00:01<00:06, 21.49it/s][A
Epoch 4:   2%|▏         | 130/5971 [04:17<3:11:42,  1.97s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:02<00:06, 22.41it/s][A
Epoch 4:   2%|▏         | 133/5971 [04:18<3:07:24,  1.93s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 22.67it/s][A
Epoch 4:   2%|▏         | 136/5971 [04:18<3:03:18,  1.88s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 22.05it/s][A
Epoch 4:   2%|▏         | 139/5971 [04:18<2:59:23,  1.85s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 21.47it/s][A
Epoch 4:   2%|▏         | 142/5971 [04:18<2:55:38,  1.81s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:06, 20.48it/s][A
Epoch 4:   2%|▏         | 145/5971 [04:18<2:52:02,  1.77s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:06, 19.80it/s][A
Epoch 4:   2%|▏         | 148/5971 [04:18<2:48:37,  1.74s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:06, 17.99it/s][A
Epoch 4:   3%|▎         | 151/5971 [04:19<2:45:18,  1.70s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:03<00:06, 18.96it/s][A
Epoch 4:   3%|▎         | 154/5971 [04:19<2:42:07,  1.67s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:03<00:05, 19.53it/s][A
Epoch 4:   3%|▎         | 157/5971 [04:19<2:39:02,  1.64s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:03<00:05, 19.97it/s][A
Epoch 4:   3%|▎         | 160/5971 [04:19<2:36:05,  1.61s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:03<00:05, 20.48it/s][A
Epoch 4:   3%|▎         | 163/5971 [04:19<2:33:14,  1.58s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:03<00:05, 20.21it/s][A
Epoch 4:   3%|▎         | 166/5971 [04:19<2:30:30,  1.56s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:05, 19.33it/s][A
Epoch 4:   3%|▎         | 169/5971 [04:19<2:27:50,  1.53s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 20.85it/s][A
Epoch 4:   3%|▎         | 172/5971 [04:20<2:25:17,  1.50s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:04<00:04, 20.57it/s][A
Epoch 4:   3%|▎         | 175/5971 [04:20<2:22:49,  1.48s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  45%|████▍     | 75/167 [00:04<00:04, 20.46it/s][A
Epoch 4:   3%|▎         | 178/5971 [04:20<2:20:25,  1.45s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:04<00:04, 20.96it/s][A
Epoch 4:   3%|▎         | 181/5971 [04:20<2:18:07,  1.43s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:04<00:04, 19.22it/s][A
Epoch 4:   3%|▎         | 184/5971 [04:20<2:15:54,  1.41s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:04<00:04, 20.04it/s][A
Epoch 4:   3%|▎         | 187/5971 [04:20<2:13:44,  1.39s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:04<00:04, 19.85it/s][A
Epoch 4:   3%|▎         | 190/5971 [04:20<2:11:38,  1.37s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:05<00:03, 19.59it/s][A
Epoch 4:   3%|▎         | 193/5971 [04:21<2:09:37,  1.35s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:05<00:03, 18.87it/s][A

Validating:  57%|█████▋    | 95/167 [00:05<00:03, 18.82it/s][A
Epoch 4:   3%|▎         | 196/5971 [04:21<2:07:40,  1.33s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:05<00:03, 17.56it/s][A
Epoch 4:   3%|▎         | 199/5971 [04:21<2:05:46,  1.31s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▉    | 99/167 [00:05<00:03, 17.84it/s][A
Epoch 4:   3%|▎         | 202/5971 [04:21<2:03:55,  1.29s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:05<00:03, 19.51it/s][A
Epoch 4:   3%|▎         | 205/5971 [04:21<2:02:06,  1.27s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:05<00:03, 20.46it/s][A
Epoch 4:   3%|▎         | 208/5971 [04:21<2:00:21,  1.25s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 20.77it/s][A
Epoch 4:   4%|▎         | 211/5971 [04:22<1:58:39,  1.24s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▋   | 111/167 [00:06<00:02, 21.52it/s][A
Epoch 4:   4%|▎         | 214/5971 [04:22<1:56:59,  1.22s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:06<00:02, 21.61it/s][A
Epoch 4:   4%|▎         | 217/5971 [04:22<1:55:23,  1.20s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:06<00:02, 22.67it/s][A
Epoch 4:   4%|▎         | 220/5971 [04:22<1:53:49,  1.19s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:06<00:02, 22.00it/s][A
Epoch 4:   4%|▎         | 223/5971 [04:22<1:52:17,  1.17s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▎  | 123/167 [00:06<00:01, 22.21it/s][A
Epoch 4:   4%|▍         | 226/5971 [04:22<1:50:49,  1.16s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:06<00:01, 20.87it/s][A
Epoch 4:   4%|▍         | 229/5971 [04:22<1:49:22,  1.14s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 129/167 [00:06<00:01, 21.14it/s][A
Epoch 4:   4%|▍         | 232/5971 [04:23<1:47:57,  1.13s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:07<00:01, 21.35it/s][A
Epoch 4:   4%|▍         | 235/5971 [04:23<1:46:35,  1.11s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:07<00:01, 21.86it/s][A
Epoch 4:   4%|▍         | 238/5971 [04:23<1:45:15,  1.10s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:07<00:01, 21.22it/s][A
Epoch 4:   4%|▍         | 241/5971 [04:23<1:43:57,  1.09s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:07<00:01, 21.65it/s][A
Epoch 4:   4%|▍         | 244/5971 [04:23<1:42:40,  1.08s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:07<00:01, 22.52it/s][A
Epoch 4:   4%|▍         | 247/5971 [04:23<1:41:25,  1.06s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:07<00:00, 23.48it/s][A
Epoch 4:   4%|▍         | 250/5971 [04:23<1:40:12,  1.05s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:07<00:00, 22.24it/s][A
Epoch 4:   4%|▍         | 253/5971 [04:23<1:39:01,  1.04s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:07<00:00, 22.88it/s][A
Epoch 4:   4%|▍         | 256/5971 [04:24<1:37:51,  1.03s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:08<00:00, 23.23it/s][A
Epoch 4:   4%|▍         | 259/5971 [04:24<1:36:44,  1.02s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:08<00:00, 22.46it/s][A
Epoch 4:   4%|▍         | 262/5971 [04:24<1:35:38,  1.01s/it, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:08<00:00, 21.28it/s][A
Epoch 4:   4%|▍         | 265/5971 [04:24<1:34:33,  1.01it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:08<00:00, 21.16it/s][A
Epoch 4:   4%|▍         | 268/5971 [04:24<1:33:29,  1.02it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   4%|▍         | 268/5971 [04:24<1:33:37,  1.02it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=2324.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:   5%|▍         | 269/5971 [04:26<1:33:40,  1.01it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0312, train/loss_vlb_step=0.000112, train/loss_step=0.0312, global_step=2325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 270/5971 [04:27<1:33:42,  1.01it/s, loss=0.112, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000571, train/loss_step=0.172, global_step=2325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▍         | 271/5971 [04:28<1:33:43,  1.01it/s, loss=0.112, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000571, train/loss_step=0.172, global_step=2325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 271/5971 [04:28<1:33:43,  1.01it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.39e-5, train/loss_step=0.00247, global_step=2325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 272/5971 [04:31<1:34:18,  1.01it/s, loss=0.101, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.000984, train/loss_step=0.267, global_step=2325.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   5%|▍         | 273/5971 [04:32<1:34:21,  1.01it/s, loss=0.106, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.00037, train/loss_step=0.113, global_step=2326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▍         | 274/5971 [04:33<1:34:24,  1.01it/s, loss=0.106, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.00037, train/loss_step=0.113, global_step=2326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 274/5971 [04:33<1:34:24,  1.01it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00758, train/loss_vlb_step=3.61e-5, train/loss_step=0.00758, global_step=2326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 275/5971 [04:34<1:34:23,  1.01it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0046, train/loss_vlb_step=2.46e-5, train/loss_step=0.0046, global_step=2326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▍         | 276/5971 [04:38<1:35:16,  1.00s/it, loss=0.109, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00136, train/loss_step=0.330, global_step=2326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   5%|▍         | 277/5971 [04:39<1:35:17,  1.00s/it, loss=0.109, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00136, train/loss_step=0.330, global_step=2326.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 277/5971 [04:39<1:35:17,  1.00s/it, loss=0.106, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000647, train/loss_step=0.184, global_step=2327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 278/5971 [04:40<1:35:16,  1.00s/it, loss=0.12, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00222, train/loss_step=0.347, global_step=2327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▍         | 279/5971 [04:41<1:35:16,  1.00s/it, loss=0.141, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00239, train/loss_step=0.422, global_step=2327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 280/5971 [04:43<1:35:50,  1.01s/it, loss=0.141, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00239, train/loss_step=0.422, global_step=2327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 280/5971 [04:43<1:35:50,  1.01s/it, loss=0.158, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00199, train/loss_step=0.355, global_step=2327.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 281/5971 [04:44<1:35:49,  1.01s/it, loss=0.165, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000608, train/loss_step=0.165, global_step=2328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 282/5971 [04:45<1:35:47,  1.01s/it, loss=0.189, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.0112, train/loss_step=0.611, global_step=2328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▍         | 283/5971 [04:46<1:35:47,  1.01s/it, loss=0.189, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.0112, train/loss_step=0.611, global_step=2328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 283/5971 [04:46<1:35:47,  1.01s/it, loss=0.18, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.92e-5, train/loss_step=0.00358, global_step=2328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 284/5971 [04:49<1:36:24,  1.02s/it, loss=0.207, v_num=0, train/loss_simple_step=0.614, train/loss_vlb_step=0.00453, train/loss_step=0.614, global_step=2328.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   5%|▍         | 285/5971 [04:51<1:36:25,  1.02s/it, loss=0.23, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.0102, train/loss_step=0.689, global_step=2329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▍         | 286/5971 [04:52<1:36:29,  1.02s/it, loss=0.23, v_num=0, train/loss_simple_step=0.689, train/loss_vlb_step=0.0102, train/loss_step=0.689, global_step=2329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 286/5971 [04:52<1:36:29,  1.02s/it, loss=0.265, v_num=0, train/loss_simple_step=0.863, train/loss_vlb_step=0.088, train/loss_step=0.863, global_step=2329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 287/5971 [04:53<1:36:27,  1.02s/it, loss=0.262, v_num=0, train/loss_simple_step=0.00545, train/loss_vlb_step=2.58e-5, train/loss_step=0.00545, global_step=2329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 288/5971 [04:56<1:37:17,  1.03s/it, loss=0.266, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000468, train/loss_step=0.139, global_step=2329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   5%|▍         | 289/5971 [04:57<1:37:17,  1.03s/it, loss=0.266, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000468, train/loss_step=0.139, global_step=2329.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 289/5971 [04:57<1:37:17,  1.03s/it, loss=0.265, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.8e-5, train/loss_step=0.0133, global_step=2330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 290/5971 [04:58<1:37:14,  1.03s/it, loss=0.261, v_num=0, train/loss_simple_step=0.0813, train/loss_vlb_step=0.000275, train/loss_step=0.0813, global_step=2330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 291/5971 [04:59<1:37:11,  1.03s/it, loss=0.264, v_num=0, train/loss_simple_step=0.056, train/loss_vlb_step=0.000194, train/loss_step=0.056, global_step=2330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▍         | 292/5971 [05:02<1:37:48,  1.03s/it, loss=0.264, v_num=0, train/loss_simple_step=0.056, train/loss_vlb_step=0.000194, train/loss_step=0.056, global_step=2330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 292/5971 [05:02<1:37:48,  1.03s/it, loss=0.254, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.00027, train/loss_step=0.0815, global_step=2330.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 293/5971 [05:03<1:37:47,  1.03s/it, loss=0.249, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.53e-5, train/loss_step=0.0158, global_step=2331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 294/5971 [05:04<1:37:47,  1.03s/it, loss=0.25, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=8.79e-5, train/loss_step=0.0222, global_step=2331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▍         | 295/5971 [05:06<1:37:48,  1.03s/it, loss=0.25, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=8.79e-5, train/loss_step=0.0222, global_step=2331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 295/5971 [05:06<1:37:48,  1.03s/it, loss=0.253, v_num=0, train/loss_simple_step=0.0574, train/loss_vlb_step=0.000201, train/loss_step=0.0574, global_step=2331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 296/5971 [05:08<1:38:11,  1.04s/it, loss=0.239, v_num=0, train/loss_simple_step=0.0578, train/loss_vlb_step=0.0002, train/loss_step=0.0578, global_step=2331.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▍         | 297/5971 [05:09<1:38:08,  1.04s/it, loss=0.245, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00127, train/loss_step=0.297, global_step=2332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▍         | 298/5971 [05:10<1:38:05,  1.04s/it, loss=0.245, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00127, train/loss_step=0.297, global_step=2332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▍         | 298/5971 [05:10<1:38:05,  1.04s/it, loss=0.228, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.68e-5, train/loss_step=0.011, global_step=2332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 299/5971 [05:11<1:38:03,  1.04s/it, loss=0.224, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00143, train/loss_step=0.351, global_step=2332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 300/5971 [05:14<1:38:43,  1.04s/it, loss=0.213, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000389, train/loss_step=0.119, global_step=2332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 301/5971 [05:15<1:38:42,  1.04s/it, loss=0.213, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000389, train/loss_step=0.119, global_step=2332.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 301/5971 [05:15<1:38:42,  1.04s/it, loss=0.205, v_num=0, train/loss_simple_step=0.00584, train/loss_vlb_step=2.82e-5, train/loss_step=0.00584, global_step=2333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 302/5971 [05:16<1:38:39,  1.04s/it, loss=0.177, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000177, train/loss_step=0.0496, global_step=2333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▌         | 303/5971 [05:17<1:38:40,  1.04s/it, loss=0.181, v_num=0, train/loss_simple_step=0.0835, train/loss_vlb_step=0.000281, train/loss_step=0.0835, global_step=2333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 304/5971 [05:20<1:39:17,  1.05s/it, loss=0.181, v_num=0, train/loss_simple_step=0.0835, train/loss_vlb_step=0.000281, train/loss_step=0.0835, global_step=2333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 304/5971 [05:20<1:39:17,  1.05s/it, loss=0.153, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000195, train/loss_step=0.055, global_step=2333.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▌         | 305/5971 [05:21<1:39:19,  1.05s/it, loss=0.12, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.00014, train/loss_step=0.0375, global_step=2334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 306/5971 [05:22<1:39:19,  1.05s/it, loss=0.0771, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.42e-5, train/loss_step=0.00259, global_step=2334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 307/5971 [05:24<1:39:20,  1.05s/it, loss=0.0771, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.42e-5, train/loss_step=0.00259, global_step=2334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 307/5971 [05:24<1:39:20,  1.05s/it, loss=0.0776, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.38e-5, train/loss_step=0.0165, global_step=2334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▌         | 308/5971 [05:27<1:39:57,  1.06s/it, loss=0.0783, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000517, train/loss_step=0.152, global_step=2334.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▌         | 309/5971 [05:28<1:39:55,  1.06s/it, loss=0.0955, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00211, train/loss_step=0.358, global_step=2335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▌         | 310/5971 [05:29<1:39:54,  1.06s/it, loss=0.0955, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00211, train/loss_step=0.358, global_step=2335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 310/5971 [05:29<1:39:54,  1.06s/it, loss=0.0931, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.00012, train/loss_step=0.0333, global_step=2335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 311/5971 [05:30<1:39:52,  1.06s/it, loss=0.101, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000745, train/loss_step=0.220, global_step=2335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▌         | 312/5971 [05:32<1:40:20,  1.06s/it, loss=0.098, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.47e-5, train/loss_step=0.0152, global_step=2335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 313/5971 [05:34<1:40:23,  1.06s/it, loss=0.098, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.47e-5, train/loss_step=0.0152, global_step=2335.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 313/5971 [05:34<1:40:23,  1.06s/it, loss=0.0997, v_num=0, train/loss_simple_step=0.0498, train/loss_vlb_step=0.000174, train/loss_step=0.0498, global_step=2336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 314/5971 [05:35<1:40:28,  1.07s/it, loss=0.116, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00152, train/loss_step=0.351, global_step=2336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   5%|▌         | 315/5971 [05:36<1:40:27,  1.07s/it, loss=0.133, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00186, train/loss_step=0.386, global_step=2336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 316/5971 [05:39<1:40:49,  1.07s/it, loss=0.133, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00186, train/loss_step=0.386, global_step=2336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 316/5971 [05:39<1:40:49,  1.07s/it, loss=0.143, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00098, train/loss_step=0.265, global_step=2336.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 317/5971 [05:40<1:40:46,  1.07s/it, loss=0.135, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000473, train/loss_step=0.144, global_step=2337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 318/5971 [05:41<1:40:44,  1.07s/it, loss=0.164, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00985, train/loss_step=0.586, global_step=2337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▌         | 319/5971 [05:42<1:40:41,  1.07s/it, loss=0.164, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00985, train/loss_step=0.586, global_step=2337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 319/5971 [05:42<1:40:41,  1.07s/it, loss=0.147, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.54e-5, train/loss_step=0.00289, global_step=2337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 320/5971 [05:44<1:41:02,  1.07s/it, loss=0.148, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.00051, train/loss_step=0.154, global_step=2337.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   5%|▌         | 321/5971 [05:45<1:41:05,  1.07s/it, loss=0.15, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000158, train/loss_step=0.0467, global_step=2338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 322/5971 [05:46<1:41:08,  1.07s/it, loss=0.15, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000158, train/loss_step=0.0467, global_step=2338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 322/5971 [05:46<1:41:08,  1.07s/it, loss=0.15, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000167, train/loss_step=0.047, global_step=2338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   5%|▌         | 323/5971 [05:48<1:41:06,  1.07s/it, loss=0.153, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=2338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 324/5971 [05:50<1:41:33,  1.08s/it, loss=0.153, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000199, train/loss_step=0.0573, global_step=2338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 325/5971 [05:51<1:41:31,  1.08s/it, loss=0.153, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000199, train/loss_step=0.0573, global_step=2338.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 325/5971 [05:51<1:41:31,  1.08s/it, loss=0.152, v_num=0, train/loss_simple_step=0.00363, train/loss_vlb_step=1.97e-5, train/loss_step=0.00363, global_step=2339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 326/5971 [05:52<1:41:28,  1.08s/it, loss=0.156, v_num=0, train/loss_simple_step=0.0834, train/loss_vlb_step=0.000276, train/loss_step=0.0834, global_step=2339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   5%|▌         | 327/5971 [05:53<1:41:26,  1.08s/it, loss=0.155, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.23e-5, train/loss_step=0.00215, global_step=2339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 328/5971 [05:57<1:42:06,  1.09s/it, loss=0.155, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.23e-5, train/loss_step=0.00215, global_step=2339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   5%|▌         | 328/5971 [05:57<1:42:06,  1.09s/it, loss=0.153, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000368, train/loss_step=0.112, global_step=2339.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   6%|▌         | 329/5971 [05:58<1:42:04,  1.09s/it, loss=0.139, v_num=0, train/loss_simple_step=0.0739, train/loss_vlb_step=0.00025, train/loss_step=0.0739, global_step=2340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 330/5971 [05:59<1:42:02,  1.09s/it, loss=0.146, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000618, train/loss_step=0.181, global_step=2340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   6%|▌         | 331/5971 [06:00<1:42:00,  1.09s/it, loss=0.146, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000618, train/loss_step=0.181, global_step=2340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 331/5971 [06:00<1:42:00,  1.09s/it, loss=0.135, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.48e-5, train/loss_step=0.00259, global_step=2340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 332/5971 [06:03<1:42:37,  1.09s/it, loss=0.136, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000134, train/loss_step=0.0354, global_step=2340.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   6%|▌         | 333/5971 [06:04<1:42:40,  1.09s/it, loss=0.137, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000191, train/loss_step=0.0553, global_step=2341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 334/5971 [06:05<1:42:37,  1.09s/it, loss=0.137, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000191, train/loss_step=0.0553, global_step=2341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 334/5971 [06:05<1:42:37,  1.09s/it, loss=0.152, v_num=0, train/loss_simple_step=0.652, train/loss_vlb_step=0.0119, train/loss_step=0.652, global_step=2341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   6%|▌         | 335/5971 [06:06<1:42:35,  1.09s/it, loss=0.132, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=9.85e-6, train/loss_step=0.00171, global_step=2341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 336/5971 [06:09<1:43:00,  1.10s/it, loss=0.12, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.62e-5, train/loss_step=0.0188, global_step=2341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   6%|▌         | 337/5971 [06:10<1:43:03,  1.10s/it, loss=0.12, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.62e-5, train/loss_step=0.0188, global_step=2341.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 337/5971 [06:10<1:43:03,  1.10s/it, loss=0.157, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.064, train/loss_step=0.873, global_step=2342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   6%|▌         | 338/5971 [06:12<1:43:06,  1.10s/it, loss=0.145, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00249, train/loss_step=0.352, global_step=2342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 339/5971 [06:13<1:43:08,  1.10s/it, loss=0.145, v_num=0, train/loss_simple_step=0.00617, train/loss_vlb_step=3.03e-5, train/loss_step=0.00617, global_step=2342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 340/5971 [06:16<1:43:33,  1.10s/it, loss=0.145, v_num=0, train/loss_simple_step=0.00617, train/loss_vlb_step=3.03e-5, train/loss_step=0.00617, global_step=2342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 340/5971 [06:16<1:43:33,  1.10s/it, loss=0.143, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000358, train/loss_step=0.104, global_step=2342.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   6%|▌         | 341/5971 [06:17<1:43:31,  1.10s/it, loss=0.141, v_num=0, train/loss_simple_step=0.00835, train/loss_vlb_step=3.76e-5, train/loss_step=0.00835, global_step=2343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 342/5971 [06:18<1:43:32,  1.10s/it, loss=0.167, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00803, train/loss_step=0.568, global_step=2343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   6%|▌         | 343/5971 [06:19<1:43:31,  1.10s/it, loss=0.167, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00803, train/loss_step=0.568, global_step=2343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 343/5971 [06:19<1:43:31,  1.10s/it, loss=0.172, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00105, train/loss_step=0.253, global_step=2343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 344/5971 [06:22<1:43:50,  1.11s/it, loss=0.178, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000582, train/loss_step=0.172, global_step=2343.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 345/5971 [06:23<1:43:51,  1.11s/it, loss=0.182, v_num=0, train/loss_simple_step=0.0762, train/loss_vlb_step=0.000254, train/loss_step=0.0762, global_step=2344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 346/5971 [06:24<1:43:50,  1.11s/it, loss=0.182, v_num=0, train/loss_simple_step=0.0762, train/loss_vlb_step=0.000254, train/loss_step=0.0762, global_step=2344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 346/5971 [06:24<1:43:50,  1.11s/it, loss=0.216, v_num=0, train/loss_simple_step=0.777, train/loss_vlb_step=0.0337, train/loss_step=0.777, global_step=2344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   6%|▌         | 347/5971 [06:25<1:43:48,  1.11s/it, loss=0.217, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.85e-5, train/loss_step=0.0186, global_step=2344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 348/5971 [06:27<1:44:08,  1.11s/it, loss=0.212, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.5e-5, train/loss_step=0.00715, global_step=2344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 349/5971 [06:28<1:44:05,  1.11s/it, loss=0.212, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.5e-5, train/loss_step=0.00715, global_step=2344.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 349/5971 [06:28<1:44:05,  1.11s/it, loss=0.213, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=2345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   6%|▌         | 350/5971 [06:29<1:44:01,  1.11s/it, loss=0.209, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=2345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 351/5971 [06:30<1:44:00,  1.11s/it, loss=0.227, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00161, train/loss_step=0.363, global_step=2345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   6%|▌         | 352/5971 [06:33<1:44:29,  1.12s/it, loss=0.227, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00161, train/loss_step=0.363, global_step=2345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 352/5971 [06:33<1:44:29,  1.12s/it, loss=0.226, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=3.85e-5, train/loss_step=0.00854, global_step=2345.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 353/5971 [06:34<1:44:26,  1.12s/it, loss=0.225, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=0.000101, train/loss_step=0.0252, global_step=2346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   6%|▌         | 354/5971 [06:35<1:44:21,  1.11s/it, loss=0.198, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=2346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   6%|▌         | 355/5971 [06:36<1:44:18,  1.11s/it, loss=0.198, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=2346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 355/5971 [06:36<1:44:18,  1.11s/it, loss=0.216, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00163, train/loss_step=0.358, global_step=2346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   6%|▌         | 356/5971 [06:40<1:44:55,  1.12s/it, loss=0.215, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.15e-5, train/loss_step=0.0118, global_step=2346.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 357/5971 [06:41<1:44:52,  1.12s/it, loss=0.186, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00148, train/loss_step=0.280, global_step=2347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   6%|▌         | 358/5971 [06:42<1:44:52,  1.12s/it, loss=0.186, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00148, train/loss_step=0.280, global_step=2347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 358/5971 [06:42<1:44:52,  1.12s/it, loss=0.172, v_num=0, train/loss_simple_step=0.0681, train/loss_vlb_step=0.00023, train/loss_step=0.0681, global_step=2347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 359/5971 [06:43<1:44:50,  1.12s/it, loss=0.172, v_num=0, train/loss_simple_step=0.00513, train/loss_vlb_step=2.75e-5, train/loss_step=0.00513, global_step=2347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 360/5971 [06:46<1:45:11,  1.12s/it, loss=0.186, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.0026, train/loss_step=0.396, global_step=2347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:   6%|▌         | 361/5971 [06:46<1:45:07,  1.12s/it, loss=0.186, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.0026, train/loss_step=0.396, global_step=2347.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 361/5971 [06:46<1:45:07,  1.12s/it, loss=0.19, v_num=0, train/loss_simple_step=0.090, train/loss_vlb_step=0.000296, train/loss_step=0.090, global_step=2348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 362/5971 [06:47<1:45:03,  1.12s/it, loss=0.162, v_num=0, train/loss_simple_step=0.00538, train/loss_vlb_step=2.81e-5, train/loss_step=0.00538, global_step=2348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 363/5971 [06:48<1:44:59,  1.12s/it, loss=0.164, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00126, train/loss_step=0.287, global_step=2348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   6%|▌         | 364/5971 [06:51<1:45:25,  1.13s/it, loss=0.164, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00126, train/loss_step=0.287, global_step=2348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 364/5971 [06:51<1:45:25,  1.13s/it, loss=0.171, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00173, train/loss_step=0.324, global_step=2348.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 365/5971 [06:52<1:45:21,  1.13s/it, loss=0.189, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00225, train/loss_step=0.424, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 366/5971 [06:53<1:45:18,  1.13s/it, loss=0.164, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00115, train/loss_step=0.274, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 367/5971 [06:54<1:45:15,  1.13s/it, loss=0.164, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00115, train/loss_step=0.274, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 367/5971 [06:54<1:45:15,  1.13s/it, loss=0.172, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000634, train/loss_step=0.187, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   6%|▌         | 368/5971 [06:57<1:45:35,  1.13s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:30,  1.82it/s][A
Epoch 4:   6%|▌         | 370/5971 [06:57<1:45:08,  1.13s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:46,  3.51it/s][A
Epoch 4:   6%|▌         | 373/5971 [06:58<1:44:16,  1.12s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.15it/s][A
Epoch 4:   6%|▋         | 376/5971 [06:58<1:43:25,  1.11s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.93it/s][A
Epoch 4:   6%|▋         | 379/5971 [06:58<1:42:34,  1.10s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.93it/s][A
Epoch 4:   6%|▋         | 382/5971 [06:58<1:41:44,  1.09s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.87it/s][A
Epoch 4:   6%|▋         | 385/5971 [06:58<1:40:56,  1.08s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.79it/s][A
Epoch 4:   6%|▋         | 388/5971 [06:58<1:40:08,  1.08s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.18it/s][A
Epoch 4:   7%|▋         | 391/5971 [06:58<1:39:20,  1.07s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.18it/s][A
Epoch 4:   7%|▋         | 394/5971 [06:58<1:38:33,  1.06s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.29it/s][A
Epoch 4:   7%|▋         | 397/5971 [06:58<1:37:47,  1.05s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 22.69it/s][A
Epoch 4:   7%|▋         | 400/5971 [06:59<1:37:02,  1.05s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.33it/s][A
Epoch 4:   7%|▋         | 403/5971 [06:59<1:36:17,  1.04s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.42it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.31it/s][A
Epoch 4:   7%|▋         | 407/5971 [06:59<1:35:19,  1.03s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 25.17it/s][A
Epoch 4:   7%|▋         | 411/5971 [06:59<1:34:21,  1.02s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 24.66it/s][A
Epoch 4:   7%|▋         | 415/5971 [06:59<1:33:25,  1.01s/it, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:05, 23.42it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:05, 23.00it/s][A
Epoch 4:   7%|▋         | 419/5971 [06:59<1:32:30,  1.00it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 24.02it/s][A
Epoch 4:   7%|▋         | 423/5971 [07:00<1:31:36,  1.01it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 23.94it/s][A
Epoch 4:   7%|▋         | 427/5971 [07:00<1:30:43,  1.02it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 23.70it/s][A

Validating:  37%|███▋      | 62/167 [00:03<00:04, 24.16it/s][A
Epoch 4:   7%|▋         | 431/5971 [07:00<1:29:51,  1.03it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 24.48it/s][A
Epoch 4:   7%|▋         | 435/5971 [07:00<1:28:59,  1.04it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 24.99it/s][A
Epoch 4:   7%|▋         | 439/5971 [07:00<1:28:09,  1.05it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 24.85it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 24.16it/s][A
Epoch 4:   7%|▋         | 443/5971 [07:00<1:27:20,  1.05it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 23.25it/s][A
Epoch 4:   7%|▋         | 447/5971 [07:01<1:26:31,  1.06it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 24.08it/s][A
Epoch 4:   8%|▊         | 451/5971 [07:01<1:25:44,  1.07it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 24.21it/s][A

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 25.17it/s][A
Epoch 4:   8%|▊         | 455/5971 [07:01<1:24:57,  1.08it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 24.90it/s][A
Epoch 4:   8%|▊         | 459/5971 [07:01<1:24:10,  1.09it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 23.84it/s][A
Epoch 4:   8%|▊         | 463/5971 [07:01<1:23:25,  1.10it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 24.03it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 23.73it/s][A
Epoch 4:   8%|▊         | 467/5971 [07:01<1:22:41,  1.11it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.50it/s][A
Epoch 4:   8%|▊         | 471/5971 [07:02<1:21:57,  1.12it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.54it/s][A
Epoch 4:   8%|▊         | 475/5971 [07:02<1:21:14,  1.13it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.37it/s][A

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 23.40it/s][A
Epoch 4:   8%|▊         | 479/5971 [07:02<1:20:32,  1.14it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 21.77it/s][A
Epoch 4:   8%|▊         | 483/5971 [07:02<1:19:51,  1.15it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 22.42it/s][A
Epoch 4:   8%|▊         | 487/5971 [07:02<1:19:10,  1.15it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:05<00:02, 23.40it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 24.17it/s][A
Epoch 4:   8%|▊         | 491/5971 [07:02<1:18:30,  1.16it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 24.68it/s][A
Epoch 4:   8%|▊         | 495/5971 [07:03<1:17:50,  1.17it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.28it/s][A
Epoch 4:   8%|▊         | 499/5971 [07:03<1:17:11,  1.18it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.28it/s][A

Validating:  80%|████████  | 134/167 [00:06<00:01, 26.14it/s][A
Epoch 4:   8%|▊         | 503/5971 [07:03<1:16:32,  1.19it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.58it/s][A
Epoch 4:   8%|▊         | 507/5971 [07:03<1:15:54,  1.20it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 25.93it/s][A
Epoch 4:   9%|▊         | 511/5971 [07:03<1:15:17,  1.21it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.59it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.42it/s][A
Epoch 4:   9%|▊         | 515/5971 [07:03<1:14:40,  1.22it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.86it/s][A
Epoch 4:   9%|▊         | 519/5971 [07:03<1:14:04,  1.23it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.46it/s][A
Epoch 4:   9%|▉         | 523/5971 [07:04<1:13:29,  1.24it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.89it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.26it/s][A
Epoch 4:   9%|▉         | 527/5971 [07:04<1:12:54,  1.24it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 24.12it/s][A
Epoch 4:   9%|▉         | 531/5971 [07:04<1:12:20,  1.25it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 25.12it/s][A
Epoch 4:   9%|▉         | 535/5971 [07:04<1:11:45,  1.26it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 100%|██████████| 167/167 [00:07<00:00, 26.41it/s][A
Epoch 4:   9%|▉         | 536/5971 [07:04<1:11:40,  1.26it/s, loss=0.172, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=2349.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:   9%|▉         | 537/5971 [07:06<1:11:43,  1.26it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0973, train/loss_vlb_step=0.000321, train/loss_step=0.0973, global_step=2350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 538/5971 [07:06<1:11:43,  1.26it/s, loss=0.194, v_num=0, train/loss_simple_step=0.558, train/loss_vlb_step=0.00725, train/loss_step=0.558, global_step=2350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   9%|▉         | 539/5971 [07:07<1:11:44,  1.26it/s, loss=0.194, v_num=0, train/loss_simple_step=0.558, train/loss_vlb_step=0.00725, train/loss_step=0.558, global_step=2350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 539/5971 [07:07<1:11:44,  1.26it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0572, train/loss_vlb_step=0.000198, train/loss_step=0.0572, global_step=2350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 540/5971 [07:11<1:12:09,  1.25it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0668, train/loss_vlb_step=0.000232, train/loss_step=0.0668, global_step=2350.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 541/5971 [07:12<1:12:11,  1.25it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0903, train/loss_vlb_step=0.0003, train/loss_step=0.0903, global_step=2351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   9%|▉         | 542/5971 [07:13<1:12:11,  1.25it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000119, train/loss_step=0.0315, global_step=2351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 543/5971 [07:14<1:12:12,  1.25it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000119, train/loss_step=0.0315, global_step=2351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 543/5971 [07:14<1:12:12,  1.25it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.15e-5, train/loss_step=0.00415, global_step=2351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 544/5971 [07:16<1:12:28,  1.25it/s, loss=0.173, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000853, train/loss_step=0.219, global_step=2351.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   9%|▉         | 545/5971 [07:17<1:12:29,  1.25it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.19e-5, train/loss_step=0.0116, global_step=2352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 546/5971 [07:18<1:12:29,  1.25it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000175, train/loss_step=0.0474, global_step=2352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 547/5971 [07:19<1:12:29,  1.25it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000175, train/loss_step=0.0474, global_step=2352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 547/5971 [07:19<1:12:29,  1.25it/s, loss=0.164, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=2352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   9%|▉         | 548/5971 [07:23<1:12:57,  1.24it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000121, train/loss_step=0.0306, global_step=2352.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 549/5971 [07:24<1:12:57,  1.24it/s, loss=0.177, v_num=0, train/loss_simple_step=0.704, train/loss_vlb_step=0.0232, train/loss_step=0.704, global_step=2353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   9%|▉         | 550/5971 [07:25<1:12:58,  1.24it/s, loss=0.209, v_num=0, train/loss_simple_step=0.656, train/loss_vlb_step=0.0087, train/loss_step=0.656, global_step=2353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 551/5971 [07:25<1:12:58,  1.24it/s, loss=0.209, v_num=0, train/loss_simple_step=0.656, train/loss_vlb_step=0.0087, train/loss_step=0.656, global_step=2353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 551/5971 [07:25<1:12:58,  1.24it/s, loss=0.201, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000392, train/loss_step=0.119, global_step=2353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 552/5971 [07:28<1:13:11,  1.23it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000114, train/loss_step=0.0289, global_step=2353.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 553/5971 [07:29<1:13:11,  1.23it/s, loss=0.185, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00363, train/loss_step=0.399, global_step=2354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   9%|▉         | 554/5971 [07:29<1:13:11,  1.23it/s, loss=0.196, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.0048, train/loss_step=0.495, global_step=2354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   9%|▉         | 555/5971 [07:30<1:13:11,  1.23it/s, loss=0.196, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.0048, train/loss_step=0.495, global_step=2354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 555/5971 [07:30<1:13:11,  1.23it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00325, train/loss_vlb_step=1.81e-5, train/loss_step=0.00325, global_step=2354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 556/5971 [07:33<1:13:27,  1.23it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.78e-5, train/loss_step=0.0108, global_step=2354.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:   9%|▉         | 557/5971 [07:34<1:13:29,  1.23it/s, loss=0.189, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000515, train/loss_step=0.141, global_step=2355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:   9%|▉         | 558/5971 [07:35<1:13:29,  1.23it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.33e-5, train/loss_step=0.00237, global_step=2355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 559/5971 [07:36<1:13:30,  1.23it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.33e-5, train/loss_step=0.00237, global_step=2355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 559/5971 [07:36<1:13:30,  1.23it/s, loss=0.208, v_num=0, train/loss_simple_step=0.983, train/loss_vlb_step=0.495, train/loss_step=0.983, global_step=2355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:   9%|▉         | 560/5971 [07:38<1:13:43,  1.22it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.22e-5, train/loss_step=0.0123, global_step=2355.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 561/5971 [07:39<1:13:43,  1.22it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0478, train/loss_vlb_step=0.000168, train/loss_step=0.0478, global_step=2356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 562/5971 [07:40<1:13:43,  1.22it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00634, train/loss_vlb_step=2.95e-5, train/loss_step=0.00634, global_step=2356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 563/5971 [07:41<1:13:45,  1.22it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00634, train/loss_vlb_step=2.95e-5, train/loss_step=0.00634, global_step=2356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 563/5971 [07:41<1:13:45,  1.22it/s, loss=0.215, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00119, train/loss_step=0.267, global_step=2356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:   9%|▉         | 564/5971 [07:43<1:13:57,  1.22it/s, loss=0.226, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00336, train/loss_step=0.452, global_step=2356.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 565/5971 [07:44<1:13:58,  1.22it/s, loss=0.244, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00189, train/loss_step=0.368, global_step=2357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 566/5971 [07:45<1:13:59,  1.22it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0863, train/loss_vlb_step=0.000284, train/loss_step=0.0863, global_step=2357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 567/5971 [07:46<1:13:59,  1.22it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0863, train/loss_vlb_step=0.000284, train/loss_step=0.0863, global_step=2357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:   9%|▉         | 567/5971 [07:46<1:13:59,  1.22it/s, loss=0.269, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00564, train/loss_step=0.574, global_step=2357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  10%|▉         | 568/5971 [07:49<1:14:15,  1.21it/s, loss=0.28, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000842, train/loss_step=0.236, global_step=2357.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 569/5971 [07:50<1:14:16,  1.21it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000161, train/loss_step=0.0453, global_step=2358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 570/5971 [07:51<1:14:15,  1.21it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.35e-5, train/loss_step=0.00234, global_step=2358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 571/5971 [07:52<1:14:16,  1.21it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.35e-5, train/loss_step=0.00234, global_step=2358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 571/5971 [07:52<1:14:16,  1.21it/s, loss=0.213, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000329, train/loss_step=0.100, global_step=2358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  10%|▉         | 572/5971 [07:54<1:14:28,  1.21it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000144, train/loss_step=0.0406, global_step=2358.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 573/5971 [07:55<1:14:30,  1.21it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.54e-5, train/loss_step=0.0028, global_step=2359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  10%|▉         | 574/5971 [07:56<1:14:30,  1.21it/s, loss=0.189, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00239, train/loss_step=0.391, global_step=2359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|▉         | 575/5971 [07:57<1:14:31,  1.21it/s, loss=0.189, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00239, train/loss_step=0.391, global_step=2359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 575/5971 [07:57<1:14:32,  1.21it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000168, train/loss_step=0.0465, global_step=2359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 576/5971 [07:59<1:14:44,  1.20it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.13e-5, train/loss_step=0.00388, global_step=2359.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 577/5971 [08:00<1:14:44,  1.20it/s, loss=0.194, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000727, train/loss_step=0.207, global_step=2360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|▉         | 578/5971 [08:01<1:14:45,  1.20it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.96e-5, train/loss_step=0.0194, global_step=2360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 579/5971 [08:02<1:14:45,  1.20it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.96e-5, train/loss_step=0.0194, global_step=2360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 579/5971 [08:02<1:14:45,  1.20it/s, loss=0.159, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00119, train/loss_step=0.273, global_step=2360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|▉         | 580/5971 [08:05<1:15:02,  1.20it/s, loss=0.184, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00331, train/loss_step=0.518, global_step=2360.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 581/5971 [08:06<1:15:02,  1.20it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000217, train/loss_step=0.0619, global_step=2361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 582/5971 [08:07<1:15:02,  1.20it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.00011, train/loss_step=0.0282, global_step=2361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  10%|▉         | 583/5971 [08:07<1:15:01,  1.20it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.00011, train/loss_step=0.0282, global_step=2361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 583/5971 [08:07<1:15:01,  1.20it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.15e-5, train/loss_step=0.00407, global_step=2361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 584/5971 [08:10<1:15:17,  1.19it/s, loss=0.168, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00182, train/loss_step=0.351, global_step=2361.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  10%|▉         | 585/5971 [08:11<1:15:18,  1.19it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.83e-5, train/loss_step=0.0108, global_step=2362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 586/5971 [08:12<1:15:17,  1.19it/s, loss=0.162, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00116, train/loss_step=0.317, global_step=2362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  10%|▉         | 587/5971 [08:13<1:15:18,  1.19it/s, loss=0.162, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00116, train/loss_step=0.317, global_step=2362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 587/5971 [08:13<1:15:18,  1.19it/s, loss=0.172, v_num=0, train/loss_simple_step=0.781, train/loss_vlb_step=0.0448, train/loss_step=0.781, global_step=2362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  10%|▉         | 588/5971 [08:15<1:15:32,  1.19it/s, loss=0.17, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000685, train/loss_step=0.192, global_step=2362.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 589/5971 [08:17<1:15:34,  1.19it/s, loss=0.181, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00104, train/loss_step=0.259, global_step=2363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 590/5971 [08:18<1:15:34,  1.19it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00938, train/loss_vlb_step=3.78e-5, train/loss_step=0.00938, global_step=2363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 591/5971 [08:19<1:15:35,  1.19it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00938, train/loss_vlb_step=3.78e-5, train/loss_step=0.00938, global_step=2363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 591/5971 [08:19<1:15:35,  1.19it/s, loss=0.191, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00143, train/loss_step=0.301, global_step=2363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  10%|▉         | 592/5971 [08:21<1:15:47,  1.18it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00284, train/loss_vlb_step=1.57e-5, train/loss_step=0.00284, global_step=2363.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 593/5971 [08:22<1:15:47,  1.18it/s, loss=0.199, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000728, train/loss_step=0.207, global_step=2364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  10%|▉         | 594/5971 [08:23<1:15:47,  1.18it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.55e-5, train/loss_step=0.0246, global_step=2364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 595/5971 [08:24<1:15:47,  1.18it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.55e-5, train/loss_step=0.0246, global_step=2364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 595/5971 [08:24<1:15:47,  1.18it/s, loss=0.193, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00115, train/loss_step=0.283, global_step=2364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|▉         | 596/5971 [08:26<1:16:04,  1.18it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000108, train/loss_step=0.0266, global_step=2364.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|▉         | 597/5971 [08:27<1:16:05,  1.18it/s, loss=0.189, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000392, train/loss_step=0.118, global_step=2365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|█         | 598/5971 [08:28<1:16:05,  1.18it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.000167, train/loss_step=0.0477, global_step=2365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 599/5971 [08:29<1:16:04,  1.18it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.000167, train/loss_step=0.0477, global_step=2365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 599/5971 [08:29<1:16:04,  1.18it/s, loss=0.218, v_num=0, train/loss_simple_step=0.818, train/loss_vlb_step=0.0599, train/loss_step=0.818, global_step=2365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  10%|█         | 600/5971 [08:32<1:16:18,  1.17it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0056, train/loss_vlb_step=2.73e-5, train/loss_step=0.0056, global_step=2365.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 601/5971 [08:33<1:16:19,  1.17it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00949, train/loss_vlb_step=4.46e-5, train/loss_step=0.00949, global_step=2366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 602/5971 [08:34<1:16:20,  1.17it/s, loss=0.196, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000474, train/loss_step=0.142, global_step=2366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|█         | 603/5971 [08:35<1:16:20,  1.17it/s, loss=0.196, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000474, train/loss_step=0.142, global_step=2366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 603/5971 [08:35<1:16:20,  1.17it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0199, train/loss_vlb_step=8.04e-5, train/loss_step=0.0199, global_step=2366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 604/5971 [08:38<1:16:36,  1.17it/s, loss=0.184, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=2366.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  10%|█         | 605/5971 [08:39<1:16:36,  1.17it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.00011, train/loss_step=0.0281, global_step=2367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 606/5971 [08:40<1:16:37,  1.17it/s, loss=0.187, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00189, train/loss_step=0.360, global_step=2367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|█         | 607/5971 [08:41<1:16:37,  1.17it/s, loss=0.187, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00189, train/loss_step=0.360, global_step=2367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 607/5971 [08:41<1:16:37,  1.17it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00363, train/loss_vlb_step=1.95e-5, train/loss_step=0.00363, global_step=2367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 608/5971 [08:43<1:16:51,  1.16it/s, loss=0.156, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00216, train/loss_step=0.351, global_step=2367.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  10%|█         | 609/5971 [08:44<1:16:51,  1.16it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.00012, train/loss_step=0.0318, global_step=2368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 610/5971 [08:45<1:16:50,  1.16it/s, loss=0.161, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00168, train/loss_step=0.339, global_step=2368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|█         | 611/5971 [08:46<1:16:50,  1.16it/s, loss=0.161, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00168, train/loss_step=0.339, global_step=2368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 611/5971 [08:46<1:16:50,  1.16it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.04e-5, train/loss_step=0.00401, global_step=2368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 612/5971 [08:48<1:17:04,  1.16it/s, loss=0.171, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00404, train/loss_step=0.498, global_step=2368.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  10%|█         | 613/5971 [08:49<1:17:03,  1.16it/s, loss=0.18, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.0022, train/loss_step=0.394, global_step=2369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|█         | 614/5971 [08:50<1:17:03,  1.16it/s, loss=0.185, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=2369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 615/5971 [08:51<1:17:02,  1.16it/s, loss=0.185, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=2369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 615/5971 [08:51<1:17:02,  1.16it/s, loss=0.186, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.0014, train/loss_step=0.307, global_step=2369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  10%|█         | 616/5971 [08:54<1:17:17,  1.15it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=8.59e-5, train/loss_step=0.0233, global_step=2369.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 617/5971 [08:55<1:17:18,  1.15it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.85e-5, train/loss_step=0.0173, global_step=2370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 618/5971 [08:56<1:17:17,  1.15it/s, loss=0.194, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00142, train/loss_step=0.298, global_step=2370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|█         | 619/5971 [08:57<1:17:17,  1.15it/s, loss=0.194, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00142, train/loss_step=0.298, global_step=2370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 619/5971 [08:57<1:17:17,  1.15it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0583, train/loss_vlb_step=0.0002, train/loss_step=0.0583, global_step=2370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 620/5971 [08:59<1:17:29,  1.15it/s, loss=0.171, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.0017, train/loss_step=0.319, global_step=2370.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|█         | 621/5971 [09:00<1:17:28,  1.15it/s, loss=0.182, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000876, train/loss_step=0.225, global_step=2371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 622/5971 [09:01<1:17:28,  1.15it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00767, train/loss_vlb_step=3.77e-5, train/loss_step=0.00767, global_step=2371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 623/5971 [09:02<1:17:29,  1.15it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00767, train/loss_vlb_step=3.77e-5, train/loss_step=0.00767, global_step=2371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 623/5971 [09:02<1:17:29,  1.15it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.53e-5, train/loss_step=0.0252, global_step=2371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  10%|█         | 624/5971 [09:04<1:17:39,  1.15it/s, loss=0.176, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000367, train/loss_step=0.111, global_step=2371.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  10%|█         | 625/5971 [09:05<1:17:40,  1.15it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00888, train/loss_vlb_step=4.02e-5, train/loss_step=0.00888, global_step=2372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  10%|█         | 626/5971 [09:06<1:17:40,  1.15it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0563, train/loss_vlb_step=0.000193, train/loss_step=0.0563, global_step=2372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  11%|█         | 627/5971 [09:07<1:17:40,  1.15it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0563, train/loss_vlb_step=0.000193, train/loss_step=0.0563, global_step=2372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  11%|█         | 627/5971 [09:07<1:17:40,  1.15it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000254, train/loss_step=0.0755, global_step=2372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  11%|█         | 628/5971 [09:10<1:17:52,  1.14it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.49e-5, train/loss_step=0.0171, global_step=2372.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  11%|█         | 629/5971 [09:11<1:17:52,  1.14it/s, loss=0.155, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000736, train/loss_step=0.203, global_step=2373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  11%|█         | 630/5971 [09:11<1:17:51,  1.14it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.73e-5, train/loss_step=0.0101, global_step=2373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  11%|█         | 631/5971 [09:12<1:17:51,  1.14it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.73e-5, train/loss_step=0.0101, global_step=2373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  11%|█         | 631/5971 [09:12<1:17:51,  1.14it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.68e-5, train/loss_step=0.0054, global_step=2373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  11%|█         | 632/5971 [09:15<1:18:05,  1.14it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.67e-5, train/loss_step=0.0126, global_step=2373.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  11%|█         | 633/5971 [09:16<1:18:04,  1.14it/s, loss=0.125, v_num=0, train/loss_simple_step=0.593, train/loss_vlb_step=0.012, train/loss_step=0.593, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  11%|█         | 634/5971 [09:17<1:18:04,  1.14it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.27e-5, train/loss_step=0.00216, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  11%|█         | 635/5971 [09:18<1:18:03,  1.14it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.27e-5, train/loss_step=0.00216, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  11%|█         | 635/5971 [09:18<1:18:03,  1.14it/s, loss=0.108, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  11%|█         | 636/5971 [09:20<1:18:13,  1.14it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:16,  2.18it/s][A

Validating:   1%|          | 2/167 [00:00<00:52,  3.16it/s][A
Epoch 4:  11%|█         | 639/5971 [09:21<1:17:55,  1.14it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.34it/s][A
Epoch 4:  11%|█         | 643/5971 [09:21<1:17:24,  1.15it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.67it/s][A
Epoch 4:  11%|█         | 647/5971 [09:21<1:16:53,  1.15it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.22it/s][A
Epoch 4:  11%|█         | 651/5971 [09:21<1:16:22,  1.16it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.11it/s][A

Validating:  11%|█         | 18/167 [00:01<00:06, 21.40it/s][A
Epoch 4:  11%|█         | 655/5971 [09:21<1:15:52,  1.17it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 21.79it/s][A
Epoch 4:  11%|█         | 659/5971 [09:21<1:15:22,  1.17it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 22.16it/s][A
Epoch 4:  11%|█         | 663/5971 [09:22<1:14:53,  1.18it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 27/167 [00:01<00:06, 22.84it/s][A

Validating:  18%|█▊        | 30/167 [00:01<00:06, 21.87it/s][A
Epoch 4:  11%|█         | 667/5971 [09:22<1:14:24,  1.19it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 22.96it/s][A
Epoch 4:  11%|█         | 671/5971 [09:22<1:13:56,  1.19it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 23.14it/s][A
Epoch 4:  11%|█▏        | 675/5971 [09:22<1:13:27,  1.20it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 23.45it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.58it/s][A
Epoch 4:  11%|█▏        | 679/5971 [09:22<1:12:59,  1.21it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.07it/s][A
Epoch 4:  11%|█▏        | 683/5971 [09:22<1:12:32,  1.22it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.98it/s][A
Epoch 4:  12%|█▏        | 687/5971 [09:23<1:12:04,  1.22it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 24.80it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.05it/s][A
Epoch 4:  12%|█▏        | 691/5971 [09:23<1:11:37,  1.23it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.38it/s][A
Epoch 4:  12%|█▏        | 695/5971 [09:23<1:11:10,  1.24it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:03<00:04, 25.90it/s][A
Epoch 4:  12%|█▏        | 699/5971 [09:23<1:10:44,  1.24it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:03<00:03, 26.87it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.18it/s][A
Epoch 4:  12%|█▏        | 703/5971 [09:23<1:10:18,  1.25it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 24.62it/s][A
Epoch 4:  12%|█▏        | 707/5971 [09:23<1:09:52,  1.26it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.21it/s][A
Epoch 4:  12%|█▏        | 711/5971 [09:24<1:09:26,  1.26it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.40it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.01it/s][A
Epoch 4:  12%|█▏        | 715/5971 [09:24<1:09:01,  1.27it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 24.79it/s][A
Epoch 4:  12%|█▏        | 719/5971 [09:24<1:08:36,  1.28it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 24.27it/s][A
Epoch 4:  12%|█▏        | 723/5971 [09:24<1:08:12,  1.28it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 24.06it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 24.02it/s][A
Epoch 4:  12%|█▏        | 727/5971 [09:24<1:07:47,  1.29it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:03, 23.22it/s][A
Epoch 4:  12%|█▏        | 731/5971 [09:24<1:07:23,  1.30it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 24.31it/s][A
Epoch 4:  12%|█▏        | 735/5971 [09:25<1:06:59,  1.30it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 24.05it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 23.72it/s][A
Epoch 4:  12%|█▏        | 739/5971 [09:25<1:06:36,  1.31it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 23.95it/s][A
Epoch 4:  12%|█▏        | 743/5971 [09:25<1:06:12,  1.32it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 25.03it/s][A
Epoch 4:  13%|█▎        | 747/5971 [09:25<1:05:49,  1.32it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 25.85it/s][A
Epoch 4:  13%|█▎        | 751/5971 [09:25<1:05:26,  1.33it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 25.79it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.20it/s][A
Epoch 4:  13%|█▎        | 755/5971 [09:25<1:05:03,  1.34it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 23.99it/s][A
Epoch 4:  13%|█▎        | 759/5971 [09:26<1:04:41,  1.34it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 23.77it/s][A
Epoch 4:  13%|█▎        | 763/5971 [09:26<1:04:19,  1.35it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 24.96it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.26it/s][A
Epoch 4:  13%|█▎        | 767/5971 [09:26<1:03:57,  1.36it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.55it/s][A
Epoch 4:  13%|█▎        | 771/5971 [09:26<1:03:35,  1.36it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.85it/s][A
Epoch 4:  13%|█▎        | 775/5971 [09:26<1:03:14,  1.37it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 24.63it/s][A
Epoch 4:  13%|█▎        | 779/5971 [09:26<1:02:52,  1.38it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.95it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.03it/s][A
Epoch 4:  13%|█▎        | 783/5971 [09:26<1:02:31,  1.38it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.17it/s][A
Epoch 4:  13%|█▎        | 787/5971 [09:27<1:02:10,  1.39it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.81it/s][A
Epoch 4:  13%|█▎        | 791/5971 [09:27<1:01:50,  1.40it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.16it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.95it/s][A
Epoch 4:  13%|█▎        | 795/5971 [09:27<1:01:29,  1.40it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 24.73it/s][A
Epoch 4:  13%|█▎        | 799/5971 [09:27<1:01:09,  1.41it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 25.36it/s][A
Epoch 4:  13%|█▎        | 803/5971 [09:27<1:00:49,  1.42it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 100%|██████████| 167/167 [00:07<00:00, 25.60it/s][A
Epoch 4:  13%|█▎        | 804/5971 [09:28<1:00:45,  1.42it/s, loss=0.117, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000647, train/loss_step=0.190, global_step=2374.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  13%|█▎        | 805/5971 [09:28<1:00:46,  1.42it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.18e-5, train/loss_step=0.00203, global_step=2375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  13%|█▎        | 806/5971 [09:29<1:00:47,  1.42it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.55e-5, train/loss_step=0.0122, global_step=2375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▎        | 807/5971 [09:30<1:00:48,  1.42it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.55e-5, train/loss_step=0.0122, global_step=2375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 807/5971 [09:30<1:00:48,  1.42it/s, loss=0.106, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000545, train/loss_step=0.153, global_step=2375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  14%|█▎        | 808/5971 [09:33<1:01:00,  1.41it/s, loss=0.091, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=3.99e-5, train/loss_step=0.00851, global_step=2375.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 809/5971 [09:34<1:01:01,  1.41it/s, loss=0.0816, v_num=0, train/loss_simple_step=0.0388, train/loss_vlb_step=0.000146, train/loss_step=0.0388, global_step=2376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 810/5971 [09:35<1:01:02,  1.41it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.38e-5, train/loss_step=0.0122, global_step=2376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  14%|█▎        | 811/5971 [09:36<1:01:02,  1.41it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.38e-5, train/loss_step=0.0122, global_step=2376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 811/5971 [09:36<1:01:02,  1.41it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.00639, train/loss_vlb_step=3.12e-5, train/loss_step=0.00639, global_step=2376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 812/5971 [09:38<1:01:13,  1.40it/s, loss=0.0756, v_num=0, train/loss_simple_step=0.00414, train/loss_vlb_step=2.22e-5, train/loss_step=0.00414, global_step=2376.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 813/5971 [09:39<1:01:13,  1.40it/s, loss=0.0788, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.00024, train/loss_step=0.0728, global_step=2377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▎        | 814/5971 [09:40<1:01:14,  1.40it/s, loss=0.098, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00354, train/loss_step=0.440, global_step=2377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  14%|█▎        | 815/5971 [09:41<1:01:15,  1.40it/s, loss=0.098, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00354, train/loss_step=0.440, global_step=2377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 815/5971 [09:41<1:01:15,  1.40it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000217, train/loss_step=0.0635, global_step=2377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 816/5971 [09:44<1:01:26,  1.40it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000167, train/loss_step=0.0475, global_step=2377.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 817/5971 [09:45<1:01:27,  1.40it/s, loss=0.089, v_num=0, train/loss_simple_step=0.00418, train/loss_vlb_step=2.2e-5, train/loss_step=0.00418, global_step=2378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  14%|█▎        | 818/5971 [09:46<1:01:28,  1.40it/s, loss=0.0893, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.73e-5, train/loss_step=0.0161, global_step=2378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 819/5971 [09:47<1:01:28,  1.40it/s, loss=0.0893, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.73e-5, train/loss_step=0.0161, global_step=2378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 819/5971 [09:47<1:01:28,  1.40it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.65e-5, train/loss_step=0.00292, global_step=2378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▎        | 820/5971 [09:49<1:01:38,  1.39it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000381, train/loss_step=0.116, global_step=2378.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  14%|█▎        | 821/5971 [09:50<1:01:39,  1.39it/s, loss=0.0663, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000118, train/loss_step=0.0326, global_step=2379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 822/5971 [09:51<1:01:40,  1.39it/s, loss=0.0664, v_num=0, train/loss_simple_step=0.00433, train/loss_vlb_step=2.24e-5, train/loss_step=0.00433, global_step=2379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 823/5971 [09:52<1:01:40,  1.39it/s, loss=0.0664, v_num=0, train/loss_simple_step=0.00433, train/loss_vlb_step=2.24e-5, train/loss_step=0.00433, global_step=2379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 823/5971 [09:52<1:01:40,  1.39it/s, loss=0.0658, v_num=0, train/loss_simple_step=0.0885, train/loss_vlb_step=0.000292, train/loss_step=0.0885, global_step=2379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  14%|█▍        | 824/5971 [09:54<1:01:49,  1.39it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.693, train/loss_vlb_step=0.0177, train/loss_step=0.693, global_step=2379.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  14%|█▍        | 825/5971 [09:55<1:01:50,  1.39it/s, loss=0.091, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.49e-5, train/loss_step=0.00253, global_step=2380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 826/5971 [09:56<1:01:51,  1.39it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000577, train/loss_step=0.170, global_step=2380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▍        | 827/5971 [09:57<1:01:51,  1.39it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000577, train/loss_step=0.170, global_step=2380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 827/5971 [09:57<1:01:51,  1.39it/s, loss=0.106, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00149, train/loss_step=0.304, global_step=2380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▍        | 828/5971 [09:59<1:02:01,  1.38it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000133, train/loss_step=0.0357, global_step=2380.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 829/5971 [10:00<1:02:01,  1.38it/s, loss=0.113, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000485, train/loss_step=0.148, global_step=2381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▍        | 830/5971 [10:01<1:02:01,  1.38it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00241, train/loss_vlb_step=1.38e-5, train/loss_step=0.00241, global_step=2381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 831/5971 [10:02<1:02:02,  1.38it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00241, train/loss_vlb_step=1.38e-5, train/loss_step=0.00241, global_step=2381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 831/5971 [10:02<1:02:02,  1.38it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.22e-5, train/loss_step=0.0043, global_step=2381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▍        | 832/5971 [10:05<1:02:14,  1.38it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0223, train/loss_vlb_step=9.29e-5, train/loss_step=0.0223, global_step=2381.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 833/5971 [10:06<1:02:15,  1.38it/s, loss=0.119, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000696, train/loss_step=0.191, global_step=2382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  14%|█▍        | 834/5971 [10:07<1:02:15,  1.38it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000213, train/loss_step=0.0643, global_step=2382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 835/5971 [10:08<1:02:16,  1.37it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000213, train/loss_step=0.0643, global_step=2382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 835/5971 [10:08<1:02:16,  1.37it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.04e-5, train/loss_step=0.00181, global_step=2382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 836/5971 [10:10<1:02:25,  1.37it/s, loss=0.112, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.0018, train/loss_step=0.332, global_step=2382.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:  14%|█▍        | 837/5971 [10:11<1:02:26,  1.37it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000288, train/loss_step=0.0861, global_step=2383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 838/5971 [10:12<1:02:27,  1.37it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00527, train/loss_vlb_step=2.7e-5, train/loss_step=0.00527, global_step=2383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 839/5971 [10:13<1:02:28,  1.37it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00527, train/loss_vlb_step=2.7e-5, train/loss_step=0.00527, global_step=2383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 839/5971 [10:13<1:02:28,  1.37it/s, loss=0.139, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00277, train/loss_step=0.474, global_step=2383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  14%|█▍        | 840/5971 [10:15<1:02:36,  1.37it/s, loss=0.138, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.00032, train/loss_step=0.096, global_step=2383.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 841/5971 [10:16<1:02:36,  1.37it/s, loss=0.147, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000846, train/loss_step=0.218, global_step=2384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 842/5971 [10:17<1:02:37,  1.37it/s, loss=0.152, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000362, train/loss_step=0.107, global_step=2384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 843/5971 [10:18<1:02:37,  1.36it/s, loss=0.152, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000362, train/loss_step=0.107, global_step=2384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 843/5971 [10:18<1:02:37,  1.36it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00117, train/loss_vlb_step=7.06e-6, train/loss_step=0.00117, global_step=2384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 844/5971 [10:21<1:02:48,  1.36it/s, loss=0.132, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00187, train/loss_step=0.382, global_step=2384.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  14%|█▍        | 845/5971 [10:22<1:02:49,  1.36it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.58e-5, train/loss_step=0.0102, global_step=2385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 846/5971 [10:23<1:02:49,  1.36it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=2385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 847/5971 [10:23<1:02:49,  1.36it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=2385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 847/5971 [10:23<1:02:49,  1.36it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.66e-5, train/loss_step=0.00291, global_step=2385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 848/5971 [10:26<1:03:00,  1.36it/s, loss=0.131, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00358, train/loss_step=0.446, global_step=2385.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  14%|█▍        | 849/5971 [10:27<1:03:02,  1.35it/s, loss=0.146, v_num=0, train/loss_simple_step=0.447, train/loss_vlb_step=0.00318, train/loss_step=0.447, global_step=2386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 850/5971 [10:28<1:03:02,  1.35it/s, loss=0.153, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000473, train/loss_step=0.142, global_step=2386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 851/5971 [10:29<1:03:03,  1.35it/s, loss=0.153, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000473, train/loss_step=0.142, global_step=2386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 851/5971 [10:29<1:03:03,  1.35it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.2e-5, train/loss_step=0.00207, global_step=2386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 852/5971 [10:31<1:03:11,  1.35it/s, loss=0.161, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.00059, train/loss_step=0.173, global_step=2386.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  14%|█▍        | 853/5971 [10:32<1:03:11,  1.35it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00493, train/loss_vlb_step=2.55e-5, train/loss_step=0.00493, global_step=2387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 854/5971 [10:33<1:03:12,  1.35it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0874, train/loss_vlb_step=0.000292, train/loss_step=0.0874, global_step=2387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  14%|█▍        | 855/5971 [10:34<1:03:12,  1.35it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0874, train/loss_vlb_step=0.000292, train/loss_step=0.0874, global_step=2387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 855/5971 [10:34<1:03:12,  1.35it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.49e-5, train/loss_step=0.0127, global_step=2387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  14%|█▍        | 856/5971 [10:36<1:03:21,  1.35it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0459, train/loss_vlb_step=0.000164, train/loss_step=0.0459, global_step=2387.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 857/5971 [10:37<1:03:22,  1.34it/s, loss=0.144, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000743, train/loss_step=0.199, global_step=2388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▍        | 858/5971 [10:38<1:03:22,  1.34it/s, loss=0.15, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000359, train/loss_step=0.108, global_step=2388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  14%|█▍        | 859/5971 [10:39<1:03:23,  1.34it/s, loss=0.15, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000359, train/loss_step=0.108, global_step=2388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 859/5971 [10:39<1:03:23,  1.34it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.51e-5, train/loss_step=0.00268, global_step=2388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 860/5971 [10:42<1:03:32,  1.34it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.00017, train/loss_step=0.0477, global_step=2388.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▍        | 861/5971 [10:43<1:03:33,  1.34it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.45e-5, train/loss_step=0.0026, global_step=2389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 862/5971 [10:44<1:03:33,  1.34it/s, loss=0.136, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00447, train/loss_step=0.574, global_step=2389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▍        | 863/5971 [10:45<1:03:33,  1.34it/s, loss=0.136, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00447, train/loss_step=0.574, global_step=2389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 863/5971 [10:45<1:03:33,  1.34it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.32e-5, train/loss_step=0.00225, global_step=2389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  14%|█▍        | 864/5971 [10:47<1:03:43,  1.34it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.81e-5, train/loss_step=0.0131, global_step=2389.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  14%|█▍        | 865/5971 [10:48<1:03:44,  1.34it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00465, train/loss_vlb_step=2.46e-5, train/loss_step=0.00465, global_step=2390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 866/5971 [10:49<1:03:44,  1.33it/s, loss=0.121, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=2390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  15%|█▍        | 867/5971 [10:50<1:03:45,  1.33it/s, loss=0.121, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=2390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 867/5971 [10:50<1:03:45,  1.33it/s, loss=0.143, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.0028, train/loss_step=0.454, global_step=2390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  15%|█▍        | 868/5971 [10:52<1:03:52,  1.33it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.17e-5, train/loss_step=0.0229, global_step=2390.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 869/5971 [10:53<1:03:53,  1.33it/s, loss=0.109, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000597, train/loss_step=0.172, global_step=2391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  15%|█▍        | 870/5971 [10:54<1:03:53,  1.33it/s, loss=0.111, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000631, train/loss_step=0.184, global_step=2391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 871/5971 [10:55<1:03:54,  1.33it/s, loss=0.111, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000631, train/loss_step=0.184, global_step=2391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 871/5971 [10:55<1:03:54,  1.33it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0843, train/loss_vlb_step=0.000295, train/loss_step=0.0843, global_step=2391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 872/5971 [10:57<1:04:02,  1.33it/s, loss=0.126, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00161, train/loss_step=0.390, global_step=2391.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  15%|█▍        | 873/5971 [10:58<1:04:03,  1.33it/s, loss=0.147, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00286, train/loss_step=0.425, global_step=2392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 874/5971 [10:59<1:04:03,  1.33it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.05e-5, train/loss_step=0.0192, global_step=2392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 875/5971 [11:00<1:04:03,  1.33it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=8.05e-5, train/loss_step=0.0192, global_step=2392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 875/5971 [11:00<1:04:03,  1.33it/s, loss=0.143, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.43e-5, train/loss_step=0.015, global_step=2392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  15%|█▍        | 876/5971 [11:03<1:04:12,  1.32it/s, loss=0.147, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=2392.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 877/5971 [11:04<1:04:12,  1.32it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.00013, train/loss_step=0.0371, global_step=2393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 878/5971 [11:04<1:04:12,  1.32it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000111, train/loss_step=0.0322, global_step=2393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 879/5971 [11:05<1:04:12,  1.32it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000111, train/loss_step=0.0322, global_step=2393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 879/5971 [11:05<1:04:12,  1.32it/s, loss=0.181, v_num=0, train/loss_simple_step=0.932, train/loss_vlb_step=0.469, train/loss_step=0.932, global_step=2393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  15%|█▍        | 880/5971 [11:08<1:04:20,  1.32it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0345, train/loss_vlb_step=0.000123, train/loss_step=0.0345, global_step=2393.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 881/5971 [11:09<1:04:21,  1.32it/s, loss=0.198, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.0017, train/loss_step=0.349, global_step=2394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  15%|█▍        | 882/5971 [11:10<1:04:22,  1.32it/s, loss=0.191, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00318, train/loss_step=0.431, global_step=2394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 883/5971 [11:11<1:04:22,  1.32it/s, loss=0.191, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00318, train/loss_step=0.431, global_step=2394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 883/5971 [11:11<1:04:22,  1.32it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.000231, train/loss_step=0.0704, global_step=2394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 884/5971 [11:13<1:04:32,  1.31it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.83e-5, train/loss_step=0.0218, global_step=2394.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  15%|█▍        | 885/5971 [11:14<1:04:32,  1.31it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000205, train/loss_step=0.0606, global_step=2395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 886/5971 [11:15<1:04:33,  1.31it/s, loss=0.234, v_num=0, train/loss_simple_step=0.825, train/loss_vlb_step=0.208, train/loss_step=0.825, global_step=2395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  15%|█▍        | 887/5971 [11:16<1:04:33,  1.31it/s, loss=0.234, v_num=0, train/loss_simple_step=0.825, train/loss_vlb_step=0.208, train/loss_step=0.825, global_step=2395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 887/5971 [11:16<1:04:33,  1.31it/s, loss=0.212, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.38e-5, train/loss_step=0.018, global_step=2395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 888/5971 [11:18<1:04:40,  1.31it/s, loss=0.213, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000177, train/loss_step=0.051, global_step=2395.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 889/5971 [11:19<1:04:40,  1.31it/s, loss=0.218, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00113, train/loss_step=0.275, global_step=2396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  15%|█▍        | 890/5971 [11:20<1:04:40,  1.31it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00824, train/loss_vlb_step=3.93e-5, train/loss_step=0.00824, global_step=2396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 891/5971 [11:21<1:04:40,  1.31it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00824, train/loss_vlb_step=3.93e-5, train/loss_step=0.00824, global_step=2396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 891/5971 [11:21<1:04:40,  1.31it/s, loss=0.208, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000184, train/loss_step=0.052, global_step=2396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  15%|█▍        | 892/5971 [11:23<1:04:49,  1.31it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.39e-5, train/loss_step=0.0124, global_step=2396.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 893/5971 [11:24<1:04:49,  1.31it/s, loss=0.174, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000414, train/loss_step=0.126, global_step=2397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  15%|█▍        | 894/5971 [11:25<1:04:49,  1.31it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.75e-5, train/loss_step=0.0263, global_step=2397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 895/5971 [11:26<1:04:49,  1.30it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.75e-5, train/loss_step=0.0263, global_step=2397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▍        | 895/5971 [11:26<1:04:49,  1.30it/s, loss=0.182, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000609, train/loss_step=0.165, global_step=2397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  15%|█▌        | 896/5971 [11:29<1:04:58,  1.30it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000128, train/loss_step=0.0336, global_step=2397.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▌        | 897/5971 [11:29<1:04:58,  1.30it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0766, train/loss_vlb_step=0.000252, train/loss_step=0.0766, global_step=2398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  15%|█▌        | 898/5971 [11:30<1:04:58,  1.30it/s, loss=0.191, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00107, train/loss_step=0.256, global_step=2398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  15%|█▌        | 899/5971 [11:31<1:04:58,  1.30it/s, loss=0.191, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00107, train/loss_step=0.256, global_step=2398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▌        | 899/5971 [11:31<1:04:58,  1.30it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.54e-5, train/loss_step=0.0214, global_step=2398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▌        | 900/5971 [11:34<1:05:06,  1.30it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.44e-5, train/loss_step=0.0025, global_step=2398.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▌        | 901/5971 [11:35<1:05:06,  1.30it/s, loss=0.147, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00194, train/loss_step=0.400, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  15%|█▌        | 902/5971 [11:35<1:05:06,  1.30it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0719, train/loss_vlb_step=0.000239, train/loss_step=0.0719, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▌        | 903/5971 [11:36<1:05:06,  1.30it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0719, train/loss_vlb_step=0.000239, train/loss_step=0.0719, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  15%|█▌        | 903/5971 [11:36<1:05:06,  1.30it/s, loss=0.148, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.00391, train/loss_step=0.466, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  15%|█▌        | 904/5971 [11:39<1:05:15,  1.29it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.38it/s][A

Validating:   1%|          | 2/167 [00:00<00:55,  2.99it/s][A
Epoch 4:  15%|█▌        | 907/5971 [11:40<1:05:04,  1.30it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.35it/s][A
Epoch 4:  15%|█▌        | 911/5971 [11:40<1:04:44,  1.30it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.38it/s][A

Validating:   6%|▌         | 10/167 [00:01<00:11, 13.42it/s][A
Epoch 4:  15%|█▌        | 915/5971 [11:40<1:04:25,  1.31it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 13/167 [00:01<00:08, 17.20it/s][A
Epoch 4:  15%|█▌        | 919/5971 [11:40<1:04:06,  1.31it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|▉         | 16/167 [00:01<00:07, 18.94it/s][A
Epoch 4:  15%|█▌        | 923/5971 [11:40<1:03:48,  1.32it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  11%|█▏        | 19/167 [00:01<00:07, 21.11it/s][A

Validating:  13%|█▎        | 22/167 [00:01<00:06, 21.96it/s][A
Epoch 4:  16%|█▌        | 927/5971 [11:40<1:03:29,  1.32it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 23.71it/s][A
Epoch 4:  16%|█▌        | 931/5971 [11:41<1:03:10,  1.33it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 23.27it/s][A
Epoch 4:  16%|█▌        | 935/5971 [11:41<1:02:52,  1.33it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 23.44it/s][A

Validating:  20%|██        | 34/167 [00:01<00:05, 24.52it/s][A
Epoch 4:  16%|█▌        | 939/5971 [11:41<1:02:34,  1.34it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 25.80it/s][A
Epoch 4:  16%|█▌        | 943/5971 [11:41<1:02:16,  1.35it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 26.25it/s][A
Epoch 4:  16%|█▌        | 947/5971 [11:41<1:01:58,  1.35it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.41it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.44it/s][A
Epoch 4:  16%|█▌        | 951/5971 [11:41<1:01:40,  1.36it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 27.64it/s][A
Epoch 4:  16%|█▌        | 955/5971 [11:41<1:01:22,  1.36it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 27.35it/s][A
Epoch 4:  16%|█▌        | 959/5971 [11:42<1:01:05,  1.37it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.51it/s][A
Epoch 4:  16%|█▌        | 963/5971 [11:42<1:00:48,  1.37it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.32it/s][A

Validating:  37%|███▋      | 62/167 [00:03<00:03, 26.44it/s][A
Epoch 4:  16%|█▌        | 967/5971 [11:42<1:00:30,  1.38it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.69it/s][A
Epoch 4:  16%|█▋        | 971/5971 [11:42<1:00:13,  1.38it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.15it/s][A
Epoch 4:  16%|█▋        | 975/5971 [11:42<59:56,  1.39it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.23it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.23it/s][A
Epoch 4:  16%|█▋        | 979/5971 [11:42<59:40,  1.39it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.15it/s][A
Epoch 4:  16%|█▋        | 983/5971 [11:43<59:23,  1.40it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.52it/s][A
Epoch 4:  17%|█▋        | 987/5971 [11:43<59:07,  1.41it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.81it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.35it/s][A
Epoch 4:  17%|█▋        | 991/5971 [11:43<58:50,  1.41it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 25.48it/s][A
Epoch 4:  17%|█▋        | 995/5971 [11:43<58:34,  1.42it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.72it/s][A
Epoch 4:  17%|█▋        | 999/5971 [11:43<58:18,  1.42it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 24.21it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.59it/s][A
Epoch 4:  17%|█▋        | 1003/5971 [11:43<58:02,  1.43it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.15it/s][A
Epoch 4:  17%|█▋        | 1007/5971 [11:43<57:46,  1.43it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.15it/s][A
Epoch 4:  17%|█▋        | 1011/5971 [11:44<57:30,  1.44it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.49it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.66it/s][A
Epoch 4:  17%|█▋        | 1015/5971 [11:44<57:15,  1.44it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 26.03it/s][A
Epoch 4:  17%|█▋        | 1019/5971 [11:44<56:59,  1.45it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.18it/s][A
Epoch 4:  17%|█▋        | 1023/5971 [11:44<56:44,  1.45it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.44it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.47it/s][A
Epoch 4:  17%|█▋        | 1027/5971 [11:44<56:29,  1.46it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.04it/s][A
Epoch 4:  17%|█▋        | 1031/5971 [11:44<56:14,  1.46it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.30it/s][A
Epoch 4:  17%|█▋        | 1035/5971 [11:45<55:59,  1.47it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 24.04it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 24.77it/s][A
Epoch 4:  17%|█▋        | 1039/5971 [11:45<55:44,  1.47it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 24.25it/s][A
Epoch 4:  17%|█▋        | 1043/5971 [11:45<55:29,  1.48it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 23.09it/s][A
Epoch 4:  18%|█▊        | 1047/5971 [11:45<55:15,  1.49it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:06<00:01, 23.46it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 24.78it/s][A
Epoch 4:  18%|█▊        | 1051/5971 [11:45<55:00,  1.49it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.36it/s][A
Epoch 4:  18%|█▊        | 1055/5971 [11:45<54:46,  1.50it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.09it/s][A
Epoch 4:  18%|█▊        | 1059/5971 [11:46<54:31,  1.50it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.58it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 24.71it/s][A
Epoch 4:  18%|█▊        | 1063/5971 [11:46<54:17,  1.51it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.87it/s][A
Epoch 4:  18%|█▊        | 1067/5971 [11:46<54:03,  1.51it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 24.12it/s][A
Epoch 4:  18%|█▊        | 1071/5971 [11:46<53:49,  1.52it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1072/5971 [11:46<53:47,  1.52it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.6e-5, train/loss_step=0.0135, global_step=2399.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.22it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.79it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.20it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.55it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.80it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.98it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.01it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.09it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.11it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.16it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.24it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.21it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.15it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.13it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.22it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.22it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.22it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.18it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.16it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.23it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.21it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.26it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.23it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.16it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.06it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.04it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.04it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  4.95it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:01,  4.94it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.03it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.21it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.41it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.95it/s]

Epoch 4:  18%|█▊        | 1073/5971 [11:59<54:41,  1.49it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00548, train/loss_vlb_step=2.78e-5, train/loss_step=0.00548, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A
Epoch 4:  18%|█▊        | 1073/5971 [12:00<54:47,  1.49it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00548, train/loss_vlb_step=2.78e-5, train/loss_step=0.00548, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.34it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.13it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.75it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.18it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.54it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.75it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.00it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.40it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.53it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.61it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.65it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.66it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.50it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.47it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.50it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.68it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.56it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.43it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.26it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.17it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.20it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.23it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.30it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.35it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.33it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.33it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.29it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.11it/s]

Epoch 4:  18%|█▊        | 1074/5971 [12:11<55:33,  1.47it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00548, train/loss_vlb_step=2.78e-5, train/loss_step=0.00548, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1074/5971 [12:11<55:33,  1.47it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.94e-5, train/loss_step=0.0139, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:33,  1.48it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.50it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.82it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.16it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.53it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.79it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.97it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.09it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.10it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.02it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.06it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.10it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.35it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.51it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.59it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.53it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.48it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.49it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.52it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.46it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.35it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.29it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.27it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.31it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.09it/s]

Epoch 4:  18%|█▊        | 1075/5971 [12:24<56:25,  1.45it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.94e-5, train/loss_step=0.0139, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1075/5971 [12:24<56:25,  1.45it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=3.01e-5, train/loss_step=0.00593, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.86it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.70it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.97it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.16it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.49it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.55it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.50it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.45it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.41it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.32it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.45it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.61it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.52it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.63it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.65it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.20it/s]

Epoch 4:  18%|█▊        | 1076/5971 [12:37<57:21,  1.42it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=3.01e-5, train/loss_step=0.00593, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1076/5971 [12:37<57:21,  1.42it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=7.1e-5, train/loss_step=0.0162, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  18%|█▊        | 1077/5971 [12:38<57:21,  1.42it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=7.1e-5, train/loss_step=0.0162, global_step=2400.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1077/5971 [12:38<57:21,  1.42it/s, loss=0.106, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00161, train/loss_step=0.356, global_step=2401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  18%|█▊        | 1078/5971 [12:39<57:21,  1.42it/s, loss=0.106, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00161, train/loss_step=0.356, global_step=2401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1078/5971 [12:39<57:21,  1.42it/s, loss=0.112, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000419, train/loss_step=0.126, global_step=2401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1079/5971 [12:39<57:22,  1.42it/s, loss=0.112, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000419, train/loss_step=0.126, global_step=2401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1079/5971 [12:39<57:22,  1.42it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=7.13e-5, train/loss_step=0.0164, global_step=2401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1080/5971 [12:42<57:29,  1.42it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=7.13e-5, train/loss_step=0.0164, global_step=2401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1080/5971 [12:42<57:29,  1.42it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.000203, train/loss_step=0.0594, global_step=2401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1081/5971 [12:43<57:29,  1.42it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.000203, train/loss_step=0.0594, global_step=2401.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1081/5971 [12:43<57:29,  1.42it/s, loss=0.115, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000594, train/loss_step=0.174, global_step=2402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  18%|█▊        | 1082/5971 [12:44<57:29,  1.42it/s, loss=0.115, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000594, train/loss_step=0.174, global_step=2402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1082/5971 [12:44<57:29,  1.42it/s, loss=0.141, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00659, train/loss_step=0.537, global_step=2402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  18%|█▊        | 1083/5971 [12:44<57:29,  1.42it/s, loss=0.141, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00659, train/loss_step=0.537, global_step=2402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1083/5971 [12:44<57:29,  1.42it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.09e-5, train/loss_step=0.00191, global_step=2402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1084/5971 [12:47<57:35,  1.41it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.09e-5, train/loss_step=0.00191, global_step=2402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1084/5971 [12:47<57:35,  1.41it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.21e-5, train/loss_step=0.0182, global_step=2402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  18%|█▊        | 1085/5971 [12:48<57:35,  1.41it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.21e-5, train/loss_step=0.0182, global_step=2402.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1085/5971 [12:48<57:35,  1.41it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.52e-5, train/loss_step=0.00268, global_step=2403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1086/5971 [12:48<57:35,  1.41it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.52e-5, train/loss_step=0.00268, global_step=2403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1086/5971 [12:48<57:35,  1.41it/s, loss=0.126, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000797, train/loss_step=0.209, global_step=2403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  18%|█▊        | 1087/5971 [12:49<57:35,  1.41it/s, loss=0.126, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000797, train/loss_step=0.209, global_step=2403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1087/5971 [12:49<57:35,  1.41it/s, loss=0.165, v_num=0, train/loss_simple_step=0.800, train/loss_vlb_step=0.0816, train/loss_step=0.800, global_step=2403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  18%|█▊        | 1088/5971 [12:52<57:43,  1.41it/s, loss=0.165, v_num=0, train/loss_simple_step=0.800, train/loss_vlb_step=0.0816, train/loss_step=0.800, global_step=2403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1088/5971 [12:52<57:43,  1.41it/s, loss=0.17, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=2403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1089/5971 [12:53<57:43,  1.41it/s, loss=0.17, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=2403.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1089/5971 [12:53<57:43,  1.41it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.77e-5, train/loss_step=0.00329, global_step=2404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1090/5971 [12:54<57:43,  1.41it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.77e-5, train/loss_step=0.00329, global_step=2404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1090/5971 [12:54<57:43,  1.41it/s, loss=0.169, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.0024, train/loss_step=0.444, global_step=2404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  18%|█▊        | 1091/5971 [12:54<57:43,  1.41it/s, loss=0.169, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.0024, train/loss_step=0.444, global_step=2404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1091/5971 [12:54<57:43,  1.41it/s, loss=0.151, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=2404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1092/5971 [12:57<57:49,  1.41it/s, loss=0.151, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=2404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1092/5971 [12:57<57:49,  1.41it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.89e-5, train/loss_step=0.00588, global_step=2404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1093/5971 [12:58<57:49,  1.41it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.89e-5, train/loss_step=0.00588, global_step=2404.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1093/5971 [12:58<57:49,  1.41it/s, loss=0.163, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00117, train/loss_step=0.255, global_step=2405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  18%|█▊        | 1094/5971 [12:59<57:49,  1.41it/s, loss=0.163, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00117, train/loss_step=0.255, global_step=2405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1094/5971 [12:59<57:49,  1.41it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0035, train/loss_vlb_step=1.9e-5, train/loss_step=0.0035, global_step=2405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1095/5971 [12:59<57:49,  1.41it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0035, train/loss_vlb_step=1.9e-5, train/loss_step=0.0035, global_step=2405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1095/5971 [12:59<57:49,  1.41it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00546, train/loss_vlb_step=2.8e-5, train/loss_step=0.00546, global_step=2405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1096/5971 [13:02<57:56,  1.40it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00546, train/loss_vlb_step=2.8e-5, train/loss_step=0.00546, global_step=2405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1096/5971 [13:02<57:56,  1.40it/s, loss=0.17, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000594, train/loss_step=0.167, global_step=2405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  18%|█▊        | 1097/5971 [13:03<57:56,  1.40it/s, loss=0.17, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000594, train/loss_step=0.167, global_step=2405.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1097/5971 [13:03<57:56,  1.40it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.35e-5, train/loss_step=0.00239, global_step=2406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1098/5971 [13:04<57:56,  1.40it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.35e-5, train/loss_step=0.00239, global_step=2406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1098/5971 [13:04<57:56,  1.40it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00557, train/loss_vlb_step=2.81e-5, train/loss_step=0.00557, global_step=2406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1099/5971 [13:05<57:57,  1.40it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00557, train/loss_vlb_step=2.81e-5, train/loss_step=0.00557, global_step=2406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1099/5971 [13:05<57:57,  1.40it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.000163, train/loss_step=0.0433, global_step=2406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  18%|█▊        | 1100/5971 [13:07<58:02,  1.40it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.000163, train/loss_step=0.0433, global_step=2406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1100/5971 [13:07<58:02,  1.40it/s, loss=0.156, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000855, train/loss_step=0.227, global_step=2406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  18%|█▊        | 1101/5971 [13:08<58:03,  1.40it/s, loss=0.156, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000855, train/loss_step=0.227, global_step=2406.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1101/5971 [13:08<58:03,  1.40it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.19e-5, train/loss_step=0.00212, global_step=2407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1102/5971 [13:09<58:03,  1.40it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.19e-5, train/loss_step=0.00212, global_step=2407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1102/5971 [13:09<58:03,  1.40it/s, loss=0.15, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00688, train/loss_step=0.571, global_step=2407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  18%|█▊        | 1103/5971 [13:10<58:03,  1.40it/s, loss=0.15, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00688, train/loss_step=0.571, global_step=2407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1103/5971 [13:10<58:03,  1.40it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00589, train/loss_vlb_step=2.94e-5, train/loss_step=0.00589, global_step=2407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1104/5971 [13:12<58:10,  1.39it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00589, train/loss_vlb_step=2.94e-5, train/loss_step=0.00589, global_step=2407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  18%|█▊        | 1104/5971 [13:12<58:10,  1.39it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=5.07e-5, train/loss_step=0.0106, global_step=2407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  19%|█▊        | 1105/5971 [13:13<58:10,  1.39it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=5.07e-5, train/loss_step=0.0106, global_step=2407.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1105/5971 [13:13<58:10,  1.39it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.43e-5, train/loss_step=0.00251, global_step=2408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1106/5971 [13:14<58:10,  1.39it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.43e-5, train/loss_step=0.00251, global_step=2408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1106/5971 [13:14<58:10,  1.39it/s, loss=0.147, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000558, train/loss_step=0.165, global_step=2408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  19%|█▊        | 1107/5971 [13:15<58:10,  1.39it/s, loss=0.147, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000558, train/loss_step=0.165, global_step=2408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1107/5971 [13:15<58:10,  1.39it/s, loss=0.116, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000619, train/loss_step=0.184, global_step=2408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1108/5971 [13:17<58:17,  1.39it/s, loss=0.116, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000619, train/loss_step=0.184, global_step=2408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1108/5971 [13:17<58:17,  1.39it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.3e-5, train/loss_step=0.00434, global_step=2408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1109/5971 [13:18<58:17,  1.39it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.3e-5, train/loss_step=0.00434, global_step=2408.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1109/5971 [13:18<58:17,  1.39it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000179, train/loss_step=0.0505, global_step=2409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1110/5971 [13:19<58:17,  1.39it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000179, train/loss_step=0.0505, global_step=2409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1110/5971 [13:19<58:17,  1.39it/s, loss=0.109, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00169, train/loss_step=0.336, global_step=2409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  19%|█▊        | 1111/5971 [13:20<58:17,  1.39it/s, loss=0.109, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00169, train/loss_step=0.336, global_step=2409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1111/5971 [13:20<58:17,  1.39it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000123, train/loss_step=0.0333, global_step=2409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1112/5971 [13:22<58:23,  1.39it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000123, train/loss_step=0.0333, global_step=2409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1112/5971 [13:22<58:23,  1.39it/s, loss=0.116, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000824, train/loss_step=0.238, global_step=2409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  19%|█▊        | 1113/5971 [13:23<58:24,  1.39it/s, loss=0.116, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000824, train/loss_step=0.238, global_step=2409.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1113/5971 [13:23<58:24,  1.39it/s, loss=0.124, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00254, train/loss_step=0.425, global_step=2410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  19%|█▊        | 1114/5971 [13:24<58:24,  1.39it/s, loss=0.124, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00254, train/loss_step=0.425, global_step=2410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1114/5971 [13:24<58:24,  1.39it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.2e-5, train/loss_step=0.00208, global_step=2410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1115/5971 [13:25<58:24,  1.39it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.2e-5, train/loss_step=0.00208, global_step=2410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1115/5971 [13:25<58:24,  1.39it/s, loss=0.141, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00174, train/loss_step=0.345, global_step=2410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  19%|█▊        | 1116/5971 [13:27<58:30,  1.38it/s, loss=0.141, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00174, train/loss_step=0.345, global_step=2410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1116/5971 [13:27<58:30,  1.38it/s, loss=0.143, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000723, train/loss_step=0.202, global_step=2410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1117/5971 [13:28<58:30,  1.38it/s, loss=0.143, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000723, train/loss_step=0.202, global_step=2410.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1117/5971 [13:28<58:30,  1.38it/s, loss=0.152, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000626, train/loss_step=0.180, global_step=2411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1118/5971 [13:29<58:31,  1.38it/s, loss=0.152, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000626, train/loss_step=0.180, global_step=2411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1118/5971 [13:29<58:31,  1.38it/s, loss=0.157, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=2411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  19%|█▊        | 1119/5971 [13:31<58:34,  1.38it/s, loss=0.157, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=2411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▊        | 1119/5971 [13:31<58:34,  1.38it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.00012, train/loss_step=0.0333, global_step=2411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1120/5971 [13:34<58:44,  1.38it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.00012, train/loss_step=0.0333, global_step=2411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1120/5971 [13:34<58:44,  1.38it/s, loss=0.153, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000523, train/loss_step=0.156, global_step=2411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  19%|█▉        | 1121/5971 [13:36<58:47,  1.37it/s, loss=0.153, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000523, train/loss_step=0.156, global_step=2411.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1121/5971 [13:36<58:47,  1.37it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0953, train/loss_vlb_step=0.000314, train/loss_step=0.0953, global_step=2412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1122/5971 [13:37<58:51,  1.37it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0953, train/loss_vlb_step=0.000314, train/loss_step=0.0953, global_step=2412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1122/5971 [13:37<58:51,  1.37it/s, loss=0.141, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00107, train/loss_step=0.251, global_step=2412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  19%|█▉        | 1123/5971 [13:39<58:54,  1.37it/s, loss=0.141, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00107, train/loss_step=0.251, global_step=2412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1123/5971 [13:39<58:54,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000209, train/loss_step=0.0622, global_step=2412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1124/5971 [13:42<59:04,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000209, train/loss_step=0.0622, global_step=2412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1124/5971 [13:42<59:04,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00603, train/loss_vlb_step=3.11e-5, train/loss_step=0.00603, global_step=2412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1125/5971 [13:43<59:04,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00603, train/loss_vlb_step=3.11e-5, train/loss_step=0.00603, global_step=2412.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1125/5971 [13:43<59:04,  1.37it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000114, train/loss_step=0.0303, global_step=2413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  19%|█▉        | 1126/5971 [13:44<59:06,  1.37it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000114, train/loss_step=0.0303, global_step=2413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1126/5971 [13:44<59:06,  1.37it/s, loss=0.138, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.37e-5, train/loss_step=0.023, global_step=2413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  19%|█▉        | 1127/5971 [13:45<59:06,  1.37it/s, loss=0.138, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.37e-5, train/loss_step=0.023, global_step=2413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1127/5971 [13:45<59:06,  1.37it/s, loss=0.155, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00411, train/loss_step=0.526, global_step=2413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1128/5971 [13:49<59:17,  1.36it/s, loss=0.155, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00411, train/loss_step=0.526, global_step=2413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1128/5971 [13:49<59:17,  1.36it/s, loss=0.161, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=2413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1129/5971 [13:50<59:18,  1.36it/s, loss=0.161, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=2413.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1129/5971 [13:50<59:18,  1.36it/s, loss=0.166, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000631, train/loss_step=0.161, global_step=2414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1130/5971 [13:51<59:18,  1.36it/s, loss=0.166, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000631, train/loss_step=0.161, global_step=2414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1130/5971 [13:51<59:18,  1.36it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=3.16e-5, train/loss_step=0.00661, global_step=2414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1131/5971 [13:52<59:19,  1.36it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=3.16e-5, train/loss_step=0.00661, global_step=2414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1131/5971 [13:52<59:19,  1.36it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000186, train/loss_step=0.0522, global_step=2414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1132/5971 [13:55<59:26,  1.36it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000186, train/loss_step=0.0522, global_step=2414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1132/5971 [13:55<59:26,  1.36it/s, loss=0.163, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.0278, train/loss_step=0.485, global_step=2414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  19%|█▉        | 1133/5971 [13:56<59:27,  1.36it/s, loss=0.163, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.0278, train/loss_step=0.485, global_step=2414.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1133/5971 [13:56<59:27,  1.36it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.00021, train/loss_step=0.0594, global_step=2415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1134/5971 [13:57<59:28,  1.36it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.00021, train/loss_step=0.0594, global_step=2415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1134/5971 [13:57<59:28,  1.36it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0683, train/loss_vlb_step=0.000229, train/loss_step=0.0683, global_step=2415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1135/5971 [13:58<59:29,  1.35it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0683, train/loss_vlb_step=0.000229, train/loss_step=0.0683, global_step=2415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1135/5971 [13:58<59:29,  1.35it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.92e-5, train/loss_step=0.0245, global_step=2415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  19%|█▉        | 1136/5971 [14:00<59:35,  1.35it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.92e-5, train/loss_step=0.0245, global_step=2415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1136/5971 [14:00<59:35,  1.35it/s, loss=0.147, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00403, train/loss_step=0.502, global_step=2415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  19%|█▉        | 1137/5971 [14:02<59:37,  1.35it/s, loss=0.147, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00403, train/loss_step=0.502, global_step=2415.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1137/5971 [14:02<59:37,  1.35it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000311, train/loss_step=0.0937, global_step=2416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1138/5971 [14:03<59:38,  1.35it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000311, train/loss_step=0.0937, global_step=2416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1138/5971 [14:03<59:38,  1.35it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00712, train/loss_vlb_step=3.44e-5, train/loss_step=0.00712, global_step=2416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1139/5971 [14:04<59:38,  1.35it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00712, train/loss_vlb_step=3.44e-5, train/loss_step=0.00712, global_step=2416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1139/5971 [14:04<59:38,  1.35it/s, loss=0.137, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.07e-5, train/loss_step=0.004, global_step=2416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  19%|█▉        | 1140/5971 [14:07<59:47,  1.35it/s, loss=0.137, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.07e-5, train/loss_step=0.004, global_step=2416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1140/5971 [14:07<59:47,  1.35it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0884, train/loss_vlb_step=0.000297, train/loss_step=0.0884, global_step=2416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1141/5971 [14:08<59:47,  1.35it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0884, train/loss_vlb_step=0.000297, train/loss_step=0.0884, global_step=2416.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1141/5971 [14:08<59:47,  1.35it/s, loss=0.135, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000433, train/loss_step=0.132, global_step=2417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  19%|█▉        | 1142/5971 [14:09<59:47,  1.35it/s, loss=0.135, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000433, train/loss_step=0.132, global_step=2417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1142/5971 [14:09<59:47,  1.35it/s, loss=0.128, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=2417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1143/5971 [14:09<59:47,  1.35it/s, loss=0.128, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=2417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1143/5971 [14:09<59:47,  1.35it/s, loss=0.127, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000149, train/loss_step=0.040, global_step=2417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1144/5971 [14:12<59:54,  1.34it/s, loss=0.127, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000149, train/loss_step=0.040, global_step=2417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1144/5971 [14:12<59:54,  1.34it/s, loss=0.13, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000243, train/loss_step=0.073, global_step=2417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  19%|█▉        | 1145/5971 [14:13<59:55,  1.34it/s, loss=0.13, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000243, train/loss_step=0.073, global_step=2417.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1145/5971 [14:13<59:55,  1.34it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.26e-5, train/loss_step=0.00234, global_step=2418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1146/5971 [14:14<59:55,  1.34it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.26e-5, train/loss_step=0.00234, global_step=2418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1146/5971 [14:14<59:55,  1.34it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00436, train/loss_vlb_step=2.26e-5, train/loss_step=0.00436, global_step=2418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1147/5971 [14:15<59:55,  1.34it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00436, train/loss_vlb_step=2.26e-5, train/loss_step=0.00436, global_step=2418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1147/5971 [14:15<59:55,  1.34it/s, loss=0.111, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000715, train/loss_step=0.202, global_step=2418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  19%|█▉        | 1148/5971 [14:17<1:00:01,  1.34it/s, loss=0.111, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000715, train/loss_step=0.202, global_step=2418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1148/5971 [14:17<1:00:01,  1.34it/s, loss=0.112, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=2418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1149/5971 [14:18<1:00:01,  1.34it/s, loss=0.112, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=2418.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1149/5971 [14:18<1:00:01,  1.34it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.00011, train/loss_step=0.0281, global_step=2419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1150/5971 [14:19<1:00:01,  1.34it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.00011, train/loss_step=0.0281, global_step=2419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1150/5971 [14:19<1:00:01,  1.34it/s, loss=0.115, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000703, train/loss_step=0.192, global_step=2419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  19%|█▉        | 1151/5971 [14:20<1:00:01,  1.34it/s, loss=0.115, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000703, train/loss_step=0.192, global_step=2419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1151/5971 [14:20<1:00:01,  1.34it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000145, train/loss_step=0.0409, global_step=2419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1152/5971 [14:22<1:00:06,  1.34it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000145, train/loss_step=0.0409, global_step=2419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1152/5971 [14:22<1:00:06,  1.34it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000135, train/loss_step=0.0371, global_step=2419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1153/5971 [14:23<1:00:06,  1.34it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000135, train/loss_step=0.0371, global_step=2419.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1153/5971 [14:23<1:00:06,  1.34it/s, loss=0.107, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00218, train/loss_step=0.371, global_step=2420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  19%|█▉        | 1154/5971 [14:24<1:00:06,  1.34it/s, loss=0.107, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00218, train/loss_step=0.371, global_step=2420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1154/5971 [14:24<1:00:06,  1.34it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.000191, train/loss_step=0.0571, global_step=2420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1155/5971 [14:25<1:00:06,  1.34it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.000191, train/loss_step=0.0571, global_step=2420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1155/5971 [14:25<1:00:06,  1.34it/s, loss=0.106, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.6e-5, train/loss_step=0.012, global_step=2420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  19%|█▉        | 1156/5971 [14:28<1:00:15,  1.33it/s, loss=0.106, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.6e-5, train/loss_step=0.012, global_step=2420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1156/5971 [14:28<1:00:15,  1.33it/s, loss=0.088, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000459, train/loss_step=0.138, global_step=2420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1157/5971 [14:29<1:00:15,  1.33it/s, loss=0.088, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000459, train/loss_step=0.138, global_step=2420.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1157/5971 [14:29<1:00:15,  1.33it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000133, train/loss_step=0.0392, global_step=2421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1158/5971 [14:30<1:00:15,  1.33it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000133, train/loss_step=0.0392, global_step=2421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1158/5971 [14:30<1:00:15,  1.33it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.000304, train/loss_step=0.0912, global_step=2421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1159/5971 [14:31<1:00:15,  1.33it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.000304, train/loss_step=0.0912, global_step=2421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1159/5971 [14:31<1:00:15,  1.33it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=2421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  19%|█▉        | 1160/5971 [14:33<1:00:20,  1.33it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=2421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1160/5971 [14:33<1:00:20,  1.33it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000636, train/loss_step=0.177, global_step=2421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1161/5971 [14:34<1:00:21,  1.33it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000636, train/loss_step=0.177, global_step=2421.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1161/5971 [14:34<1:00:21,  1.33it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.00999, train/loss_vlb_step=4.44e-5, train/loss_step=0.00999, global_step=2422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1162/5971 [14:35<1:00:21,  1.33it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.00999, train/loss_vlb_step=4.44e-5, train/loss_step=0.00999, global_step=2422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1162/5971 [14:35<1:00:21,  1.33it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.00084, train/loss_step=0.231, global_step=2422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  19%|█▉        | 1163/5971 [14:36<1:00:21,  1.33it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.00084, train/loss_step=0.231, global_step=2422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1163/5971 [14:36<1:00:21,  1.33it/s, loss=0.115, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.0021, train/loss_step=0.352, global_step=2422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  19%|█▉        | 1164/5971 [14:38<1:00:26,  1.33it/s, loss=0.115, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.0021, train/loss_step=0.352, global_step=2422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  19%|█▉        | 1164/5971 [14:38<1:00:26,  1.33it/s, loss=0.117, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000368, train/loss_step=0.110, global_step=2422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1165/5971 [14:39<1:00:26,  1.33it/s, loss=0.117, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000368, train/loss_step=0.110, global_step=2422.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1165/5971 [14:39<1:00:26,  1.33it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.12e-5, train/loss_step=0.00405, global_step=2423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1166/5971 [14:40<1:00:26,  1.33it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.12e-5, train/loss_step=0.00405, global_step=2423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1166/5971 [14:40<1:00:26,  1.33it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0764, train/loss_vlb_step=0.000258, train/loss_step=0.0764, global_step=2423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  20%|█▉        | 1167/5971 [14:41<1:00:26,  1.32it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0764, train/loss_vlb_step=0.000258, train/loss_step=0.0764, global_step=2423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1167/5971 [14:41<1:00:26,  1.32it/s, loss=0.117, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=2423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  20%|█▉        | 1168/5971 [14:43<1:00:31,  1.32it/s, loss=0.117, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=2423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1168/5971 [14:43<1:00:31,  1.32it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.08e-5, train/loss_step=0.00638, global_step=2423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1169/5971 [14:44<1:00:31,  1.32it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.08e-5, train/loss_step=0.00638, global_step=2423.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1169/5971 [14:44<1:00:31,  1.32it/s, loss=0.133, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.0024, train/loss_step=0.483, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  20%|█▉        | 1170/5971 [14:45<1:00:31,  1.32it/s, loss=0.133, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.0024, train/loss_step=0.483, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1170/5971 [14:45<1:00:31,  1.32it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.12e-5, train/loss_step=0.0113, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1171/5971 [14:46<1:00:31,  1.32it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.12e-5, train/loss_step=0.0113, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1171/5971 [14:46<1:00:31,  1.32it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000183, train/loss_step=0.0548, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1172/5971 [14:49<1:00:37,  1.32it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000183, train/loss_step=0.0548, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1172/5971 [14:49<1:00:37,  1.32it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  20%|█▉        | 1173/5971 [14:49<1:00:33,  1.32it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:06,  2.51it/s][A
Epoch 4:  20%|█▉        | 1174/5971 [14:49<1:00:31,  1.32it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:41,  3.95it/s][A
Epoch 4:  20%|█▉        | 1175/5971 [14:49<1:00:28,  1.32it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:16, 10.10it/s][A
Epoch 4:  20%|█▉        | 1178/5971 [14:49<1:00:17,  1.33it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.01it/s][A
Epoch 4:  20%|█▉        | 1181/5971 [14:49<1:00:06,  1.33it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.71it/s][A
Epoch 4:  20%|█▉        | 1184/5971 [14:50<59:55,  1.33it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.23it/s][A
Epoch 4:  20%|█▉        | 1187/5971 [14:50<59:44,  1.33it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.46it/s][A
Epoch 4:  20%|█▉        | 1190/5971 [14:50<59:33,  1.34it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.45it/s][A
Epoch 4:  20%|█▉        | 1193/5971 [14:50<59:23,  1.34it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.04it/s][A
Epoch 4:  20%|██        | 1196/5971 [14:50<59:12,  1.34it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.37it/s][A
Epoch 4:  20%|██        | 1199/5971 [14:50<59:01,  1.35it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.36it/s][A
Epoch 4:  20%|██        | 1202/5971 [14:50<58:51,  1.35it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.41it/s][A
Epoch 4:  20%|██        | 1205/5971 [14:50<58:40,  1.35it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 23.56it/s][A
Epoch 4:  20%|██        | 1208/5971 [14:51<58:30,  1.36it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 23.28it/s][A
Epoch 4:  20%|██        | 1211/5971 [14:51<58:20,  1.36it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 23.65it/s][A
Epoch 4:  20%|██        | 1214/5971 [14:51<58:09,  1.36it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 24.26it/s][A
Epoch 4:  20%|██        | 1217/5971 [14:51<57:59,  1.37it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.06it/s][A
Epoch 4:  20%|██        | 1220/5971 [14:51<57:48,  1.37it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.14it/s][A
Epoch 4:  20%|██        | 1223/5971 [14:51<57:38,  1.37it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 25.44it/s][A
Epoch 4:  21%|██        | 1226/5971 [14:51<57:28,  1.38it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.30it/s][A
Epoch 4:  21%|██        | 1229/5971 [14:51<57:18,  1.38it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.22it/s][A
Epoch 4:  21%|██        | 1232/5971 [14:51<57:08,  1.38it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:03<00:04, 22.97it/s][A
Epoch 4:  21%|██        | 1235/5971 [14:52<56:58,  1.39it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 22.43it/s][A
Epoch 4:  21%|██        | 1238/5971 [14:52<56:48,  1.39it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:04, 23.94it/s][A
Epoch 4:  21%|██        | 1241/5971 [14:52<56:38,  1.39it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.21it/s][A
Epoch 4:  21%|██        | 1244/5971 [14:52<56:28,  1.39it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.24it/s][A
Epoch 4:  21%|██        | 1247/5971 [14:52<56:18,  1.40it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.79it/s][A
Epoch 4:  21%|██        | 1250/5971 [14:52<56:08,  1.40it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.92it/s][A
Epoch 4:  21%|██        | 1254/5971 [14:52<55:55,  1.41it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.31it/s][A
Epoch 4:  21%|██        | 1258/5971 [14:53<55:43,  1.41it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 24.38it/s][A
Epoch 4:  21%|██        | 1262/5971 [14:53<55:30,  1.41it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 25.46it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 24.68it/s][A
Epoch 4:  21%|██        | 1266/5971 [14:53<55:17,  1.42it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.40it/s][A
Epoch 4:  21%|██▏       | 1270/5971 [14:53<55:04,  1.42it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.85it/s][A
Epoch 4:  21%|██▏       | 1274/5971 [14:53<54:52,  1.43it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.35it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.95it/s][A
Epoch 4:  21%|██▏       | 1278/5971 [14:53<54:39,  1.43it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.78it/s][A
Epoch 4:  21%|██▏       | 1282/5971 [14:53<54:27,  1.44it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.40it/s][A
Epoch 4:  22%|██▏       | 1286/5971 [14:54<54:14,  1.44it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 25.02it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:02, 24.98it/s][A
Epoch 4:  22%|██▏       | 1290/5971 [14:54<54:02,  1.44it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.10it/s][A
Epoch 4:  22%|██▏       | 1294/5971 [14:54<53:50,  1.45it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.29it/s][A
Epoch 4:  22%|██▏       | 1298/5971 [14:54<53:38,  1.45it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.78it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.60it/s][A
Epoch 4:  22%|██▏       | 1302/5971 [14:54<53:26,  1.46it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.65it/s][A
Epoch 4:  22%|██▏       | 1306/5971 [14:54<53:14,  1.46it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.40it/s][A
Epoch 4:  22%|██▏       | 1310/5971 [14:55<53:02,  1.46it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.19it/s][A

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 24.79it/s][A
Epoch 4:  22%|██▏       | 1314/5971 [14:55<52:50,  1.47it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 23.56it/s][A
Epoch 4:  22%|██▏       | 1318/5971 [14:55<52:38,  1.47it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 24.26it/s][A
Epoch 4:  22%|██▏       | 1322/5971 [14:55<52:26,  1.48it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.56it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.11it/s][A
Epoch 4:  22%|██▏       | 1326/5971 [14:55<52:15,  1.48it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.67it/s][A
Epoch 4:  22%|██▏       | 1330/5971 [14:55<52:03,  1.49it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.79it/s][A
Epoch 4:  22%|██▏       | 1334/5971 [14:56<51:52,  1.49it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.24it/s][A

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 25.83it/s][A
Epoch 4:  22%|██▏       | 1338/5971 [14:56<51:40,  1.49it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  22%|██▏       | 1340/5971 [14:56<51:36,  1.50it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.41e-5, train/loss_step=0.00458, global_step=2424.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  22%|██▏       | 1341/5971 [14:57<51:36,  1.50it/s, loss=0.117, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000864, train/loss_step=0.243, global_step=2425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  22%|██▏       | 1342/5971 [14:58<51:36,  1.49it/s, loss=0.117, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000864, train/loss_step=0.243, global_step=2425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  22%|██▏       | 1342/5971 [14:58<51:36,  1.49it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00552, train/loss_vlb_step=2.76e-5, train/loss_step=0.00552, global_step=2425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  22%|██▏       | 1343/5971 [14:59<51:36,  1.49it/s, loss=0.131, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00189, train/loss_step=0.355, global_step=2425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  23%|██▎       | 1344/5971 [15:02<51:43,  1.49it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.000123, train/loss_step=0.0332, global_step=2425.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1345/5971 [15:03<51:43,  1.49it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000246, train/loss_step=0.0723, global_step=2426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1346/5971 [15:04<51:43,  1.49it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000246, train/loss_step=0.0723, global_step=2426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1346/5971 [15:04<51:43,  1.49it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.07e-5, train/loss_step=0.0201, global_step=2426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1347/5971 [15:04<51:44,  1.49it/s, loss=0.145, v_num=0, train/loss_simple_step=0.541, train/loss_vlb_step=0.00619, train/loss_step=0.541, global_step=2426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  23%|██▎       | 1348/5971 [15:07<51:49,  1.49it/s, loss=0.143, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.00046, train/loss_step=0.138, global_step=2426.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1349/5971 [15:08<51:49,  1.49it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.25e-5, train/loss_step=0.00217, global_step=2427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1350/5971 [15:09<51:49,  1.49it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.25e-5, train/loss_step=0.00217, global_step=2427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1350/5971 [15:09<51:49,  1.49it/s, loss=0.142, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000678, train/loss_step=0.203, global_step=2427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  23%|██▎       | 1351/5971 [15:10<51:49,  1.49it/s, loss=0.131, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000506, train/loss_step=0.148, global_step=2427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1352/5971 [15:12<51:54,  1.48it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00819, train/loss_vlb_step=3.77e-5, train/loss_step=0.00819, global_step=2427.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1353/5971 [15:13<51:54,  1.48it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000297, train/loss_step=0.0901, global_step=2428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1354/5971 [15:14<51:55,  1.48it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000297, train/loss_step=0.0901, global_step=2428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1354/5971 [15:14<51:55,  1.48it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.5e-5, train/loss_step=0.00266, global_step=2428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1355/5971 [15:15<51:55,  1.48it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.14e-5, train/loss_step=0.00193, global_step=2428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1356/5971 [15:17<52:00,  1.48it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.68e-5, train/loss_step=0.0103, global_step=2428.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  23%|██▎       | 1357/5971 [15:18<52:01,  1.48it/s, loss=0.112, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00138, train/loss_step=0.286, global_step=2429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  23%|██▎       | 1358/5971 [15:19<52:01,  1.48it/s, loss=0.112, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00138, train/loss_step=0.286, global_step=2429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1358/5971 [15:19<52:01,  1.48it/s, loss=0.124, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.0011, train/loss_step=0.262, global_step=2429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1359/5971 [15:20<52:01,  1.48it/s, loss=0.137, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00153, train/loss_step=0.319, global_step=2429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1360/5971 [15:22<52:05,  1.48it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00323, train/loss_vlb_step=1.79e-5, train/loss_step=0.00323, global_step=2429.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1361/5971 [15:23<52:06,  1.47it/s, loss=0.133, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000549, train/loss_step=0.159, global_step=2430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  23%|██▎       | 1362/5971 [15:24<52:06,  1.47it/s, loss=0.133, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000549, train/loss_step=0.159, global_step=2430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1362/5971 [15:24<52:06,  1.47it/s, loss=0.143, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000697, train/loss_step=0.201, global_step=2430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1363/5971 [15:25<52:06,  1.47it/s, loss=0.132, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000488, train/loss_step=0.144, global_step=2430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1364/5971 [15:27<52:11,  1.47it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00896, train/loss_vlb_step=3.94e-5, train/loss_step=0.00896, global_step=2430.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1365/5971 [15:28<52:12,  1.47it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00425, train/loss_vlb_step=2.23e-5, train/loss_step=0.00425, global_step=2431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1366/5971 [15:29<52:12,  1.47it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00425, train/loss_vlb_step=2.23e-5, train/loss_step=0.00425, global_step=2431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1366/5971 [15:29<52:12,  1.47it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0034, train/loss_vlb_step=1.83e-5, train/loss_step=0.0034, global_step=2431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  23%|██▎       | 1367/5971 [15:30<52:12,  1.47it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.00413, train/loss_vlb_step=2.1e-5, train/loss_step=0.00413, global_step=2431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1368/5971 [15:32<52:16,  1.47it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.00035, train/loss_step=0.107, global_step=2431.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  23%|██▎       | 1369/5971 [15:33<52:17,  1.47it/s, loss=0.114, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00135, train/loss_step=0.310, global_step=2432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1370/5971 [15:34<52:17,  1.47it/s, loss=0.114, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00135, train/loss_step=0.310, global_step=2432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1370/5971 [15:34<52:17,  1.47it/s, loss=0.117, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00146, train/loss_step=0.277, global_step=2432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1371/5971 [15:35<52:17,  1.47it/s, loss=0.118, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000569, train/loss_step=0.164, global_step=2432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1372/5971 [15:37<52:21,  1.46it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000206, train/loss_step=0.0622, global_step=2432.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1373/5971 [15:38<52:21,  1.46it/s, loss=0.12, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000224, train/loss_step=0.063, global_step=2433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  23%|██▎       | 1374/5971 [15:39<52:21,  1.46it/s, loss=0.12, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000224, train/loss_step=0.063, global_step=2433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1374/5971 [15:39<52:21,  1.46it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.32e-5, train/loss_step=0.0142, global_step=2433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1375/5971 [15:40<52:21,  1.46it/s, loss=0.138, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00183, train/loss_step=0.366, global_step=2433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1376/5971 [15:43<52:26,  1.46it/s, loss=0.168, v_num=0, train/loss_simple_step=0.598, train/loss_vlb_step=0.0055, train/loss_step=0.598, global_step=2433.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1377/5971 [15:44<52:27,  1.46it/s, loss=0.171, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00168, train/loss_step=0.349, global_step=2434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1378/5971 [15:45<52:27,  1.46it/s, loss=0.171, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00168, train/loss_step=0.349, global_step=2434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1378/5971 [15:45<52:27,  1.46it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.03e-5, train/loss_step=0.00174, global_step=2434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1379/5971 [15:46<52:27,  1.46it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.02e-5, train/loss_step=0.0167, global_step=2434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  23%|██▎       | 1380/5971 [15:48<52:31,  1.46it/s, loss=0.148, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=2434.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1381/5971 [15:49<52:32,  1.46it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0724, train/loss_vlb_step=0.000239, train/loss_step=0.0724, global_step=2435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1382/5971 [15:49<52:32,  1.46it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0724, train/loss_vlb_step=0.000239, train/loss_step=0.0724, global_step=2435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1382/5971 [15:49<52:32,  1.46it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0582, train/loss_vlb_step=0.000208, train/loss_step=0.0582, global_step=2435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1383/5971 [15:50<52:32,  1.46it/s, loss=0.142, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000945, train/loss_step=0.249, global_step=2435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  23%|██▎       | 1384/5971 [15:53<52:37,  1.45it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000211, train/loss_step=0.0614, global_step=2435.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1385/5971 [15:54<52:37,  1.45it/s, loss=0.159, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00203, train/loss_step=0.300, global_step=2436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  23%|██▎       | 1386/5971 [15:55<52:37,  1.45it/s, loss=0.159, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00203, train/loss_step=0.300, global_step=2436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1386/5971 [15:55<52:37,  1.45it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.47e-5, train/loss_step=0.00254, global_step=2436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1387/5971 [15:56<52:37,  1.45it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.000187, train/loss_step=0.0537, global_step=2436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1388/5971 [15:58<52:44,  1.45it/s, loss=0.173, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00158, train/loss_step=0.346, global_step=2436.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  23%|██▎       | 1389/5971 [15:59<52:44,  1.45it/s, loss=0.165, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000443, train/loss_step=0.135, global_step=2437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1390/5971 [16:00<52:44,  1.45it/s, loss=0.165, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000443, train/loss_step=0.135, global_step=2437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1390/5971 [16:00<52:44,  1.45it/s, loss=0.151, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.43e-5, train/loss_step=0.013, global_step=2437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1391/5971 [16:01<52:44,  1.45it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.86e-5, train/loss_step=0.00568, global_step=2437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1392/5971 [16:03<52:48,  1.45it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.04e-5, train/loss_step=0.00176, global_step=2437.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1393/5971 [16:04<52:48,  1.44it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.71e-5, train/loss_step=0.0153, global_step=2438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1394/5971 [16:05<52:48,  1.44it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.71e-5, train/loss_step=0.0153, global_step=2438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1394/5971 [16:05<52:48,  1.44it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000126, train/loss_step=0.0347, global_step=2438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1395/5971 [16:06<52:48,  1.44it/s, loss=0.139, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00165, train/loss_step=0.372, global_step=2438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  23%|██▎       | 1396/5971 [16:09<52:56,  1.44it/s, loss=0.125, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00177, train/loss_step=0.310, global_step=2438.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1397/5971 [16:10<52:56,  1.44it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.22e-5, train/loss_step=0.0143, global_step=2439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1398/5971 [16:11<52:56,  1.44it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.22e-5, train/loss_step=0.0143, global_step=2439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1398/5971 [16:11<52:56,  1.44it/s, loss=0.116, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000565, train/loss_step=0.166, global_step=2439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  23%|██▎       | 1399/5971 [16:12<52:56,  1.44it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00869, train/loss_vlb_step=4.13e-5, train/loss_step=0.00869, global_step=2439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1400/5971 [16:14<53:00,  1.44it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00276, train/loss_vlb_step=1.56e-5, train/loss_step=0.00276, global_step=2439.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1401/5971 [16:15<53:00,  1.44it/s, loss=0.113, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=2440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  23%|██▎       | 1402/5971 [16:16<53:00,  1.44it/s, loss=0.113, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=2440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1402/5971 [16:16<53:00,  1.44it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000115, train/loss_step=0.0308, global_step=2440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  23%|██▎       | 1403/5971 [16:17<53:00,  1.44it/s, loss=0.099, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.68e-6, train/loss_step=0.00166, global_step=2440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1404/5971 [16:19<53:04,  1.43it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.7e-5, train/loss_step=0.00322, global_step=2440.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1405/5971 [16:20<53:04,  1.43it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=2.92e-5, train/loss_step=0.00598, global_step=2441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1406/5971 [16:21<53:04,  1.43it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=2.92e-5, train/loss_step=0.00598, global_step=2441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1406/5971 [16:21<53:04,  1.43it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00176, train/loss_step=0.322, global_step=2441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  24%|██▎       | 1407/5971 [16:22<53:04,  1.43it/s, loss=0.113, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.0022, train/loss_step=0.368, global_step=2441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  24%|██▎       | 1408/5971 [16:25<53:11,  1.43it/s, loss=0.109, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.000968, train/loss_step=0.268, global_step=2441.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1409/5971 [16:26<53:11,  1.43it/s, loss=0.112, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000708, train/loss_step=0.186, global_step=2442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1410/5971 [16:27<53:11,  1.43it/s, loss=0.112, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000708, train/loss_step=0.186, global_step=2442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1410/5971 [16:27<53:11,  1.43it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00466, train/loss_vlb_step=2.42e-5, train/loss_step=0.00466, global_step=2442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1411/5971 [16:28<53:11,  1.43it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00126, train/loss_vlb_step=7.4e-6, train/loss_step=0.00126, global_step=2442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  24%|██▎       | 1412/5971 [16:30<53:15,  1.43it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000149, train/loss_step=0.0422, global_step=2442.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1413/5971 [16:31<53:15,  1.43it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0541, train/loss_vlb_step=0.000189, train/loss_step=0.0541, global_step=2443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1414/5971 [16:32<53:15,  1.43it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0541, train/loss_vlb_step=0.000189, train/loss_step=0.0541, global_step=2443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1414/5971 [16:32<53:15,  1.43it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00772, train/loss_vlb_step=3.54e-5, train/loss_step=0.00772, global_step=2443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1415/5971 [16:33<53:15,  1.43it/s, loss=0.106, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000923, train/loss_step=0.216, global_step=2443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  24%|██▎       | 1416/5971 [16:35<53:19,  1.42it/s, loss=0.0911, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.51e-5, train/loss_step=0.0123, global_step=2443.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1417/5971 [16:36<53:19,  1.42it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.25e-5, train/loss_step=0.00435, global_step=2444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1418/5971 [16:37<53:19,  1.42it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.25e-5, train/loss_step=0.00435, global_step=2444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▎       | 1418/5971 [16:37<53:19,  1.42it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000174, train/loss_step=0.0491, global_step=2444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  24%|██▍       | 1419/5971 [16:38<53:19,  1.42it/s, loss=0.0844, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=2444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1420/5971 [16:40<53:23,  1.42it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.58e-5, train/loss_step=0.0177, global_step=2444.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  24%|██▍       | 1421/5971 [16:41<53:23,  1.42it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.21e-5, train/loss_step=0.0176, global_step=2445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1422/5971 [16:42<53:23,  1.42it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.21e-5, train/loss_step=0.0176, global_step=2445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1422/5971 [16:42<53:23,  1.42it/s, loss=0.0802, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.15e-5, train/loss_step=0.0198, global_step=2445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1423/5971 [16:42<53:23,  1.42it/s, loss=0.098, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00216, train/loss_step=0.356, global_step=2445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  24%|██▍       | 1424/5971 [16:45<53:27,  1.42it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.52e-6, train/loss_step=0.0014, global_step=2445.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1425/5971 [16:46<53:27,  1.42it/s, loss=0.0983, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.17e-5, train/loss_step=0.0148, global_step=2446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1426/5971 [16:46<53:27,  1.42it/s, loss=0.0983, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.17e-5, train/loss_step=0.0148, global_step=2446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1426/5971 [16:46<53:27,  1.42it/s, loss=0.0897, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000559, train/loss_step=0.151, global_step=2446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  24%|██▍       | 1427/5971 [16:47<53:26,  1.42it/s, loss=0.106, v_num=0, train/loss_simple_step=0.696, train/loss_vlb_step=0.0163, train/loss_step=0.696, global_step=2446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  24%|██▍       | 1428/5971 [16:50<53:32,  1.41it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000465, train/loss_step=0.141, global_step=2446.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1429/5971 [16:51<53:31,  1.41it/s, loss=0.112, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.0031, train/loss_step=0.424, global_step=2447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  24%|██▍       | 1430/5971 [16:52<53:31,  1.41it/s, loss=0.112, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.0031, train/loss_step=0.424, global_step=2447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1430/5971 [16:52<53:31,  1.41it/s, loss=0.122, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000724, train/loss_step=0.203, global_step=2447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1431/5971 [16:53<53:31,  1.41it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000214, train/loss_step=0.0625, global_step=2447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1432/5971 [16:55<53:36,  1.41it/s, loss=0.131, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000593, train/loss_step=0.177, global_step=2447.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  24%|██▍       | 1433/5971 [16:56<53:36,  1.41it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00708, train/loss_vlb_step=3.27e-5, train/loss_step=0.00708, global_step=2448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1434/5971 [16:57<53:36,  1.41it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00708, train/loss_vlb_step=3.27e-5, train/loss_step=0.00708, global_step=2448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1434/5971 [16:57<53:36,  1.41it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00723, train/loss_vlb_step=3.32e-5, train/loss_step=0.00723, global_step=2448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1435/5971 [16:58<53:36,  1.41it/s, loss=0.126, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000565, train/loss_step=0.157, global_step=2448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  24%|██▍       | 1436/5971 [17:01<53:43,  1.41it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.13e-5, train/loss_step=0.00192, global_step=2448.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1437/5971 [17:02<53:43,  1.41it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.73e-6, train/loss_step=0.0017, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  24%|██▍       | 1438/5971 [17:03<53:43,  1.41it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.73e-6, train/loss_step=0.0017, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1438/5971 [17:03<53:43,  1.41it/s, loss=0.124, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.5e-5, train/loss_step=0.014, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  24%|██▍       | 1439/5971 [17:04<53:43,  1.41it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0029, train/loss_vlb_step=1.64e-5, train/loss_step=0.0029, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  24%|██▍       | 1440/5971 [17:07<53:50,  1.40it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:04,  2.59it/s][A
Epoch 4:  24%|██▍       | 1442/5971 [17:07<53:45,  1.40it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:48,  3.43it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.20it/s][A
Epoch 4:  24%|██▍       | 1446/5971 [17:08<53:35,  1.41it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.46it/s][A
Epoch 4:  24%|██▍       | 1450/5971 [17:08<53:23,  1.41it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.47it/s][A
Epoch 4:  24%|██▍       | 1454/5971 [17:08<53:12,  1.41it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:08, 19.05it/s][A

Validating:  10%|█         | 17/167 [00:01<00:07, 20.61it/s][A
Epoch 4:  24%|██▍       | 1458/5971 [17:08<53:01,  1.42it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.12it/s][A
Epoch 4:  24%|██▍       | 1462/5971 [17:08<52:50,  1.42it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.63it/s][A
Epoch 4:  25%|██▍       | 1466/5971 [17:08<52:39,  1.43it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.77it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.64it/s][A
Epoch 4:  25%|██▍       | 1470/5971 [17:09<52:28,  1.43it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.38it/s][A
Epoch 4:  25%|██▍       | 1474/5971 [17:09<52:18,  1.43it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.65it/s][A
Epoch 4:  25%|██▍       | 1478/5971 [17:09<52:07,  1.44it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 26.76it/s][A
Epoch 4:  25%|██▍       | 1482/5971 [17:09<51:56,  1.44it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.46it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.66it/s][A
Epoch 4:  25%|██▍       | 1486/5971 [17:09<51:45,  1.44it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 27.83it/s][A
Epoch 4:  25%|██▍       | 1490/5971 [17:09<51:35,  1.45it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 52/167 [00:02<00:04, 27.91it/s][A
Epoch 4:  25%|██▌       | 1494/5971 [17:09<51:24,  1.45it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 27.93it/s][A
Epoch 4:  25%|██▌       | 1498/5971 [17:10<51:13,  1.46it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.31it/s][A

Validating:  37%|███▋      | 61/167 [00:02<00:04, 26.05it/s][A
Epoch 4:  25%|██▌       | 1502/5971 [17:10<51:03,  1.46it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 64/167 [00:02<00:04, 25.35it/s][A
Epoch 4:  25%|██▌       | 1506/5971 [17:10<50:53,  1.46it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:03<00:04, 24.86it/s][A
Epoch 4:  25%|██▌       | 1510/5971 [17:10<50:42,  1.47it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.07it/s][A
Epoch 4:  25%|██▌       | 1514/5971 [17:10<50:32,  1.47it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.63it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.08it/s][A
Epoch 4:  25%|██▌       | 1518/5971 [17:10<50:22,  1.47it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.20it/s][A
Epoch 4:  25%|██▌       | 1522/5971 [17:11<50:11,  1.48it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.64it/s][A
Epoch 4:  26%|██▌       | 1526/5971 [17:11<50:01,  1.48it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.35it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 25.19it/s][A
Epoch 4:  26%|██▌       | 1530/5971 [17:11<49:51,  1.48it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 24.04it/s][A
Epoch 4:  26%|██▌       | 1534/5971 [17:11<49:41,  1.49it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.24it/s][A
Epoch 4:  26%|██▌       | 1538/5971 [17:11<49:31,  1.49it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.09it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 23.71it/s][A
Epoch 4:  26%|██▌       | 1542/5971 [17:11<49:21,  1.50it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.17it/s][A
Epoch 4:  26%|██▌       | 1546/5971 [17:12<49:11,  1.50it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.59it/s][A
Epoch 4:  26%|██▌       | 1550/5971 [17:12<49:02,  1.50it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 24.31it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 24.34it/s][A
Epoch 4:  26%|██▌       | 1554/5971 [17:12<48:52,  1.51it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.02it/s][A
Epoch 4:  26%|██▌       | 1558/5971 [17:12<48:42,  1.51it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.36it/s][A
Epoch 4:  26%|██▌       | 1562/5971 [17:12<48:32,  1.51it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.17it/s][A
Epoch 4:  26%|██▌       | 1566/5971 [17:12<48:23,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.65it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.02it/s][A
Epoch 4:  26%|██▋       | 1570/5971 [17:12<48:13,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.20it/s][A
Epoch 4:  26%|██▋       | 1574/5971 [17:13<48:04,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.67it/s][A
Epoch 4:  26%|██▋       | 1578/5971 [17:13<47:54,  1.53it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.46it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.23it/s][A
Epoch 4:  26%|██▋       | 1582/5971 [17:13<47:45,  1.53it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.20it/s][A
Epoch 4:  27%|██▋       | 1586/5971 [17:13<47:35,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.09it/s][A
Epoch 4:  27%|██▋       | 1590/5971 [17:13<47:26,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.29it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.63it/s][A
Epoch 4:  27%|██▋       | 1594/5971 [17:13<47:17,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 25.79it/s][A
Epoch 4:  27%|██▋       | 1598/5971 [17:14<47:07,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.98it/s][A
Epoch 4:  27%|██▋       | 1602/5971 [17:14<46:58,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.21it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.62it/s][A
Epoch 4:  27%|██▋       | 1606/5971 [17:14<46:49,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1608/5971 [17:14<46:45,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000221, train/loss_step=0.0619, global_step=2449.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  27%|██▋       | 1609/5971 [17:15<46:45,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.65e-5, train/loss_step=0.0127, global_step=2450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1610/5971 [17:16<46:45,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.65e-5, train/loss_step=0.0127, global_step=2450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1610/5971 [17:16<46:45,  1.55it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00585, train/loss_vlb_step=2.73e-5, train/loss_step=0.00585, global_step=2450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1611/5971 [17:17<46:45,  1.55it/s, loss=0.131, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00292, train/loss_step=0.476, global_step=2450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  27%|██▋       | 1612/5971 [17:20<46:51,  1.55it/s, loss=0.149, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00177, train/loss_step=0.353, global_step=2450.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1613/5971 [17:21<46:51,  1.55it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.36e-5, train/loss_step=0.0211, global_step=2451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1614/5971 [17:22<46:51,  1.55it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.36e-5, train/loss_step=0.0211, global_step=2451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1614/5971 [17:22<46:51,  1.55it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.36e-5, train/loss_step=0.0188, global_step=2451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1615/5971 [17:23<46:51,  1.55it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00979, train/loss_vlb_step=4.67e-5, train/loss_step=0.00979, global_step=2451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1616/5971 [17:25<46:56,  1.55it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000175, train/loss_step=0.0502, global_step=2451.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1617/5971 [17:26<46:56,  1.55it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000264, train/loss_step=0.080, global_step=2452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1618/5971 [17:27<46:56,  1.55it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000264, train/loss_step=0.080, global_step=2452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1618/5971 [17:27<46:56,  1.55it/s, loss=0.0762, v_num=0, train/loss_simple_step=0.00436, train/loss_vlb_step=2.35e-5, train/loss_step=0.00436, global_step=2452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1619/5971 [17:28<46:56,  1.54it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00137, train/loss_step=0.307, global_step=2452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  27%|██▋       | 1620/5971 [17:30<47:00,  1.54it/s, loss=0.0801, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.06e-5, train/loss_step=0.0112, global_step=2452.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1621/5971 [17:31<47:00,  1.54it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.41e-5, train/loss_step=0.0122, global_step=2453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1622/5971 [17:32<47:00,  1.54it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.41e-5, train/loss_step=0.0122, global_step=2453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1622/5971 [17:32<47:00,  1.54it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000384, train/loss_step=0.116, global_step=2453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1623/5971 [17:33<47:00,  1.54it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00193, train/loss_step=0.325, global_step=2453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1624/5971 [17:35<47:04,  1.54it/s, loss=0.117, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00269, train/loss_step=0.458, global_step=2453.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1625/5971 [17:36<47:04,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000693, train/loss_step=0.206, global_step=2454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1626/5971 [17:37<47:04,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000693, train/loss_step=0.206, global_step=2454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1626/5971 [17:37<47:04,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000352, train/loss_step=0.105, global_step=2454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1627/5971 [17:38<47:04,  1.54it/s, loss=0.138, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=2454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1628/5971 [17:40<47:08,  1.54it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.88e-5, train/loss_step=0.0257, global_step=2454.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1629/5971 [17:41<47:08,  1.54it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000161, train/loss_step=0.0439, global_step=2455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1630/5971 [17:42<47:08,  1.53it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000161, train/loss_step=0.0439, global_step=2455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1630/5971 [17:42<47:08,  1.53it/s, loss=0.143, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=2455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  27%|██▋       | 1631/5971 [17:43<47:08,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.31e-5, train/loss_step=0.00664, global_step=2455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1632/5971 [17:45<47:11,  1.53it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0388, train/loss_vlb_step=0.000139, train/loss_step=0.0388, global_step=2455.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1633/5971 [17:46<47:11,  1.53it/s, loss=0.111, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000612, train/loss_step=0.173, global_step=2456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  27%|██▋       | 1634/5971 [17:47<47:11,  1.53it/s, loss=0.111, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000612, train/loss_step=0.173, global_step=2456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1634/5971 [17:47<47:11,  1.53it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.49e-5, train/loss_step=0.0026, global_step=2456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1635/5971 [17:48<47:11,  1.53it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000187, train/loss_step=0.0531, global_step=2456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1636/5971 [17:50<47:15,  1.53it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00249, train/loss_vlb_step=1.36e-5, train/loss_step=0.00249, global_step=2456.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1637/5971 [17:51<47:15,  1.53it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.15e-5, train/loss_step=0.0117, global_step=2457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1638/5971 [17:52<47:15,  1.53it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.15e-5, train/loss_step=0.0117, global_step=2457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1638/5971 [17:52<47:15,  1.53it/s, loss=0.109, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.000153, train/loss_step=0.041, global_step=2457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1639/5971 [17:53<47:15,  1.53it/s, loss=0.117, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00355, train/loss_step=0.464, global_step=2457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  27%|██▋       | 1640/5971 [17:55<47:19,  1.53it/s, loss=0.134, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00208, train/loss_step=0.359, global_step=2457.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1641/5971 [17:56<47:19,  1.52it/s, loss=0.153, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00193, train/loss_step=0.392, global_step=2458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1642/5971 [17:57<47:19,  1.52it/s, loss=0.153, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00193, train/loss_step=0.392, global_step=2458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  27%|██▋       | 1642/5971 [17:57<47:19,  1.52it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.28e-5, train/loss_step=0.0149, global_step=2458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1643/5971 [17:58<47:19,  1.52it/s, loss=0.134, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000134, train/loss_step=0.038, global_step=2458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  28%|██▊       | 1644/5971 [18:00<47:22,  1.52it/s, loss=0.138, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00586, train/loss_step=0.538, global_step=2458.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  28%|██▊       | 1645/5971 [18:01<47:22,  1.52it/s, loss=0.134, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000426, train/loss_step=0.130, global_step=2459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1646/5971 [18:02<47:22,  1.52it/s, loss=0.134, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000426, train/loss_step=0.130, global_step=2459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1646/5971 [18:02<47:22,  1.52it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.56e-5, train/loss_step=0.0102, global_step=2459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1647/5971 [18:03<47:22,  1.52it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.06e-5, train/loss_step=0.0143, global_step=2459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1648/5971 [18:05<47:25,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0838, train/loss_vlb_step=0.000281, train/loss_step=0.0838, global_step=2459.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1649/5971 [18:06<47:25,  1.52it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0507, train/loss_vlb_step=0.000173, train/loss_step=0.0507, global_step=2460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1650/5971 [18:07<47:25,  1.52it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0507, train/loss_vlb_step=0.000173, train/loss_step=0.0507, global_step=2460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1650/5971 [18:07<47:25,  1.52it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000236, train/loss_step=0.0698, global_step=2460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1651/5971 [18:08<47:25,  1.52it/s, loss=0.133, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000646, train/loss_step=0.182, global_step=2460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  28%|██▊       | 1652/5971 [18:10<47:28,  1.52it/s, loss=0.138, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000463, train/loss_step=0.136, global_step=2460.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1653/5971 [18:11<47:28,  1.52it/s, loss=0.139, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000653, train/loss_step=0.194, global_step=2461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1654/5971 [18:12<47:28,  1.52it/s, loss=0.139, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000653, train/loss_step=0.194, global_step=2461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1654/5971 [18:12<47:28,  1.52it/s, loss=0.146, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000435, train/loss_step=0.128, global_step=2461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1655/5971 [18:13<47:28,  1.52it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.000162, train/loss_step=0.0477, global_step=2461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1656/5971 [18:15<47:33,  1.51it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000175, train/loss_step=0.0484, global_step=2461.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1657/5971 [18:16<47:32,  1.51it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.94e-5, train/loss_step=0.0138, global_step=2462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  28%|██▊       | 1658/5971 [18:17<47:32,  1.51it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.94e-5, train/loss_step=0.0138, global_step=2462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1658/5971 [18:17<47:32,  1.51it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.21e-6, train/loss_step=0.00158, global_step=2462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1659/5971 [18:18<47:32,  1.51it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.86e-5, train/loss_step=0.0208, global_step=2462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  28%|██▊       | 1660/5971 [18:20<47:35,  1.51it/s, loss=0.13, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00352, train/loss_step=0.477, global_step=2462.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  28%|██▊       | 1661/5971 [18:21<47:35,  1.51it/s, loss=0.118, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000587, train/loss_step=0.171, global_step=2463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1662/5971 [18:22<47:35,  1.51it/s, loss=0.118, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000587, train/loss_step=0.171, global_step=2463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1662/5971 [18:22<47:35,  1.51it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000112, train/loss_step=0.0291, global_step=2463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1663/5971 [18:22<47:35,  1.51it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.35e-5, train/loss_step=0.0149, global_step=2463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  28%|██▊       | 1664/5971 [18:25<47:38,  1.51it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.0689, train/loss_vlb_step=0.000233, train/loss_step=0.0689, global_step=2463.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1665/5971 [18:26<47:38,  1.51it/s, loss=0.104, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00172, train/loss_step=0.313, global_step=2464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  28%|██▊       | 1666/5971 [18:27<47:38,  1.51it/s, loss=0.104, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00172, train/loss_step=0.313, global_step=2464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1666/5971 [18:27<47:38,  1.51it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.000214, train/loss_step=0.0649, global_step=2464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1667/5971 [18:27<47:38,  1.51it/s, loss=0.107, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=8.01e-5, train/loss_step=0.020, global_step=2464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  28%|██▊       | 1668/5971 [18:30<47:41,  1.50it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0884, train/loss_vlb_step=0.000294, train/loss_step=0.0884, global_step=2464.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1669/5971 [18:30<47:41,  1.50it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000105, train/loss_step=0.0264, global_step=2465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1670/5971 [18:31<47:41,  1.50it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000105, train/loss_step=0.0264, global_step=2465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1670/5971 [18:31<47:41,  1.50it/s, loss=0.125, v_num=0, train/loss_simple_step=0.450, train/loss_vlb_step=0.00298, train/loss_step=0.450, global_step=2465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  28%|██▊       | 1671/5971 [18:32<47:41,  1.50it/s, loss=0.134, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00215, train/loss_step=0.365, global_step=2465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1672/5971 [18:35<47:46,  1.50it/s, loss=0.128, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=7.91e-5, train/loss_step=0.020, global_step=2465.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1673/5971 [18:36<47:46,  1.50it/s, loss=0.157, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0394, train/loss_step=0.761, global_step=2466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  28%|██▊       | 1674/5971 [18:37<47:46,  1.50it/s, loss=0.157, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0394, train/loss_step=0.761, global_step=2466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1674/5971 [18:37<47:46,  1.50it/s, loss=0.157, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000473, train/loss_step=0.139, global_step=2466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1675/5971 [18:38<47:45,  1.50it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0776, train/loss_vlb_step=0.00026, train/loss_step=0.0776, global_step=2466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1676/5971 [18:40<47:49,  1.50it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.16e-5, train/loss_step=0.0217, global_step=2466.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1677/5971 [18:41<47:49,  1.50it/s, loss=0.159, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000165, train/loss_step=0.047, global_step=2467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  28%|██▊       | 1678/5971 [18:42<47:49,  1.50it/s, loss=0.159, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000165, train/loss_step=0.047, global_step=2467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1678/5971 [18:42<47:49,  1.50it/s, loss=0.189, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.0103, train/loss_step=0.595, global_step=2467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  28%|██▊       | 1679/5971 [18:43<47:49,  1.50it/s, loss=0.208, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00236, train/loss_step=0.409, global_step=2467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1680/5971 [18:45<47:53,  1.49it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0954, train/loss_vlb_step=0.000314, train/loss_step=0.0954, global_step=2467.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1681/5971 [18:46<47:53,  1.49it/s, loss=0.193, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000993, train/loss_step=0.257, global_step=2468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  28%|██▊       | 1682/5971 [18:47<47:53,  1.49it/s, loss=0.193, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000993, train/loss_step=0.257, global_step=2468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1682/5971 [18:47<47:53,  1.49it/s, loss=0.197, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=2468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1683/5971 [18:48<47:53,  1.49it/s, loss=0.224, v_num=0, train/loss_simple_step=0.555, train/loss_vlb_step=0.00681, train/loss_step=0.555, global_step=2468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  28%|██▊       | 1684/5971 [18:50<47:57,  1.49it/s, loss=0.221, v_num=0, train/loss_simple_step=0.00672, train/loss_vlb_step=3.33e-5, train/loss_step=0.00672, global_step=2468.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1685/5971 [18:51<47:57,  1.49it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0603, train/loss_vlb_step=0.000202, train/loss_step=0.0603, global_step=2469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  28%|██▊       | 1686/5971 [18:52<47:57,  1.49it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0603, train/loss_vlb_step=0.000202, train/loss_step=0.0603, global_step=2469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1686/5971 [18:52<47:57,  1.49it/s, loss=0.216, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000985, train/loss_step=0.230, global_step=2469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  28%|██▊       | 1687/5971 [18:53<47:57,  1.49it/s, loss=0.224, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000601, train/loss_step=0.174, global_step=2469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1688/5971 [18:55<48:00,  1.49it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0984, train/loss_vlb_step=0.000323, train/loss_step=0.0984, global_step=2469.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1689/5971 [18:56<48:00,  1.49it/s, loss=0.23, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000469, train/loss_step=0.143, global_step=2470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  28%|██▊       | 1690/5971 [18:57<47:59,  1.49it/s, loss=0.23, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000469, train/loss_step=0.143, global_step=2470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1690/5971 [18:57<47:59,  1.49it/s, loss=0.221, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00105, train/loss_step=0.256, global_step=2470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1691/5971 [18:58<47:59,  1.49it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.000231, train/loss_step=0.0693, global_step=2470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1692/5971 [19:00<48:02,  1.48it/s, loss=0.219, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00121, train/loss_step=0.293, global_step=2470.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  28%|██▊       | 1693/5971 [19:01<48:02,  1.48it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.4e-5, train/loss_step=0.00255, global_step=2471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1694/5971 [19:02<48:02,  1.48it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.4e-5, train/loss_step=0.00255, global_step=2471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1694/5971 [19:02<48:02,  1.48it/s, loss=0.191, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.0014, train/loss_step=0.338, global_step=2471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  28%|██▊       | 1695/5971 [19:03<48:02,  1.48it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.89e-5, train/loss_step=0.0198, global_step=2471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1696/5971 [19:05<48:05,  1.48it/s, loss=0.207, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00188, train/loss_step=0.384, global_step=2471.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  28%|██▊       | 1697/5971 [19:06<48:05,  1.48it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.63e-5, train/loss_step=0.00531, global_step=2472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1698/5971 [19:07<48:05,  1.48it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.63e-5, train/loss_step=0.00531, global_step=2472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1698/5971 [19:07<48:05,  1.48it/s, loss=0.181, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000401, train/loss_step=0.119, global_step=2472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  28%|██▊       | 1699/5971 [19:08<48:05,  1.48it/s, loss=0.194, v_num=0, train/loss_simple_step=0.680, train/loss_vlb_step=0.0437, train/loss_step=0.680, global_step=2472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  28%|██▊       | 1700/5971 [19:10<48:08,  1.48it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.09e-5, train/loss_step=0.00189, global_step=2472.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  28%|██▊       | 1701/5971 [19:11<48:08,  1.48it/s, loss=0.184, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000448, train/loss_step=0.136, global_step=2473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  29%|██▊       | 1702/5971 [19:12<48:08,  1.48it/s, loss=0.184, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000448, train/loss_step=0.136, global_step=2473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  29%|██▊       | 1702/5971 [19:12<48:08,  1.48it/s, loss=0.184, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=2473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  29%|██▊       | 1703/5971 [19:13<48:08,  1.48it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.00013, train/loss_step=0.0338, global_step=2473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  29%|██▊       | 1704/5971 [19:15<48:11,  1.48it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000134, train/loss_step=0.0354, global_step=2473.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  29%|██▊       | 1705/5971 [19:16<48:11,  1.48it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.42e-5, train/loss_step=0.00246, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  29%|██▊       | 1706/5971 [19:17<48:11,  1.48it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.42e-5, train/loss_step=0.00246, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  29%|██▊       | 1706/5971 [19:17<48:11,  1.48it/s, loss=0.152, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00047, train/loss_step=0.140, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  29%|██▊       | 1707/5971 [19:18<48:11,  1.47it/s, loss=0.145, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000128, train/loss_step=0.034, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  29%|██▊       | 1708/5971 [19:20<48:14,  1.47it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.36it/s][A
Epoch 4:  29%|██▊       | 1710/5971 [19:20<48:10,  1.47it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:45,  3.62it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.49it/s][A
Epoch 4:  29%|██▊       | 1714/5971 [19:20<48:01,  1.48it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.13it/s][A
Epoch 4:  29%|██▉       | 1718/5971 [19:21<47:52,  1.48it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.96it/s][A
Epoch 4:  29%|██▉       | 1722/5971 [19:21<47:43,  1.48it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   9%|▉         | 15/167 [00:01<00:06, 21.84it/s][A
Epoch 4:  29%|██▉       | 1726/5971 [19:21<47:34,  1.49it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  11%|█         | 18/167 [00:01<00:06, 23.10it/s][A

Validating:  13%|█▎        | 21/167 [00:01<00:06, 24.12it/s][A
Epoch 4:  29%|██▉       | 1730/5971 [19:21<47:25,  1.49it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.69it/s][A
Epoch 4:  29%|██▉       | 1734/5971 [19:21<47:16,  1.49it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.32it/s][A
Epoch 4:  29%|██▉       | 1738/5971 [19:21<47:08,  1.50it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.67it/s][A

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.06it/s][A
Epoch 4:  29%|██▉       | 1742/5971 [19:22<46:59,  1.50it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 24.84it/s][A
Epoch 4:  29%|██▉       | 1746/5971 [19:22<46:50,  1.50it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 25.64it/s][A
Epoch 4:  29%|██▉       | 1750/5971 [19:22<46:41,  1.51it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.36it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.93it/s][A
Epoch 4:  29%|██▉       | 1754/5971 [19:22<46:33,  1.51it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.89it/s][A
Epoch 4:  29%|██▉       | 1758/5971 [19:22<46:24,  1.51it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 24.50it/s][A
Epoch 4:  30%|██▉       | 1762/5971 [19:22<46:16,  1.52it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.87it/s][A

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.54it/s][A
Epoch 4:  30%|██▉       | 1766/5971 [19:22<46:07,  1.52it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.73it/s][A
Epoch 4:  30%|██▉       | 1770/5971 [19:23<45:59,  1.52it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.93it/s][A
Epoch 4:  30%|██▉       | 1774/5971 [19:23<45:50,  1.53it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.61it/s][A
Epoch 4:  30%|██▉       | 1778/5971 [19:23<45:42,  1.53it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.13it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.78it/s][A
Epoch 4:  30%|██▉       | 1782/5971 [19:23<45:33,  1.53it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 24.84it/s][A
Epoch 4:  30%|██▉       | 1786/5971 [19:23<45:25,  1.54it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.76it/s][A
Epoch 4:  30%|██▉       | 1790/5971 [19:23<45:17,  1.54it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.08it/s][A

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.97it/s][A
Epoch 4:  30%|███       | 1794/5971 [19:24<45:08,  1.54it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 26.91it/s][A
Epoch 4:  30%|███       | 1798/5971 [19:24<45:00,  1.55it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.47it/s][A
Epoch 4:  30%|███       | 1802/5971 [19:24<44:52,  1.55it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 28.04it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.19it/s][A
Epoch 4:  30%|███       | 1806/5971 [19:24<44:44,  1.55it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 28.47it/s][A
Epoch 4:  30%|███       | 1810/5971 [19:24<44:35,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.81it/s][A
Epoch 4:  30%|███       | 1814/5971 [19:24<44:27,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.19it/s][A
Epoch 4:  30%|███       | 1818/5971 [19:24<44:19,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.79it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 25.85it/s][A
Epoch 4:  31%|███       | 1822/5971 [19:25<44:11,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 25.90it/s][A
Epoch 4:  31%|███       | 1826/5971 [19:25<44:03,  1.57it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 24.42it/s][A
Epoch 4:  31%|███       | 1830/5971 [19:25<43:55,  1.57it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.04it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.21it/s][A
Epoch 4:  31%|███       | 1834/5971 [19:25<43:47,  1.57it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.82it/s][A
Epoch 4:  31%|███       | 1838/5971 [19:25<43:39,  1.58it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.13it/s][A
Epoch 4:  31%|███       | 1842/5971 [19:25<43:31,  1.58it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.72it/s][A

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 24.93it/s][A
Epoch 4:  31%|███       | 1846/5971 [19:26<43:24,  1.58it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 25.29it/s][A
Epoch 4:  31%|███       | 1850/5971 [19:26<43:16,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.27it/s][A
Epoch 4:  31%|███       | 1854/5971 [19:26<43:08,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.16it/s][A

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.91it/s][A
Epoch 4:  31%|███       | 1858/5971 [19:26<43:00,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.82it/s][A
Epoch 4:  31%|███       | 1862/5971 [19:26<42:53,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.81it/s][A
Epoch 4:  31%|███▏      | 1866/5971 [19:26<42:45,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.37it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.31it/s][A
Epoch 4:  31%|███▏      | 1870/5971 [19:26<42:37,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 27.68it/s][A
Epoch 4:  31%|███▏      | 1874/5971 [19:27<42:30,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  31%|███▏      | 1876/5971 [19:27<42:27,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00156, train/loss_step=0.364, global_step=2474.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  31%|███▏      | 1877/5971 [19:28<42:27,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000212, train/loss_step=0.0636, global_step=2475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  31%|███▏      | 1878/5971 [19:29<42:27,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000212, train/loss_step=0.0636, global_step=2475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  31%|███▏      | 1878/5971 [19:29<42:27,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000867, train/loss_step=0.234, global_step=2475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  31%|███▏      | 1879/5971 [19:30<42:27,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000326, train/loss_step=0.0991, global_step=2475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  31%|███▏      | 1880/5971 [19:32<42:30,  1.60it/s, loss=0.152, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.0011, train/loss_step=0.241, global_step=2475.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  32%|███▏      | 1881/5971 [19:33<42:31,  1.60it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000139, train/loss_step=0.0379, global_step=2476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1882/5971 [19:34<42:31,  1.60it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000139, train/loss_step=0.0379, global_step=2476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1882/5971 [19:34<42:31,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0576, train/loss_vlb_step=0.000201, train/loss_step=0.0576, global_step=2476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  32%|███▏      | 1883/5971 [19:35<42:31,  1.60it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00986, train/loss_vlb_step=4.34e-5, train/loss_step=0.00986, global_step=2476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1884/5971 [19:37<42:33,  1.60it/s, loss=0.128, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000555, train/loss_step=0.165, global_step=2476.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  32%|███▏      | 1885/5971 [19:38<42:33,  1.60it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000219, train/loss_step=0.0655, global_step=2477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1886/5971 [19:39<42:33,  1.60it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000219, train/loss_step=0.0655, global_step=2477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1886/5971 [19:39<42:33,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.71e-5, train/loss_step=0.0193, global_step=2477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  32%|███▏      | 1887/5971 [19:40<42:33,  1.60it/s, loss=0.1, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000581, train/loss_step=0.161, global_step=2477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  32%|███▏      | 1888/5971 [19:42<42:36,  1.60it/s, loss=0.111, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000737, train/loss_step=0.214, global_step=2477.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1889/5971 [19:43<42:36,  1.60it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0526, train/loss_vlb_step=0.000182, train/loss_step=0.0526, global_step=2478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1890/5971 [19:44<42:36,  1.60it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0526, train/loss_vlb_step=0.000182, train/loss_step=0.0526, global_step=2478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1890/5971 [19:44<42:36,  1.60it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000127, train/loss_step=0.0346, global_step=2478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1891/5971 [19:45<42:35,  1.60it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00981, train/loss_vlb_step=4.29e-5, train/loss_step=0.00981, global_step=2478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1892/5971 [19:47<42:38,  1.59it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0833, train/loss_vlb_step=0.000274, train/loss_step=0.0833, global_step=2478.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  32%|███▏      | 1893/5971 [19:48<42:38,  1.59it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.16e-5, train/loss_step=0.00204, global_step=2479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1894/5971 [19:49<42:38,  1.59it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.16e-5, train/loss_step=0.00204, global_step=2479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1894/5971 [19:49<42:38,  1.59it/s, loss=0.124, v_num=0, train/loss_simple_step=0.523, train/loss_vlb_step=0.00366, train/loss_step=0.523, global_step=2479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  32%|███▏      | 1895/5971 [19:50<42:38,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.061, train/loss_vlb_step=0.000204, train/loss_step=0.061, global_step=2479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1896/5971 [19:52<42:41,  1.59it/s, loss=0.118, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000757, train/loss_step=0.218, global_step=2479.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1897/5971 [19:53<42:41,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0084, train/loss_vlb_step=4.02e-5, train/loss_step=0.0084, global_step=2480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1898/5971 [19:54<42:41,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0084, train/loss_vlb_step=4.02e-5, train/loss_step=0.0084, global_step=2480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1898/5971 [19:54<42:41,  1.59it/s, loss=0.118, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00145, train/loss_step=0.289, global_step=2480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  32%|███▏      | 1899/5971 [19:55<42:41,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000891, train/loss_step=0.241, global_step=2480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1900/5971 [19:57<42:44,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000166, train/loss_step=0.0468, global_step=2480.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1901/5971 [19:58<42:44,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000139, train/loss_step=0.037, global_step=2481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  32%|███▏      | 1902/5971 [19:59<42:44,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000139, train/loss_step=0.037, global_step=2481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1902/5971 [19:59<42:44,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.92e-5, train/loss_step=0.0139, global_step=2481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1903/5971 [20:00<42:44,  1.59it/s, loss=0.119, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000493, train/loss_step=0.145, global_step=2481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  32%|███▏      | 1904/5971 [20:02<42:47,  1.58it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000157, train/loss_step=0.0422, global_step=2481.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1905/5971 [20:03<42:47,  1.58it/s, loss=0.152, v_num=0, train/loss_simple_step=0.836, train/loss_vlb_step=0.0613, train/loss_step=0.836, global_step=2482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  32%|███▏      | 1906/5971 [20:04<42:47,  1.58it/s, loss=0.152, v_num=0, train/loss_simple_step=0.836, train/loss_vlb_step=0.0613, train/loss_step=0.836, global_step=2482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1906/5971 [20:04<42:47,  1.58it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.11e-5, train/loss_step=0.00183, global_step=2482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1907/5971 [20:05<42:47,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=7.04e-5, train/loss_step=0.0164, global_step=2482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  32%|███▏      | 1908/5971 [20:07<42:50,  1.58it/s, loss=0.142, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000632, train/loss_step=0.180, global_step=2482.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  32%|███▏      | 1909/5971 [20:08<42:50,  1.58it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.88e-5, train/loss_step=0.00335, global_step=2483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1910/5971 [20:09<42:49,  1.58it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.88e-5, train/loss_step=0.00335, global_step=2483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1910/5971 [20:09<42:49,  1.58it/s, loss=0.141, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000181, train/loss_step=0.053, global_step=2483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  32%|███▏      | 1911/5971 [20:10<42:49,  1.58it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=2.99e-5, train/loss_step=0.00593, global_step=2483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1912/5971 [20:12<42:52,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000125, train/loss_step=0.0346, global_step=2483.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1913/5971 [20:13<42:52,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000641, train/loss_step=0.183, global_step=2484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  32%|███▏      | 1914/5971 [20:14<42:52,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000641, train/loss_step=0.183, global_step=2484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1914/5971 [20:14<42:52,  1.58it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0786, train/loss_vlb_step=0.000261, train/loss_step=0.0786, global_step=2484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1915/5971 [20:15<42:52,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.49e-5, train/loss_step=0.00261, global_step=2484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1916/5971 [20:17<42:54,  1.57it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.62e-5, train/loss_step=0.0101, global_step=2484.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  32%|███▏      | 1917/5971 [20:18<42:54,  1.57it/s, loss=0.119, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000515, train/loss_step=0.157, global_step=2485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  32%|███▏      | 1918/5971 [20:18<42:54,  1.57it/s, loss=0.119, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000515, train/loss_step=0.157, global_step=2485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1918/5971 [20:18<42:54,  1.57it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0767, train/loss_vlb_step=0.000265, train/loss_step=0.0767, global_step=2485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1919/5971 [20:19<42:54,  1.57it/s, loss=0.0963, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.2e-5, train/loss_step=0.0021, global_step=2485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  32%|███▏      | 1920/5971 [20:22<42:57,  1.57it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.0795, train/loss_vlb_step=0.000263, train/loss_step=0.0795, global_step=2485.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1921/5971 [20:23<42:57,  1.57it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.07e-5, train/loss_step=0.00175, global_step=2486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1922/5971 [20:24<42:57,  1.57it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.07e-5, train/loss_step=0.00175, global_step=2486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1922/5971 [20:24<42:57,  1.57it/s, loss=0.108, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.00105, train/loss_step=0.248, global_step=2486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  32%|███▏      | 1923/5971 [20:24<42:57,  1.57it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.16e-5, train/loss_step=0.0116, global_step=2486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1924/5971 [20:27<42:59,  1.57it/s, loss=0.136, v_num=0, train/loss_simple_step=0.729, train/loss_vlb_step=0.0273, train/loss_step=0.729, global_step=2486.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  32%|███▏      | 1925/5971 [20:27<42:59,  1.57it/s, loss=0.123, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.0128, train/loss_step=0.590, global_step=2487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1926/5971 [20:28<42:59,  1.57it/s, loss=0.123, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.0128, train/loss_step=0.590, global_step=2487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1926/5971 [20:28<42:59,  1.57it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.32e-5, train/loss_step=0.0189, global_step=2487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1927/5971 [20:29<42:59,  1.57it/s, loss=0.158, v_num=0, train/loss_simple_step=0.691, train/loss_vlb_step=0.0156, train/loss_step=0.691, global_step=2487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  32%|███▏      | 1928/5971 [20:31<43:02,  1.57it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.3e-5, train/loss_step=0.00225, global_step=2487.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1929/5971 [20:32<43:01,  1.57it/s, loss=0.163, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00116, train/loss_step=0.288, global_step=2488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  32%|███▏      | 1930/5971 [20:33<43:01,  1.57it/s, loss=0.163, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00116, train/loss_step=0.288, global_step=2488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1930/5971 [20:33<43:01,  1.57it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.42e-5, train/loss_step=0.00251, global_step=2488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1931/5971 [20:34<43:01,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=2.03e-5, train/loss_step=0.00377, global_step=2488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1932/5971 [20:36<43:04,  1.56it/s, loss=0.172, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.000953, train/loss_step=0.265, global_step=2488.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  32%|███▏      | 1933/5971 [20:37<43:04,  1.56it/s, loss=0.166, v_num=0, train/loss_simple_step=0.067, train/loss_vlb_step=0.000221, train/loss_step=0.067, global_step=2489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1934/5971 [20:38<43:04,  1.56it/s, loss=0.166, v_num=0, train/loss_simple_step=0.067, train/loss_vlb_step=0.000221, train/loss_step=0.067, global_step=2489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1934/5971 [20:38<43:04,  1.56it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.71e-5, train/loss_step=0.0106, global_step=2489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1935/5971 [20:39<43:03,  1.56it/s, loss=0.178, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00137, train/loss_step=0.303, global_step=2489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  32%|███▏      | 1936/5971 [20:42<43:07,  1.56it/s, loss=0.192, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.0011, train/loss_step=0.283, global_step=2489.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  32%|███▏      | 1937/5971 [20:43<43:07,  1.56it/s, loss=0.187, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.000196, train/loss_step=0.057, global_step=2490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1938/5971 [20:43<43:07,  1.56it/s, loss=0.187, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.000196, train/loss_step=0.057, global_step=2490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1938/5971 [20:43<43:07,  1.56it/s, loss=0.188, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000351, train/loss_step=0.104, global_step=2490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1939/5971 [20:44<43:07,  1.56it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0088, train/loss_vlb_step=3.89e-5, train/loss_step=0.0088, global_step=2490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  32%|███▏      | 1940/5971 [20:47<43:10,  1.56it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00443, train/loss_vlb_step=2.22e-5, train/loss_step=0.00443, global_step=2490.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1941/5971 [20:48<43:10,  1.56it/s, loss=0.205, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00304, train/loss_step=0.420, global_step=2491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  33%|███▎      | 1942/5971 [20:49<43:10,  1.56it/s, loss=0.205, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00304, train/loss_step=0.420, global_step=2491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1942/5971 [20:49<43:10,  1.56it/s, loss=0.195, v_num=0, train/loss_simple_step=0.045, train/loss_vlb_step=0.000159, train/loss_step=0.045, global_step=2491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1943/5971 [20:50<43:10,  1.56it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.4e-5, train/loss_step=0.0237, global_step=2491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1944/5971 [20:52<43:12,  1.55it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=8.92e-6, train/loss_step=0.00151, global_step=2491.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1945/5971 [20:53<43:12,  1.55it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000212, train/loss_step=0.0604, global_step=2492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  33%|███▎      | 1946/5971 [20:53<43:12,  1.55it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000212, train/loss_step=0.0604, global_step=2492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1946/5971 [20:53<43:12,  1.55it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0714, train/loss_vlb_step=0.000242, train/loss_step=0.0714, global_step=2492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1947/5971 [20:54<43:12,  1.55it/s, loss=0.121, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.002, train/loss_step=0.406, global_step=2492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  33%|███▎      | 1948/5971 [20:57<43:15,  1.55it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000181, train/loss_step=0.0512, global_step=2492.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1949/5971 [20:58<43:15,  1.55it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=4.99e-5, train/loss_step=0.0116, global_step=2493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  33%|███▎      | 1950/5971 [20:59<43:15,  1.55it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=4.99e-5, train/loss_step=0.0116, global_step=2493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1950/5971 [20:59<43:15,  1.55it/s, loss=0.121, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000872, train/loss_step=0.229, global_step=2493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1951/5971 [21:00<43:14,  1.55it/s, loss=0.168, v_num=0, train/loss_simple_step=0.945, train/loss_vlb_step=0.476, train/loss_step=0.945, global_step=2493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  33%|███▎      | 1952/5971 [21:02<43:17,  1.55it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.00022, train/loss_step=0.0654, global_step=2493.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1953/5971 [21:03<43:17,  1.55it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.8e-5, train/loss_step=0.0173, global_step=2494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  33%|███▎      | 1954/5971 [21:03<43:17,  1.55it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.8e-5, train/loss_step=0.0173, global_step=2494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1954/5971 [21:03<43:17,  1.55it/s, loss=0.194, v_num=0, train/loss_simple_step=0.770, train/loss_vlb_step=0.0364, train/loss_step=0.770, global_step=2494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  33%|███▎      | 1955/5971 [21:04<43:16,  1.55it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.71e-5, train/loss_step=0.00317, global_step=2494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1956/5971 [21:07<43:20,  1.54it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00445, train/loss_vlb_step=2.21e-5, train/loss_step=0.00445, global_step=2494.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1957/5971 [21:08<43:20,  1.54it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.00012, train/loss_step=0.0317, global_step=2495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  33%|███▎      | 1958/5971 [21:09<43:20,  1.54it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.00012, train/loss_step=0.0317, global_step=2495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1958/5971 [21:09<43:20,  1.54it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.52e-5, train/loss_step=0.0177, global_step=2495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1959/5971 [21:10<43:20,  1.54it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0876, train/loss_vlb_step=0.000291, train/loss_step=0.0876, global_step=2495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1960/5971 [21:12<43:23,  1.54it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.02e-5, train/loss_step=0.0112, global_step=2495.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  33%|███▎      | 1961/5971 [21:13<43:23,  1.54it/s, loss=0.163, v_num=0, train/loss_simple_step=0.416, train/loss_vlb_step=0.00208, train/loss_step=0.416, global_step=2496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  33%|███▎      | 1962/5971 [21:14<43:23,  1.54it/s, loss=0.163, v_num=0, train/loss_simple_step=0.416, train/loss_vlb_step=0.00208, train/loss_step=0.416, global_step=2496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1962/5971 [21:14<43:23,  1.54it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.24e-5, train/loss_step=0.0122, global_step=2496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1963/5971 [21:15<43:23,  1.54it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000134, train/loss_step=0.0364, global_step=2496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1964/5971 [21:17<43:25,  1.54it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0872, train/loss_vlb_step=0.000288, train/loss_step=0.0872, global_step=2496.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1965/5971 [21:18<43:25,  1.54it/s, loss=0.171, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.00055, train/loss_step=0.154, global_step=2497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  33%|███▎      | 1966/5971 [21:19<43:25,  1.54it/s, loss=0.171, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.00055, train/loss_step=0.154, global_step=2497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1966/5971 [21:19<43:25,  1.54it/s, loss=0.174, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000406, train/loss_step=0.119, global_step=2497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1967/5971 [21:20<43:25,  1.54it/s, loss=0.159, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000337, train/loss_step=0.103, global_step=2497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1968/5971 [21:22<43:27,  1.54it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.93e-5, train/loss_step=0.0138, global_step=2497.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1969/5971 [21:23<43:27,  1.53it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.2e-5, train/loss_step=0.0118, global_step=2498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  33%|███▎      | 1970/5971 [21:24<43:27,  1.53it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.2e-5, train/loss_step=0.0118, global_step=2498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1970/5971 [21:24<43:27,  1.53it/s, loss=0.154, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000588, train/loss_step=0.174, global_step=2498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1971/5971 [21:25<43:27,  1.53it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000111, train/loss_step=0.0296, global_step=2498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1972/5971 [21:27<43:30,  1.53it/s, loss=0.146, v_num=0, train/loss_simple_step=0.822, train/loss_vlb_step=0.0529, train/loss_step=0.822, global_step=2498.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  33%|███▎      | 1973/5971 [21:28<43:30,  1.53it/s, loss=0.151, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1973/5971 [21:40<43:54,  1.52it/s, loss=0.151, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1974/5971 [22:02<44:37,  1.49it/s, loss=0.151, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1974/5971 [22:02<44:37,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1975/5971 [22:03<44:37,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.04e-5, train/loss_step=0.00173, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1975/5971 [22:03<44:37,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00872, train/loss_vlb_step=4.3e-5, train/loss_step=0.00872, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  33%|███▎      | 1976/5971 [22:05<44:39,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00872, train/loss_vlb_step=4.3e-5, train/loss_step=0.00872, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  33%|███▎      | 1976/5971 [22:05<44:39,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.23it/s][A
Epoch 4:  33%|███▎      | 1978/5971 [22:06<44:36,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:48,  3.43it/s][A
Epoch 4:  33%|███▎      | 1980/5971 [22:06<44:32,  1.49it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.03it/s][A
Epoch 4:  33%|███▎      | 1983/5971 [22:06<44:26,  1.50it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.82it/s][A
Epoch 4:  33%|███▎      | 1986/5971 [22:06<44:21,  1.50it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.58it/s][A
Epoch 4:  33%|███▎      | 1989/5971 [22:06<44:15,  1.50it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.05it/s][A
Epoch 4:  33%|███▎      | 1992/5971 [22:07<44:09,  1.50it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.94it/s][A
Epoch 4:  33%|███▎      | 1995/5971 [22:07<44:03,  1.50it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.71it/s][A
Epoch 4:  33%|███▎      | 1998/5971 [22:07<43:58,  1.51it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.70it/s][A
Epoch 4:  34%|███▎      | 2001/5971 [22:07<43:52,  1.51it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.65it/s][A
Epoch 4:  34%|███▎      | 2004/5971 [22:07<43:46,  1.51it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 26.07it/s][A
Epoch 4:  34%|███▎      | 2007/5971 [22:07<43:40,  1.51it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.62it/s][A
Epoch 4:  34%|███▎      | 2010/5971 [22:07<43:35,  1.51it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 23.48it/s][A
Epoch 4:  34%|███▎      | 2013/5971 [22:07<43:29,  1.52it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 22.89it/s][A
Epoch 4:  34%|███▍      | 2016/5971 [22:08<43:24,  1.52it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 23.57it/s][A
Epoch 4:  34%|███▍      | 2019/5971 [22:08<43:18,  1.52it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.42it/s][A
Epoch 4:  34%|███▍      | 2022/5971 [22:08<43:12,  1.52it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 27.09it/s][A
Epoch 4:  34%|███▍      | 2026/5971 [22:08<43:05,  1.53it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 28.00it/s][A
Epoch 4:  34%|███▍      | 2030/5971 [22:08<42:57,  1.53it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.70it/s][A
Epoch 4:  34%|███▍      | 2034/5971 [22:08<42:50,  1.53it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.80it/s][A
Epoch 4:  34%|███▍      | 2038/5971 [22:08<42:43,  1.53it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.60it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:03, 28.00it/s][A
Epoch 4:  34%|███▍      | 2042/5971 [22:08<42:35,  1.54it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.20it/s][A
Epoch 4:  34%|███▍      | 2046/5971 [22:09<42:28,  1.54it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 28.05it/s][A
Epoch 4:  34%|███▍      | 2050/5971 [22:09<42:21,  1.54it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.42it/s][A
Epoch 4:  34%|███▍      | 2054/5971 [22:09<42:13,  1.55it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.05it/s][A
Epoch 4:  34%|███▍      | 2058/5971 [22:09<42:06,  1.55it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.65it/s][A

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.58it/s][A
Epoch 4:  35%|███▍      | 2062/5971 [22:09<41:59,  1.55it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 26.93it/s][A
Epoch 4:  35%|███▍      | 2066/5971 [22:09<41:52,  1.55it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 91/167 [00:03<00:03, 24.99it/s][A
Epoch 4:  35%|███▍      | 2070/5971 [22:10<41:45,  1.56it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 24.97it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.96it/s][A
Epoch 4:  35%|███▍      | 2074/5971 [22:10<41:38,  1.56it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.67it/s][A
Epoch 4:  35%|███▍      | 2078/5971 [22:10<41:31,  1.56it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.31it/s][A
Epoch 4:  35%|███▍      | 2082/5971 [22:10<41:24,  1.57it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.28it/s][A

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.24it/s][A
Epoch 4:  35%|███▍      | 2086/5971 [22:10<41:16,  1.57it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 24.60it/s][A
Epoch 4:  35%|███▌      | 2090/5971 [22:10<41:10,  1.57it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:04<00:02, 25.81it/s][A
Epoch 4:  35%|███▌      | 2094/5971 [22:10<41:03,  1.57it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.00it/s][A

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.58it/s][A
Epoch 4:  35%|███▌      | 2098/5971 [22:11<40:56,  1.58it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.31it/s][A
Epoch 4:  35%|███▌      | 2102/5971 [22:11<40:49,  1.58it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.17it/s][A
Epoch 4:  35%|███▌      | 2106/5971 [22:11<40:42,  1.58it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.88it/s][A
Epoch 4:  35%|███▌      | 2110/5971 [22:11<40:35,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.20it/s][A

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.70it/s][A
Epoch 4:  35%|███▌      | 2114/5971 [22:11<40:28,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.36it/s][A
Epoch 4:  35%|███▌      | 2118/5971 [22:11<40:21,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.94it/s][A
Epoch 4:  36%|███▌      | 2122/5971 [22:11<40:14,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.54it/s][A

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.52it/s][A
Epoch 4:  36%|███▌      | 2126/5971 [22:12<40:08,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.35it/s][A
Epoch 4:  36%|███▌      | 2130/5971 [22:12<40:01,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.46it/s][A
Epoch 4:  36%|███▌      | 2134/5971 [22:12<39:54,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 28.03it/s][A
Epoch 4:  36%|███▌      | 2138/5971 [22:12<39:47,  1.61it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 27.63it/s][A
Epoch 4:  36%|███▌      | 2142/5971 [22:12<39:41,  1.61it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 28.17it/s][A
Epoch 4:  36%|███▌      | 2144/5971 [22:13<39:38,  1.61it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.81e-6, train/loss_step=0.00145, global_step=2499.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.45it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.31it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.96it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.45it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.81it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.05it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.23it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.54it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.59it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.61it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.66it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.67it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.63it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.66it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.68it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.68it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.68it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.70it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.33it/s]

Epoch 4:  36%|███▌      | 2145/5971 [22:24<39:57,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.000106, train/loss_step=0.0256, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.43it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.07it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.25it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.51it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.57it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.61it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.66it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.58it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.58it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.59it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.63it/s][A
Epoch 4:  36%|███▌      | 2145/5971 [22:30<40:08,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.000106, train/loss_step=0.0256, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.66it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.66it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.66it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.67it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.63it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.57it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.58it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.31it/s]

Epoch 4:  36%|███▌      | 2146/5971 [22:36<40:16,  1.58it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.000106, train/loss_step=0.0256, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2146/5971 [22:36<40:16,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0782, train/loss_vlb_step=0.000257, train/loss_step=0.0782, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.94it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.81it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.07it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.11it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.10it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.08it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.11it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.11it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.07it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  4.97it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.06it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.12it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.11it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  4.95it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  4.90it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:05,  4.67it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:05,  4.67it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  4.67it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  4.80it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  4.93it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.14it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.34it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.30it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.59it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.62it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.56it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.00it/s]

Epoch 4:  36%|███▌      | 2147/5971 [22:48<40:36,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0782, train/loss_vlb_step=0.000257, train/loss_step=0.0782, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2147/5971 [22:48<40:36,  1.57it/s, loss=0.138, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00493, train/loss_step=0.542, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.34it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.11it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.70it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.11it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.48it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.78it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.03it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.19it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.21it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.20it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.18it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.30it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.21it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.58it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.61it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.64it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.68it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.68it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.12it/s]

Epoch 4:  36%|███▌      | 2148/5971 [23:02<40:58,  1.55it/s, loss=0.138, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00493, train/loss_step=0.542, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2148/5971 [23:02<40:58,  1.55it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000118, train/loss_step=0.0328, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2149/5971 [23:03<40:58,  1.55it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000118, train/loss_step=0.0328, global_step=2500.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2149/5971 [23:03<40:58,  1.55it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.8e-5, train/loss_step=0.0114, global_step=2501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  36%|███▌      | 2150/5971 [23:03<40:58,  1.55it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.8e-5, train/loss_step=0.0114, global_step=2501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2150/5971 [23:03<40:58,  1.55it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000169, train/loss_step=0.0487, global_step=2501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2151/5971 [23:04<40:58,  1.55it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000169, train/loss_step=0.0487, global_step=2501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2151/5971 [23:04<40:58,  1.55it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0627, train/loss_vlb_step=0.00022, train/loss_step=0.0627, global_step=2501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2152/5971 [23:06<41:00,  1.55it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0627, train/loss_vlb_step=0.00022, train/loss_step=0.0627, global_step=2501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2152/5971 [23:06<41:00,  1.55it/s, loss=0.139, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00275, train/loss_step=0.431, global_step=2501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  36%|███▌      | 2153/5971 [23:07<41:00,  1.55it/s, loss=0.139, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00275, train/loss_step=0.431, global_step=2501.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2153/5971 [23:07<41:00,  1.55it/s, loss=0.142, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000729, train/loss_step=0.211, global_step=2502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2154/5971 [23:08<40:59,  1.55it/s, loss=0.142, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000729, train/loss_step=0.211, global_step=2502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2154/5971 [23:08<40:59,  1.55it/s, loss=0.138, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000154, train/loss_step=0.043, global_step=2502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2155/5971 [23:09<40:59,  1.55it/s, loss=0.138, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000154, train/loss_step=0.043, global_step=2502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2155/5971 [23:09<40:59,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00236, train/loss_step=0.413, global_step=2502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  36%|███▌      | 2156/5971 [23:11<41:01,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00236, train/loss_step=0.413, global_step=2502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2156/5971 [23:11<41:01,  1.55it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0762, train/loss_vlb_step=0.000253, train/loss_step=0.0762, global_step=2502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2157/5971 [23:12<41:01,  1.55it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0762, train/loss_vlb_step=0.000253, train/loss_step=0.0762, global_step=2502.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2157/5971 [23:12<41:01,  1.55it/s, loss=0.162, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000414, train/loss_step=0.125, global_step=2503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  36%|███▌      | 2158/5971 [23:13<41:01,  1.55it/s, loss=0.162, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000414, train/loss_step=0.125, global_step=2503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2158/5971 [23:13<41:01,  1.55it/s, loss=0.186, v_num=0, train/loss_simple_step=0.654, train/loss_vlb_step=0.0107, train/loss_step=0.654, global_step=2503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  36%|███▌      | 2159/5971 [23:14<41:01,  1.55it/s, loss=0.186, v_num=0, train/loss_simple_step=0.654, train/loss_vlb_step=0.0107, train/loss_step=0.654, global_step=2503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2159/5971 [23:14<41:01,  1.55it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000201, train/loss_step=0.0575, global_step=2503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2160/5971 [23:16<41:03,  1.55it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000201, train/loss_step=0.0575, global_step=2503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2160/5971 [23:16<41:03,  1.55it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000117, train/loss_step=0.0296, global_step=2503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2161/5971 [23:17<41:03,  1.55it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000117, train/loss_step=0.0296, global_step=2503.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2161/5971 [23:17<41:03,  1.55it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00437, train/loss_vlb_step=2.27e-5, train/loss_step=0.00437, global_step=2504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2162/5971 [23:18<41:03,  1.55it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00437, train/loss_vlb_step=2.27e-5, train/loss_step=0.00437, global_step=2504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2162/5971 [23:18<41:03,  1.55it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000162, train/loss_step=0.0453, global_step=2504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  36%|███▌      | 2163/5971 [23:19<41:02,  1.55it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000162, train/loss_step=0.0453, global_step=2504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2163/5971 [23:19<41:02,  1.55it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.09e-5, train/loss_step=0.00407, global_step=2504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2164/5971 [23:21<41:04,  1.54it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.09e-5, train/loss_step=0.00407, global_step=2504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▌      | 2164/5971 [23:21<41:04,  1.54it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.14e-5, train/loss_step=0.00393, global_step=2504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2165/5971 [23:22<41:04,  1.54it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.14e-5, train/loss_step=0.00393, global_step=2504.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2165/5971 [23:22<41:04,  1.54it/s, loss=0.154, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.00074, train/loss_step=0.205, global_step=2505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  36%|███▋      | 2166/5971 [23:23<41:04,  1.54it/s, loss=0.154, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.00074, train/loss_step=0.205, global_step=2505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2166/5971 [23:23<41:04,  1.54it/s, loss=0.153, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000213, train/loss_step=0.064, global_step=2505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2167/5971 [23:24<41:04,  1.54it/s, loss=0.153, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000213, train/loss_step=0.064, global_step=2505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2167/5971 [23:24<41:04,  1.54it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0356, train/loss_vlb_step=0.000135, train/loss_step=0.0356, global_step=2505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2168/5971 [23:27<41:07,  1.54it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0356, train/loss_vlb_step=0.000135, train/loss_step=0.0356, global_step=2505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2168/5971 [23:27<41:07,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=5.96e-5, train/loss_step=0.014, global_step=2505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  36%|███▋      | 2169/5971 [23:28<41:06,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=5.96e-5, train/loss_step=0.014, global_step=2505.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2169/5971 [23:28<41:06,  1.54it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000108, train/loss_step=0.0302, global_step=2506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2170/5971 [23:28<41:06,  1.54it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000108, train/loss_step=0.0302, global_step=2506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2170/5971 [23:28<41:06,  1.54it/s, loss=0.152, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00502, train/loss_step=0.525, global_step=2506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  36%|███▋      | 2171/5971 [23:29<41:06,  1.54it/s, loss=0.152, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00502, train/loss_step=0.525, global_step=2506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2171/5971 [23:29<41:06,  1.54it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.66e-5, train/loss_step=0.0173, global_step=2506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2172/5971 [23:31<41:08,  1.54it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.66e-5, train/loss_step=0.0173, global_step=2506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2172/5971 [23:31<41:08,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000604, train/loss_step=0.175, global_step=2506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  36%|███▋      | 2173/5971 [23:32<41:08,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000604, train/loss_step=0.175, global_step=2506.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2173/5971 [23:32<41:08,  1.54it/s, loss=0.145, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00284, train/loss_step=0.385, global_step=2507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  36%|███▋      | 2174/5971 [23:33<41:07,  1.54it/s, loss=0.145, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00284, train/loss_step=0.385, global_step=2507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2174/5971 [23:33<41:07,  1.54it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0458, train/loss_vlb_step=0.000157, train/loss_step=0.0458, global_step=2507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2175/5971 [23:34<41:07,  1.54it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0458, train/loss_vlb_step=0.000157, train/loss_step=0.0458, global_step=2507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2175/5971 [23:34<41:07,  1.54it/s, loss=0.135, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.00075, train/loss_step=0.206, global_step=2507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  36%|███▋      | 2176/5971 [23:36<41:09,  1.54it/s, loss=0.135, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.00075, train/loss_step=0.206, global_step=2507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2176/5971 [23:36<41:09,  1.54it/s, loss=0.16, v_num=0, train/loss_simple_step=0.579, train/loss_vlb_step=0.0047, train/loss_step=0.579, global_step=2507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  36%|███▋      | 2177/5971 [23:37<41:09,  1.54it/s, loss=0.16, v_num=0, train/loss_simple_step=0.579, train/loss_vlb_step=0.0047, train/loss_step=0.579, global_step=2507.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2177/5971 [23:37<41:09,  1.54it/s, loss=0.168, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.0011, train/loss_step=0.280, global_step=2508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2178/5971 [23:38<41:09,  1.54it/s, loss=0.168, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.0011, train/loss_step=0.280, global_step=2508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2178/5971 [23:38<41:09,  1.54it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0048, train/loss_vlb_step=2.37e-5, train/loss_step=0.0048, global_step=2508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2179/5971 [23:39<41:08,  1.54it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0048, train/loss_vlb_step=2.37e-5, train/loss_step=0.0048, global_step=2508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  36%|███▋      | 2179/5971 [23:39<41:08,  1.54it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00708, train/loss_vlb_step=3.45e-5, train/loss_step=0.00708, global_step=2508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2180/5971 [23:41<41:11,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00708, train/loss_vlb_step=3.45e-5, train/loss_step=0.00708, global_step=2508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2180/5971 [23:41<41:11,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.91e-5, train/loss_step=0.0226, global_step=2508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2181/5971 [23:42<41:11,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.91e-5, train/loss_step=0.0226, global_step=2508.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2181/5971 [23:42<41:11,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00551, train/loss_vlb_step=2.76e-5, train/loss_step=0.00551, global_step=2509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2182/5971 [23:43<41:11,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00551, train/loss_vlb_step=2.76e-5, train/loss_step=0.00551, global_step=2509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2182/5971 [23:43<41:11,  1.53it/s, loss=0.155, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.00527, train/loss_step=0.492, global_step=2509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  37%|███▋      | 2183/5971 [23:44<41:10,  1.53it/s, loss=0.155, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.00527, train/loss_step=0.492, global_step=2509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2183/5971 [23:44<41:10,  1.53it/s, loss=0.182, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00458, train/loss_step=0.535, global_step=2509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2184/5971 [23:46<41:13,  1.53it/s, loss=0.182, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00458, train/loss_step=0.535, global_step=2509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2184/5971 [23:46<41:13,  1.53it/s, loss=0.196, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00158, train/loss_step=0.290, global_step=2509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2185/5971 [23:47<41:12,  1.53it/s, loss=0.196, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00158, train/loss_step=0.290, global_step=2509.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2185/5971 [23:47<41:12,  1.53it/s, loss=0.197, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000945, train/loss_step=0.238, global_step=2510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2186/5971 [23:48<41:12,  1.53it/s, loss=0.197, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000945, train/loss_step=0.238, global_step=2510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2186/5971 [23:48<41:12,  1.53it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.24e-5, train/loss_step=0.00209, global_step=2510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2187/5971 [23:49<41:12,  1.53it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.24e-5, train/loss_step=0.00209, global_step=2510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2187/5971 [23:49<41:12,  1.53it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00389, train/loss_vlb_step=2.04e-5, train/loss_step=0.00389, global_step=2510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2188/5971 [23:51<41:14,  1.53it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00389, train/loss_vlb_step=2.04e-5, train/loss_step=0.00389, global_step=2510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2188/5971 [23:51<41:14,  1.53it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0714, train/loss_vlb_step=0.000242, train/loss_step=0.0714, global_step=2510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2189/5971 [23:52<41:14,  1.53it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0714, train/loss_vlb_step=0.000242, train/loss_step=0.0714, global_step=2510.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2189/5971 [23:52<41:14,  1.53it/s, loss=0.195, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.75e-5, train/loss_step=0.024, global_step=2511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  37%|███▋      | 2190/5971 [23:53<41:14,  1.53it/s, loss=0.195, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.75e-5, train/loss_step=0.024, global_step=2511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2190/5971 [23:53<41:14,  1.53it/s, loss=0.172, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.00021, train/loss_step=0.060, global_step=2511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2191/5971 [23:54<41:13,  1.53it/s, loss=0.172, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.00021, train/loss_step=0.060, global_step=2511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2191/5971 [23:54<41:13,  1.53it/s, loss=0.177, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000406, train/loss_step=0.124, global_step=2511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2192/5971 [23:57<41:16,  1.53it/s, loss=0.177, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000406, train/loss_step=0.124, global_step=2511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2192/5971 [23:57<41:16,  1.53it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8.17e-5, train/loss_step=0.0191, global_step=2511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2193/5971 [23:57<41:16,  1.53it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8.17e-5, train/loss_step=0.0191, global_step=2511.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2193/5971 [23:57<41:16,  1.53it/s, loss=0.162, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000832, train/loss_step=0.239, global_step=2512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2194/5971 [23:58<41:15,  1.53it/s, loss=0.162, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000832, train/loss_step=0.239, global_step=2512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2194/5971 [23:58<41:15,  1.53it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00129, train/loss_vlb_step=7.66e-6, train/loss_step=0.00129, global_step=2512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2195/5971 [23:59<41:15,  1.53it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00129, train/loss_vlb_step=7.66e-6, train/loss_step=0.00129, global_step=2512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2195/5971 [23:59<41:15,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00553, train/loss_vlb_step=2.71e-5, train/loss_step=0.00553, global_step=2512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2196/5971 [24:01<41:17,  1.52it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00553, train/loss_vlb_step=2.71e-5, train/loss_step=0.00553, global_step=2512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2196/5971 [24:01<41:17,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0946, train/loss_vlb_step=0.000311, train/loss_step=0.0946, global_step=2512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2197/5971 [24:02<41:17,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0946, train/loss_vlb_step=0.000311, train/loss_step=0.0946, global_step=2512.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2197/5971 [24:02<41:17,  1.52it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=6.14e-5, train/loss_step=0.0135, global_step=2513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2198/5971 [24:03<41:16,  1.52it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=6.14e-5, train/loss_step=0.0135, global_step=2513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2198/5971 [24:03<41:16,  1.52it/s, loss=0.152, v_num=0, train/loss_simple_step=0.792, train/loss_vlb_step=0.0185, train/loss_step=0.792, global_step=2513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  37%|███▋      | 2199/5971 [24:04<41:16,  1.52it/s, loss=0.152, v_num=0, train/loss_simple_step=0.792, train/loss_vlb_step=0.0185, train/loss_step=0.792, global_step=2513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2199/5971 [24:04<41:16,  1.52it/s, loss=0.171, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00247, train/loss_step=0.384, global_step=2513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2200/5971 [24:06<41:18,  1.52it/s, loss=0.171, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00247, train/loss_step=0.384, global_step=2513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2200/5971 [24:06<41:18,  1.52it/s, loss=0.179, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000602, train/loss_step=0.180, global_step=2513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2201/5971 [24:07<41:18,  1.52it/s, loss=0.179, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000602, train/loss_step=0.180, global_step=2513.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2201/5971 [24:07<41:18,  1.52it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0902, train/loss_vlb_step=0.000297, train/loss_step=0.0902, global_step=2514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2202/5971 [24:08<41:17,  1.52it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0902, train/loss_vlb_step=0.000297, train/loss_step=0.0902, global_step=2514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2202/5971 [24:08<41:17,  1.52it/s, loss=0.164, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=2514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2203/5971 [24:09<41:17,  1.52it/s, loss=0.164, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=2514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2203/5971 [24:09<41:17,  1.52it/s, loss=0.137, v_num=0, train/loss_simple_step=0.007, train/loss_vlb_step=3.45e-5, train/loss_step=0.007, global_step=2514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2204/5971 [24:11<41:19,  1.52it/s, loss=0.137, v_num=0, train/loss_simple_step=0.007, train/loss_vlb_step=3.45e-5, train/loss_step=0.007, global_step=2514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2204/5971 [24:11<41:19,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0577, train/loss_vlb_step=0.000205, train/loss_step=0.0577, global_step=2514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2205/5971 [24:12<41:19,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0577, train/loss_vlb_step=0.000205, train/loss_step=0.0577, global_step=2514.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2205/5971 [24:12<41:19,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0963, train/loss_vlb_step=0.000317, train/loss_step=0.0963, global_step=2515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2206/5971 [24:13<41:19,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0963, train/loss_vlb_step=0.000317, train/loss_step=0.0963, global_step=2515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2206/5971 [24:13<41:19,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.08e-5, train/loss_step=0.0141, global_step=2515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2207/5971 [24:14<41:18,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.08e-5, train/loss_step=0.0141, global_step=2515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2207/5971 [24:14<41:18,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.73e-5, train/loss_step=0.00757, global_step=2515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2208/5971 [24:16<41:21,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.73e-5, train/loss_step=0.00757, global_step=2515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2208/5971 [24:16<41:21,  1.52it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.33e-5, train/loss_step=0.0141, global_step=2515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2209/5971 [24:17<41:21,  1.52it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.33e-5, train/loss_step=0.0141, global_step=2515.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2209/5971 [24:17<41:21,  1.52it/s, loss=0.122, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000433, train/loss_step=0.132, global_step=2516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2210/5971 [24:18<41:20,  1.52it/s, loss=0.122, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000433, train/loss_step=0.132, global_step=2516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2210/5971 [24:18<41:20,  1.52it/s, loss=0.131, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.00112, train/loss_step=0.242, global_step=2516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2211/5971 [24:19<41:20,  1.52it/s, loss=0.131, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.00112, train/loss_step=0.242, global_step=2516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2211/5971 [24:19<41:20,  1.52it/s, loss=0.147, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00271, train/loss_step=0.436, global_step=2516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2212/5971 [24:21<41:22,  1.51it/s, loss=0.147, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00271, train/loss_step=0.436, global_step=2516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2212/5971 [24:21<41:22,  1.51it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.52e-5, train/loss_step=0.00478, global_step=2516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2213/5971 [24:22<41:22,  1.51it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.52e-5, train/loss_step=0.00478, global_step=2516.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2213/5971 [24:22<41:22,  1.51it/s, loss=0.154, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.0027, train/loss_step=0.407, global_step=2517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  37%|███▋      | 2214/5971 [24:23<41:21,  1.51it/s, loss=0.154, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.0027, train/loss_step=0.407, global_step=2517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2214/5971 [24:23<41:21,  1.51it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=6.19e-5, train/loss_step=0.0134, global_step=2517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2215/5971 [24:24<41:21,  1.51it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=6.19e-5, train/loss_step=0.0134, global_step=2517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2215/5971 [24:24<41:21,  1.51it/s, loss=0.172, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.0019, train/loss_step=0.352, global_step=2517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  37%|███▋      | 2216/5971 [24:26<41:23,  1.51it/s, loss=0.172, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.0019, train/loss_step=0.352, global_step=2517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2216/5971 [24:26<41:23,  1.51it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00918, train/loss_vlb_step=4.34e-5, train/loss_step=0.00918, global_step=2517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2217/5971 [24:27<41:23,  1.51it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00918, train/loss_vlb_step=4.34e-5, train/loss_step=0.00918, global_step=2517.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2217/5971 [24:27<41:23,  1.51it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.00011, train/loss_step=0.0274, global_step=2518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2218/5971 [24:27<41:22,  1.51it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.00011, train/loss_step=0.0274, global_step=2518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2218/5971 [24:27<41:22,  1.51it/s, loss=0.135, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=2518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2219/5971 [24:28<41:22,  1.51it/s, loss=0.135, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=2518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2219/5971 [24:28<41:22,  1.51it/s, loss=0.128, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000952, train/loss_step=0.240, global_step=2518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2220/5971 [24:31<41:24,  1.51it/s, loss=0.128, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000952, train/loss_step=0.240, global_step=2518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2220/5971 [24:31<41:24,  1.51it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.53e-5, train/loss_step=0.0213, global_step=2518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2221/5971 [24:32<41:24,  1.51it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.53e-5, train/loss_step=0.0213, global_step=2518.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2221/5971 [24:32<41:24,  1.51it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00765, train/loss_vlb_step=3.74e-5, train/loss_step=0.00765, global_step=2519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2222/5971 [24:33<41:24,  1.51it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00765, train/loss_vlb_step=3.74e-5, train/loss_step=0.00765, global_step=2519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2222/5971 [24:33<41:24,  1.51it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000212, train/loss_step=0.0626, global_step=2519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2223/5971 [24:34<41:24,  1.51it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000212, train/loss_step=0.0626, global_step=2519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2223/5971 [24:34<41:24,  1.51it/s, loss=0.121, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000506, train/loss_step=0.153, global_step=2519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2224/5971 [24:36<41:27,  1.51it/s, loss=0.121, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000506, train/loss_step=0.153, global_step=2519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2224/5971 [24:36<41:27,  1.51it/s, loss=0.133, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00115, train/loss_step=0.296, global_step=2519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2225/5971 [24:38<41:27,  1.51it/s, loss=0.133, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00115, train/loss_step=0.296, global_step=2519.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2225/5971 [24:38<41:27,  1.51it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.18e-5, train/loss_step=0.00204, global_step=2520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2226/5971 [24:39<41:28,  1.51it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.18e-5, train/loss_step=0.00204, global_step=2520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2226/5971 [24:39<41:28,  1.51it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.51e-5, train/loss_step=0.0124, global_step=2520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2227/5971 [24:40<41:27,  1.50it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.51e-5, train/loss_step=0.0124, global_step=2520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2227/5971 [24:40<41:27,  1.50it/s, loss=0.151, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00358, train/loss_step=0.459, global_step=2520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2228/5971 [24:42<41:29,  1.50it/s, loss=0.151, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00358, train/loss_step=0.459, global_step=2520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2228/5971 [24:42<41:29,  1.50it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000114, train/loss_step=0.0306, global_step=2520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2229/5971 [24:43<41:29,  1.50it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000114, train/loss_step=0.0306, global_step=2520.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2229/5971 [24:43<41:29,  1.50it/s, loss=0.153, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.00053, train/loss_step=0.161, global_step=2521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  37%|███▋      | 2230/5971 [24:44<41:29,  1.50it/s, loss=0.153, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.00053, train/loss_step=0.161, global_step=2521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2230/5971 [24:44<41:29,  1.50it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.74e-5, train/loss_step=0.0188, global_step=2521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2231/5971 [24:45<41:28,  1.50it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.74e-5, train/loss_step=0.0188, global_step=2521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2231/5971 [24:45<41:28,  1.50it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.53e-5, train/loss_step=0.0215, global_step=2521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2232/5971 [24:47<41:30,  1.50it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.53e-5, train/loss_step=0.0215, global_step=2521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2232/5971 [24:47<41:30,  1.50it/s, loss=0.155, v_num=0, train/loss_simple_step=0.688, train/loss_vlb_step=0.0119, train/loss_step=0.688, global_step=2521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  37%|███▋      | 2233/5971 [24:48<41:30,  1.50it/s, loss=0.155, v_num=0, train/loss_simple_step=0.688, train/loss_vlb_step=0.0119, train/loss_step=0.688, global_step=2521.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2233/5971 [24:48<41:30,  1.50it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.59e-5, train/loss_step=0.00308, global_step=2522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2234/5971 [24:49<41:30,  1.50it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.59e-5, train/loss_step=0.00308, global_step=2522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2234/5971 [24:49<41:30,  1.50it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.8e-5, train/loss_step=0.00327, global_step=2522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  37%|███▋      | 2235/5971 [24:50<41:29,  1.50it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.8e-5, train/loss_step=0.00327, global_step=2522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2235/5971 [24:50<41:29,  1.50it/s, loss=0.119, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000143, train/loss_step=0.040, global_step=2522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  37%|███▋      | 2236/5971 [24:52<41:31,  1.50it/s, loss=0.119, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000143, train/loss_step=0.040, global_step=2522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2236/5971 [24:52<41:31,  1.50it/s, loss=0.124, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000378, train/loss_step=0.115, global_step=2522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2237/5971 [24:53<41:31,  1.50it/s, loss=0.124, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000378, train/loss_step=0.115, global_step=2522.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2237/5971 [24:53<41:31,  1.50it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000118, train/loss_step=0.0341, global_step=2523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2238/5971 [24:54<41:31,  1.50it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000118, train/loss_step=0.0341, global_step=2523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2238/5971 [24:54<41:31,  1.50it/s, loss=0.14, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00312, train/loss_step=0.427, global_step=2523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  37%|███▋      | 2239/5971 [24:55<41:31,  1.50it/s, loss=0.14, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00312, train/loss_step=0.427, global_step=2523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  37%|███▋      | 2239/5971 [24:55<41:31,  1.50it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000144, train/loss_step=0.0389, global_step=2523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  38%|███▊      | 2240/5971 [24:57<41:33,  1.50it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000144, train/loss_step=0.0389, global_step=2523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  38%|███▊      | 2240/5971 [24:57<41:33,  1.50it/s, loss=0.145, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00169, train/loss_step=0.331, global_step=2523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  38%|███▊      | 2241/5971 [24:58<41:33,  1.50it/s, loss=0.145, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00169, train/loss_step=0.331, global_step=2523.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  38%|███▊      | 2241/5971 [24:58<41:33,  1.50it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.75e-5, train/loss_step=0.0106, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  38%|███▊      | 2242/5971 [24:59<41:33,  1.50it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.75e-5, train/loss_step=0.0106, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  38%|███▊      | 2242/5971 [24:59<41:33,  1.50it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00941, train/loss_vlb_step=4.18e-5, train/loss_step=0.00941, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  38%|███▊      | 2243/5971 [25:00<41:32,  1.50it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00941, train/loss_vlb_step=4.18e-5, train/loss_step=0.00941, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  38%|███▊      | 2243/5971 [25:00<41:32,  1.50it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000122, train/loss_step=0.0337, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  38%|███▊      | 2244/5971 [25:02<41:35,  1.49it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000122, train/loss_step=0.0337, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  38%|███▊      | 2244/5971 [25:02<41:35,  1.49it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.22it/s][A
Epoch 4:  38%|███▊      | 2246/5971 [25:03<41:32,  1.49it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:48,  3.38it/s][A
Epoch 4:  38%|███▊      | 2248/5971 [25:03<41:29,  1.50it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.93it/s][A
Epoch 4:  38%|███▊      | 2251/5971 [25:03<41:23,  1.50it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.42it/s][A
Epoch 4:  38%|███▊      | 2254/5971 [25:03<41:18,  1.50it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.78it/s][A
Epoch 4:  38%|███▊      | 2257/5971 [25:03<41:13,  1.50it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.17it/s][A
Epoch 4:  38%|███▊      | 2260/5971 [25:04<41:08,  1.50it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.89it/s][A
Epoch 4:  38%|███▊      | 2264/5971 [25:04<41:01,  1.51it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.62it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.72it/s][A
Epoch 4:  38%|███▊      | 2268/5971 [25:04<40:55,  1.51it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.56it/s][A
Epoch 4:  38%|███▊      | 2272/5971 [25:04<40:48,  1.51it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.41it/s][A
Epoch 4:  38%|███▊      | 2276/5971 [25:04<40:41,  1.51it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.82it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 25.86it/s][A
Epoch 4:  38%|███▊      | 2280/5971 [25:04<40:35,  1.52it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.25it/s][A
Epoch 4:  38%|███▊      | 2284/5971 [25:05<40:28,  1.52it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 23.86it/s][A
Epoch 4:  38%|███▊      | 2288/5971 [25:05<40:21,  1.52it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 23.93it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:05, 23.80it/s][A
Epoch 4:  38%|███▊      | 2292/5971 [25:05<40:15,  1.52it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 23.79it/s][A
Epoch 4:  38%|███▊      | 2296/5971 [25:05<40:08,  1.53it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 24.42it/s][A
Epoch 4:  39%|███▊      | 2300/5971 [25:05<40:02,  1.53it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.28it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.42it/s][A
Epoch 4:  39%|███▊      | 2304/5971 [25:05<39:55,  1.53it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:03<00:04, 25.83it/s][A
Epoch 4:  39%|███▊      | 2308/5971 [25:05<39:49,  1.53it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 24.77it/s][A
Epoch 4:  39%|███▊      | 2312/5971 [25:06<39:42,  1.54it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 23.24it/s][A
Epoch 4:  39%|███▉      | 2316/5971 [25:06<39:36,  1.54it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.34it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.04it/s][A
Epoch 4:  39%|███▉      | 2320/5971 [25:06<39:29,  1.54it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.20it/s][A
Epoch 4:  39%|███▉      | 2324/5971 [25:06<39:23,  1.54it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.06it/s][A
Epoch 4:  39%|███▉      | 2328/5971 [25:06<39:16,  1.55it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.03it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.63it/s][A
Epoch 4:  39%|███▉      | 2332/5971 [25:06<39:10,  1.55it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.14it/s][A
Epoch 4:  39%|███▉      | 2336/5971 [25:07<39:04,  1.55it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.17it/s][A
Epoch 4:  39%|███▉      | 2340/5971 [25:07<38:57,  1.55it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.47it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.72it/s][A
Epoch 4:  39%|███▉      | 2344/5971 [25:07<38:51,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 28.28it/s][A
Epoch 4:  39%|███▉      | 2348/5971 [25:07<38:45,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.95it/s][A
Epoch 4:  39%|███▉      | 2352/5971 [25:07<38:38,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 25.97it/s][A
Epoch 4:  39%|███▉      | 2356/5971 [25:07<38:32,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.08it/s][A

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 25.97it/s][A
Epoch 4:  40%|███▉      | 2360/5971 [25:08<38:26,  1.57it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.83it/s][A
Epoch 4:  40%|███▉      | 2364/5971 [25:08<38:20,  1.57it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.60it/s][A
Epoch 4:  40%|███▉      | 2368/5971 [25:08<38:13,  1.57it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.82it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.57it/s][A
Epoch 4:  40%|███▉      | 2372/5971 [25:08<38:07,  1.57it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.73it/s][A
Epoch 4:  40%|███▉      | 2376/5971 [25:08<38:01,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.17it/s][A
Epoch 4:  40%|███▉      | 2380/5971 [25:08<37:55,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.91it/s][A
Epoch 4:  40%|███▉      | 2384/5971 [25:08<37:49,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.66it/s][A
Epoch 4:  40%|███▉      | 2388/5971 [25:09<37:43,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.76it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.81it/s][A
Epoch 4:  40%|████      | 2392/5971 [25:09<37:37,  1.59it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.73it/s][A
Epoch 4:  40%|████      | 2396/5971 [25:09<37:31,  1.59it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.21it/s][A
Epoch 4:  40%|████      | 2400/5971 [25:09<37:25,  1.59it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.16it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.76it/s][A
Epoch 4:  40%|████      | 2404/5971 [25:09<37:19,  1.59it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.69it/s][A
Epoch 4:  40%|████      | 2408/5971 [25:09<37:13,  1.60it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.70it/s][A
Epoch 4:  40%|████      | 2412/5971 [25:09<37:07,  1.60it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  40%|████      | 2412/5971 [25:10<37:07,  1.60it/s, loss=0.138, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00165, train/loss_step=0.327, global_step=2524.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  40%|████      | 2413/5971 [25:11<37:07,  1.60it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.65e-5, train/loss_step=0.0158, global_step=2525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  40%|████      | 2414/5971 [25:12<37:07,  1.60it/s, loss=0.146, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000484, train/loss_step=0.142, global_step=2525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  40%|████      | 2415/5971 [25:13<37:07,  1.60it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.79e-5, train/loss_step=0.00529, global_step=2525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  40%|████      | 2416/5971 [25:15<37:09,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.79e-5, train/loss_step=0.00529, global_step=2525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  40%|████      | 2416/5971 [25:15<37:09,  1.59it/s, loss=0.135, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00139, train/loss_step=0.283, global_step=2525.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  40%|████      | 2417/5971 [25:16<37:09,  1.59it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.49e-6, train/loss_step=0.00157, global_step=2526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  40%|████      | 2418/5971 [25:17<37:08,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.00722, train/loss_step=0.621, global_step=2526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  41%|████      | 2419/5971 [25:18<37:08,  1.59it/s, loss=0.16, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000215, train/loss_step=0.064, global_step=2526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2420/5971 [25:20<37:10,  1.59it/s, loss=0.16, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000215, train/loss_step=0.064, global_step=2526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2420/5971 [25:20<37:10,  1.59it/s, loss=0.131, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000368, train/loss_step=0.111, global_step=2526.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2421/5971 [25:21<37:09,  1.59it/s, loss=0.139, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000563, train/loss_step=0.158, global_step=2527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2422/5971 [25:22<37:09,  1.59it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000214, train/loss_step=0.0644, global_step=2527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2423/5971 [25:23<37:09,  1.59it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000112, train/loss_step=0.0294, global_step=2527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2424/5971 [25:25<37:11,  1.59it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000112, train/loss_step=0.0294, global_step=2527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2424/5971 [25:25<37:11,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0074, train/loss_vlb_step=3.52e-5, train/loss_step=0.0074, global_step=2527.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████      | 2425/5971 [25:26<37:10,  1.59it/s, loss=0.141, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.00048, train/loss_step=0.145, global_step=2528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  41%|████      | 2426/5971 [25:27<37:10,  1.59it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.22e-5, train/loss_step=0.00207, global_step=2528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2427/5971 [25:27<37:10,  1.59it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.95e-5, train/loss_step=0.0197, global_step=2528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████      | 2428/5971 [25:30<37:11,  1.59it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.95e-5, train/loss_step=0.0197, global_step=2528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2428/5971 [25:30<37:11,  1.59it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.57e-5, train/loss_step=0.00283, global_step=2528.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2429/5971 [25:31<37:11,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00117, train/loss_step=0.285, global_step=2529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  41%|████      | 2430/5971 [25:31<37:11,  1.59it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.34e-5, train/loss_step=0.0141, global_step=2529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2431/5971 [25:32<37:11,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000108, train/loss_step=0.0274, global_step=2529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2432/5971 [25:34<37:12,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000108, train/loss_step=0.0274, global_step=2529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2432/5971 [25:34<37:12,  1.59it/s, loss=0.114, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00121, train/loss_step=0.280, global_step=2529.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  41%|████      | 2433/5971 [25:35<37:12,  1.58it/s, loss=0.135, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.00248, train/loss_step=0.435, global_step=2530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2434/5971 [25:36<37:12,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000644, train/loss_step=0.193, global_step=2530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2435/5971 [25:37<37:11,  1.58it/s, loss=0.154, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00227, train/loss_step=0.330, global_step=2530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████      | 2436/5971 [25:40<37:14,  1.58it/s, loss=0.154, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00227, train/loss_step=0.330, global_step=2530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2436/5971 [25:40<37:14,  1.58it/s, loss=0.163, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00283, train/loss_step=0.468, global_step=2530.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2437/5971 [25:41<37:14,  1.58it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.87e-5, train/loss_step=0.0187, global_step=2531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2438/5971 [25:42<37:14,  1.58it/s, loss=0.146, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00126, train/loss_step=0.271, global_step=2531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  41%|████      | 2439/5971 [25:43<37:14,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000453, train/loss_step=0.136, global_step=2531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2440/5971 [25:45<37:15,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000453, train/loss_step=0.136, global_step=2531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2440/5971 [25:45<37:15,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0795, train/loss_vlb_step=0.000262, train/loss_step=0.0795, global_step=2531.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2441/5971 [25:46<37:15,  1.58it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00412, train/loss_vlb_step=2.02e-5, train/loss_step=0.00412, global_step=2532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2442/5971 [25:47<37:15,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.33e-5, train/loss_step=0.00232, global_step=2532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2443/5971 [25:48<37:15,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000136, train/loss_step=0.0364, global_step=2532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████      | 2444/5971 [25:50<37:16,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000136, train/loss_step=0.0364, global_step=2532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2444/5971 [25:50<37:16,  1.58it/s, loss=0.149, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000789, train/loss_step=0.225, global_step=2532.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  41%|████      | 2445/5971 [25:51<37:16,  1.58it/s, loss=0.157, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00115, train/loss_step=0.302, global_step=2533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████      | 2446/5971 [25:52<37:16,  1.58it/s, loss=0.17, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.000981, train/loss_step=0.270, global_step=2533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2447/5971 [25:53<37:15,  1.58it/s, loss=0.189, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00262, train/loss_step=0.408, global_step=2533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2448/5971 [25:55<37:17,  1.57it/s, loss=0.189, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00262, train/loss_step=0.408, global_step=2533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2448/5971 [25:55<37:17,  1.57it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.53e-5, train/loss_step=0.0101, global_step=2533.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2449/5971 [25:56<37:17,  1.57it/s, loss=0.178, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000186, train/loss_step=0.051, global_step=2534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2450/5971 [25:57<37:16,  1.57it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.42e-5, train/loss_step=0.0208, global_step=2534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2451/5971 [25:57<37:16,  1.57it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.00013, train/loss_step=0.0341, global_step=2534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2452/5971 [26:00<37:18,  1.57it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.00013, train/loss_step=0.0341, global_step=2534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2452/5971 [26:00<37:18,  1.57it/s, loss=0.173, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.00053, train/loss_step=0.159, global_step=2534.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  41%|████      | 2453/5971 [26:00<37:17,  1.57it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.26e-5, train/loss_step=0.00221, global_step=2535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2454/5971 [26:01<37:17,  1.57it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0312, train/loss_vlb_step=0.000118, train/loss_step=0.0312, global_step=2535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████      | 2455/5971 [26:02<37:17,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000105, train/loss_step=0.0278, global_step=2535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2456/5971 [26:05<37:19,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000105, train/loss_step=0.0278, global_step=2535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2456/5971 [26:05<37:19,  1.57it/s, loss=0.112, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000575, train/loss_step=0.159, global_step=2535.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  41%|████      | 2457/5971 [26:06<37:19,  1.57it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0506, train/loss_vlb_step=0.000171, train/loss_step=0.0506, global_step=2536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2458/5971 [26:07<37:19,  1.57it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0011, train/loss_vlb_step=6.41e-6, train/loss_step=0.0011, global_step=2536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████      | 2459/5971 [26:08<37:18,  1.57it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.00899, train/loss_vlb_step=3.95e-5, train/loss_step=0.00899, global_step=2536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2460/5971 [26:10<37:20,  1.57it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.00899, train/loss_vlb_step=3.95e-5, train/loss_step=0.00899, global_step=2536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2460/5971 [26:10<37:20,  1.57it/s, loss=0.103, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000921, train/loss_step=0.252, global_step=2536.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  41%|████      | 2461/5971 [26:11<37:19,  1.57it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=8.95e-5, train/loss_step=0.0225, global_step=2537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2462/5971 [26:12<37:19,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000104, train/loss_step=0.0277, global_step=2537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████      | 2463/5971 [26:12<37:19,  1.57it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000219, train/loss_step=0.0657, global_step=2537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2464/5971 [26:15<37:21,  1.56it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000219, train/loss_step=0.0657, global_step=2537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2464/5971 [26:15<37:21,  1.56it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.0811, train/loss_vlb_step=0.000268, train/loss_step=0.0811, global_step=2537.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2465/5971 [26:16<37:21,  1.56it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.00883, train/loss_vlb_step=3.93e-5, train/loss_step=0.00883, global_step=2538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2466/5971 [26:17<37:21,  1.56it/s, loss=0.0732, v_num=0, train/loss_simple_step=0.0429, train/loss_vlb_step=0.000149, train/loss_step=0.0429, global_step=2538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████▏     | 2467/5971 [26:18<37:20,  1.56it/s, loss=0.0571, v_num=0, train/loss_simple_step=0.0857, train/loss_vlb_step=0.000287, train/loss_step=0.0857, global_step=2538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2468/5971 [26:20<37:22,  1.56it/s, loss=0.0571, v_num=0, train/loss_simple_step=0.0857, train/loss_vlb_step=0.000287, train/loss_step=0.0857, global_step=2538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2468/5971 [26:20<37:22,  1.56it/s, loss=0.062, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000359, train/loss_step=0.107, global_step=2538.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  41%|████▏     | 2469/5971 [26:21<37:22,  1.56it/s, loss=0.0596, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=1.98e-5, train/loss_step=0.00372, global_step=2539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2470/5971 [26:22<37:22,  1.56it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.713, train/loss_vlb_step=0.0287, train/loss_step=0.713, global_step=2539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  41%|████▏     | 2471/5971 [26:23<37:21,  1.56it/s, loss=0.112, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00236, train/loss_step=0.385, global_step=2539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2472/5971 [26:25<37:23,  1.56it/s, loss=0.112, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00236, train/loss_step=0.385, global_step=2539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2472/5971 [26:25<37:23,  1.56it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.00023, train/loss_step=0.0693, global_step=2539.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2473/5971 [26:26<37:22,  1.56it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0875, train/loss_vlb_step=0.000289, train/loss_step=0.0875, global_step=2540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2474/5971 [26:27<37:22,  1.56it/s, loss=0.141, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.0112, train/loss_step=0.628, global_step=2540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  41%|████▏     | 2475/5971 [26:28<37:22,  1.56it/s, loss=0.177, v_num=0, train/loss_simple_step=0.732, train/loss_vlb_step=0.042, train/loss_step=0.732, global_step=2540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  41%|████▏     | 2476/5971 [26:30<37:24,  1.56it/s, loss=0.177, v_num=0, train/loss_simple_step=0.732, train/loss_vlb_step=0.042, train/loss_step=0.732, global_step=2540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2476/5971 [26:30<37:24,  1.56it/s, loss=0.183, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00156, train/loss_step=0.286, global_step=2540.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  41%|████▏     | 2477/5971 [26:31<37:24,  1.56it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.7e-5, train/loss_step=0.00324, global_step=2541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2478/5971 [26:32<37:23,  1.56it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.01e-5, train/loss_step=0.0165, global_step=2541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  42%|████▏     | 2479/5971 [26:33<37:23,  1.56it/s, loss=0.214, v_num=0, train/loss_simple_step=0.661, train/loss_vlb_step=0.0125, train/loss_step=0.661, global_step=2541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  42%|████▏     | 2480/5971 [26:35<37:25,  1.55it/s, loss=0.214, v_num=0, train/loss_simple_step=0.661, train/loss_vlb_step=0.0125, train/loss_step=0.661, global_step=2541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2480/5971 [26:35<37:25,  1.55it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.49e-5, train/loss_step=0.00262, global_step=2541.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2481/5971 [26:36<37:24,  1.55it/s, loss=0.209, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000567, train/loss_step=0.166, global_step=2542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  42%|████▏     | 2482/5971 [26:37<37:24,  1.55it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000175, train/loss_step=0.0487, global_step=2542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2483/5971 [26:38<37:24,  1.55it/s, loss=0.217, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000818, train/loss_step=0.219, global_step=2542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  42%|████▏     | 2484/5971 [26:40<37:25,  1.55it/s, loss=0.217, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000818, train/loss_step=0.219, global_step=2542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2484/5971 [26:40<37:25,  1.55it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.13e-5, train/loss_step=0.00195, global_step=2542.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2485/5971 [26:41<37:25,  1.55it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.00031, train/loss_step=0.0941, global_step=2543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  42%|████▏     | 2486/5971 [26:42<37:25,  1.55it/s, loss=0.216, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.05e-5, train/loss_step=0.00175, global_step=2543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2487/5971 [26:43<37:24,  1.55it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.00014, train/loss_step=0.0393, global_step=2543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  42%|████▏     | 2488/5971 [26:45<37:27,  1.55it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.00014, train/loss_step=0.0393, global_step=2543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2488/5971 [26:45<37:27,  1.55it/s, loss=0.22, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000842, train/loss_step=0.231, global_step=2543.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  42%|████▏     | 2489/5971 [26:46<37:27,  1.55it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.84e-5, train/loss_step=0.0108, global_step=2544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2490/5971 [26:47<37:26,  1.55it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0223, train/loss_vlb_step=9.23e-5, train/loss_step=0.0223, global_step=2544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2491/5971 [26:48<37:26,  1.55it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00896, train/loss_vlb_step=4.47e-5, train/loss_step=0.00896, global_step=2544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2492/5971 [26:51<37:28,  1.55it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00896, train/loss_vlb_step=4.47e-5, train/loss_step=0.00896, global_step=2544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2492/5971 [26:51<37:28,  1.55it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000109, train/loss_step=0.0325, global_step=2544.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  42%|████▏     | 2493/5971 [26:52<37:28,  1.55it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.72e-6, train/loss_step=0.00161, global_step=2545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2494/5971 [26:53<37:27,  1.55it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0963, train/loss_vlb_step=0.000317, train/loss_step=0.0963, global_step=2545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2495/5971 [26:53<37:27,  1.55it/s, loss=0.111, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00104, train/loss_step=0.269, global_step=2545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  42%|████▏     | 2496/5971 [26:56<37:29,  1.54it/s, loss=0.111, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00104, train/loss_step=0.269, global_step=2545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2496/5971 [26:56<37:29,  1.54it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.000328, train/loss_step=0.0998, global_step=2545.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2497/5971 [26:57<37:29,  1.54it/s, loss=0.114, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.000946, train/loss_step=0.264, global_step=2546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  42%|████▏     | 2498/5971 [26:58<37:29,  1.54it/s, loss=0.124, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.00077, train/loss_step=0.218, global_step=2546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  42%|████▏     | 2499/5971 [26:59<37:28,  1.54it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000255, train/loss_step=0.0772, global_step=2546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2500/5971 [27:01<37:30,  1.54it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000255, train/loss_step=0.0772, global_step=2546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2500/5971 [27:01<37:30,  1.54it/s, loss=0.122, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.00405, train/loss_step=0.539, global_step=2546.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  42%|████▏     | 2501/5971 [27:02<37:29,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.13e-5, train/loss_step=0.023, global_step=2547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2502/5971 [27:03<37:29,  1.54it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.52e-5, train/loss_step=0.00262, global_step=2547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2503/5971 [27:03<37:29,  1.54it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.53e-5, train/loss_step=0.0146, global_step=2547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  42%|████▏     | 2504/5971 [27:06<37:30,  1.54it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.53e-5, train/loss_step=0.0146, global_step=2547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2504/5971 [27:06<37:30,  1.54it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00272, train/loss_vlb_step=1.52e-5, train/loss_step=0.00272, global_step=2547.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2505/5971 [27:07<37:30,  1.54it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.63e-5, train/loss_step=0.00286, global_step=2548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2506/5971 [27:07<37:29,  1.54it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.72e-5, train/loss_step=0.00298, global_step=2548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2507/5971 [27:08<37:29,  1.54it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.88e-5, train/loss_step=0.0257, global_step=2548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  42%|████▏     | 2508/5971 [27:11<37:31,  1.54it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.88e-5, train/loss_step=0.0257, global_step=2548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2508/5971 [27:11<37:31,  1.54it/s, loss=0.116, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.00764, train/loss_step=0.611, global_step=2548.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  42%|████▏     | 2509/5971 [27:12<37:31,  1.54it/s, loss=0.141, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.00447, train/loss_step=0.504, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2510/5971 [27:13<37:31,  1.54it/s, loss=0.148, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000568, train/loss_step=0.169, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2511/5971 [27:14<37:30,  1.54it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.3e-5, train/loss_step=0.00656, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2512/5971 [27:16<37:32,  1.54it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.3e-5, train/loss_step=0.00656, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  42%|████▏     | 2512/5971 [27:16<37:32,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.43it/s][A

Validating:   1%|          | 2/167 [00:00<00:47,  3.44it/s][A
Epoch 4:  42%|████▏     | 2516/5971 [27:16<37:27,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.09it/s][A
Epoch 4:  42%|████▏     | 2520/5971 [27:17<37:21,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.53it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.07it/s][A
Epoch 4:  42%|████▏     | 2524/5971 [27:17<37:15,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.86it/s][A
Epoch 4:  42%|████▏     | 2528/5971 [27:17<37:09,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.25it/s][A
Epoch 4:  42%|████▏     | 2532/5971 [27:17<37:03,  1.55it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.90it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:05, 25.02it/s][A
Epoch 4:  42%|████▏     | 2536/5971 [27:17<36:57,  1.55it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 26.00it/s][A
Epoch 4:  43%|████▎     | 2540/5971 [27:17<36:51,  1.55it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.68it/s][A
Epoch 4:  43%|████▎     | 2544/5971 [27:18<36:45,  1.55it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.55it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 24.60it/s][A
Epoch 4:  43%|████▎     | 2548/5971 [27:18<36:39,  1.56it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:05, 25.72it/s][A
Epoch 4:  43%|████▎     | 2552/5971 [27:18<36:34,  1.56it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.69it/s][A
Epoch 4:  43%|████▎     | 2556/5971 [27:18<36:28,  1.56it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.83it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 27.48it/s][A
Epoch 4:  43%|████▎     | 2560/5971 [27:18<36:22,  1.56it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.45it/s][A
Epoch 4:  43%|████▎     | 2564/5971 [27:18<36:16,  1.57it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.51it/s][A
Epoch 4:  43%|████▎     | 2568/5971 [27:18<36:10,  1.57it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.76it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.74it/s][A
Epoch 4:  43%|████▎     | 2572/5971 [27:19<36:05,  1.57it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.06it/s][A
Epoch 4:  43%|████▎     | 2576/5971 [27:19<35:59,  1.57it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.35it/s][A
Epoch 4:  43%|████▎     | 2580/5971 [27:19<35:53,  1.57it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 27.30it/s][A
Epoch 4:  43%|████▎     | 2584/5971 [27:19<35:48,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.57it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.96it/s][A
Epoch 4:  43%|████▎     | 2588/5971 [27:19<35:42,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 28.01it/s][A
Epoch 4:  43%|████▎     | 2592/5971 [27:19<35:36,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.66it/s][A
Epoch 4:  43%|████▎     | 2596/5971 [27:19<35:31,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.43it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 27.15it/s][A
Epoch 4:  44%|████▎     | 2600/5971 [27:20<35:25,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 27.53it/s][A
Epoch 4:  44%|████▎     | 2604/5971 [27:20<35:20,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 27.68it/s][A
Epoch 4:  44%|████▎     | 2608/5971 [27:20<35:14,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.07it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.54it/s][A
Epoch 4:  44%|████▎     | 2612/5971 [27:20<35:08,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.54it/s][A
Epoch 4:  44%|████▍     | 2616/5971 [27:20<35:03,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.48it/s][A
Epoch 4:  44%|████▍     | 2620/5971 [27:20<34:57,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.68it/s][A
Epoch 4:  44%|████▍     | 2624/5971 [27:20<34:52,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 28.15it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 28.07it/s][A
Epoch 4:  44%|████▍     | 2628/5971 [27:21<34:46,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████   | 118/167 [00:04<00:01, 25.30it/s][A
Epoch 4:  44%|████▍     | 2632/5971 [27:21<34:41,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.04it/s][A
Epoch 4:  44%|████▍     | 2636/5971 [27:21<34:35,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.87it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.84it/s][A
Epoch 4:  44%|████▍     | 2640/5971 [27:21<34:30,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 25.46it/s][A
Epoch 4:  44%|████▍     | 2644/5971 [27:21<34:25,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.70it/s][A
Epoch 4:  44%|████▍     | 2648/5971 [27:21<34:19,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 25.35it/s][A
Epoch 4:  44%|████▍     | 2652/5971 [27:22<34:14,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.12it/s][A
Epoch 4:  44%|████▍     | 2656/5971 [27:22<34:08,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 28.15it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.28it/s][A
Epoch 4:  45%|████▍     | 2660/5971 [27:22<34:03,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.53it/s][A
Epoch 4:  45%|████▍     | 2664/5971 [27:22<33:58,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.44it/s][A
Epoch 4:  45%|████▍     | 2668/5971 [27:22<33:52,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.78it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.71it/s][A
Epoch 4:  45%|████▍     | 2672/5971 [27:22<33:47,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.85it/s][A
Epoch 4:  45%|████▍     | 2676/5971 [27:22<33:42,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.73it/s][A
Epoch 4:  45%|████▍     | 2680/5971 [27:23<33:36,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▍     | 2680/5971 [27:23<33:37,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=2549.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  45%|████▍     | 2681/5971 [27:24<33:37,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00126, train/loss_step=0.293, global_step=2550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  45%|████▍     | 2682/5971 [27:25<33:36,  1.63it/s, loss=0.168, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.0011, train/loss_step=0.236, global_step=2550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  45%|████▍     | 2683/5971 [27:26<33:36,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00131, train/loss_step=0.312, global_step=2550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▍     | 2684/5971 [27:28<33:38,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00131, train/loss_step=0.312, global_step=2550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▍     | 2684/5971 [27:28<33:38,  1.63it/s, loss=0.172, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.00046, train/loss_step=0.138, global_step=2550.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▍     | 2685/5971 [27:29<33:37,  1.63it/s, loss=0.164, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000362, train/loss_step=0.109, global_step=2551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▍     | 2686/5971 [27:30<33:37,  1.63it/s, loss=0.159, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000351, train/loss_step=0.106, global_step=2551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2687/5971 [27:31<33:37,  1.63it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.1e-5, train/loss_step=0.0166, global_step=2551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2688/5971 [27:34<33:39,  1.63it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.1e-5, train/loss_step=0.0166, global_step=2551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2688/5971 [27:34<33:39,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=2551.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2689/5971 [27:34<33:39,  1.63it/s, loss=0.141, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000472, train/loss_step=0.142, global_step=2552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2690/5971 [27:35<33:38,  1.63it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.49e-5, train/loss_step=0.0172, global_step=2552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2691/5971 [27:36<33:38,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=7.8e-5, train/loss_step=0.0205, global_step=2552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  45%|████▌     | 2692/5971 [27:39<33:40,  1.62it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=7.8e-5, train/loss_step=0.0205, global_step=2552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2692/5971 [27:39<33:40,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.000167, train/loss_step=0.046, global_step=2552.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2693/5971 [27:39<33:39,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.71e-5, train/loss_step=0.0102, global_step=2553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2694/5971 [27:40<33:39,  1.62it/s, loss=0.157, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.00094, train/loss_step=0.252, global_step=2553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  45%|████▌     | 2695/5971 [27:41<33:39,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.43e-5, train/loss_step=0.00485, global_step=2553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2696/5971 [27:43<33:40,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.43e-5, train/loss_step=0.00485, global_step=2553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2696/5971 [27:43<33:40,  1.62it/s, loss=0.138, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00104, train/loss_step=0.257, global_step=2553.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  45%|████▌     | 2697/5971 [27:44<33:40,  1.62it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.000194, train/loss_step=0.0566, global_step=2554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2698/5971 [27:45<33:39,  1.62it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.11e-5, train/loss_step=0.0234, global_step=2554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  45%|████▌     | 2699/5971 [27:46<33:39,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000485, train/loss_step=0.146, global_step=2554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  45%|████▌     | 2700/5971 [27:49<33:41,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000485, train/loss_step=0.146, global_step=2554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2700/5971 [27:49<33:41,  1.62it/s, loss=0.117, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.00011, train/loss_step=0.030, global_step=2554.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  45%|████▌     | 2701/5971 [27:50<33:41,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.00055, train/loss_step=0.162, global_step=2555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  45%|████▌     | 2702/5971 [27:50<33:40,  1.62it/s, loss=0.105, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000445, train/loss_step=0.133, global_step=2555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2703/5971 [27:51<33:40,  1.62it/s, loss=0.112, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.00339, train/loss_step=0.444, global_step=2555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  45%|████▌     | 2704/5971 [27:55<33:43,  1.61it/s, loss=0.112, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.00339, train/loss_step=0.444, global_step=2555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2704/5971 [27:55<33:43,  1.61it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00473, train/loss_vlb_step=2.35e-5, train/loss_step=0.00473, global_step=2555.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2705/5971 [27:55<33:42,  1.61it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.16e-5, train/loss_step=0.0172, global_step=2556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  45%|████▌     | 2706/5971 [27:56<33:42,  1.61it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.00024, train/loss_step=0.0707, global_step=2556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2707/5971 [27:57<33:42,  1.61it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.18e-5, train/loss_step=0.00408, global_step=2556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2708/5971 [28:00<33:43,  1.61it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.18e-5, train/loss_step=0.00408, global_step=2556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2708/5971 [28:00<33:43,  1.61it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=2556.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  45%|████▌     | 2709/5971 [28:00<33:43,  1.61it/s, loss=0.0917, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.75e-5, train/loss_step=0.0241, global_step=2557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2710/5971 [28:01<33:43,  1.61it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=9.11e-6, train/loss_step=0.00153, global_step=2557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2711/5971 [28:02<33:42,  1.61it/s, loss=0.09, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.12e-5, train/loss_step=0.00194, global_step=2557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  45%|████▌     | 2712/5971 [28:04<33:44,  1.61it/s, loss=0.09, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.12e-5, train/loss_step=0.00194, global_step=2557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2712/5971 [28:04<33:44,  1.61it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.00122, train/loss_vlb_step=7.31e-6, train/loss_step=0.00122, global_step=2557.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2713/5971 [28:05<33:43,  1.61it/s, loss=0.108, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.0026, train/loss_step=0.421, global_step=2558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:  45%|████▌     | 2714/5971 [28:06<33:43,  1.61it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.29e-5, train/loss_step=0.00232, global_step=2558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2715/5971 [28:07<33:43,  1.61it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=7.98e-6, train/loss_step=0.00132, global_step=2558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2716/5971 [28:10<33:44,  1.61it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=7.98e-6, train/loss_step=0.00132, global_step=2558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  45%|████▌     | 2716/5971 [28:10<33:44,  1.61it/s, loss=0.0862, v_num=0, train/loss_simple_step=0.0678, train/loss_vlb_step=0.000229, train/loss_step=0.0678, global_step=2558.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2717/5971 [28:11<33:44,  1.61it/s, loss=0.0844, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.32e-5, train/loss_step=0.0213, global_step=2559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2718/5971 [28:12<33:44,  1.61it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000628, train/loss_step=0.176, global_step=2559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2719/5971 [28:12<33:44,  1.61it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=2559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2720/5971 [28:15<33:45,  1.61it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=2559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2720/5971 [28:15<33:45,  1.61it/s, loss=0.102, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00111, train/loss_step=0.279, global_step=2559.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  46%|████▌     | 2721/5971 [28:15<33:44,  1.60it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000374, train/loss_step=0.114, global_step=2560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2722/5971 [28:16<33:44,  1.60it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000365, train/loss_step=0.110, global_step=2560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2723/5971 [28:17<33:44,  1.60it/s, loss=0.0766, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=9.79e-6, train/loss_step=0.00168, global_step=2560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2724/5971 [28:20<33:46,  1.60it/s, loss=0.0766, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=9.79e-6, train/loss_step=0.00168, global_step=2560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2724/5971 [28:20<33:46,  1.60it/s, loss=0.0841, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000594, train/loss_step=0.155, global_step=2560.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  46%|████▌     | 2725/5971 [28:21<33:46,  1.60it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.0013, train/loss_step=0.299, global_step=2561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  46%|████▌     | 2726/5971 [28:22<33:46,  1.60it/s, loss=0.117, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.00264, train/loss_step=0.444, global_step=2561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2727/5971 [28:23<33:45,  1.60it/s, loss=0.123, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000443, train/loss_step=0.135, global_step=2561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2728/5971 [28:26<33:47,  1.60it/s, loss=0.123, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000443, train/loss_step=0.135, global_step=2561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2728/5971 [28:26<33:47,  1.60it/s, loss=0.133, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00148, train/loss_step=0.312, global_step=2561.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2729/5971 [28:27<33:47,  1.60it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.94e-5, train/loss_step=0.0226, global_step=2562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2730/5971 [28:28<33:47,  1.60it/s, loss=0.151, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00198, train/loss_step=0.351, global_step=2562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  46%|████▌     | 2731/5971 [28:29<33:47,  1.60it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000307, train/loss_step=0.0935, global_step=2562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2732/5971 [28:32<33:49,  1.60it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000307, train/loss_step=0.0935, global_step=2562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2732/5971 [28:32<33:49,  1.60it/s, loss=0.161, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000399, train/loss_step=0.120, global_step=2562.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  46%|████▌     | 2733/5971 [28:33<33:49,  1.60it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=9.9e-5, train/loss_step=0.0269, global_step=2563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2734/5971 [28:34<33:49,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000336, train/loss_step=0.101, global_step=2563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2735/5971 [28:36<33:49,  1.59it/s, loss=0.173, v_num=0, train/loss_simple_step=0.529, train/loss_vlb_step=0.00439, train/loss_step=0.529, global_step=2563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2736/5971 [28:39<33:51,  1.59it/s, loss=0.173, v_num=0, train/loss_simple_step=0.529, train/loss_vlb_step=0.00439, train/loss_step=0.529, global_step=2563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2736/5971 [28:39<33:51,  1.59it/s, loss=0.182, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.00101, train/loss_step=0.240, global_step=2563.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2737/5971 [28:40<33:51,  1.59it/s, loss=0.184, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000219, train/loss_step=0.064, global_step=2564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2738/5971 [28:41<33:51,  1.59it/s, loss=0.188, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00106, train/loss_step=0.271, global_step=2564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2739/5971 [28:42<33:51,  1.59it/s, loss=0.214, v_num=0, train/loss_simple_step=0.608, train/loss_vlb_step=0.0155, train/loss_step=0.608, global_step=2564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2740/5971 [28:44<33:53,  1.59it/s, loss=0.214, v_num=0, train/loss_simple_step=0.608, train/loss_vlb_step=0.0155, train/loss_step=0.608, global_step=2564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2740/5971 [28:44<33:53,  1.59it/s, loss=0.202, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.00012, train/loss_step=0.033, global_step=2564.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2741/5971 [28:46<33:53,  1.59it/s, loss=0.208, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000986, train/loss_step=0.242, global_step=2565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2742/5971 [28:47<33:53,  1.59it/s, loss=0.223, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00185, train/loss_step=0.413, global_step=2565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2743/5971 [28:48<33:52,  1.59it/s, loss=0.223, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.32e-5, train/loss_step=0.00468, global_step=2565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2744/5971 [28:50<33:54,  1.59it/s, loss=0.223, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.32e-5, train/loss_step=0.00468, global_step=2565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2744/5971 [28:50<33:54,  1.59it/s, loss=0.216, v_num=0, train/loss_simple_step=0.00876, train/loss_vlb_step=3.82e-5, train/loss_step=0.00876, global_step=2565.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2745/5971 [28:51<33:54,  1.59it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.26e-5, train/loss_step=0.0122, global_step=2566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  46%|████▌     | 2746/5971 [28:52<33:54,  1.59it/s, loss=0.191, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000976, train/loss_step=0.235, global_step=2566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2747/5971 [28:53<33:54,  1.58it/s, loss=0.196, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000784, train/loss_step=0.235, global_step=2566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2748/5971 [28:56<33:55,  1.58it/s, loss=0.196, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000784, train/loss_step=0.235, global_step=2566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2748/5971 [28:56<33:55,  1.58it/s, loss=0.207, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00581, train/loss_step=0.534, global_step=2566.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2749/5971 [28:57<33:55,  1.58it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.18e-5, train/loss_step=0.0236, global_step=2567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2750/5971 [28:58<33:55,  1.58it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.1e-5, train/loss_step=0.0194, global_step=2567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▌     | 2751/5971 [28:59<33:55,  1.58it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00831, train/loss_vlb_step=3.87e-5, train/loss_step=0.00831, global_step=2567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2752/5971 [29:01<33:56,  1.58it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00831, train/loss_vlb_step=3.87e-5, train/loss_step=0.00831, global_step=2567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2752/5971 [29:01<33:56,  1.58it/s, loss=0.2, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.0021, train/loss_step=0.386, global_step=2567.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]       
Epoch 4:  46%|████▌     | 2753/5971 [29:02<33:56,  1.58it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.2e-5, train/loss_step=0.0185, global_step=2568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2754/5971 [29:03<33:56,  1.58it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.28e-5, train/loss_step=0.00218, global_step=2568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2755/5971 [29:04<33:55,  1.58it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.77e-5, train/loss_step=0.00757, global_step=2568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2756/5971 [29:07<33:57,  1.58it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.77e-5, train/loss_step=0.00757, global_step=2568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2756/5971 [29:07<33:57,  1.58it/s, loss=0.172, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00138, train/loss_step=0.317, global_step=2568.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  46%|████▌     | 2757/5971 [29:08<33:57,  1.58it/s, loss=0.179, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000665, train/loss_step=0.196, global_step=2569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2758/5971 [29:09<33:57,  1.58it/s, loss=0.213, v_num=0, train/loss_simple_step=0.967, train/loss_vlb_step=0.487, train/loss_step=0.967, global_step=2569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  46%|████▌     | 2759/5971 [29:10<33:56,  1.58it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.41e-5, train/loss_step=0.00251, global_step=2569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2760/5971 [29:12<33:58,  1.58it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.41e-5, train/loss_step=0.00251, global_step=2569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2760/5971 [29:12<33:58,  1.58it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.79e-5, train/loss_step=0.00334, global_step=2569.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▌     | 2761/5971 [29:13<33:57,  1.58it/s, loss=0.175, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=2570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  46%|████▋     | 2762/5971 [29:14<33:57,  1.57it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.96e-5, train/loss_step=0.0189, global_step=2570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2763/5971 [29:15<33:57,  1.57it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00557, train/loss_vlb_step=2.87e-5, train/loss_step=0.00557, global_step=2570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2764/5971 [29:17<33:58,  1.57it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00557, train/loss_vlb_step=2.87e-5, train/loss_step=0.00557, global_step=2570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2764/5971 [29:17<33:58,  1.57it/s, loss=0.16, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000357, train/loss_step=0.108, global_step=2570.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  46%|████▋     | 2765/5971 [29:18<33:58,  1.57it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00228, train/loss_vlb_step=1.3e-5, train/loss_step=0.00228, global_step=2571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2766/5971 [29:19<33:57,  1.57it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.7e-5, train/loss_step=0.00301, global_step=2571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2767/5971 [29:20<33:57,  1.57it/s, loss=0.151, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00146, train/loss_step=0.299, global_step=2571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  46%|████▋     | 2768/5971 [29:22<33:58,  1.57it/s, loss=0.151, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00146, train/loss_step=0.299, global_step=2571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2768/5971 [29:22<33:58,  1.57it/s, loss=0.14, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00142, train/loss_step=0.306, global_step=2571.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▋     | 2769/5971 [29:23<33:58,  1.57it/s, loss=0.169, v_num=0, train/loss_simple_step=0.602, train/loss_vlb_step=0.00798, train/loss_step=0.602, global_step=2572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2770/5971 [29:24<33:58,  1.57it/s, loss=0.179, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000778, train/loss_step=0.213, global_step=2572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2771/5971 [29:25<33:57,  1.57it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000148, train/loss_step=0.0411, global_step=2572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2772/5971 [29:28<34:00,  1.57it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000148, train/loss_step=0.0411, global_step=2572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2772/5971 [29:28<34:00,  1.57it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00561, train/loss_vlb_step=2.88e-5, train/loss_step=0.00561, global_step=2572.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2773/5971 [29:29<33:59,  1.57it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.31e-5, train/loss_step=0.0126, global_step=2573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  46%|████▋     | 2774/5971 [29:30<33:59,  1.57it/s, loss=0.174, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000964, train/loss_step=0.261, global_step=2573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  46%|████▋     | 2775/5971 [29:31<33:59,  1.57it/s, loss=0.212, v_num=0, train/loss_simple_step=0.779, train/loss_vlb_step=0.0273, train/loss_step=0.779, global_step=2573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  46%|████▋     | 2776/5971 [29:33<34:00,  1.57it/s, loss=0.212, v_num=0, train/loss_simple_step=0.779, train/loss_vlb_step=0.0273, train/loss_step=0.779, global_step=2573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  46%|████▋     | 2776/5971 [29:33<34:00,  1.57it/s, loss=0.209, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000906, train/loss_step=0.248, global_step=2573.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  47%|████▋     | 2777/5971 [29:34<34:00,  1.57it/s, loss=0.206, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  47%|████▋     | 2778/5971 [29:35<34:00,  1.56it/s, loss=0.186, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00688, train/loss_step=0.571, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  47%|████▋     | 2779/5971 [29:36<33:59,  1.56it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.76e-5, train/loss_step=0.00536, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  47%|████▋     | 2780/5971 [29:39<34:01,  1.56it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.76e-5, train/loss_step=0.00536, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  47%|████▋     | 2780/5971 [29:39<34:01,  1.56it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:27,  1.91it/s][A

Validating:   1%|          | 2/167 [00:00<00:52,  3.14it/s][A
Epoch 4:  47%|████▋     | 2784/5971 [29:39<33:56,  1.56it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.24it/s][A
Epoch 4:  47%|████▋     | 2788/5971 [29:40<33:51,  1.57it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.78it/s][A

Validating:   7%|▋         | 11/167 [00:01<00:09, 15.87it/s][A
Epoch 4:  47%|████▋     | 2792/5971 [29:40<33:46,  1.57it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.33it/s][A
Epoch 4:  47%|████▋     | 2796/5971 [29:40<33:40,  1.57it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.08it/s][A
Epoch 4:  47%|████▋     | 2800/5971 [29:40<33:35,  1.57it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.27it/s][A
Epoch 4:  47%|████▋     | 2804/5971 [29:40<33:30,  1.58it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.80it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.54it/s][A
Epoch 4:  47%|████▋     | 2808/5971 [29:40<33:25,  1.58it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.28it/s][A
Epoch 4:  47%|████▋     | 2812/5971 [29:40<33:20,  1.58it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.03it/s][A
Epoch 4:  47%|████▋     | 2816/5971 [29:41<33:14,  1.58it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 25.69it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.04it/s][A
Epoch 4:  47%|████▋     | 2820/5971 [29:41<33:09,  1.58it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.17it/s][A
Epoch 4:  47%|████▋     | 2824/5971 [29:41<33:04,  1.59it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.21it/s][A
Epoch 4:  47%|████▋     | 2828/5971 [29:41<32:59,  1.59it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.29it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 24.49it/s][A
Epoch 4:  47%|████▋     | 2832/5971 [29:41<32:54,  1.59it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.47it/s][A
Epoch 4:  47%|████▋     | 2836/5971 [29:41<32:49,  1.59it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 23.93it/s][A
Epoch 4:  48%|████▊     | 2840/5971 [29:42<32:43,  1.59it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 24.19it/s][A

Validating:  38%|███▊      | 63/167 [00:03<00:04, 24.42it/s][A
Epoch 4:  48%|████▊     | 2844/5971 [29:42<32:38,  1.60it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 23.32it/s][A
Epoch 4:  48%|████▊     | 2848/5971 [29:42<32:33,  1.60it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 23.39it/s][A
Epoch 4:  48%|████▊     | 2852/5971 [29:42<32:28,  1.60it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:04, 23.16it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 23.15it/s][A
Epoch 4:  48%|████▊     | 2856/5971 [29:42<32:23,  1.60it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:04, 21.95it/s][A
Epoch 4:  48%|████▊     | 2860/5971 [29:42<32:18,  1.60it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▊     | 81/167 [00:03<00:04, 21.42it/s][A
Epoch 4:  48%|████▊     | 2864/5971 [29:43<32:13,  1.61it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:04<00:03, 22.54it/s][A

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 22.82it/s][A
Epoch 4:  48%|████▊     | 2868/5971 [29:43<32:08,  1.61it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 24.48it/s][A
Epoch 4:  48%|████▊     | 2872/5971 [29:43<32:03,  1.61it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▌    | 93/167 [00:04<00:03, 24.19it/s][A
Epoch 4:  48%|████▊     | 2876/5971 [29:43<31:58,  1.61it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 24.60it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 24.97it/s][A
Epoch 4:  48%|████▊     | 2880/5971 [29:43<31:53,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 21.86it/s][A
Epoch 4:  48%|████▊     | 2884/5971 [29:44<31:48,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:03, 20.40it/s][A
Epoch 4:  48%|████▊     | 2888/5971 [29:44<31:44,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 21.50it/s][A

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 22.40it/s][A
Epoch 4:  48%|████▊     | 2892/5971 [29:44<31:39,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 22.25it/s][A
Epoch 4:  49%|████▊     | 2896/5971 [29:44<31:34,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:05<00:02, 23.27it/s][A
Epoch 4:  49%|████▊     | 2900/5971 [29:44<31:29,  1.63it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 23.60it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 24.09it/s][A
Epoch 4:  49%|████▊     | 2904/5971 [29:44<31:24,  1.63it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 24.08it/s][A
Epoch 4:  49%|████▊     | 2908/5971 [29:45<31:19,  1.63it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 24.84it/s][A
Epoch 4:  49%|████▉     | 2912/5971 [29:45<31:14,  1.63it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 25.48it/s][A

Validating:  81%|████████  | 135/167 [00:06<00:01, 26.11it/s][A
Epoch 4:  49%|████▉     | 2916/5971 [29:45<31:09,  1.63it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 25.25it/s][A
Epoch 4:  49%|████▉     | 2920/5971 [29:45<31:04,  1.64it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 24.44it/s][A
Epoch 4:  49%|████▉     | 2924/5971 [29:45<31:00,  1.64it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 24.90it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 25.76it/s][A
Epoch 4:  49%|████▉     | 2928/5971 [29:45<30:55,  1.64it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.56it/s][A
Epoch 4:  49%|████▉     | 2932/5971 [29:46<30:50,  1.64it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 21.51it/s][A
Epoch 4:  49%|████▉     | 2936/5971 [29:46<30:45,  1.64it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:07<00:00, 22.77it/s][A

Validating:  95%|█████████▌| 159/167 [00:07<00:00, 24.10it/s][A
Epoch 4:  49%|████▉     | 2940/5971 [29:46<30:41,  1.65it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 24.16it/s][A
Epoch 4:  49%|████▉     | 2944/5971 [29:46<30:36,  1.65it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 25.55it/s][A
Epoch 4:  49%|████▉     | 2948/5971 [29:46<30:31,  1.65it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  49%|████▉     | 2948/5971 [29:46<30:31,  1.65it/s, loss=0.205, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00189, train/loss_step=0.376, global_step=2574.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  49%|████▉     | 2949/5971 [29:48<30:31,  1.65it/s, loss=0.219, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00288, train/loss_step=0.384, global_step=2575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  49%|████▉     | 2950/5971 [29:48<30:31,  1.65it/s, loss=0.225, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000454, train/loss_step=0.135, global_step=2575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  49%|████▉     | 2951/5971 [29:49<30:31,  1.65it/s, loss=0.241, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00219, train/loss_step=0.338, global_step=2575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  49%|████▉     | 2952/5971 [29:52<30:32,  1.65it/s, loss=0.241, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00219, train/loss_step=0.338, global_step=2575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  49%|████▉     | 2952/5971 [29:52<30:32,  1.65it/s, loss=0.241, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=2575.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  49%|████▉     | 2953/5971 [29:53<30:31,  1.65it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00931, train/loss_vlb_step=4.48e-5, train/loss_step=0.00931, global_step=2576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  49%|████▉     | 2954/5971 [29:54<30:31,  1.65it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000219, train/loss_step=0.0636, global_step=2576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  49%|████▉     | 2955/5971 [29:55<30:31,  1.65it/s, loss=0.246, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00137, train/loss_step=0.318, global_step=2576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  50%|████▉     | 2956/5971 [29:57<30:32,  1.65it/s, loss=0.246, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00137, train/loss_step=0.318, global_step=2576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2956/5971 [29:57<30:32,  1.65it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000126, train/loss_step=0.0348, global_step=2576.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2957/5971 [29:58<30:32,  1.65it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.14e-5, train/loss_step=0.00196, global_step=2577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2958/5971 [29:58<30:31,  1.64it/s, loss=0.207, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00176, train/loss_step=0.318, global_step=2577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  50%|████▉     | 2959/5971 [29:59<30:31,  1.64it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.81e-5, train/loss_step=0.00331, global_step=2577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2960/5971 [30:02<30:33,  1.64it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.81e-5, train/loss_step=0.00331, global_step=2577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2960/5971 [30:02<30:33,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000887, train/loss_step=0.226, global_step=2577.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  50%|████▉     | 2961/5971 [30:03<30:32,  1.64it/s, loss=0.224, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000557, train/loss_step=0.159, global_step=2578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2962/5971 [30:04<30:32,  1.64it/s, loss=0.219, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.00058, train/loss_step=0.171, global_step=2578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|████▉     | 2963/5971 [30:05<30:32,  1.64it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=8.93e-6, train/loss_step=0.00149, global_step=2578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2964/5971 [30:07<30:33,  1.64it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=8.93e-6, train/loss_step=0.00149, global_step=2578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2964/5971 [30:07<30:33,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00701, train/loss_vlb_step=3.5e-5, train/loss_step=0.00701, global_step=2578.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2965/5971 [30:08<30:32,  1.64it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00545, train/loss_vlb_step=2.57e-5, train/loss_step=0.00545, global_step=2579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2966/5971 [30:09<30:32,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=2579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  50%|████▉     | 2967/5971 [30:10<30:32,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.86e-5, train/loss_step=0.0158, global_step=2579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2968/5971 [30:12<30:33,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.86e-5, train/loss_step=0.0158, global_step=2579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2968/5971 [30:12<30:33,  1.64it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000217, train/loss_step=0.0657, global_step=2579.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2969/5971 [30:13<30:33,  1.64it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.67e-5, train/loss_step=0.0207, global_step=2580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|████▉     | 2970/5971 [30:14<30:32,  1.64it/s, loss=0.127, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.00701, train/loss_step=0.561, global_step=2580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|████▉     | 2971/5971 [30:15<30:32,  1.64it/s, loss=0.119, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000643, train/loss_step=0.170, global_step=2580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2972/5971 [30:17<30:33,  1.64it/s, loss=0.119, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000643, train/loss_step=0.170, global_step=2580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2972/5971 [30:17<30:33,  1.64it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.15e-5, train/loss_step=0.00195, global_step=2580.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2973/5971 [30:18<30:33,  1.64it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.61e-5, train/loss_step=0.00296, global_step=2581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2974/5971 [30:19<30:32,  1.64it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.000294, train/loss_step=0.0881, global_step=2581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|████▉     | 2975/5971 [30:20<30:32,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=2581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|████▉     | 2976/5971 [30:22<30:33,  1.63it/s, loss=0.105, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000427, train/loss_step=0.130, global_step=2581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2976/5971 [30:22<30:33,  1.63it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.07e-5, train/loss_step=0.0113, global_step=2581.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2977/5971 [30:23<30:33,  1.63it/s, loss=0.121, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00176, train/loss_step=0.348, global_step=2582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|████▉     | 2978/5971 [30:24<30:32,  1.63it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.46e-6, train/loss_step=0.00145, global_step=2582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2979/5971 [30:25<30:32,  1.63it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.13e-5, train/loss_step=0.0201, global_step=2582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|████▉     | 2980/5971 [30:27<30:33,  1.63it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.13e-5, train/loss_step=0.0201, global_step=2582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2980/5971 [30:27<30:33,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.551, train/loss_vlb_step=0.00979, train/loss_step=0.551, global_step=2582.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|████▉     | 2981/5971 [30:28<30:33,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.000143, train/loss_step=0.0404, global_step=2583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2982/5971 [30:29<30:32,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000647, train/loss_step=0.184, global_step=2583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|████▉     | 2983/5971 [30:30<30:32,  1.63it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.98e-5, train/loss_step=0.0226, global_step=2583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2984/5971 [30:32<30:33,  1.63it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.98e-5, train/loss_step=0.0226, global_step=2583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|████▉     | 2984/5971 [30:32<30:33,  1.63it/s, loss=0.129, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000866, train/loss_step=0.227, global_step=2583.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|████▉     | 2985/5971 [30:33<30:33,  1.63it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.35e-5, train/loss_step=0.00222, global_step=2584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2986/5971 [30:34<30:33,  1.63it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000138, train/loss_step=0.0383, global_step=2584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|█████     | 2987/5971 [30:35<30:32,  1.63it/s, loss=0.133, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000587, train/loss_step=0.167, global_step=2584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|█████     | 2988/5971 [30:37<30:33,  1.63it/s, loss=0.133, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000587, train/loss_step=0.167, global_step=2584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2988/5971 [30:37<30:33,  1.63it/s, loss=0.16, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.0123, train/loss_step=0.611, global_step=2584.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  50%|█████     | 2989/5971 [30:38<30:33,  1.63it/s, loss=0.174, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00145, train/loss_step=0.310, global_step=2585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2990/5971 [30:39<30:33,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00306, train/loss_vlb_step=1.71e-5, train/loss_step=0.00306, global_step=2585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2991/5971 [30:40<30:32,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.00082, train/loss_step=0.201, global_step=2585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  50%|█████     | 2992/5971 [30:42<30:33,  1.62it/s, loss=0.148, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.00082, train/loss_step=0.201, global_step=2585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2992/5971 [30:42<30:33,  1.62it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00617, train/loss_vlb_step=3.11e-5, train/loss_step=0.00617, global_step=2585.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2993/5971 [30:43<30:33,  1.62it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00383, train/loss_vlb_step=2.07e-5, train/loss_step=0.00383, global_step=2586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2994/5971 [30:44<30:33,  1.62it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0909, train/loss_vlb_step=0.0003, train/loss_step=0.0909, global_step=2586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  50%|█████     | 2995/5971 [30:44<30:32,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000886, train/loss_step=0.242, global_step=2586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2996/5971 [30:47<30:33,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000886, train/loss_step=0.242, global_step=2586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2996/5971 [30:47<30:33,  1.62it/s, loss=0.177, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00435, train/loss_step=0.473, global_step=2586.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|█████     | 2997/5971 [30:48<30:33,  1.62it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.56e-5, train/loss_step=0.00285, global_step=2587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 2998/5971 [30:49<30:33,  1.62it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.2e-5, train/loss_step=0.0176, global_step=2587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|█████     | 2999/5971 [30:50<30:32,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000229, train/loss_step=0.0669, global_step=2587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3000/5971 [30:53<30:34,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000229, train/loss_step=0.0669, global_step=2587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3000/5971 [30:53<30:34,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000595, train/loss_step=0.176, global_step=2587.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  50%|█████     | 3001/5971 [30:54<30:34,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000866, train/loss_step=0.224, global_step=2588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3002/5971 [30:55<30:34,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000625, train/loss_step=0.176, global_step=2588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3003/5971 [30:56<30:33,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00117, train/loss_step=0.287, global_step=2588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|█████     | 3004/5971 [30:58<30:34,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00117, train/loss_step=0.287, global_step=2588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3004/5971 [30:58<30:34,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00181, train/loss_step=0.305, global_step=2588.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|█████     | 3005/5971 [30:59<30:34,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.42e-5, train/loss_step=0.00704, global_step=2589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3006/5971 [31:00<30:34,  1.62it/s, loss=0.182, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00106, train/loss_step=0.273, global_step=2589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  50%|█████     | 3007/5971 [31:01<30:33,  1.62it/s, loss=0.202, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.00462, train/loss_step=0.557, global_step=2589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3008/5971 [31:03<30:34,  1.61it/s, loss=0.202, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.00462, train/loss_step=0.557, global_step=2589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3008/5971 [31:03<30:34,  1.61it/s, loss=0.188, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00142, train/loss_step=0.343, global_step=2589.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3009/5971 [31:04<30:34,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.05e-5, train/loss_step=0.0112, global_step=2590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3010/5971 [31:05<30:34,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00307, train/loss_vlb_step=1.69e-5, train/loss_step=0.00307, global_step=2590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3011/5971 [31:06<30:34,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000137, train/loss_step=0.0374, global_step=2590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|█████     | 3012/5971 [31:08<30:35,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000137, train/loss_step=0.0374, global_step=2590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3012/5971 [31:08<30:35,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.29e-5, train/loss_step=0.00235, global_step=2590.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3013/5971 [31:09<30:34,  1.61it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.000155, train/loss_step=0.0433, global_step=2591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  50%|█████     | 3014/5971 [31:10<30:34,  1.61it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.07e-5, train/loss_step=0.00179, global_step=2591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  50%|█████     | 3015/5971 [31:11<30:34,  1.61it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00367, train/loss_vlb_step=1.95e-5, train/loss_step=0.00367, global_step=2591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3016/5971 [31:13<30:35,  1.61it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00367, train/loss_vlb_step=1.95e-5, train/loss_step=0.00367, global_step=2591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3016/5971 [31:13<30:35,  1.61it/s, loss=0.141, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00124, train/loss_step=0.278, global_step=2591.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  51%|█████     | 3017/5971 [31:14<30:34,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00161, train/loss_step=0.323, global_step=2592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3018/5971 [31:15<30:34,  1.61it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0753, train/loss_vlb_step=0.000247, train/loss_step=0.0753, global_step=2592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3019/5971 [31:16<30:34,  1.61it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.09e-5, train/loss_step=0.00191, global_step=2592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3020/5971 [31:18<30:35,  1.61it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.09e-5, train/loss_step=0.00191, global_step=2592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3020/5971 [31:18<30:35,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000298, train/loss_step=0.0895, global_step=2592.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  51%|█████     | 3021/5971 [31:19<30:34,  1.61it/s, loss=0.149, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000528, train/loss_step=0.158, global_step=2593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  51%|█████     | 3022/5971 [31:20<30:34,  1.61it/s, loss=0.17, v_num=0, train/loss_simple_step=0.598, train/loss_vlb_step=0.0117, train/loss_step=0.598, global_step=2593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  51%|█████     | 3023/5971 [31:21<30:34,  1.61it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.17e-5, train/loss_step=0.00203, global_step=2593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3024/5971 [31:23<30:35,  1.61it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.17e-5, train/loss_step=0.00203, global_step=2593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3024/5971 [31:23<30:35,  1.61it/s, loss=0.145, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2593.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  51%|█████     | 3025/5971 [31:24<30:34,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000124, train/loss_step=0.0339, global_step=2594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3026/5971 [31:25<30:34,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00106, train/loss_step=0.280, global_step=2594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  51%|█████     | 3027/5971 [31:26<30:34,  1.61it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00941, train/loss_vlb_step=4.51e-5, train/loss_step=0.00941, global_step=2594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3028/5971 [31:29<30:35,  1.60it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00941, train/loss_vlb_step=4.51e-5, train/loss_step=0.00941, global_step=2594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3028/5971 [31:29<30:35,  1.60it/s, loss=0.106, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000227, train/loss_step=0.065, global_step=2594.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  51%|█████     | 3029/5971 [31:30<30:35,  1.60it/s, loss=0.114, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000596, train/loss_step=0.174, global_step=2595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3030/5971 [31:31<30:35,  1.60it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00565, train/loss_vlb_step=2.73e-5, train/loss_step=0.00565, global_step=2595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3031/5971 [31:32<30:34,  1.60it/s, loss=0.12, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000498, train/loss_step=0.151, global_step=2595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  51%|█████     | 3032/5971 [31:34<30:35,  1.60it/s, loss=0.12, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000498, train/loss_step=0.151, global_step=2595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3032/5971 [31:34<30:35,  1.60it/s, loss=0.135, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00159, train/loss_step=0.307, global_step=2595.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3033/5971 [31:35<30:35,  1.60it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.49e-5, train/loss_step=0.00255, global_step=2596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3034/5971 [31:36<30:35,  1.60it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.81e-5, train/loss_step=0.00336, global_step=2596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3035/5971 [31:37<30:34,  1.60it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.61e-6, train/loss_step=0.00144, global_step=2596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3036/5971 [31:39<30:35,  1.60it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.61e-6, train/loss_step=0.00144, global_step=2596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3036/5971 [31:39<30:35,  1.60it/s, loss=0.129, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000712, train/loss_step=0.195, global_step=2596.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  51%|█████     | 3037/5971 [31:40<30:35,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.21e-6, train/loss_step=0.00155, global_step=2597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3038/5971 [31:41<30:35,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.82e-5, train/loss_step=0.0241, global_step=2597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  51%|█████     | 3039/5971 [31:42<30:34,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000137, train/loss_step=0.0366, global_step=2597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3040/5971 [31:44<30:35,  1.60it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000137, train/loss_step=0.0366, global_step=2597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3040/5971 [31:44<30:35,  1.60it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.57e-5, train/loss_step=0.00291, global_step=2597.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3041/5971 [31:45<30:35,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000902, train/loss_step=0.213, global_step=2598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  51%|█████     | 3042/5971 [31:46<30:35,  1.60it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00186, train/loss_step=0.366, global_step=2598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3043/5971 [31:47<30:34,  1.60it/s, loss=0.104, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000416, train/loss_step=0.114, global_step=2598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3044/5971 [31:49<30:35,  1.59it/s, loss=0.104, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000416, train/loss_step=0.114, global_step=2598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3044/5971 [31:49<30:35,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.000972, train/loss_step=0.276, global_step=2598.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3045/5971 [31:50<30:35,  1.59it/s, loss=0.12, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000638, train/loss_step=0.177, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  51%|█████     | 3046/5971 [31:51<30:34,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.2e-5, train/loss_step=0.0119, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3047/5971 [31:52<30:34,  1.59it/s, loss=0.13, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.0033, train/loss_step=0.467, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  51%|█████     | 3048/5971 [31:54<30:35,  1.59it/s, loss=0.13, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.0033, train/loss_step=0.467, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  51%|█████     | 3048/5971 [31:54<30:35,  1.59it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.41it/s][A

Validating:   1%|          | 2/167 [00:00<00:49,  3.34it/s][A
Epoch 4:  51%|█████     | 3052/5971 [31:55<30:31,  1.59it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.55it/s][A
Epoch 4:  51%|█████     | 3056/5971 [31:55<30:26,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.35it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.34it/s][A
Epoch 4:  51%|█████     | 3060/5971 [31:55<30:21,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.70it/s][A
Epoch 4:  51%|█████▏    | 3064/5971 [31:55<30:16,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.72it/s][A
Epoch 4:  51%|█████▏    | 3068/5971 [31:55<30:12,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.75it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.29it/s][A
Epoch 4:  51%|█████▏    | 3072/5971 [31:55<30:07,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.09it/s][A
Epoch 4:  52%|█████▏    | 3076/5971 [31:56<30:02,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.61it/s][A
Epoch 4:  52%|█████▏    | 3080/5971 [31:56<29:58,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.40it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 24.16it/s][A
Epoch 4:  52%|█████▏    | 3084/5971 [31:56<29:53,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.33it/s][A
Epoch 4:  52%|█████▏    | 3088/5971 [31:56<29:48,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.88it/s][A
Epoch 4:  52%|█████▏    | 3092/5971 [31:56<29:44,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 23.14it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:05, 21.00it/s][A
Epoch 4:  52%|█████▏    | 3096/5971 [31:56<29:39,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:05, 21.30it/s][A
Epoch 4:  52%|█████▏    | 3100/5971 [31:57<29:34,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 23.17it/s][A
Epoch 4:  52%|█████▏    | 3104/5971 [31:57<29:30,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 23.44it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 23.90it/s][A
Epoch 4:  52%|█████▏    | 3108/5971 [31:57<29:25,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:03<00:05, 17.98it/s][A
Epoch 4:  52%|█████▏    | 3112/5971 [31:57<29:21,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:05, 19.93it/s][A
Epoch 4:  52%|█████▏    | 3116/5971 [31:57<29:16,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:04, 21.84it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:04, 22.77it/s][A
Epoch 4:  52%|█████▏    | 3120/5971 [31:58<29:12,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:04, 23.04it/s][A
Epoch 4:  52%|█████▏    | 3124/5971 [31:58<29:07,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 23.05it/s][A
Epoch 4:  52%|█████▏    | 3128/5971 [31:58<29:03,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 23.94it/s][A

Validating:  50%|████▉     | 83/167 [00:04<00:03, 25.45it/s][A
Epoch 4:  52%|█████▏    | 3132/5971 [31:58<28:58,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 25.52it/s][A
Epoch 4:  53%|█████▎    | 3136/5971 [31:58<28:53,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 25.30it/s][A
Epoch 4:  53%|█████▎    | 3140/5971 [31:58<28:49,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.27it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 24.22it/s][A
Epoch 4:  53%|█████▎    | 3144/5971 [31:59<28:44,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.31it/s][A
Epoch 4:  53%|█████▎    | 3148/5971 [31:59<28:40,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.93it/s][A
Epoch 4:  53%|█████▎    | 3152/5971 [31:59<28:36,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.01it/s][A

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 24.51it/s][A
Epoch 4:  53%|█████▎    | 3156/5971 [31:59<28:31,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 24.98it/s][A
Epoch 4:  53%|█████▎    | 3160/5971 [31:59<28:27,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 25.28it/s][A
Epoch 4:  53%|█████▎    | 3164/5971 [31:59<28:22,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.41it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.62it/s][A
Epoch 4:  53%|█████▎    | 3168/5971 [31:59<28:18,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.95it/s][A
Epoch 4:  53%|█████▎    | 3172/5971 [32:00<28:13,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.73it/s][A
Epoch 4:  53%|█████▎    | 3176/5971 [32:00<28:09,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.52it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 27.27it/s][A
Epoch 4:  53%|█████▎    | 3180/5971 [32:00<28:04,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  80%|████████  | 134/167 [00:06<00:01, 26.93it/s][A
Epoch 4:  53%|█████▎    | 3184/5971 [32:00<28:00,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.73it/s][A
Epoch 4:  53%|█████▎    | 3188/5971 [32:00<27:56,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:06<00:00, 27.04it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 27.32it/s][A
Epoch 4:  53%|█████▎    | 3192/5971 [32:00<27:51,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.57it/s][A
Epoch 4:  54%|█████▎    | 3196/5971 [32:00<27:47,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.40it/s][A
Epoch 4:  54%|█████▎    | 3200/5971 [32:01<27:43,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.71it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.23it/s][A
Epoch 4:  54%|█████▎    | 3204/5971 [32:01<27:38,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.68it/s][A
Epoch 4:  54%|█████▎    | 3208/5971 [32:01<27:34,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 25.42it/s][A
Epoch 4:  54%|█████▍    | 3212/5971 [32:01<27:30,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 24.50it/s][A
Epoch 4:  54%|█████▍    | 3216/5971 [32:01<27:25,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3216/5971 [32:02<27:26,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:23,  2.05it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:14,  3.22it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:11,  3.96it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.40it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.57it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.74it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.86it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  4.98it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.97it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.09it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.30it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.27it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.30it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.51it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.45it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.47it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.48it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.38it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.29it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.56it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.40it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.31it/s][A
Epoch 4:  54%|█████▍    | 3216/5971 [32:12<27:34,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.22it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.17it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.03it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.18it/s]

Epoch 4:  54%|█████▍    | 3217/5971 [32:14<27:35,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=2599.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3217/5971 [32:14<27:35,  1.66it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.000143, train/loss_step=0.0434, global_step=2600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.89it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.68it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.85it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.00it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.06it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.22it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.18it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.14it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.13it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.15it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.25it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.12it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.03it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  4.98it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.01it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.01it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.01it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.02it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.03it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.01it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  4.93it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  4.92it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  4.96it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  4.97it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.04it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.06it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.06it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.07it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  4.83it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:02,  4.78it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  4.66it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  4.56it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:09<00:01,  4.61it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  4.63it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:01,  4.63it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  4.60it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  4.68it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:10<00:00,  4.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  4.66it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.57it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.74it/s]

Epoch 4:  54%|█████▍    | 3218/5971 [32:27<27:45,  1.65it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.000143, train/loss_step=0.0434, global_step=2600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3218/5971 [32:27<27:45,  1.65it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00672, train/loss_vlb_step=3.32e-5, train/loss_step=0.00672, global_step=2600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:28,  1.73it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:16,  2.86it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.64it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  4.10it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.67it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.79it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  4.96it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.09it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.28it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.02it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  4.91it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  4.83it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:06,  4.78it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:06,  4.81it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  4.76it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  4.82it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  4.87it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:05,  4.92it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  4.87it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  4.78it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  4.81it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  4.84it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:04,  4.83it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:04,  4.72it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  4.77it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  4.77it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  4.83it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:03,  4.80it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  4.79it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  4.62it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:08<00:02,  4.45it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  4.33it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:02,  4.32it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:02,  4.41it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  4.50it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:09<00:01,  4.53it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  4.54it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:01,  4.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  4.57it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:10<00:00,  4.65it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:10<00:00,  4.72it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  4.76it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.68it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.67it/s]

Epoch 4:  54%|█████▍    | 3219/5971 [32:40<27:55,  1.64it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00672, train/loss_vlb_step=3.32e-5, train/loss_step=0.00672, global_step=2600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3219/5971 [32:40<27:55,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00111, train/loss_step=0.269, global_step=2600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:38,  1.28it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:21,  2.21it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  2.95it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.57it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  4.05it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.39it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.58it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.79it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.84it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.98it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.17it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.16it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.14it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.12it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.16it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  5.18it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.20it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.22it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:05<00:05,  5.32it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.29it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.23it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.20it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.16it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.16it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.16it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.11it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.19it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.24it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.24it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.29it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.32it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.94it/s]

Epoch 4:  54%|█████▍    | 3220/5971 [32:54<28:06,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00111, train/loss_step=0.269, global_step=2600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3220/5971 [32:54<28:06,  1.63it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.88e-5, train/loss_step=0.0108, global_step=2600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3221/5971 [32:55<28:06,  1.63it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.88e-5, train/loss_step=0.0108, global_step=2600.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3221/5971 [32:55<28:06,  1.63it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0557, train/loss_vlb_step=0.0002, train/loss_step=0.0557, global_step=2601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3222/5971 [32:56<28:05,  1.63it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0557, train/loss_vlb_step=0.0002, train/loss_step=0.0557, global_step=2601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3222/5971 [32:56<28:05,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000695, train/loss_step=0.200, global_step=2601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3223/5971 [32:57<28:05,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000695, train/loss_step=0.200, global_step=2601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3223/5971 [32:57<28:05,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000243, train/loss_step=0.0723, global_step=2601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3224/5971 [33:00<28:06,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000243, train/loss_step=0.0723, global_step=2601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3224/5971 [33:00<28:06,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00246, train/loss_step=0.454, global_step=2601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  54%|█████▍    | 3225/5971 [33:00<28:06,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00246, train/loss_step=0.454, global_step=2601.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3225/5971 [33:00<28:06,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000338, train/loss_step=0.102, global_step=2602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3226/5971 [33:01<28:05,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000338, train/loss_step=0.102, global_step=2602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3226/5971 [33:01<28:05,  1.63it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00915, train/loss_vlb_step=4.31e-5, train/loss_step=0.00915, global_step=2602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3227/5971 [33:02<28:05,  1.63it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00915, train/loss_vlb_step=4.31e-5, train/loss_step=0.00915, global_step=2602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3227/5971 [33:02<28:05,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000191, train/loss_step=0.0547, global_step=2602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3228/5971 [33:04<28:06,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000191, train/loss_step=0.0547, global_step=2602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3228/5971 [33:04<28:06,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000118, train/loss_step=0.034, global_step=2602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  54%|█████▍    | 3229/5971 [33:05<28:05,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000118, train/loss_step=0.034, global_step=2602.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3229/5971 [33:05<28:05,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.6e-5, train/loss_step=0.0124, global_step=2603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3230/5971 [33:06<28:05,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.6e-5, train/loss_step=0.0124, global_step=2603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3230/5971 [33:06<28:05,  1.63it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000268, train/loss_step=0.0807, global_step=2603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3231/5971 [33:07<28:05,  1.63it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000268, train/loss_step=0.0807, global_step=2603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3231/5971 [33:07<28:05,  1.63it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=2.85e-5, train/loss_step=0.00598, global_step=2603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3232/5971 [33:09<28:05,  1.62it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=2.85e-5, train/loss_step=0.00598, global_step=2603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3232/5971 [33:09<28:05,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000139, train/loss_step=0.040, global_step=2603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  54%|█████▍    | 3233/5971 [33:10<28:05,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000139, train/loss_step=0.040, global_step=2603.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3233/5971 [33:10<28:05,  1.62it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00505, train/loss_vlb_step=2.48e-5, train/loss_step=0.00505, global_step=2604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3234/5971 [33:11<28:05,  1.62it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00505, train/loss_vlb_step=2.48e-5, train/loss_step=0.00505, global_step=2604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3234/5971 [33:11<28:05,  1.62it/s, loss=0.107, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000207, train/loss_step=0.060, global_step=2604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  54%|█████▍    | 3235/5971 [33:12<28:04,  1.62it/s, loss=0.107, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000207, train/loss_step=0.060, global_step=2604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3235/5971 [33:12<28:04,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.00618, train/loss_step=0.595, global_step=2604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3236/5971 [33:15<28:05,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.595, train/loss_vlb_step=0.00618, train/loss_step=0.595, global_step=2604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3236/5971 [33:15<28:05,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000315, train/loss_step=0.0956, global_step=2604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3237/5971 [33:16<28:05,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000315, train/loss_step=0.0956, global_step=2604.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3237/5971 [33:16<28:05,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00122, train/loss_step=0.275, global_step=2605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  54%|█████▍    | 3238/5971 [33:17<28:05,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00122, train/loss_step=0.275, global_step=2605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3238/5971 [33:17<28:05,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.76e-5, train/loss_step=0.0135, global_step=2605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3239/5971 [33:18<28:04,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.76e-5, train/loss_step=0.0135, global_step=2605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3239/5971 [33:18<28:04,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.95e-5, train/loss_step=0.0166, global_step=2605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3240/5971 [33:20<28:05,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.95e-5, train/loss_step=0.0166, global_step=2605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3240/5971 [33:20<28:05,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00136, train/loss_step=0.279, global_step=2605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3241/5971 [33:21<28:05,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00136, train/loss_step=0.279, global_step=2605.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3241/5971 [33:21<28:05,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.00015, train/loss_step=0.0438, global_step=2606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3242/5971 [33:22<28:04,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.00015, train/loss_step=0.0438, global_step=2606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3242/5971 [33:22<28:04,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000176, train/loss_step=0.0496, global_step=2606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3243/5971 [33:23<28:04,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000176, train/loss_step=0.0496, global_step=2606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3243/5971 [33:23<28:04,  1.62it/s, loss=0.124, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000982, train/loss_step=0.248, global_step=2606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  54%|█████▍    | 3244/5971 [33:25<28:05,  1.62it/s, loss=0.124, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000982, train/loss_step=0.248, global_step=2606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3244/5971 [33:25<28:05,  1.62it/s, loss=0.118, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00187, train/loss_step=0.348, global_step=2606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3245/5971 [33:26<28:05,  1.62it/s, loss=0.118, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00187, train/loss_step=0.348, global_step=2606.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3245/5971 [33:26<28:05,  1.62it/s, loss=0.12, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000484, train/loss_step=0.141, global_step=2607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3246/5971 [33:27<28:04,  1.62it/s, loss=0.12, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000484, train/loss_step=0.141, global_step=2607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3246/5971 [33:27<28:04,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.69e-5, train/loss_step=0.0132, global_step=2607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3247/5971 [33:28<28:04,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.69e-5, train/loss_step=0.0132, global_step=2607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3247/5971 [33:28<28:04,  1.62it/s, loss=0.127, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000645, train/loss_step=0.177, global_step=2607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3248/5971 [33:30<28:05,  1.62it/s, loss=0.127, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000645, train/loss_step=0.177, global_step=2607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3248/5971 [33:30<28:05,  1.62it/s, loss=0.138, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.000978, train/loss_step=0.267, global_step=2607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3249/5971 [33:31<28:04,  1.62it/s, loss=0.138, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.000978, train/loss_step=0.267, global_step=2607.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3249/5971 [33:31<28:04,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=2608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3250/5971 [33:32<28:04,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=2608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3250/5971 [33:32<28:04,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00492, train/loss_step=0.528, global_step=2608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3251/5971 [33:33<28:04,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00492, train/loss_step=0.528, global_step=2608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3251/5971 [33:33<28:04,  1.62it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.07e-5, train/loss_step=0.0202, global_step=2608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3252/5971 [33:35<28:04,  1.61it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=8.07e-5, train/loss_step=0.0202, global_step=2608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3252/5971 [33:35<28:04,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.61e-5, train/loss_step=0.00281, global_step=2608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3253/5971 [33:36<28:04,  1.61it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.61e-5, train/loss_step=0.00281, global_step=2608.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3253/5971 [33:36<28:04,  1.61it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0933, train/loss_vlb_step=0.000306, train/loss_step=0.0933, global_step=2609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  54%|█████▍    | 3254/5971 [33:37<28:03,  1.61it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0933, train/loss_vlb_step=0.000306, train/loss_step=0.0933, global_step=2609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  54%|█████▍    | 3254/5971 [33:37<28:03,  1.61it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000109, train/loss_step=0.0291, global_step=2609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3255/5971 [33:38<28:03,  1.61it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000109, train/loss_step=0.0291, global_step=2609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3255/5971 [33:38<28:03,  1.61it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.31e-5, train/loss_step=0.0119, global_step=2609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▍    | 3256/5971 [33:40<28:04,  1.61it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.31e-5, train/loss_step=0.0119, global_step=2609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3256/5971 [33:40<28:04,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.000962, train/loss_step=0.258, global_step=2609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▍    | 3257/5971 [33:41<28:03,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.000962, train/loss_step=0.258, global_step=2609.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3257/5971 [33:41<28:03,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.0034, train/loss_step=0.449, global_step=2610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  55%|█████▍    | 3258/5971 [33:42<28:03,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.0034, train/loss_step=0.449, global_step=2610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3258/5971 [33:42<28:03,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.14e-5, train/loss_step=0.00188, global_step=2610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3259/5971 [33:43<28:03,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.14e-5, train/loss_step=0.00188, global_step=2610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3259/5971 [33:43<28:03,  1.61it/s, loss=0.167, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=2610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  55%|█████▍    | 3260/5971 [33:45<28:03,  1.61it/s, loss=0.167, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=2610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3260/5971 [33:45<28:03,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000215, train/loss_step=0.0645, global_step=2610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3261/5971 [33:46<28:03,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000215, train/loss_step=0.0645, global_step=2610.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3261/5971 [33:46<28:03,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.05e-5, train/loss_step=0.00378, global_step=2611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3262/5971 [33:47<28:03,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.05e-5, train/loss_step=0.00378, global_step=2611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3262/5971 [33:47<28:03,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000134, train/loss_step=0.0359, global_step=2611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▍    | 3263/5971 [33:48<28:02,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000134, train/loss_step=0.0359, global_step=2611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3263/5971 [33:48<28:02,  1.61it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0578, train/loss_vlb_step=0.000191, train/loss_step=0.0578, global_step=2611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3264/5971 [33:50<28:03,  1.61it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0578, train/loss_vlb_step=0.000191, train/loss_step=0.0578, global_step=2611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3264/5971 [33:50<28:03,  1.61it/s, loss=0.145, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.0017, train/loss_step=0.366, global_step=2611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  55%|█████▍    | 3265/5971 [33:51<28:03,  1.61it/s, loss=0.145, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.0017, train/loss_step=0.366, global_step=2611.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3265/5971 [33:51<28:03,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00948, train/loss_vlb_step=4.63e-5, train/loss_step=0.00948, global_step=2612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3266/5971 [33:52<28:02,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00948, train/loss_vlb_step=4.63e-5, train/loss_step=0.00948, global_step=2612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3266/5971 [33:52<28:02,  1.61it/s, loss=0.144, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000405, train/loss_step=0.118, global_step=2612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  55%|█████▍    | 3267/5971 [33:53<28:02,  1.61it/s, loss=0.144, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000405, train/loss_step=0.118, global_step=2612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3267/5971 [33:53<28:02,  1.61it/s, loss=0.166, v_num=0, train/loss_simple_step=0.610, train/loss_vlb_step=0.00924, train/loss_step=0.610, global_step=2612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▍    | 3268/5971 [33:55<28:03,  1.61it/s, loss=0.166, v_num=0, train/loss_simple_step=0.610, train/loss_vlb_step=0.00924, train/loss_step=0.610, global_step=2612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3268/5971 [33:55<28:03,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00524, train/loss_vlb_step=2.59e-5, train/loss_step=0.00524, global_step=2612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3269/5971 [33:56<28:02,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00524, train/loss_vlb_step=2.59e-5, train/loss_step=0.00524, global_step=2612.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3269/5971 [33:56<28:02,  1.61it/s, loss=0.16, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00179, train/loss_step=0.276, global_step=2613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  55%|█████▍    | 3270/5971 [33:57<28:02,  1.61it/s, loss=0.16, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00179, train/loss_step=0.276, global_step=2613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3270/5971 [33:57<28:02,  1.61it/s, loss=0.145, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000816, train/loss_step=0.211, global_step=2613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3271/5971 [33:58<28:01,  1.61it/s, loss=0.145, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000816, train/loss_step=0.211, global_step=2613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3271/5971 [33:58<28:01,  1.61it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0857, train/loss_vlb_step=0.000283, train/loss_step=0.0857, global_step=2613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3272/5971 [34:00<28:02,  1.60it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0857, train/loss_vlb_step=0.000283, train/loss_step=0.0857, global_step=2613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3272/5971 [34:00<28:02,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000723, train/loss_step=0.201, global_step=2613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  55%|█████▍    | 3273/5971 [34:01<28:02,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000723, train/loss_step=0.201, global_step=2613.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3273/5971 [34:01<28:02,  1.60it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=0.000103, train/loss_step=0.0254, global_step=2614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3274/5971 [34:02<28:01,  1.60it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=0.000103, train/loss_step=0.0254, global_step=2614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3274/5971 [34:02<28:01,  1.60it/s, loss=0.184, v_num=0, train/loss_simple_step=0.627, train/loss_vlb_step=0.00974, train/loss_step=0.627, global_step=2614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  55%|█████▍    | 3275/5971 [34:03<28:01,  1.60it/s, loss=0.184, v_num=0, train/loss_simple_step=0.627, train/loss_vlb_step=0.00974, train/loss_step=0.627, global_step=2614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3275/5971 [34:03<28:01,  1.60it/s, loss=0.206, v_num=0, train/loss_simple_step=0.437, train/loss_vlb_step=0.00297, train/loss_step=0.437, global_step=2614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3276/5971 [34:05<28:02,  1.60it/s, loss=0.206, v_num=0, train/loss_simple_step=0.437, train/loss_vlb_step=0.00297, train/loss_step=0.437, global_step=2614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3276/5971 [34:05<28:02,  1.60it/s, loss=0.201, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000553, train/loss_step=0.168, global_step=2614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3277/5971 [34:06<28:01,  1.60it/s, loss=0.201, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000553, train/loss_step=0.168, global_step=2614.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3277/5971 [34:06<28:01,  1.60it/s, loss=0.19, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000781, train/loss_step=0.222, global_step=2615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▍    | 3278/5971 [34:07<28:01,  1.60it/s, loss=0.19, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000781, train/loss_step=0.222, global_step=2615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3278/5971 [34:07<28:01,  1.60it/s, loss=0.2, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000845, train/loss_step=0.217, global_step=2615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▍    | 3279/5971 [34:08<28:00,  1.60it/s, loss=0.2, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000845, train/loss_step=0.217, global_step=2615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3279/5971 [34:08<28:00,  1.60it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=1e-5, train/loss_step=0.00167, global_step=2615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3280/5971 [34:10<28:01,  1.60it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=1e-5, train/loss_step=0.00167, global_step=2615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3280/5971 [34:10<28:01,  1.60it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00861, train/loss_vlb_step=4.07e-5, train/loss_step=0.00861, global_step=2615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3281/5971 [34:11<28:01,  1.60it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00861, train/loss_vlb_step=4.07e-5, train/loss_step=0.00861, global_step=2615.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3281/5971 [34:11<28:01,  1.60it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000165, train/loss_step=0.0464, global_step=2616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▍    | 3282/5971 [34:12<28:00,  1.60it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000165, train/loss_step=0.0464, global_step=2616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3282/5971 [34:12<28:00,  1.60it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.5e-5, train/loss_step=0.0212, global_step=2616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  55%|█████▍    | 3283/5971 [34:13<28:00,  1.60it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.5e-5, train/loss_step=0.0212, global_step=2616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3283/5971 [34:13<28:00,  1.60it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.77e-5, train/loss_step=0.00324, global_step=2616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3284/5971 [34:15<28:01,  1.60it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.77e-5, train/loss_step=0.00324, global_step=2616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▍    | 3284/5971 [34:15<28:01,  1.60it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=2e-5, train/loss_step=0.00374, global_step=2616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  55%|█████▌    | 3285/5971 [34:16<28:01,  1.60it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=2e-5, train/loss_step=0.00374, global_step=2616.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3285/5971 [34:16<28:01,  1.60it/s, loss=0.165, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.55e-5, train/loss_step=0.018, global_step=2617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▌    | 3286/5971 [34:17<28:00,  1.60it/s, loss=0.165, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.55e-5, train/loss_step=0.018, global_step=2617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3286/5971 [34:17<28:00,  1.60it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=1.01e-5, train/loss_step=0.00167, global_step=2617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3287/5971 [34:18<28:00,  1.60it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=1.01e-5, train/loss_step=0.00167, global_step=2617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3287/5971 [34:18<28:00,  1.60it/s, loss=0.144, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00157, train/loss_step=0.305, global_step=2617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  55%|█████▌    | 3288/5971 [34:20<28:00,  1.60it/s, loss=0.144, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00157, train/loss_step=0.305, global_step=2617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3288/5971 [34:20<28:00,  1.60it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=6.91e-5, train/loss_step=0.0178, global_step=2617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3289/5971 [34:21<28:00,  1.60it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=6.91e-5, train/loss_step=0.0178, global_step=2617.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3289/5971 [34:21<28:00,  1.60it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.93e-5, train/loss_step=0.0112, global_step=2618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3290/5971 [34:22<28:00,  1.60it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.93e-5, train/loss_step=0.0112, global_step=2618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3290/5971 [34:22<28:00,  1.60it/s, loss=0.129, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000509, train/loss_step=0.153, global_step=2618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▌    | 3291/5971 [34:23<27:59,  1.60it/s, loss=0.129, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000509, train/loss_step=0.153, global_step=2618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3291/5971 [34:23<27:59,  1.60it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000177, train/loss_step=0.0488, global_step=2618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3292/5971 [34:25<28:00,  1.59it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000177, train/loss_step=0.0488, global_step=2618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3292/5971 [34:25<28:00,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000351, train/loss_step=0.105, global_step=2618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  55%|█████▌    | 3293/5971 [34:26<28:00,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000351, train/loss_step=0.105, global_step=2618.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3293/5971 [34:26<28:00,  1.59it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.85e-5, train/loss_step=0.00344, global_step=2619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3294/5971 [34:27<27:59,  1.59it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.85e-5, train/loss_step=0.00344, global_step=2619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3294/5971 [34:27<27:59,  1.59it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.00122, train/loss_vlb_step=7.38e-6, train/loss_step=0.00122, global_step=2619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3295/5971 [34:28<27:59,  1.59it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.00122, train/loss_vlb_step=7.38e-6, train/loss_step=0.00122, global_step=2619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3295/5971 [34:28<27:59,  1.59it/s, loss=0.0704, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000181, train/loss_step=0.0502, global_step=2619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▌    | 3296/5971 [34:31<28:00,  1.59it/s, loss=0.0704, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000181, train/loss_step=0.0502, global_step=2619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3296/5971 [34:31<28:00,  1.59it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00807, train/loss_step=0.571, global_step=2619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  55%|█████▌    | 3297/5971 [34:32<28:00,  1.59it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00807, train/loss_step=0.571, global_step=2619.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3297/5971 [34:32<28:00,  1.59it/s, loss=0.0836, v_num=0, train/loss_simple_step=0.0826, train/loss_vlb_step=0.000273, train/loss_step=0.0826, global_step=2620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3298/5971 [34:33<27:59,  1.59it/s, loss=0.0836, v_num=0, train/loss_simple_step=0.0826, train/loss_vlb_step=0.000273, train/loss_step=0.0826, global_step=2620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3298/5971 [34:33<27:59,  1.59it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.0987, train/loss_vlb_step=0.000324, train/loss_step=0.0987, global_step=2620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3299/5971 [34:34<27:59,  1.59it/s, loss=0.0777, v_num=0, train/loss_simple_step=0.0987, train/loss_vlb_step=0.000324, train/loss_step=0.0987, global_step=2620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3299/5971 [34:34<27:59,  1.59it/s, loss=0.078, v_num=0, train/loss_simple_step=0.00864, train/loss_vlb_step=4.01e-5, train/loss_step=0.00864, global_step=2620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3300/5971 [34:36<28:00,  1.59it/s, loss=0.078, v_num=0, train/loss_simple_step=0.00864, train/loss_vlb_step=4.01e-5, train/loss_step=0.00864, global_step=2620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3300/5971 [34:36<28:00,  1.59it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.85e-5, train/loss_step=0.0105, global_step=2620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▌    | 3301/5971 [34:37<28:00,  1.59it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.85e-5, train/loss_step=0.0105, global_step=2620.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3301/5971 [34:37<28:00,  1.59it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.00047, train/loss_step=0.141, global_step=2621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  55%|█████▌    | 3302/5971 [34:38<27:59,  1.59it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.00047, train/loss_step=0.141, global_step=2621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3302/5971 [34:38<27:59,  1.59it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000259, train/loss_step=0.0761, global_step=2621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3303/5971 [34:39<27:59,  1.59it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000259, train/loss_step=0.0761, global_step=2621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3303/5971 [34:39<27:59,  1.59it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000606, train/loss_step=0.177, global_step=2621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  55%|█████▌    | 3304/5971 [34:42<28:00,  1.59it/s, loss=0.0943, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000606, train/loss_step=0.177, global_step=2621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3304/5971 [34:42<28:00,  1.59it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.00011, train/loss_step=0.0282, global_step=2621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3305/5971 [34:43<27:59,  1.59it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.00011, train/loss_step=0.0282, global_step=2621.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3305/5971 [34:43<27:59,  1.59it/s, loss=0.105, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000805, train/loss_step=0.204, global_step=2622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  55%|█████▌    | 3306/5971 [34:43<27:59,  1.59it/s, loss=0.105, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000805, train/loss_step=0.204, global_step=2622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3306/5971 [34:43<27:59,  1.59it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.75e-5, train/loss_step=0.00331, global_step=2622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3307/5971 [34:44<27:59,  1.59it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.75e-5, train/loss_step=0.00331, global_step=2622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3307/5971 [34:44<27:59,  1.59it/s, loss=0.0897, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.36e-5, train/loss_step=0.00232, global_step=2622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3308/5971 [34:47<27:59,  1.59it/s, loss=0.0897, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.36e-5, train/loss_step=0.00232, global_step=2622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3308/5971 [34:47<27:59,  1.59it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.54e-5, train/loss_step=0.00497, global_step=2622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3309/5971 [34:47<27:59,  1.59it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.54e-5, train/loss_step=0.00497, global_step=2622.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3309/5971 [34:47<27:59,  1.59it/s, loss=0.0959, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000486, train/loss_step=0.148, global_step=2623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  55%|█████▌    | 3310/5971 [34:48<27:58,  1.59it/s, loss=0.0959, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000486, train/loss_step=0.148, global_step=2623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3310/5971 [34:48<27:58,  1.59it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.47e-5, train/loss_step=0.00754, global_step=2623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3311/5971 [34:49<27:58,  1.58it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.47e-5, train/loss_step=0.00754, global_step=2623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3311/5971 [34:49<27:58,  1.58it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000298, train/loss_step=0.0901, global_step=2623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  55%|█████▌    | 3312/5971 [34:51<27:58,  1.58it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000298, train/loss_step=0.0901, global_step=2623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3312/5971 [34:51<27:58,  1.58it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.21e-5, train/loss_step=0.00213, global_step=2623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3313/5971 [34:52<27:58,  1.58it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.21e-5, train/loss_step=0.00213, global_step=2623.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  55%|█████▌    | 3313/5971 [34:52<27:58,  1.58it/s, loss=0.087, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000126, train/loss_step=0.0324, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  56%|█████▌    | 3314/5971 [34:53<27:58,  1.58it/s, loss=0.087, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000126, train/loss_step=0.0324, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  56%|█████▌    | 3314/5971 [34:53<27:58,  1.58it/s, loss=0.0924, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000365, train/loss_step=0.109, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  56%|█████▌    | 3315/5971 [34:54<27:57,  1.58it/s, loss=0.0924, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000365, train/loss_step=0.109, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  56%|█████▌    | 3315/5971 [34:54<27:57,  1.58it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.34e-5, train/loss_step=0.0126, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  56%|█████▌    | 3316/5971 [34:57<27:58,  1.58it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.34e-5, train/loss_step=0.0126, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  56%|█████▌    | 3316/5971 [34:57<27:58,  1.58it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:26,  1.91it/s][A
Epoch 4:  56%|█████▌    | 3318/5971 [34:57<27:56,  1.58it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<01:11,  2.31it/s][A
Epoch 4:  56%|█████▌    | 3320/5971 [34:58<27:54,  1.58it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:01<00:24,  6.60it/s][A
Epoch 4:  56%|█████▌    | 3323/5971 [34:58<27:51,  1.58it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:01<00:15, 10.21it/s][A
Epoch 4:  56%|█████▌    | 3326/5971 [34:58<27:48,  1.59it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:11, 13.51it/s][A
Epoch 4:  56%|█████▌    | 3329/5971 [34:58<27:44,  1.59it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:09, 15.69it/s][A
Epoch 4:  56%|█████▌    | 3332/5971 [34:58<27:41,  1.59it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:08, 17.55it/s][A
Epoch 4:  56%|█████▌    | 3335/5971 [34:58<27:38,  1.59it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 19.52it/s][A
Epoch 4:  56%|█████▌    | 3338/5971 [34:58<27:35,  1.59it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:07, 20.19it/s][A
Epoch 4:  56%|█████▌    | 3341/5971 [34:59<27:31,  1.59it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 20.75it/s][A
Epoch 4:  56%|█████▌    | 3344/5971 [34:59<27:28,  1.59it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:02<00:06, 20.69it/s][A
Epoch 4:  56%|█████▌    | 3347/5971 [34:59<27:25,  1.59it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:02<00:06, 21.97it/s][A
Epoch 4:  56%|█████▌    | 3350/5971 [34:59<27:22,  1.60it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:02<00:05, 22.06it/s][A
Epoch 4:  56%|█████▌    | 3353/5971 [34:59<27:18,  1.60it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 23.57it/s][A
Epoch 4:  56%|█████▌    | 3356/5971 [34:59<27:15,  1.60it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.38it/s][A
Epoch 4:  56%|█████▋    | 3359/5971 [34:59<27:12,  1.60it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.29it/s][A
Epoch 4:  56%|█████▋    | 3362/5971 [34:59<27:09,  1.60it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.62it/s][A
Epoch 4:  56%|█████▋    | 3365/5971 [35:00<27:05,  1.60it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 23.53it/s][A
Epoch 4:  56%|█████▋    | 3368/5971 [35:00<27:02,  1.60it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:03<00:04, 25.00it/s][A
Epoch 4:  56%|█████▋    | 3371/5971 [35:00<26:59,  1.61it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▍      | 58/167 [00:03<00:04, 26.47it/s][A
Epoch 4:  57%|█████▋    | 3375/5971 [35:00<26:55,  1.61it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 25.73it/s][A
Epoch 4:  57%|█████▋    | 3379/5971 [35:00<26:50,  1.61it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.40it/s][A
Epoch 4:  57%|█████▋    | 3383/5971 [35:00<26:46,  1.61it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.49it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.17it/s][A
Epoch 4:  57%|█████▋    | 3387/5971 [35:00<26:42,  1.61it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.84it/s][A
Epoch 4:  57%|█████▋    | 3391/5971 [35:01<26:38,  1.61it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.14it/s][A
Epoch 4:  57%|█████▋    | 3395/5971 [35:01<26:33,  1.62it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:04<00:03, 26.47it/s][A

Validating:  49%|████▉     | 82/167 [00:04<00:03, 26.32it/s][A
Epoch 4:  57%|█████▋    | 3399/5971 [35:01<26:29,  1.62it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  51%|█████     | 85/167 [00:04<00:03, 26.46it/s][A
Epoch 4:  57%|█████▋    | 3403/5971 [35:01<26:25,  1.62it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:04<00:02, 26.75it/s][A
Epoch 4:  57%|█████▋    | 3407/5971 [35:01<26:21,  1.62it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 26.20it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.27it/s][A
Epoch 4:  57%|█████▋    | 3411/5971 [35:01<26:16,  1.62it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.73it/s][A
Epoch 4:  57%|█████▋    | 3415/5971 [35:01<26:12,  1.63it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.99it/s][A
Epoch 4:  57%|█████▋    | 3419/5971 [35:02<26:08,  1.63it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.96it/s][A

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 25.65it/s][A
Epoch 4:  57%|█████▋    | 3423/5971 [35:02<26:04,  1.63it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 26.59it/s][A
Epoch 4:  57%|█████▋    | 3427/5971 [35:02<26:00,  1.63it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 26.07it/s][A
Epoch 4:  57%|█████▋    | 3431/5971 [35:02<25:56,  1.63it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 25.87it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.13it/s][A
Epoch 4:  58%|█████▊    | 3435/5971 [35:02<25:51,  1.63it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 24.24it/s][A
Epoch 4:  58%|█████▊    | 3439/5971 [35:02<25:47,  1.64it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 23.40it/s][A
Epoch 4:  58%|█████▊    | 3443/5971 [35:03<25:43,  1.64it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 24.26it/s][A

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 24.40it/s][A
Epoch 4:  58%|█████▊    | 3447/5971 [35:03<25:39,  1.64it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:06<00:01, 24.38it/s][A
Epoch 4:  58%|█████▊    | 3451/5971 [35:03<25:35,  1.64it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 24.02it/s][A
Epoch 4:  58%|█████▊    | 3455/5971 [35:03<25:31,  1.64it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 24.78it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 25.33it/s][A
Epoch 4:  58%|█████▊    | 3459/5971 [35:03<25:27,  1.64it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.84it/s][A
Epoch 4:  58%|█████▊    | 3463/5971 [35:03<25:23,  1.65it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.71it/s][A
Epoch 4:  58%|█████▊    | 3467/5971 [35:04<25:19,  1.65it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.28it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.15it/s][A
Epoch 4:  58%|█████▊    | 3471/5971 [35:04<25:15,  1.65it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:07<00:00, 25.71it/s][A
Epoch 4:  58%|█████▊    | 3475/5971 [35:04<25:11,  1.65it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 25.01it/s][A
Epoch 4:  58%|█████▊    | 3479/5971 [35:04<25:07,  1.65it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 25.13it/s][A

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 25.51it/s][A
Epoch 4:  58%|█████▊    | 3483/5971 [35:04<25:02,  1.66it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  58%|█████▊    | 3484/5971 [35:05<25:02,  1.66it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000103, train/loss_step=0.026, global_step=2624.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  58%|█████▊    | 3485/5971 [35:06<25:02,  1.66it/s, loss=0.0595, v_num=0, train/loss_simple_step=0.00812, train/loss_vlb_step=3.9e-5, train/loss_step=0.00812, global_step=2625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  58%|█████▊    | 3486/5971 [35:07<25:01,  1.65it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00121, train/loss_step=0.307, global_step=2625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  58%|█████▊    | 3487/5971 [35:08<25:01,  1.65it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00121, train/loss_step=0.307, global_step=2625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  58%|█████▊    | 3487/5971 [35:08<25:01,  1.65it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000709, train/loss_step=0.194, global_step=2625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  58%|█████▊    | 3488/5971 [35:10<25:02,  1.65it/s, loss=0.0924, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00111, train/loss_step=0.275, global_step=2625.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  58%|█████▊    | 3489/5971 [35:11<25:01,  1.65it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.00011, train/loss_step=0.0287, global_step=2626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  58%|█████▊    | 3490/5971 [35:12<25:01,  1.65it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.00659, train/loss_vlb_step=3e-5, train/loss_step=0.00659, global_step=2626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  58%|█████▊    | 3491/5971 [35:13<25:01,  1.65it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.00659, train/loss_vlb_step=3e-5, train/loss_step=0.00659, global_step=2626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  58%|█████▊    | 3491/5971 [35:13<25:01,  1.65it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.087, train/loss_vlb_step=0.000286, train/loss_step=0.087, global_step=2626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  58%|█████▊    | 3492/5971 [35:15<25:01,  1.65it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000372, train/loss_step=0.112, global_step=2626.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  58%|█████▊    | 3493/5971 [35:16<25:01,  1.65it/s, loss=0.0761, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000214, train/loss_step=0.0644, global_step=2627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3494/5971 [35:17<25:00,  1.65it/s, loss=0.0788, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000193, train/loss_step=0.0568, global_step=2627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3495/5971 [35:18<25:00,  1.65it/s, loss=0.0788, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000193, train/loss_step=0.0568, global_step=2627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3495/5971 [35:18<25:00,  1.65it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.48e-5, train/loss_step=0.0224, global_step=2627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▊    | 3496/5971 [35:20<25:00,  1.65it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.00619, train/loss_vlb_step=2.99e-5, train/loss_step=0.00619, global_step=2627.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3497/5971 [35:21<25:00,  1.65it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000409, train/loss_step=0.124, global_step=2628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  59%|█████▊    | 3498/5971 [35:22<25:00,  1.65it/s, loss=0.08, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000122, train/loss_step=0.034, global_step=2628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  59%|█████▊    | 3499/5971 [35:23<24:59,  1.65it/s, loss=0.08, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000122, train/loss_step=0.034, global_step=2628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3499/5971 [35:23<24:59,  1.65it/s, loss=0.0775, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000153, train/loss_step=0.0407, global_step=2628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3500/5971 [35:25<25:00,  1.65it/s, loss=0.0775, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.46e-5, train/loss_step=0.00246, global_step=2628.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3501/5971 [35:26<24:59,  1.65it/s, loss=0.0787, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.00019, train/loss_step=0.0561, global_step=2629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  59%|█████▊    | 3502/5971 [35:27<24:59,  1.65it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000651, train/loss_step=0.188, global_step=2629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▊    | 3503/5971 [35:28<24:59,  1.65it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000651, train/loss_step=0.188, global_step=2629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3503/5971 [35:28<24:59,  1.65it/s, loss=0.104, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.00304, train/loss_step=0.444, global_step=2629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  59%|█████▊    | 3504/5971 [35:31<25:00,  1.64it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0972, train/loss_vlb_step=0.000322, train/loss_step=0.0972, global_step=2629.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3505/5971 [35:32<24:59,  1.64it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.22e-5, train/loss_step=0.0196, global_step=2630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▊    | 3506/5971 [35:33<24:59,  1.64it/s, loss=0.0938, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.52e-5, train/loss_step=0.0157, global_step=2630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3507/5971 [35:34<24:58,  1.64it/s, loss=0.0938, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.52e-5, train/loss_step=0.0157, global_step=2630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▊    | 3507/5971 [35:34<24:58,  1.64it/s, loss=0.0919, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000536, train/loss_step=0.158, global_step=2630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▉    | 3508/5971 [35:36<24:59,  1.64it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000374, train/loss_step=0.114, global_step=2630.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3509/5971 [35:37<24:59,  1.64it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000189, train/loss_step=0.0551, global_step=2631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3510/5971 [35:38<24:58,  1.64it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.00694, train/loss_vlb_step=3.45e-5, train/loss_step=0.00694, global_step=2631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3511/5971 [35:38<24:58,  1.64it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.00694, train/loss_vlb_step=3.45e-5, train/loss_step=0.00694, global_step=2631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3511/5971 [35:38<24:58,  1.64it/s, loss=0.0901, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000665, train/loss_step=0.184, global_step=2631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  59%|█████▉    | 3512/5971 [35:41<24:58,  1.64it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.24e-5, train/loss_step=0.023, global_step=2631.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▉    | 3513/5971 [35:42<24:58,  1.64it/s, loss=0.0835, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.74e-5, train/loss_step=0.0215, global_step=2632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3514/5971 [35:43<24:58,  1.64it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.32e-5, train/loss_step=0.0177, global_step=2632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3515/5971 [35:44<24:57,  1.64it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.32e-5, train/loss_step=0.0177, global_step=2632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3515/5971 [35:44<24:57,  1.64it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=2632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▉    | 3516/5971 [35:46<24:58,  1.64it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000276, train/loss_step=0.080, global_step=2632.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3517/5971 [35:47<24:57,  1.64it/s, loss=0.116, v_num=0, train/loss_simple_step=0.636, train/loss_vlb_step=0.00715, train/loss_step=0.636, global_step=2633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  59%|█████▉    | 3518/5971 [35:48<24:57,  1.64it/s, loss=0.125, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000754, train/loss_step=0.215, global_step=2633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3519/5971 [35:49<24:57,  1.64it/s, loss=0.125, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000754, train/loss_step=0.215, global_step=2633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3519/5971 [35:49<24:57,  1.64it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00748, train/loss_vlb_step=3.43e-5, train/loss_step=0.00748, global_step=2633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3520/5971 [35:51<24:57,  1.64it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.23e-5, train/loss_step=0.00224, global_step=2633.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3521/5971 [35:52<24:57,  1.64it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0041, train/loss_vlb_step=2.2e-5, train/loss_step=0.0041, global_step=2634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  59%|█████▉    | 3522/5971 [35:53<24:56,  1.64it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.11e-5, train/loss_step=0.00189, global_step=2634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3523/5971 [35:54<24:56,  1.64it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.11e-5, train/loss_step=0.00189, global_step=2634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3523/5971 [35:54<24:56,  1.64it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000636, train/loss_step=0.182, global_step=2634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  59%|█████▉    | 3524/5971 [35:56<24:57,  1.63it/s, loss=0.094, v_num=0, train/loss_simple_step=0.00888, train/loss_vlb_step=4.36e-5, train/loss_step=0.00888, global_step=2634.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3525/5971 [35:57<24:56,  1.63it/s, loss=0.0932, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.48e-5, train/loss_step=0.00262, global_step=2635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3526/5971 [35:58<24:56,  1.63it/s, loss=0.104, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.00084, train/loss_step=0.241, global_step=2635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  59%|█████▉    | 3527/5971 [35:59<24:55,  1.63it/s, loss=0.104, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.00084, train/loss_step=0.241, global_step=2635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3527/5971 [35:59<24:55,  1.63it/s, loss=0.104, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000462, train/loss_step=0.140, global_step=2635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3528/5971 [36:01<24:56,  1.63it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.00369, train/loss_vlb_step=2.02e-5, train/loss_step=0.00369, global_step=2635.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3529/5971 [36:02<24:56,  1.63it/s, loss=0.114, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.0024, train/loss_step=0.373, global_step=2636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:  59%|█████▉    | 3530/5971 [36:03<24:55,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000214, train/loss_step=0.0614, global_step=2636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3531/5971 [36:04<24:55,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000214, train/loss_step=0.0614, global_step=2636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3531/5971 [36:04<24:55,  1.63it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.53e-5, train/loss_step=0.0163, global_step=2636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▉    | 3532/5971 [36:06<24:55,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.00015, train/loss_step=0.0417, global_step=2636.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3533/5971 [36:07<24:55,  1.63it/s, loss=0.119, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000939, train/loss_step=0.214, global_step=2637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▉    | 3534/5971 [36:08<24:54,  1.63it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00169, train/loss_vlb_step=1e-5, train/loss_step=0.00169, global_step=2637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3535/5971 [36:09<24:54,  1.63it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00169, train/loss_vlb_step=1e-5, train/loss_step=0.00169, global_step=2637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3535/5971 [36:09<24:54,  1.63it/s, loss=0.152, v_num=0, train/loss_simple_step=0.799, train/loss_vlb_step=0.101, train/loss_step=0.799, global_step=2637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  59%|█████▉    | 3536/5971 [36:11<24:54,  1.63it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000155, train/loss_step=0.0424, global_step=2637.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3537/5971 [36:12<24:54,  1.63it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.67e-5, train/loss_step=0.0106, global_step=2638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3538/5971 [36:13<24:54,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=2638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  59%|█████▉    | 3539/5971 [36:14<24:53,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=2638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3539/5971 [36:14<24:53,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.39e-5, train/loss_step=0.00455, global_step=2638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3540/5971 [36:16<24:54,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00948, train/loss_vlb_step=4.33e-5, train/loss_step=0.00948, global_step=2638.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3541/5971 [36:17<24:53,  1.63it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00722, train/loss_vlb_step=3.62e-5, train/loss_step=0.00722, global_step=2639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3542/5971 [36:18<24:53,  1.63it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000296, train/loss_step=0.0901, global_step=2639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▉    | 3543/5971 [36:19<24:52,  1.63it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000296, train/loss_step=0.0901, global_step=2639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3543/5971 [36:19<24:52,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00774, train/loss_vlb_step=3.67e-5, train/loss_step=0.00774, global_step=2639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3544/5971 [36:21<24:53,  1.63it/s, loss=0.121, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.000991, train/loss_step=0.258, global_step=2639.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  59%|█████▉    | 3545/5971 [36:22<24:53,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.65e-5, train/loss_step=0.00298, global_step=2640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3546/5971 [36:23<24:52,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.33e-5, train/loss_step=0.00232, global_step=2640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  59%|█████▉    | 3547/5971 [36:24<24:52,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.33e-5, train/loss_step=0.00232, global_step=2640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3547/5971 [36:24<24:52,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.4e-5, train/loss_step=0.0158, global_step=2640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  59%|█████▉    | 3548/5971 [36:26<24:52,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.38e-5, train/loss_step=0.00243, global_step=2640.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3549/5971 [36:27<24:52,  1.62it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00105, train/loss_step=0.262, global_step=2641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  59%|█████▉    | 3550/5971 [36:28<24:51,  1.62it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.00015, train/loss_step=0.0413, global_step=2641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3551/5971 [36:29<24:51,  1.62it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.00015, train/loss_step=0.0413, global_step=2641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3551/5971 [36:29<24:51,  1.62it/s, loss=0.096, v_num=0, train/loss_simple_step=0.00421, train/loss_vlb_step=2.15e-5, train/loss_step=0.00421, global_step=2641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  59%|█████▉    | 3552/5971 [36:31<24:51,  1.62it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.00905, train/loss_vlb_step=4.3e-5, train/loss_step=0.00905, global_step=2641.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3553/5971 [36:32<24:51,  1.62it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00151, train/loss_step=0.289, global_step=2642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  60%|█████▉    | 3554/5971 [36:33<24:51,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=2642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3555/5971 [36:34<24:50,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=2642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3555/5971 [36:34<24:50,  1.62it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000967, train/loss_step=0.244, global_step=2642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3556/5971 [36:36<24:51,  1.62it/s, loss=0.103, v_num=0, train/loss_simple_step=0.582, train/loss_vlb_step=0.00842, train/loss_step=0.582, global_step=2642.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  60%|█████▉    | 3557/5971 [36:37<24:50,  1.62it/s, loss=0.112, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000891, train/loss_step=0.205, global_step=2643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3558/5971 [36:38<24:50,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000207, train/loss_step=0.0615, global_step=2643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3559/5971 [36:38<24:49,  1.62it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000207, train/loss_step=0.0615, global_step=2643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3559/5971 [36:38<24:49,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000845, train/loss_step=0.227, global_step=2643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  60%|█████▉    | 3560/5971 [36:41<24:50,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.23e-5, train/loss_step=0.00638, global_step=2643.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3561/5971 [36:41<24:49,  1.62it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000192, train/loss_step=0.0554, global_step=2644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  60%|█████▉    | 3562/5971 [36:42<24:49,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000776, train/loss_step=0.218, global_step=2644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  60%|█████▉    | 3563/5971 [36:43<24:48,  1.62it/s, loss=0.13, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000776, train/loss_step=0.218, global_step=2644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3563/5971 [36:43<24:48,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00113, train/loss_step=0.284, global_step=2644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3564/5971 [36:46<24:49,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000343, train/loss_step=0.103, global_step=2644.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3565/5971 [36:46<24:49,  1.62it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.43e-5, train/loss_step=0.0239, global_step=2645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3566/5971 [36:47<24:48,  1.62it/s, loss=0.157, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00243, train/loss_step=0.394, global_step=2645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  60%|█████▉    | 3567/5971 [36:48<24:48,  1.62it/s, loss=0.157, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00243, train/loss_step=0.394, global_step=2645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3567/5971 [36:48<24:48,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00282, train/loss_vlb_step=1.57e-5, train/loss_step=0.00282, global_step=2645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3568/5971 [36:50<24:48,  1.61it/s, loss=0.175, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00214, train/loss_step=0.369, global_step=2645.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  60%|█████▉    | 3569/5971 [36:51<24:48,  1.61it/s, loss=0.167, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=2646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3570/5971 [36:52<24:47,  1.61it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000202, train/loss_step=0.0562, global_step=2646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3571/5971 [36:53<24:47,  1.61it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000202, train/loss_step=0.0562, global_step=2646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3571/5971 [36:53<24:47,  1.61it/s, loss=0.177, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000722, train/loss_step=0.204, global_step=2646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  60%|█████▉    | 3572/5971 [36:56<24:47,  1.61it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7e-5, train/loss_step=0.0181, global_step=2646.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  60%|█████▉    | 3573/5971 [36:57<24:47,  1.61it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.47e-5, train/loss_step=0.00704, global_step=2647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3574/5971 [36:57<24:47,  1.61it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0589, train/loss_vlb_step=0.000201, train/loss_step=0.0589, global_step=2647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  60%|█████▉    | 3575/5971 [36:58<24:46,  1.61it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0589, train/loss_vlb_step=0.000201, train/loss_step=0.0589, global_step=2647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3575/5971 [36:58<24:46,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0953, train/loss_vlb_step=0.000313, train/loss_step=0.0953, global_step=2647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3576/5971 [37:01<24:47,  1.61it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0541, train/loss_vlb_step=0.000188, train/loss_step=0.0541, global_step=2647.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3577/5971 [37:02<24:47,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.0018, train/loss_step=0.373, global_step=2648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  60%|█████▉    | 3578/5971 [37:03<24:46,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.63e-5, train/loss_step=0.0217, global_step=2648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3579/5971 [37:04<24:46,  1.61it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.63e-5, train/loss_step=0.0217, global_step=2648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3579/5971 [37:04<24:46,  1.61it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0962, train/loss_vlb_step=0.00032, train/loss_step=0.0962, global_step=2648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3580/5971 [37:06<24:46,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.21e-5, train/loss_step=0.0141, global_step=2648.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|█████▉    | 3581/5971 [37:07<24:46,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.506, train/loss_vlb_step=0.00328, train/loss_step=0.506, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  60%|█████▉    | 3582/5971 [37:08<24:45,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0784, train/loss_vlb_step=0.000259, train/loss_step=0.0784, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|██████    | 3583/5971 [37:09<24:45,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0784, train/loss_vlb_step=0.000259, train/loss_step=0.0784, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  60%|██████    | 3583/5971 [37:09<24:45,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.37e-5, train/loss_step=0.0181, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  60%|██████    | 3584/5971 [37:11<24:46,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.34it/s][A

Validating:   1%|          | 2/167 [00:00<00:50,  3.24it/s][A
Epoch 4:  60%|██████    | 3587/5971 [37:12<24:43,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.77it/s][A
Epoch 4:  60%|██████    | 3591/5971 [37:12<24:39,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 13.00it/s][A
Epoch 4:  60%|██████    | 3595/5971 [37:12<24:35,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:10, 15.24it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.57it/s][A
Epoch 4:  60%|██████    | 3599/5971 [37:13<24:31,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.27it/s][A
Epoch 4:  60%|██████    | 3603/5971 [37:13<24:27,  1.61it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.21it/s][A
Epoch 4:  60%|██████    | 3607/5971 [37:13<24:23,  1.62it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.80it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.90it/s][A
Epoch 4:  60%|██████    | 3611/5971 [37:13<24:19,  1.62it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.26it/s][A
Epoch 4:  61%|██████    | 3615/5971 [37:13<24:15,  1.62it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.76it/s][A
Epoch 4:  61%|██████    | 3619/5971 [37:13<24:11,  1.62it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.70it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.44it/s][A
Epoch 4:  61%|██████    | 3623/5971 [37:14<24:07,  1.62it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.42it/s][A
Epoch 4:  61%|██████    | 3627/5971 [37:14<24:03,  1.62it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.56it/s][A
Epoch 4:  61%|██████    | 3631/5971 [37:14<23:59,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.89it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.37it/s][A
Epoch 4:  61%|██████    | 3635/5971 [37:14<23:55,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 25.09it/s][A
Epoch 4:  61%|██████    | 3639/5971 [37:14<23:51,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.32it/s][A
Epoch 4:  61%|██████    | 3643/5971 [37:14<23:47,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.57it/s][A

Validating:  37%|███▋      | 62/167 [00:03<00:04, 25.56it/s][A
Epoch 4:  61%|██████    | 3647/5971 [37:15<23:43,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.50it/s][A
Epoch 4:  61%|██████    | 3651/5971 [37:15<23:39,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.56it/s][A
Epoch 4:  61%|██████    | 3655/5971 [37:15<23:36,  1.64it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.03it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.68it/s][A
Epoch 4:  61%|██████▏   | 3659/5971 [37:15<23:32,  1.64it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.73it/s][A
Epoch 4:  61%|██████▏   | 3663/5971 [37:15<23:28,  1.64it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 23.92it/s][A
Epoch 4:  61%|██████▏   | 3667/5971 [37:15<23:24,  1.64it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 24.29it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 23.62it/s][A
Epoch 4:  61%|██████▏   | 3671/5971 [37:15<23:20,  1.64it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 24.31it/s][A
Epoch 4:  62%|██████▏   | 3675/5971 [37:16<23:16,  1.64it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 24.16it/s][A
Epoch 4:  62%|██████▏   | 3679/5971 [37:16<23:12,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.11it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.97it/s][A
Epoch 4:  62%|██████▏   | 3683/5971 [37:16<23:08,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.15it/s][A
Epoch 4:  62%|██████▏   | 3687/5971 [37:16<23:05,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.21it/s][A
Epoch 4:  62%|██████▏   | 3691/5971 [37:16<23:01,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 27.16it/s][A
Epoch 4:  62%|██████▏   | 3695/5971 [37:16<22:57,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 28.27it/s][A
Epoch 4:  62%|██████▏   | 3699/5971 [37:17<22:53,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 27.31it/s][A

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.10it/s][A
Epoch 4:  62%|██████▏   | 3703/5971 [37:17<22:49,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.36it/s][A
Epoch 4:  62%|██████▏   | 3707/5971 [37:17<22:46,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.08it/s][A
Epoch 4:  62%|██████▏   | 3711/5971 [37:17<22:42,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.53it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 25.56it/s][A
Epoch 4:  62%|██████▏   | 3715/5971 [37:17<22:38,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 24.36it/s][A
Epoch 4:  62%|██████▏   | 3719/5971 [37:17<22:34,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 24.74it/s][A
Epoch 4:  62%|██████▏   | 3723/5971 [37:17<22:30,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 25.23it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 25.52it/s][A
Epoch 4:  62%|██████▏   | 3727/5971 [37:18<22:27,  1.67it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 24.63it/s][A
Epoch 4:  62%|██████▏   | 3731/5971 [37:18<22:23,  1.67it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 23.73it/s][A
Epoch 4:  63%|██████▎   | 3735/5971 [37:18<22:19,  1.67it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 23.97it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 24.60it/s][A
Epoch 4:  63%|██████▎   | 3739/5971 [37:18<22:15,  1.67it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 21.65it/s][A
Epoch 4:  63%|██████▎   | 3743/5971 [37:18<22:12,  1.67it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 23.01it/s][A
Epoch 4:  63%|██████▎   | 3747/5971 [37:18<22:08,  1.67it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 23.63it/s][A

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 24.17it/s][A
Epoch 4:  63%|██████▎   | 3751/5971 [37:19<22:04,  1.68it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3752/5971 [37:19<22:04,  1.68it/s, loss=0.131, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=2649.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  63%|██████▎   | 3753/5971 [37:20<22:03,  1.68it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00328, train/loss_vlb_step=1.72e-5, train/loss_step=0.00328, global_step=2650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3754/5971 [37:21<22:03,  1.68it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000157, train/loss_step=0.0454, global_step=2650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3755/5971 [37:22<22:02,  1.68it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000157, train/loss_step=0.0454, global_step=2650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3755/5971 [37:22<22:02,  1.68it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0093, train/loss_vlb_step=4.42e-5, train/loss_step=0.0093, global_step=2650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  63%|██████▎   | 3756/5971 [37:24<22:03,  1.67it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.2e-5, train/loss_step=0.00201, global_step=2650.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3757/5971 [37:25<22:02,  1.67it/s, loss=0.114, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00468, train/loss_step=0.490, global_step=2651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  63%|██████▎   | 3758/5971 [37:26<22:02,  1.67it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=1e-5, train/loss_step=0.00164, global_step=2651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3759/5971 [37:27<22:01,  1.67it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=1e-5, train/loss_step=0.00164, global_step=2651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3759/5971 [37:27<22:01,  1.67it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00627, train/loss_vlb_step=3.01e-5, train/loss_step=0.00627, global_step=2651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3760/5971 [37:29<22:02,  1.67it/s, loss=0.106, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=2651.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  63%|██████▎   | 3761/5971 [37:30<22:02,  1.67it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0441, train/loss_vlb_step=0.000162, train/loss_step=0.0441, global_step=2652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3762/5971 [37:31<22:01,  1.67it/s, loss=0.114, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000672, train/loss_step=0.185, global_step=2652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  63%|██████▎   | 3763/5971 [37:32<22:01,  1.67it/s, loss=0.114, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000672, train/loss_step=0.185, global_step=2652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3763/5971 [37:32<22:01,  1.67it/s, loss=0.12, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000729, train/loss_step=0.207, global_step=2652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  63%|██████▎   | 3764/5971 [37:34<22:01,  1.67it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.16e-5, train/loss_step=0.0196, global_step=2652.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3765/5971 [37:35<22:01,  1.67it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0545, train/loss_vlb_step=0.000189, train/loss_step=0.0545, global_step=2653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3766/5971 [37:36<22:00,  1.67it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.000176, train/loss_step=0.0511, global_step=2653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3767/5971 [37:37<22:00,  1.67it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.000176, train/loss_step=0.0511, global_step=2653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3767/5971 [37:37<22:00,  1.67it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.00925, train/loss_vlb_step=4.45e-5, train/loss_step=0.00925, global_step=2653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3768/5971 [37:39<22:00,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000463, train/loss_step=0.135, global_step=2653.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  63%|██████▎   | 3769/5971 [37:40<22:00,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00734, train/loss_step=0.542, global_step=2654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  63%|██████▎   | 3770/5971 [37:41<22:00,  1.67it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00773, train/loss_vlb_step=3.85e-5, train/loss_step=0.00773, global_step=2654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3771/5971 [37:42<21:59,  1.67it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00773, train/loss_vlb_step=3.85e-5, train/loss_step=0.00773, global_step=2654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3771/5971 [37:42<21:59,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=2654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  63%|██████▎   | 3772/5971 [37:44<21:59,  1.67it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.78e-5, train/loss_step=0.0235, global_step=2654.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  63%|██████▎   | 3773/5971 [37:45<21:59,  1.67it/s, loss=0.112, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000661, train/loss_step=0.191, global_step=2655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  63%|██████▎   | 3774/5971 [37:46<21:59,  1.67it/s, loss=0.118, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000745, train/loss_step=0.184, global_step=2655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3775/5971 [37:47<21:58,  1.67it/s, loss=0.118, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000745, train/loss_step=0.184, global_step=2655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3775/5971 [37:47<21:58,  1.67it/s, loss=0.153, v_num=0, train/loss_simple_step=0.708, train/loss_vlb_step=0.013, train/loss_step=0.708, global_step=2655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  63%|██████▎   | 3776/5971 [37:49<21:58,  1.66it/s, loss=0.167, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.00133, train/loss_step=0.272, global_step=2655.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3777/5971 [37:50<21:58,  1.66it/s, loss=0.191, v_num=0, train/loss_simple_step=0.977, train/loss_vlb_step=0.492, train/loss_step=0.977, global_step=2656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  63%|██████▎   | 3778/5971 [37:51<21:58,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.002, train/loss_step=0.403, global_step=2656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3779/5971 [37:52<21:57,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.002, train/loss_step=0.403, global_step=2656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3779/5971 [37:52<21:57,  1.66it/s, loss=0.248, v_num=0, train/loss_simple_step=0.732, train/loss_vlb_step=0.0294, train/loss_step=0.732, global_step=2656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3780/5971 [37:54<21:58,  1.66it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.36e-5, train/loss_step=0.00243, global_step=2656.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3781/5971 [37:55<21:57,  1.66it/s, loss=0.24, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.99e-5, train/loss_step=0.00351, global_step=2657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  63%|██████▎   | 3782/5971 [37:56<21:57,  1.66it/s, loss=0.237, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000455, train/loss_step=0.136, global_step=2657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  63%|██████▎   | 3783/5971 [37:57<21:56,  1.66it/s, loss=0.237, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000455, train/loss_step=0.136, global_step=2657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3783/5971 [37:57<21:56,  1.66it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.75e-5, train/loss_step=0.0259, global_step=2657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3784/5971 [37:59<21:56,  1.66it/s, loss=0.227, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.15e-5, train/loss_step=0.00191, global_step=2657.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3785/5971 [38:00<21:56,  1.66it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.53e-5, train/loss_step=0.0122, global_step=2658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  63%|██████▎   | 3786/5971 [38:01<21:56,  1.66it/s, loss=0.223, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.95e-5, train/loss_step=0.00366, global_step=2658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3787/5971 [38:01<21:55,  1.66it/s, loss=0.223, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.95e-5, train/loss_step=0.00366, global_step=2658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3787/5971 [38:01<21:55,  1.66it/s, loss=0.244, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00216, train/loss_step=0.426, global_step=2658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  63%|██████▎   | 3788/5971 [38:04<21:56,  1.66it/s, loss=0.239, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000136, train/loss_step=0.035, global_step=2658.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3789/5971 [38:05<21:55,  1.66it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=7.06e-5, train/loss_step=0.0158, global_step=2659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3790/5971 [38:06<21:55,  1.66it/s, loss=0.212, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.53e-5, train/loss_step=0.010, global_step=2659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  63%|██████▎   | 3791/5971 [38:07<21:54,  1.66it/s, loss=0.212, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.53e-5, train/loss_step=0.010, global_step=2659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  63%|██████▎   | 3791/5971 [38:07<21:54,  1.66it/s, loss=0.241, v_num=0, train/loss_simple_step=0.662, train/loss_vlb_step=0.0195, train/loss_step=0.662, global_step=2659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  64%|██████▎   | 3792/5971 [38:09<21:55,  1.66it/s, loss=0.251, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000878, train/loss_step=0.219, global_step=2659.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3793/5971 [38:10<21:54,  1.66it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000223, train/loss_step=0.0635, global_step=2660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3794/5971 [38:11<21:54,  1.66it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000141, train/loss_step=0.0368, global_step=2660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3795/5971 [38:12<21:53,  1.66it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000141, train/loss_step=0.0368, global_step=2660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3795/5971 [38:12<21:53,  1.66it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000158, train/loss_step=0.0448, global_step=2660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3796/5971 [38:14<21:54,  1.65it/s, loss=0.199, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000611, train/loss_step=0.169, global_step=2660.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  64%|██████▎   | 3797/5971 [38:15<21:53,  1.65it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=5.09e-5, train/loss_step=0.0107, global_step=2661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3798/5971 [38:16<21:53,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00416, train/loss_vlb_step=2.29e-5, train/loss_step=0.00416, global_step=2661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3799/5971 [38:17<21:53,  1.65it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00416, train/loss_vlb_step=2.29e-5, train/loss_step=0.00416, global_step=2661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3799/5971 [38:17<21:53,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00154, train/loss_step=0.334, global_step=2661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  64%|██████▎   | 3800/5971 [38:19<21:53,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00621, train/loss_vlb_step=3.06e-5, train/loss_step=0.00621, global_step=2661.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3801/5971 [38:20<21:52,  1.65it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0717, train/loss_vlb_step=0.000244, train/loss_step=0.0717, global_step=2662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  64%|██████▎   | 3802/5971 [38:21<21:52,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0519, train/loss_vlb_step=0.000178, train/loss_step=0.0519, global_step=2662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  64%|██████▎   | 3803/5971 [38:22<21:52,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0519, train/loss_vlb_step=0.000178, train/loss_step=0.0519, global_step=2662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3803/5971 [38:22<21:52,  1.65it/s, loss=0.11, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.32e-5, train/loss_step=0.015, global_step=2662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  64%|██████▎   | 3804/5971 [38:24<21:52,  1.65it/s, loss=0.116, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=2662.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3805/5971 [38:25<21:51,  1.65it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00279, train/loss_vlb_step=1.51e-5, train/loss_step=0.00279, global_step=2663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▎   | 3806/5971 [38:26<21:51,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.00864, train/loss_step=0.570, global_step=2663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  64%|██████▍   | 3807/5971 [38:27<21:51,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.00864, train/loss_step=0.570, global_step=2663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3807/5971 [38:27<21:51,  1.65it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0921, train/loss_vlb_step=0.000306, train/loss_step=0.0921, global_step=2663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3808/5971 [38:29<21:51,  1.65it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.96e-5, train/loss_step=0.00358, global_step=2663.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3809/5971 [38:30<21:51,  1.65it/s, loss=0.132, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000553, train/loss_step=0.152, global_step=2664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  64%|██████▍   | 3810/5971 [38:31<21:50,  1.65it/s, loss=0.178, v_num=0, train/loss_simple_step=0.929, train/loss_vlb_step=0.235, train/loss_step=0.929, global_step=2664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  64%|██████▍   | 3811/5971 [38:32<21:50,  1.65it/s, loss=0.178, v_num=0, train/loss_simple_step=0.929, train/loss_vlb_step=0.235, train/loss_step=0.929, global_step=2664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3811/5971 [38:32<21:50,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.17e-5, train/loss_step=0.0142, global_step=2664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3812/5971 [38:34<21:50,  1.65it/s, loss=0.141, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000444, train/loss_step=0.133, global_step=2664.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  64%|██████▍   | 3813/5971 [38:35<21:50,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000374, train/loss_step=0.113, global_step=2665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3814/5971 [38:36<21:49,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.88e-5, train/loss_step=0.0185, global_step=2665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3815/5971 [38:37<21:49,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.88e-5, train/loss_step=0.0185, global_step=2665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3815/5971 [38:37<21:49,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000439, train/loss_step=0.129, global_step=2665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  64%|██████▍   | 3816/5971 [38:39<21:49,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0752, train/loss_vlb_step=0.000251, train/loss_step=0.0752, global_step=2665.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3817/5971 [38:40<21:49,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.000257, train/loss_step=0.078, global_step=2666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  64%|██████▍   | 3818/5971 [38:41<21:48,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.22e-5, train/loss_step=0.00429, global_step=2666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3819/5971 [38:42<21:48,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.22e-5, train/loss_step=0.00429, global_step=2666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3819/5971 [38:42<21:48,  1.65it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.12e-5, train/loss_step=0.0115, global_step=2666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  64%|██████▍   | 3820/5971 [38:44<21:48,  1.64it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.53e-5, train/loss_step=0.0108, global_step=2666.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3821/5971 [38:45<21:48,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000533, train/loss_step=0.153, global_step=2667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3822/5971 [38:46<21:47,  1.64it/s, loss=0.149, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0021, train/loss_step=0.358, global_step=2667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  64%|██████▍   | 3823/5971 [38:47<21:47,  1.64it/s, loss=0.149, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0021, train/loss_step=0.358, global_step=2667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3823/5971 [38:47<21:47,  1.64it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0345, train/loss_vlb_step=0.000128, train/loss_step=0.0345, global_step=2667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3824/5971 [38:49<21:47,  1.64it/s, loss=0.157, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00122, train/loss_step=0.262, global_step=2667.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  64%|██████▍   | 3825/5971 [38:50<21:46,  1.64it/s, loss=0.171, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.0011, train/loss_step=0.274, global_step=2668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  64%|██████▍   | 3826/5971 [38:51<21:46,  1.64it/s, loss=0.148, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000357, train/loss_step=0.108, global_step=2668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3827/5971 [38:51<21:46,  1.64it/s, loss=0.148, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000357, train/loss_step=0.108, global_step=2668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3827/5971 [38:51<21:46,  1.64it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.6e-5, train/loss_step=0.0117, global_step=2668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3828/5971 [38:54<21:46,  1.64it/s, loss=0.161, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00262, train/loss_step=0.355, global_step=2668.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  64%|██████▍   | 3829/5971 [38:55<21:46,  1.64it/s, loss=0.166, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000831, train/loss_step=0.241, global_step=2669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3830/5971 [38:56<21:45,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.634, train/loss_vlb_step=0.0209, train/loss_step=0.634, global_step=2669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  64%|██████▍   | 3831/5971 [38:57<21:45,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.634, train/loss_vlb_step=0.0209, train/loss_step=0.634, global_step=2669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3831/5971 [38:57<21:45,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00711, train/loss_vlb_step=3.37e-5, train/loss_step=0.00711, global_step=2669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3832/5971 [38:59<21:45,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000154, train/loss_step=0.0439, global_step=2669.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  64%|██████▍   | 3833/5971 [39:00<21:44,  1.64it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00282, train/loss_vlb_step=1.53e-5, train/loss_step=0.00282, global_step=2670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3834/5971 [39:00<21:44,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00982, train/loss_vlb_step=4.4e-5, train/loss_step=0.00982, global_step=2670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  64%|██████▍   | 3835/5971 [39:01<21:44,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00982, train/loss_vlb_step=4.4e-5, train/loss_step=0.00982, global_step=2670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3835/5971 [39:01<21:44,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.83e-5, train/loss_step=0.0108, global_step=2670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3836/5971 [39:03<21:44,  1.64it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.34e-5, train/loss_step=0.00231, global_step=2670.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3837/5971 [39:04<21:43,  1.64it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00433, train/loss_vlb_step=2.27e-5, train/loss_step=0.00433, global_step=2671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3838/5971 [39:05<21:43,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00493, train/loss_step=0.488, global_step=2671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  64%|██████▍   | 3839/5971 [39:06<21:42,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00493, train/loss_step=0.488, global_step=2671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3839/5971 [39:06<21:42,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.82e-5, train/loss_step=0.0134, global_step=2671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3840/5971 [39:08<21:43,  1.64it/s, loss=0.172, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.0024, train/loss_step=0.436, global_step=2671.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  64%|██████▍   | 3841/5971 [39:09<21:42,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000225, train/loss_step=0.0658, global_step=2672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3842/5971 [39:10<21:42,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0911, train/loss_vlb_step=0.000304, train/loss_step=0.0911, global_step=2672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3843/5971 [39:11<21:41,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0911, train/loss_vlb_step=0.000304, train/loss_step=0.0911, global_step=2672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3843/5971 [39:11<21:41,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00705, train/loss_vlb_step=3.21e-5, train/loss_step=0.00705, global_step=2672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3844/5971 [39:13<21:41,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.587, train/loss_vlb_step=0.0083, train/loss_step=0.587, global_step=2672.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:  64%|██████▍   | 3845/5971 [39:14<21:41,  1.63it/s, loss=0.189, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.00966, train/loss_step=0.668, global_step=2673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3846/5971 [39:15<21:41,  1.63it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00491, train/loss_vlb_step=2.42e-5, train/loss_step=0.00491, global_step=2673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3847/5971 [39:16<21:40,  1.63it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00491, train/loss_vlb_step=2.42e-5, train/loss_step=0.00491, global_step=2673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3847/5971 [39:16<21:40,  1.63it/s, loss=0.189, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=2673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  64%|██████▍   | 3848/5971 [39:18<21:41,  1.63it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000163, train/loss_step=0.0436, global_step=2673.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3849/5971 [39:19<21:40,  1.63it/s, loss=0.179, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00136, train/loss_step=0.354, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  64%|██████▍   | 3850/5971 [39:20<21:40,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3851/5971 [39:21<21:39,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  64%|██████▍   | 3851/5971 [39:21<21:39,  1.63it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0513, train/loss_vlb_step=0.000184, train/loss_step=0.0513, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  65%|██████▍   | 3852/5971 [39:23<21:39,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.39it/s][A
Epoch 4:  65%|██████▍   | 3855/5971 [39:24<21:37,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   2%|▏         | 3/167 [00:00<00:30,  5.42it/s][A

Validating:   4%|▎         | 6/167 [00:00<00:16,  9.96it/s][A
Epoch 4:  65%|██████▍   | 3859/5971 [39:24<21:33,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▌         | 9/167 [00:00<00:11, 13.91it/s][A
Epoch 4:  65%|██████▍   | 3863/5971 [39:24<21:29,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 12/167 [00:01<00:09, 16.70it/s][A
Epoch 4:  65%|██████▍   | 3867/5971 [39:24<21:26,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   9%|▉         | 15/167 [00:01<00:07, 19.24it/s][A

Validating:  11%|█         | 18/167 [00:01<00:07, 20.11it/s][A
Epoch 4:  65%|██████▍   | 3871/5971 [39:24<21:22,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.08it/s][A
Epoch 4:  65%|██████▍   | 3875/5971 [39:24<21:18,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.23it/s][A
Epoch 4:  65%|██████▍   | 3879/5971 [39:25<21:15,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 27/167 [00:01<00:06, 22.52it/s][A

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.05it/s][A
Epoch 4:  65%|██████▌   | 3883/5971 [39:25<21:11,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.77it/s][A
Epoch 4:  65%|██████▌   | 3887/5971 [39:25<21:07,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.82it/s][A
Epoch 4:  65%|██████▌   | 3891/5971 [39:25<21:04,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.44it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.63it/s][A
Epoch 4:  65%|██████▌   | 3895/5971 [39:25<21:00,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.03it/s][A
Epoch 4:  65%|██████▌   | 3899/5971 [39:25<20:56,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.58it/s][A
Epoch 4:  65%|██████▌   | 3903/5971 [39:26<20:53,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.15it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.42it/s][A
Epoch 4:  65%|██████▌   | 3907/5971 [39:26<20:49,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.60it/s][A
Epoch 4:  65%|██████▌   | 3911/5971 [39:26<20:46,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.66it/s][A
Epoch 4:  66%|██████▌   | 3915/5971 [39:26<20:42,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 27.45it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.76it/s][A
Epoch 4:  66%|██████▌   | 3919/5971 [39:26<20:38,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.55it/s][A
Epoch 4:  66%|██████▌   | 3923/5971 [39:26<20:35,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.37it/s][A
Epoch 4:  66%|██████▌   | 3927/5971 [39:26<20:31,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.42it/s][A
Epoch 4:  66%|██████▌   | 3931/5971 [39:27<20:28,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 24.75it/s][A

Validating:  49%|████▉     | 82/167 [00:03<00:03, 25.84it/s][A
Epoch 4:  66%|██████▌   | 3935/5971 [39:27<20:24,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.54it/s][A
Epoch 4:  66%|██████▌   | 3939/5971 [39:27<20:20,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 25.57it/s][A
Epoch 4:  66%|██████▌   | 3943/5971 [39:27<20:17,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.88it/s][A

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.98it/s][A
Epoch 4:  66%|██████▌   | 3947/5971 [39:27<20:13,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.35it/s][A
Epoch 4:  66%|██████▌   | 3951/5971 [39:27<20:10,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.68it/s][A
Epoch 4:  66%|██████▌   | 3955/5971 [39:28<20:06,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.80it/s][A
Epoch 4:  66%|██████▋   | 3959/5971 [39:28<20:03,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.25it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.75it/s][A
Epoch 4:  66%|██████▋   | 3963/5971 [39:28<19:59,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.52it/s][A
Epoch 4:  66%|██████▋   | 3967/5971 [39:28<19:56,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.94it/s][A
Epoch 4:  67%|██████▋   | 3971/5971 [39:28<19:52,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 27.43it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 27.48it/s][A
Epoch 4:  67%|██████▋   | 3975/5971 [39:28<19:49,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.41it/s][A
Epoch 4:  67%|██████▋   | 3979/5971 [39:28<19:45,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.85it/s][A
Epoch 4:  67%|██████▋   | 3983/5971 [39:29<19:42,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.57it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.53it/s][A
Epoch 4:  67%|██████▋   | 3987/5971 [39:29<19:38,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.48it/s][A
Epoch 4:  67%|██████▋   | 3991/5971 [39:29<19:35,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.27it/s][A
Epoch 4:  67%|██████▋   | 3995/5971 [39:29<19:31,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.06it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.84it/s][A
Epoch 4:  67%|██████▋   | 3999/5971 [39:29<19:28,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.61it/s][A
Epoch 4:  67%|██████▋   | 4003/5971 [39:29<19:24,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.42it/s][A
Epoch 4:  67%|██████▋   | 4007/5971 [39:30<19:21,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 18.44it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 19.91it/s][A
Epoch 4:  67%|██████▋   | 4011/5971 [39:30<19:17,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 21.15it/s][A
Epoch 4:  67%|██████▋   | 4015/5971 [39:30<19:14,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 22.64it/s][A
Epoch 4:  67%|██████▋   | 4019/5971 [39:30<19:11,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 100%|██████████| 167/167 [00:07<00:00, 22.71it/s][A
Epoch 4:  67%|██████▋   | 4020/5971 [39:31<19:10,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000287, train/loss_step=0.0854, global_step=2674.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  67%|██████▋   | 4021/5971 [39:32<19:10,  1.70it/s, loss=0.165, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000532, train/loss_step=0.153, global_step=2675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  67%|██████▋   | 4022/5971 [39:32<19:09,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.8e-5, train/loss_step=0.0209, global_step=2675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  67%|██████▋   | 4023/5971 [39:33<19:09,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.8e-5, train/loss_step=0.0209, global_step=2675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  67%|██████▋   | 4023/5971 [39:33<19:09,  1.70it/s, loss=0.172, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000453, train/loss_step=0.136, global_step=2675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  67%|██████▋   | 4024/5971 [39:36<19:09,  1.69it/s, loss=0.192, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00268, train/loss_step=0.402, global_step=2675.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  67%|██████▋   | 4025/5971 [39:37<19:08,  1.69it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0914, train/loss_vlb_step=0.000305, train/loss_step=0.0914, global_step=2676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  67%|██████▋   | 4026/5971 [39:37<19:08,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.000274, train/loss_step=0.0817, global_step=2676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  67%|██████▋   | 4027/5971 [39:38<19:08,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0817, train/loss_vlb_step=0.000274, train/loss_step=0.0817, global_step=2676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  67%|██████▋   | 4027/5971 [39:38<19:08,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00878, train/loss_vlb_step=3.91e-5, train/loss_step=0.00878, global_step=2676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  67%|██████▋   | 4028/5971 [39:41<19:08,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000865, train/loss_step=0.232, global_step=2676.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  67%|██████▋   | 4029/5971 [39:42<19:07,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.97e-5, train/loss_step=0.0109, global_step=2677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  67%|██████▋   | 4030/5971 [39:43<19:07,  1.69it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.98e-5, train/loss_step=0.0157, global_step=2677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4031/5971 [39:43<19:07,  1.69it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.98e-5, train/loss_step=0.0157, global_step=2677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4031/5971 [39:43<19:07,  1.69it/s, loss=0.174, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00133, train/loss_step=0.303, global_step=2677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  68%|██████▊   | 4032/5971 [39:46<19:07,  1.69it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.09e-5, train/loss_step=0.00186, global_step=2677.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4033/5971 [39:46<19:06,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=2678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  68%|██████▊   | 4034/5971 [39:47<19:06,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=2678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4035/5971 [39:48<19:05,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=2678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4035/5971 [39:48<19:05,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.09e-5, train/loss_step=0.00384, global_step=2678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4036/5971 [39:50<19:06,  1.69it/s, loss=0.141, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00377, train/loss_step=0.464, global_step=2678.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  68%|██████▊   | 4037/5971 [39:51<19:05,  1.69it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=7.72e-5, train/loss_step=0.0211, global_step=2679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4038/5971 [39:52<19:05,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.48e-5, train/loss_step=0.00715, global_step=2679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4039/5971 [39:53<19:04,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.48e-5, train/loss_step=0.00715, global_step=2679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4039/5971 [39:53<19:04,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.67e-5, train/loss_step=0.00298, global_step=2679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4040/5971 [39:55<19:04,  1.69it/s, loss=0.125, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.00103, train/loss_step=0.261, global_step=2679.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  68%|██████▊   | 4041/5971 [39:56<19:04,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.55e-5, train/loss_step=0.00966, global_step=2680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4042/5971 [39:57<19:03,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000879, train/loss_step=0.244, global_step=2680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  68%|██████▊   | 4043/5971 [39:58<19:03,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000879, train/loss_step=0.244, global_step=2680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4043/5971 [39:58<19:03,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00177, train/loss_vlb_step=1.06e-5, train/loss_step=0.00177, global_step=2680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4044/5971 [40:01<19:03,  1.68it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.18e-6, train/loss_step=0.00135, global_step=2680.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4045/5971 [40:02<19:03,  1.68it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.87e-5, train/loss_step=0.0105, global_step=2681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  68%|██████▊   | 4046/5971 [40:03<19:03,  1.68it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.53e-5, train/loss_step=0.0252, global_step=2681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4047/5971 [40:04<19:02,  1.68it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.53e-5, train/loss_step=0.0252, global_step=2681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4047/5971 [40:04<19:02,  1.68it/s, loss=0.114, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00201, train/loss_step=0.376, global_step=2681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  68%|██████▊   | 4048/5971 [40:06<19:02,  1.68it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00227, train/loss_vlb_step=1.3e-5, train/loss_step=0.00227, global_step=2681.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4049/5971 [40:07<19:02,  1.68it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=1.94e-5, train/loss_step=0.00376, global_step=2682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4050/5971 [40:08<19:01,  1.68it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.73e-5, train/loss_step=0.0194, global_step=2682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  68%|██████▊   | 4051/5971 [40:08<19:01,  1.68it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.73e-5, train/loss_step=0.0194, global_step=2682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4051/5971 [40:08<19:01,  1.68it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000959, train/loss_step=0.247, global_step=2682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4052/5971 [40:11<19:01,  1.68it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0942, train/loss_vlb_step=0.00032, train/loss_step=0.0942, global_step=2682.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4053/5971 [40:11<19:01,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00228, train/loss_step=0.385, global_step=2683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  68%|██████▊   | 4054/5971 [40:12<19:00,  1.68it/s, loss=0.118, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000672, train/loss_step=0.189, global_step=2683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4055/5971 [40:13<19:00,  1.68it/s, loss=0.118, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000672, train/loss_step=0.189, global_step=2683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4055/5971 [40:13<19:00,  1.68it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000312, train/loss_step=0.0918, global_step=2683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4056/5971 [40:15<19:00,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000479, train/loss_step=0.134, global_step=2683.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  68%|██████▊   | 4057/5971 [40:16<18:59,  1.68it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.2e-5, train/loss_step=0.00201, global_step=2684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4058/5971 [40:17<18:59,  1.68it/s, loss=0.125, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00295, train/loss_step=0.393, global_step=2684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  68%|██████▊   | 4059/5971 [40:18<18:58,  1.68it/s, loss=0.125, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00295, train/loss_step=0.393, global_step=2684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4059/5971 [40:18<18:58,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000892, train/loss_step=0.244, global_step=2684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4060/5971 [40:20<18:59,  1.68it/s, loss=0.13, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000448, train/loss_step=0.134, global_step=2684.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  68%|██████▊   | 4061/5971 [40:21<18:58,  1.68it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00742, train/loss_vlb_step=3.73e-5, train/loss_step=0.00742, global_step=2685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4062/5971 [40:22<18:58,  1.68it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0648, train/loss_vlb_step=0.000226, train/loss_step=0.0648, global_step=2685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4063/5971 [40:23<18:57,  1.68it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0648, train/loss_vlb_step=0.000226, train/loss_step=0.0648, global_step=2685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4063/5971 [40:23<18:57,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00106, train/loss_step=0.282, global_step=2685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  68%|██████▊   | 4064/5971 [40:26<18:58,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.00017, train/loss_step=0.0472, global_step=2685.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4065/5971 [40:26<18:57,  1.68it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.00021, train/loss_step=0.0609, global_step=2686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  68%|██████▊   | 4066/5971 [40:27<18:57,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.1e-5, train/loss_step=0.00676, global_step=2686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4067/5971 [40:28<18:56,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.1e-5, train/loss_step=0.00676, global_step=2686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4067/5971 [40:28<18:56,  1.67it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000128, train/loss_step=0.0348, global_step=2686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4068/5971 [40:30<18:56,  1.67it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00413, train/loss_vlb_step=2.23e-5, train/loss_step=0.00413, global_step=2686.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4069/5971 [40:31<18:56,  1.67it/s, loss=0.144, v_num=0, train/loss_simple_step=0.442, train/loss_vlb_step=0.00375, train/loss_step=0.442, global_step=2687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  68%|██████▊   | 4070/5971 [40:32<18:55,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0721, train/loss_vlb_step=0.000241, train/loss_step=0.0721, global_step=2687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4071/5971 [40:33<18:55,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0721, train/loss_vlb_step=0.000241, train/loss_step=0.0721, global_step=2687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4071/5971 [40:33<18:55,  1.67it/s, loss=0.145, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000723, train/loss_step=0.205, global_step=2687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  68%|██████▊   | 4072/5971 [40:35<18:55,  1.67it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000305, train/loss_step=0.0918, global_step=2687.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4073/5971 [40:36<18:55,  1.67it/s, loss=0.168, v_num=0, train/loss_simple_step=0.853, train/loss_vlb_step=0.0869, train/loss_step=0.853, global_step=2688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  68%|██████▊   | 4074/5971 [40:37<18:54,  1.67it/s, loss=0.195, v_num=0, train/loss_simple_step=0.731, train/loss_vlb_step=0.0138, train/loss_step=0.731, global_step=2688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4075/5971 [40:38<18:54,  1.67it/s, loss=0.195, v_num=0, train/loss_simple_step=0.731, train/loss_vlb_step=0.0138, train/loss_step=0.731, global_step=2688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4075/5971 [40:38<18:54,  1.67it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.84e-5, train/loss_step=0.0187, global_step=2688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4076/5971 [40:40<18:54,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.00101, train/loss_step=0.252, global_step=2688.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  68%|██████▊   | 4077/5971 [40:41<18:53,  1.67it/s, loss=0.214, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00192, train/loss_step=0.343, global_step=2689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4078/5971 [40:42<18:53,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00188, train/loss_step=0.323, global_step=2689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4079/5971 [40:43<18:52,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00188, train/loss_step=0.323, global_step=2689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4079/5971 [40:43<18:52,  1.67it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.68e-5, train/loss_step=0.0126, global_step=2689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4080/5971 [40:45<18:53,  1.67it/s, loss=0.206, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00116, train/loss_step=0.266, global_step=2689.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  68%|██████▊   | 4081/5971 [40:46<18:52,  1.67it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000108, train/loss_step=0.0271, global_step=2690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4082/5971 [40:47<18:52,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000478, train/loss_step=0.146, global_step=2690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  68%|██████▊   | 4083/5971 [40:48<18:51,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000478, train/loss_step=0.146, global_step=2690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4083/5971 [40:48<18:51,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00114, train/loss_step=0.295, global_step=2690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  68%|██████▊   | 4084/5971 [40:50<18:51,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000203, train/loss_step=0.0562, global_step=2690.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4085/5971 [40:51<18:51,  1.67it/s, loss=0.223, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.0012, train/loss_step=0.285, global_step=2691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  68%|██████▊   | 4086/5971 [40:52<18:51,  1.67it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000158, train/loss_step=0.0453, global_step=2691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4087/5971 [40:53<18:50,  1.67it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000158, train/loss_step=0.0453, global_step=2691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4087/5971 [40:53<18:50,  1.67it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000315, train/loss_step=0.0958, global_step=2691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4088/5971 [40:55<18:50,  1.67it/s, loss=0.27, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.071, train/loss_step=0.834, global_step=2691.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:  68%|██████▊   | 4089/5971 [40:56<18:50,  1.67it/s, loss=0.249, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.22e-5, train/loss_step=0.0201, global_step=2692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  68%|██████▊   | 4090/5971 [40:57<18:49,  1.66it/s, loss=0.245, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.13e-5, train/loss_step=0.00186, global_step=2692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4091/5971 [40:57<18:49,  1.66it/s, loss=0.245, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.13e-5, train/loss_step=0.00186, global_step=2692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4091/5971 [40:57<18:49,  1.66it/s, loss=0.243, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000559, train/loss_step=0.170, global_step=2692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  69%|██████▊   | 4092/5971 [41:00<18:49,  1.66it/s, loss=0.244, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000373, train/loss_step=0.112, global_step=2692.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4093/5971 [41:01<18:49,  1.66it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.03e-5, train/loss_step=0.0227, global_step=2693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4094/5971 [41:02<18:48,  1.66it/s, loss=0.183, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00145, train/loss_step=0.326, global_step=2693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  69%|██████▊   | 4095/5971 [41:02<18:48,  1.66it/s, loss=0.183, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00145, train/loss_step=0.326, global_step=2693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4095/5971 [41:02<18:48,  1.66it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.07e-5, train/loss_step=0.0234, global_step=2693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4096/5971 [41:05<18:48,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00558, train/loss_vlb_step=2.88e-5, train/loss_step=0.00558, global_step=2693.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4097/5971 [41:06<18:47,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0414, train/loss_vlb_step=0.000145, train/loss_step=0.0414, global_step=2694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  69%|██████▊   | 4098/5971 [41:06<18:47,  1.66it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00692, train/loss_vlb_step=3.28e-5, train/loss_step=0.00692, global_step=2694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4099/5971 [41:07<18:46,  1.66it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00692, train/loss_vlb_step=3.28e-5, train/loss_step=0.00692, global_step=2694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4099/5971 [41:07<18:46,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000495, train/loss_step=0.147, global_step=2694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  69%|██████▊   | 4100/5971 [41:09<18:46,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.83e-5, train/loss_step=0.0135, global_step=2694.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4101/5971 [41:10<18:46,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.71e-6, train/loss_step=0.00161, global_step=2695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4102/5971 [41:11<18:45,  1.66it/s, loss=0.147, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00305, train/loss_step=0.443, global_step=2695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  69%|██████▊   | 4103/5971 [41:12<18:45,  1.66it/s, loss=0.147, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00305, train/loss_step=0.443, global_step=2695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4103/5971 [41:12<18:45,  1.66it/s, loss=0.141, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000598, train/loss_step=0.175, global_step=2695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4104/5971 [41:15<18:45,  1.66it/s, loss=0.145, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000432, train/loss_step=0.131, global_step=2695.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▊   | 4105/5971 [41:16<18:45,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000213, train/loss_step=0.0645, global_step=2696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4106/5971 [41:17<18:45,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.00017, train/loss_step=0.0508, global_step=2696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  69%|██████▉   | 4107/5971 [41:18<18:44,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.00017, train/loss_step=0.0508, global_step=2696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4107/5971 [41:18<18:44,  1.66it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00508, train/loss_vlb_step=2.58e-5, train/loss_step=0.00508, global_step=2696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4108/5971 [41:20<18:44,  1.66it/s, loss=0.101, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000963, train/loss_step=0.249, global_step=2696.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  69%|██████▉   | 4109/5971 [41:21<18:44,  1.66it/s, loss=0.11, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000748, train/loss_step=0.204, global_step=2697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  69%|██████▉   | 4110/5971 [41:22<18:43,  1.66it/s, loss=0.126, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00146, train/loss_step=0.324, global_step=2697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4111/5971 [41:23<18:43,  1.66it/s, loss=0.126, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00146, train/loss_step=0.324, global_step=2697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4111/5971 [41:23<18:43,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000554, train/loss_step=0.155, global_step=2697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4112/5971 [41:25<18:43,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00713, train/loss_vlb_step=3.53e-5, train/loss_step=0.00713, global_step=2697.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4113/5971 [41:26<18:43,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0414, train/loss_vlb_step=0.000148, train/loss_step=0.0414, global_step=2698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4114/5971 [41:27<18:42,  1.65it/s, loss=0.149, v_num=0, train/loss_simple_step=0.883, train/loss_vlb_step=0.149, train/loss_step=0.883, global_step=2698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  69%|██████▉   | 4115/5971 [41:28<18:42,  1.65it/s, loss=0.149, v_num=0, train/loss_simple_step=0.883, train/loss_vlb_step=0.149, train/loss_step=0.883, global_step=2698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4115/5971 [41:28<18:42,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.470, train/loss_vlb_step=0.00588, train/loss_step=0.470, global_step=2698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4116/5971 [41:31<18:42,  1.65it/s, loss=0.178, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000468, train/loss_step=0.141, global_step=2698.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4117/5971 [41:31<18:41,  1.65it/s, loss=0.181, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4118/5971 [41:32<18:41,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000806, train/loss_step=0.223, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4119/5971 [41:33<18:40,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000806, train/loss_step=0.223, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  69%|██████▉   | 4119/5971 [41:33<18:40,  1.65it/s, loss=0.19, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000347, train/loss_step=0.105, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  69%|██████▉   | 4120/5971 [41:35<18:41,  1.65it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:12,  2.30it/s][A

Validating:   1%|          | 2/167 [00:00<00:52,  3.12it/s][A
Epoch 4:  69%|██████▉   | 4123/5971 [41:36<18:38,  1.65it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.60it/s][A
Epoch 4:  69%|██████▉   | 4127/5971 [41:36<18:35,  1.65it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.98it/s][A
Epoch 4:  69%|██████▉   | 4131/5971 [41:36<18:31,  1.65it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.68it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.25it/s][A
Epoch 4:  69%|██████▉   | 4135/5971 [41:37<18:28,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.96it/s][A
Epoch 4:  69%|██████▉   | 4139/5971 [41:37<18:25,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 20.80it/s][A
Epoch 4:  69%|██████▉   | 4143/5971 [41:37<18:21,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.02it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:06, 22.66it/s][A
Epoch 4:  69%|██████▉   | 4147/5971 [41:37<18:18,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 22.88it/s][A
Epoch 4:  70%|██████▉   | 4151/5971 [41:37<18:14,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.03it/s][A
Epoch 4:  70%|██████▉   | 4155/5971 [41:37<18:11,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:02<00:05, 23.59it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.13it/s][A
Epoch 4:  70%|██████▉   | 4159/5971 [41:37<18:08,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.29it/s][A
Epoch 4:  70%|██████▉   | 4163/5971 [41:38<18:04,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.74it/s][A
Epoch 4:  70%|██████▉   | 4167/5971 [41:38<18:01,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.21it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.06it/s][A
Epoch 4:  70%|██████▉   | 4171/5971 [41:38<17:57,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.60it/s][A
Epoch 4:  70%|██████▉   | 4175/5971 [41:38<17:54,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.64it/s][A
Epoch 4:  70%|██████▉   | 4179/5971 [41:38<17:51,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.30it/s][A

Validating:  37%|███▋      | 62/167 [00:03<00:04, 25.39it/s][A
Epoch 4:  70%|███████   | 4183/5971 [41:38<17:47,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.12it/s][A
Epoch 4:  70%|███████   | 4187/5971 [41:39<17:44,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.01it/s][A
Epoch 4:  70%|███████   | 4191/5971 [41:39<17:41,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.47it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.38it/s][A
Epoch 4:  70%|███████   | 4195/5971 [41:39<17:37,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.27it/s][A
Epoch 4:  70%|███████   | 4199/5971 [41:39<17:34,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.27it/s][A
Epoch 4:  70%|███████   | 4203/5971 [41:39<17:31,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.31it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.49it/s][A
Epoch 4:  70%|███████   | 4207/5971 [41:39<17:27,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 89/167 [00:04<00:02, 26.95it/s][A
Epoch 4:  71%|███████   | 4211/5971 [41:39<17:24,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.61it/s][A
Epoch 4:  71%|███████   | 4215/5971 [41:40<17:21,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.36it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.91it/s][A
Epoch 4:  71%|███████   | 4219/5971 [41:40<17:18,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.13it/s][A
Epoch 4:  71%|███████   | 4223/5971 [41:40<17:14,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.36it/s][A
Epoch 4:  71%|███████   | 4227/5971 [41:40<17:11,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.62it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 23.42it/s][A
Epoch 4:  71%|███████   | 4231/5971 [41:40<17:08,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 24.53it/s][A
Epoch 4:  71%|███████   | 4235/5971 [41:40<17:04,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 24.07it/s][A
Epoch 4:  71%|███████   | 4239/5971 [41:41<17:01,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 24.26it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 23.25it/s][A
Epoch 4:  71%|███████   | 4243/5971 [41:41<16:58,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 23.37it/s][A
Epoch 4:  71%|███████   | 4247/5971 [41:41<16:55,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.47it/s][A
Epoch 4:  71%|███████   | 4251/5971 [41:41<16:51,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.24it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 24.46it/s][A
Epoch 4:  71%|███████▏  | 4255/5971 [41:41<16:48,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 23.88it/s][A
Epoch 4:  71%|███████▏  | 4259/5971 [41:41<16:45,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 24.02it/s][A
Epoch 4:  71%|███████▏  | 4263/5971 [41:42<16:42,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 24.47it/s][A

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 24.60it/s][A
Epoch 4:  71%|███████▏  | 4267/5971 [41:42<16:39,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 24.91it/s][A
Epoch 4:  72%|███████▏  | 4271/5971 [41:42<16:35,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 24.57it/s][A
Epoch 4:  72%|███████▏  | 4275/5971 [41:42<16:32,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 24.32it/s][A

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 19.22it/s][A
Epoch 4:  72%|███████▏  | 4279/5971 [41:42<16:29,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 20.81it/s][A
Epoch 4:  72%|███████▏  | 4283/5971 [41:43<16:26,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 21.98it/s][A
Epoch 4:  72%|███████▏  | 4287/5971 [41:43<16:23,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 100%|██████████| 167/167 [00:07<00:00, 23.66it/s][A
Epoch 4:  72%|███████▏  | 4288/5971 [41:43<16:22,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000559, train/loss_step=0.162, global_step=2699.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.69it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.89it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.09it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.36it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.38it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.30it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.41it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.38it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.19it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.17it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.17it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.24it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.16it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.17it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.24it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.24it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.34it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.22it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.05it/s]

Epoch 4:  72%|███████▏  | 4289/5971 [41:55<16:26,  1.71it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.000246, train/loss_step=0.0748, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.33it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.15it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.66it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.16it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.50it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.95it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.12it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.22it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.45it/s][A
Epoch 4:  72%|███████▏  | 4289/5971 [42:00<16:28,  1.70it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.000246, train/loss_step=0.0748, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.45it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.22it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.11it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.11it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.16it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.22it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.25it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.34it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.36it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.27it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.15it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.00it/s]

Epoch 4:  72%|███████▏  | 4290/5971 [42:08<16:30,  1.70it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.000246, train/loss_step=0.0748, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4290/5971 [42:08<16:30,  1.70it/s, loss=0.205, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00409, train/loss_step=0.514, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.95it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.81it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.25it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.53it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.56it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.60it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.59it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.64it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.66it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.69it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.49it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.48it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.43it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.41it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.41it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.41it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.44it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.22it/s]

Epoch 4:  72%|███████▏  | 4291/5971 [42:20<16:34,  1.69it/s, loss=0.205, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00409, train/loss_step=0.514, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4291/5971 [42:20<16:34,  1.69it/s, loss=0.212, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00151, train/loss_step=0.323, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.23it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.88it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.71it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.98it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.17it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.43it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.38it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.53it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.59it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.28it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.61it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.52it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.34it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.24it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.20it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.17it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.18it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.19it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.20it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.21it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.22it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.28it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.11it/s]

Epoch 4:  72%|███████▏  | 4292/5971 [42:33<16:38,  1.68it/s, loss=0.212, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00151, train/loss_step=0.323, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4292/5971 [42:33<16:38,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00141, train/loss_step=0.331, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4293/5971 [42:34<16:38,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00141, train/loss_step=0.331, global_step=2700.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4293/5971 [42:34<16:38,  1.68it/s, loss=0.225, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000423, train/loss_step=0.128, global_step=2701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4294/5971 [42:35<16:37,  1.68it/s, loss=0.225, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000423, train/loss_step=0.128, global_step=2701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4294/5971 [42:35<16:37,  1.68it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.45e-5, train/loss_step=0.0106, global_step=2701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4295/5971 [42:36<16:37,  1.68it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.45e-5, train/loss_step=0.0106, global_step=2701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4295/5971 [42:36<16:37,  1.68it/s, loss=0.238, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00146, train/loss_step=0.298, global_step=2701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  72%|███████▏  | 4296/5971 [42:39<16:37,  1.68it/s, loss=0.238, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00146, train/loss_step=0.298, global_step=2701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4296/5971 [42:39<16:37,  1.68it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.38e-5, train/loss_step=0.0148, global_step=2701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4297/5971 [42:40<16:37,  1.68it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.38e-5, train/loss_step=0.0148, global_step=2701.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4297/5971 [42:40<16:37,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000223, train/loss_step=0.063, global_step=2702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  72%|███████▏  | 4298/5971 [42:41<16:36,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000223, train/loss_step=0.063, global_step=2702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4298/5971 [42:41<16:36,  1.68it/s, loss=0.207, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000263, train/loss_step=0.080, global_step=2702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4299/5971 [42:42<16:36,  1.68it/s, loss=0.207, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000263, train/loss_step=0.080, global_step=2702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4299/5971 [42:42<16:36,  1.68it/s, loss=0.215, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00131, train/loss_step=0.316, global_step=2702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  72%|███████▏  | 4300/5971 [42:44<16:36,  1.68it/s, loss=0.215, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00131, train/loss_step=0.316, global_step=2702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4300/5971 [42:44<16:36,  1.68it/s, loss=0.228, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00113, train/loss_step=0.269, global_step=2702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4301/5971 [42:45<16:35,  1.68it/s, loss=0.228, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00113, train/loss_step=0.269, global_step=2702.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4301/5971 [42:45<16:35,  1.68it/s, loss=0.235, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000647, train/loss_step=0.179, global_step=2703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4302/5971 [42:46<16:35,  1.68it/s, loss=0.235, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000647, train/loss_step=0.179, global_step=2703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4302/5971 [42:46<16:35,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0848, train/loss_vlb_step=0.000284, train/loss_step=0.0848, global_step=2703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4303/5971 [42:47<16:34,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0848, train/loss_vlb_step=0.000284, train/loss_step=0.0848, global_step=2703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4303/5971 [42:47<16:34,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000946, train/loss_step=0.266, global_step=2703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  72%|███████▏  | 4304/5971 [42:49<16:34,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000946, train/loss_step=0.266, global_step=2703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4304/5971 [42:49<16:34,  1.68it/s, loss=0.19, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000948, train/loss_step=0.245, global_step=2703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  72%|███████▏  | 4305/5971 [42:50<16:34,  1.68it/s, loss=0.19, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000948, train/loss_step=0.245, global_step=2703.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4305/5971 [42:50<16:34,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00147, train/loss_step=0.311, global_step=2704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  72%|███████▏  | 4306/5971 [42:51<16:33,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00147, train/loss_step=0.311, global_step=2704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4306/5971 [42:51<16:33,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.5e-5, train/loss_step=0.0127, global_step=2704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4307/5971 [42:52<16:33,  1.67it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.5e-5, train/loss_step=0.0127, global_step=2704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4307/5971 [42:52<16:33,  1.67it/s, loss=0.228, v_num=0, train/loss_simple_step=0.884, train/loss_vlb_step=0.112, train/loss_step=0.884, global_step=2704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  72%|███████▏  | 4308/5971 [42:54<16:33,  1.67it/s, loss=0.228, v_num=0, train/loss_simple_step=0.884, train/loss_vlb_step=0.112, train/loss_step=0.884, global_step=2704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4308/5971 [42:54<16:33,  1.67it/s, loss=0.228, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000543, train/loss_step=0.156, global_step=2704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4309/5971 [42:55<16:33,  1.67it/s, loss=0.228, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000543, train/loss_step=0.156, global_step=2704.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4309/5971 [42:55<16:33,  1.67it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.61e-6, train/loss_step=0.00159, global_step=2705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4310/5971 [42:56<16:32,  1.67it/s, loss=0.224, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.61e-6, train/loss_step=0.00159, global_step=2705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4310/5971 [42:56<16:32,  1.67it/s, loss=0.213, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00146, train/loss_step=0.282, global_step=2705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  72%|███████▏  | 4311/5971 [42:57<16:32,  1.67it/s, loss=0.213, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00146, train/loss_step=0.282, global_step=2705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4311/5971 [42:57<16:32,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.23e-5, train/loss_step=0.014, global_step=2705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4312/5971 [42:59<16:32,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.23e-5, train/loss_step=0.014, global_step=2705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4312/5971 [42:59<16:32,  1.67it/s, loss=0.188, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000492, train/loss_step=0.143, global_step=2705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4313/5971 [43:00<16:31,  1.67it/s, loss=0.188, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000492, train/loss_step=0.143, global_step=2705.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4313/5971 [43:00<16:31,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00939, train/loss_vlb_step=4.25e-5, train/loss_step=0.00939, global_step=2706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4314/5971 [43:01<16:31,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00939, train/loss_vlb_step=4.25e-5, train/loss_step=0.00939, global_step=2706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4314/5971 [43:01<16:31,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.06e-5, train/loss_step=0.0237, global_step=2706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  72%|███████▏  | 4315/5971 [43:01<16:30,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.06e-5, train/loss_step=0.0237, global_step=2706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4315/5971 [43:01<16:30,  1.67it/s, loss=0.173, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000341, train/loss_step=0.103, global_step=2706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  72%|███████▏  | 4316/5971 [43:04<16:30,  1.67it/s, loss=0.173, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000341, train/loss_step=0.103, global_step=2706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4316/5971 [43:04<16:30,  1.67it/s, loss=0.214, v_num=0, train/loss_simple_step=0.832, train/loss_vlb_step=0.140, train/loss_step=0.832, global_step=2706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  72%|███████▏  | 4317/5971 [43:05<16:30,  1.67it/s, loss=0.214, v_num=0, train/loss_simple_step=0.832, train/loss_vlb_step=0.140, train/loss_step=0.832, global_step=2706.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4317/5971 [43:05<16:30,  1.67it/s, loss=0.216, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00037, train/loss_step=0.110, global_step=2707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4318/5971 [43:06<16:29,  1.67it/s, loss=0.216, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00037, train/loss_step=0.110, global_step=2707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4318/5971 [43:06<16:29,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00555, train/loss_vlb_step=2.73e-5, train/loss_step=0.00555, global_step=2707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4319/5971 [43:06<16:29,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00555, train/loss_vlb_step=2.73e-5, train/loss_step=0.00555, global_step=2707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4319/5971 [43:06<16:29,  1.67it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000112, train/loss_step=0.0307, global_step=2707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  72%|███████▏  | 4320/5971 [43:09<16:29,  1.67it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000112, train/loss_step=0.0307, global_step=2707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4320/5971 [43:09<16:29,  1.67it/s, loss=0.185, v_num=0, train/loss_simple_step=0.009, train/loss_vlb_step=4.4e-5, train/loss_step=0.009, global_step=2707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  72%|███████▏  | 4321/5971 [43:10<16:28,  1.67it/s, loss=0.185, v_num=0, train/loss_simple_step=0.009, train/loss_vlb_step=4.4e-5, train/loss_step=0.009, global_step=2707.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4321/5971 [43:10<16:28,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0295, train/loss_vlb_step=0.000113, train/loss_step=0.0295, global_step=2708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4322/5971 [43:10<16:28,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0295, train/loss_vlb_step=0.000113, train/loss_step=0.0295, global_step=2708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4322/5971 [43:10<16:28,  1.67it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00653, train/loss_vlb_step=3.24e-5, train/loss_step=0.00653, global_step=2708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4323/5971 [43:11<16:27,  1.67it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00653, train/loss_vlb_step=3.24e-5, train/loss_step=0.00653, global_step=2708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4323/5971 [43:11<16:27,  1.67it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00343, train/loss_vlb_step=1.9e-5, train/loss_step=0.00343, global_step=2708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  72%|███████▏  | 4324/5971 [43:14<16:27,  1.67it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00343, train/loss_vlb_step=1.9e-5, train/loss_step=0.00343, global_step=2708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4324/5971 [43:14<16:27,  1.67it/s, loss=0.16, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000854, train/loss_step=0.227, global_step=2708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  72%|███████▏  | 4325/5971 [43:14<16:27,  1.67it/s, loss=0.16, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000854, train/loss_step=0.227, global_step=2708.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4325/5971 [43:14<16:27,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.790, train/loss_vlb_step=0.0409, train/loss_step=0.790, global_step=2709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  72%|███████▏  | 4326/5971 [43:15<16:26,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.790, train/loss_vlb_step=0.0409, train/loss_step=0.790, global_step=2709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4326/5971 [43:15<16:26,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.18e-5, train/loss_step=0.00202, global_step=2709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4327/5971 [43:16<16:26,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.18e-5, train/loss_step=0.00202, global_step=2709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4327/5971 [43:16<16:26,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00218, train/loss_step=0.363, global_step=2709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  72%|███████▏  | 4328/5971 [43:19<16:26,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00218, train/loss_step=0.363, global_step=2709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  72%|███████▏  | 4328/5971 [43:19<16:26,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000565, train/loss_step=0.172, global_step=2709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4329/5971 [43:20<16:26,  1.67it/s, loss=0.158, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000565, train/loss_step=0.172, global_step=2709.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4329/5971 [43:20<16:26,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4330/5971 [43:21<16:25,  1.66it/s, loss=0.164, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4330/5971 [43:21<16:25,  1.66it/s, loss=0.157, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=2710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4331/5971 [43:22<16:25,  1.66it/s, loss=0.157, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=2710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4331/5971 [43:22<16:25,  1.66it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00573, train/loss_vlb_step=2.83e-5, train/loss_step=0.00573, global_step=2710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4332/5971 [43:24<16:25,  1.66it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00573, train/loss_vlb_step=2.83e-5, train/loss_step=0.00573, global_step=2710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4332/5971 [43:24<16:25,  1.66it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000269, train/loss_step=0.0809, global_step=2710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4333/5971 [43:25<16:24,  1.66it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000269, train/loss_step=0.0809, global_step=2710.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4333/5971 [43:25<16:24,  1.66it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.41e-6, train/loss_step=0.00139, global_step=2711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4334/5971 [43:26<16:24,  1.66it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.41e-6, train/loss_step=0.00139, global_step=2711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4334/5971 [43:26<16:24,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000164, train/loss_step=0.0456, global_step=2711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4335/5971 [43:27<16:23,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000164, train/loss_step=0.0456, global_step=2711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4335/5971 [43:27<16:23,  1.66it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.00011, train/loss_step=0.0289, global_step=2711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  73%|███████▎  | 4336/5971 [43:29<16:23,  1.66it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.00011, train/loss_step=0.0289, global_step=2711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4336/5971 [43:29<16:23,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0666, train/loss_vlb_step=0.00022, train/loss_step=0.0666, global_step=2711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4337/5971 [43:30<16:23,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0666, train/loss_vlb_step=0.00022, train/loss_step=0.0666, global_step=2711.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4337/5971 [43:30<16:23,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.68e-5, train/loss_step=0.0152, global_step=2712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4338/5971 [43:31<16:22,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.68e-5, train/loss_step=0.0152, global_step=2712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4338/5971 [43:31<16:22,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000173, train/loss_step=0.051, global_step=2712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4339/5971 [43:32<16:22,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000173, train/loss_step=0.051, global_step=2712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4339/5971 [43:32<16:22,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0898, train/loss_vlb_step=0.000296, train/loss_step=0.0898, global_step=2712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4340/5971 [43:34<16:22,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0898, train/loss_vlb_step=0.000296, train/loss_step=0.0898, global_step=2712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4340/5971 [43:34<16:22,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00115, train/loss_step=0.262, global_step=2712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  73%|███████▎  | 4341/5971 [43:35<16:21,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00115, train/loss_step=0.262, global_step=2712.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4341/5971 [43:35<16:21,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.47e-5, train/loss_step=0.0173, global_step=2713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4342/5971 [43:36<16:21,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=6.47e-5, train/loss_step=0.0173, global_step=2713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4342/5971 [43:36<16:21,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=8.15e-5, train/loss_step=0.0188, global_step=2713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4343/5971 [43:37<16:20,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=8.15e-5, train/loss_step=0.0188, global_step=2713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4343/5971 [43:37<16:20,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.34e-5, train/loss_step=0.00676, global_step=2713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4344/5971 [43:39<16:21,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.34e-5, train/loss_step=0.00676, global_step=2713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4344/5971 [43:39<16:21,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00678, train/loss_vlb_step=3.35e-5, train/loss_step=0.00678, global_step=2713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4345/5971 [43:40<16:20,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00678, train/loss_vlb_step=3.35e-5, train/loss_step=0.00678, global_step=2713.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4345/5971 [43:40<16:20,  1.66it/s, loss=0.0747, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2.01e-5, train/loss_step=0.00391, global_step=2714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4346/5971 [43:41<16:20,  1.66it/s, loss=0.0747, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2.01e-5, train/loss_step=0.00391, global_step=2714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4346/5971 [43:41<16:20,  1.66it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000384, train/loss_step=0.115, global_step=2714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  73%|███████▎  | 4347/5971 [43:42<16:19,  1.66it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000384, train/loss_step=0.115, global_step=2714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4347/5971 [43:42<16:19,  1.66it/s, loss=0.0624, v_num=0, train/loss_simple_step=0.00258, train/loss_vlb_step=1.36e-5, train/loss_step=0.00258, global_step=2714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4348/5971 [43:44<16:19,  1.66it/s, loss=0.0624, v_num=0, train/loss_simple_step=0.00258, train/loss_vlb_step=1.36e-5, train/loss_step=0.00258, global_step=2714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4348/5971 [43:44<16:19,  1.66it/s, loss=0.066, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00091, train/loss_step=0.244, global_step=2714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  73%|███████▎  | 4349/5971 [43:45<16:19,  1.66it/s, loss=0.066, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00091, train/loss_step=0.244, global_step=2714.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4349/5971 [43:45<16:19,  1.66it/s, loss=0.0717, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000975, train/loss_step=0.237, global_step=2715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4350/5971 [43:46<16:18,  1.66it/s, loss=0.0717, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000975, train/loss_step=0.237, global_step=2715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4350/5971 [43:46<16:18,  1.66it/s, loss=0.0653, v_num=0, train/loss_simple_step=0.00689, train/loss_vlb_step=3.29e-5, train/loss_step=0.00689, global_step=2715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4351/5971 [43:47<16:18,  1.66it/s, loss=0.0653, v_num=0, train/loss_simple_step=0.00689, train/loss_vlb_step=3.29e-5, train/loss_step=0.00689, global_step=2715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4351/5971 [43:47<16:18,  1.66it/s, loss=0.0822, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00154, train/loss_step=0.344, global_step=2715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  73%|███████▎  | 4352/5971 [43:49<16:18,  1.66it/s, loss=0.0822, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00154, train/loss_step=0.344, global_step=2715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4352/5971 [43:49<16:18,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00145, train/loss_step=0.317, global_step=2715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4353/5971 [43:50<16:17,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.00145, train/loss_step=0.317, global_step=2715.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4353/5971 [43:50<16:17,  1.66it/s, loss=0.111, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00156, train/loss_step=0.347, global_step=2716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4354/5971 [43:51<16:17,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00156, train/loss_step=0.347, global_step=2716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4354/5971 [43:51<16:17,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000162, train/loss_step=0.0443, global_step=2716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4355/5971 [43:52<16:16,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000162, train/loss_step=0.0443, global_step=2716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4355/5971 [43:52<16:16,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000964, train/loss_step=0.236, global_step=2716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  73%|███████▎  | 4356/5971 [43:55<16:16,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000964, train/loss_step=0.236, global_step=2716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4356/5971 [43:55<16:16,  1.65it/s, loss=0.162, v_num=0, train/loss_simple_step=0.870, train/loss_vlb_step=0.220, train/loss_step=0.870, global_step=2716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  73%|███████▎  | 4357/5971 [43:56<16:16,  1.65it/s, loss=0.162, v_num=0, train/loss_simple_step=0.870, train/loss_vlb_step=0.220, train/loss_step=0.870, global_step=2716.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4357/5971 [43:56<16:16,  1.65it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0577, train/loss_vlb_step=0.000204, train/loss_step=0.0577, global_step=2717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4358/5971 [43:56<16:15,  1.65it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0577, train/loss_vlb_step=0.000204, train/loss_step=0.0577, global_step=2717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4358/5971 [43:56<16:15,  1.65it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.08e-6, train/loss_step=0.0014, global_step=2717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4359/5971 [43:57<16:15,  1.65it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.08e-6, train/loss_step=0.0014, global_step=2717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4359/5971 [43:57<16:15,  1.65it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00461, train/loss_vlb_step=2.33e-5, train/loss_step=0.00461, global_step=2717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4360/5971 [43:59<16:15,  1.65it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00461, train/loss_vlb_step=2.33e-5, train/loss_step=0.00461, global_step=2717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4360/5971 [43:59<16:15,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000175, train/loss_step=0.0515, global_step=2717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4361/5971 [44:00<16:14,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000175, train/loss_step=0.0515, global_step=2717.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4361/5971 [44:00<16:14,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00288, train/loss_vlb_step=1.54e-5, train/loss_step=0.00288, global_step=2718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4362/5971 [44:01<16:14,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00288, train/loss_vlb_step=1.54e-5, train/loss_step=0.00288, global_step=2718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4362/5971 [44:01<16:14,  1.65it/s, loss=0.152, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000483, train/loss_step=0.147, global_step=2718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  73%|███████▎  | 4363/5971 [44:02<16:13,  1.65it/s, loss=0.152, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000483, train/loss_step=0.147, global_step=2718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4363/5971 [44:02<16:13,  1.65it/s, loss=0.18, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.0082, train/loss_step=0.554, global_step=2718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  73%|███████▎  | 4364/5971 [44:04<16:13,  1.65it/s, loss=0.18, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.0082, train/loss_step=0.554, global_step=2718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4364/5971 [44:04<16:13,  1.65it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000221, train/loss_step=0.0673, global_step=2718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4365/5971 [44:05<16:13,  1.65it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000221, train/loss_step=0.0673, global_step=2718.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4365/5971 [44:05<16:13,  1.65it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0565, train/loss_vlb_step=0.000193, train/loss_step=0.0565, global_step=2719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4366/5971 [44:06<16:12,  1.65it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0565, train/loss_vlb_step=0.000193, train/loss_step=0.0565, global_step=2719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4366/5971 [44:06<16:12,  1.65it/s, loss=0.214, v_num=0, train/loss_simple_step=0.693, train/loss_vlb_step=0.0278, train/loss_step=0.693, global_step=2719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  73%|███████▎  | 4367/5971 [44:07<16:12,  1.65it/s, loss=0.214, v_num=0, train/loss_simple_step=0.693, train/loss_vlb_step=0.0278, train/loss_step=0.693, global_step=2719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4367/5971 [44:07<16:12,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.47e-5, train/loss_step=0.0186, global_step=2719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4368/5971 [44:09<16:12,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.47e-5, train/loss_step=0.0186, global_step=2719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4368/5971 [44:09<16:12,  1.65it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00669, train/loss_vlb_step=3.18e-5, train/loss_step=0.00669, global_step=2719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4369/5971 [44:10<16:11,  1.65it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00669, train/loss_vlb_step=3.18e-5, train/loss_step=0.00669, global_step=2719.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4369/5971 [44:10<16:11,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.85e-5, train/loss_step=0.0125, global_step=2720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  73%|███████▎  | 4370/5971 [44:11<16:11,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.85e-5, train/loss_step=0.0125, global_step=2720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4370/5971 [44:11<16:11,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00282, train/loss_vlb_step=1.59e-5, train/loss_step=0.00282, global_step=2720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4371/5971 [44:12<16:10,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00282, train/loss_vlb_step=1.59e-5, train/loss_step=0.00282, global_step=2720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4371/5971 [44:12<16:10,  1.65it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.1e-5, train/loss_step=0.00192, global_step=2720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4372/5971 [44:14<16:10,  1.65it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.1e-5, train/loss_step=0.00192, global_step=2720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4372/5971 [44:14<16:10,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00412, train/loss_vlb_step=2.17e-5, train/loss_step=0.00412, global_step=2720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4373/5971 [44:15<16:10,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00412, train/loss_vlb_step=2.17e-5, train/loss_step=0.00412, global_step=2720.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4373/5971 [44:15<16:10,  1.65it/s, loss=0.156, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00171, train/loss_step=0.282, global_step=2721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  73%|███████▎  | 4374/5971 [44:16<16:09,  1.65it/s, loss=0.156, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00171, train/loss_step=0.282, global_step=2721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4374/5971 [44:16<16:09,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000134, train/loss_step=0.0374, global_step=2721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4375/5971 [44:17<16:09,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000134, train/loss_step=0.0374, global_step=2721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4375/5971 [44:17<16:09,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.61e-5, train/loss_step=0.00485, global_step=2721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4376/5971 [44:19<16:09,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.61e-5, train/loss_step=0.00485, global_step=2721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4376/5971 [44:19<16:09,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000756, train/loss_step=0.221, global_step=2721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  73%|███████▎  | 4377/5971 [44:20<16:08,  1.65it/s, loss=0.111, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000756, train/loss_step=0.221, global_step=2721.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4377/5971 [44:20<16:08,  1.65it/s, loss=0.139, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0067, train/loss_step=0.606, global_step=2722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  73%|███████▎  | 4378/5971 [44:21<16:08,  1.65it/s, loss=0.139, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0067, train/loss_step=0.606, global_step=2722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4378/5971 [44:21<16:08,  1.65it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000114, train/loss_step=0.0301, global_step=2722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4379/5971 [44:22<16:07,  1.65it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000114, train/loss_step=0.0301, global_step=2722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4379/5971 [44:22<16:07,  1.65it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=9.96e-6, train/loss_step=0.00176, global_step=2722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4380/5971 [44:25<16:08,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=9.96e-6, train/loss_step=0.00176, global_step=2722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4380/5971 [44:25<16:08,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0356, train/loss_vlb_step=0.000123, train/loss_step=0.0356, global_step=2722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4381/5971 [44:26<16:07,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0356, train/loss_vlb_step=0.000123, train/loss_step=0.0356, global_step=2722.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4381/5971 [44:26<16:07,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000871, train/loss_step=0.234, global_step=2723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  73%|███████▎  | 4382/5971 [44:27<16:07,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000871, train/loss_step=0.234, global_step=2723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4382/5971 [44:27<16:07,  1.64it/s, loss=0.161, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00158, train/loss_step=0.341, global_step=2723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4383/5971 [44:28<16:06,  1.64it/s, loss=0.161, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00158, train/loss_step=0.341, global_step=2723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4383/5971 [44:28<16:06,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=2723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4384/5971 [44:30<16:06,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=2723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4384/5971 [44:30<16:06,  1.64it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.23e-5, train/loss_step=0.00668, global_step=2723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4385/5971 [44:31<16:05,  1.64it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.23e-5, train/loss_step=0.00668, global_step=2723.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4385/5971 [44:31<16:05,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000136, train/loss_step=0.0352, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4386/5971 [44:32<16:05,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000136, train/loss_step=0.0352, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4386/5971 [44:32<16:05,  1.64it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=4.08e-5, train/loss_step=0.00854, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  73%|███████▎  | 4387/5971 [44:33<16:04,  1.64it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=4.08e-5, train/loss_step=0.00854, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4387/5971 [44:33<16:04,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.000217, train/loss_step=0.0637, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4388/5971 [44:35<16:04,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0637, train/loss_vlb_step=0.000217, train/loss_step=0.0637, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  73%|███████▎  | 4388/5971 [44:35<16:04,  1.64it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:21,  2.05it/s][A
Epoch 4:  74%|███████▎  | 4390/5971 [44:36<16:03,  1.64it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   2%|▏         | 3/167 [00:00<00:27,  5.93it/s][A
Epoch 4:  74%|███████▎  | 4392/5971 [44:36<16:01,  1.64it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   4%|▎         | 6/167 [00:00<00:14, 11.19it/s][A
Epoch 4:  74%|███████▎  | 4395/5971 [44:36<15:59,  1.64it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.56it/s][A
Epoch 4:  74%|███████▎  | 4398/5971 [44:36<15:57,  1.64it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 12/167 [00:00<00:08, 17.80it/s][A
Epoch 4:  74%|███████▎  | 4401/5971 [44:36<15:54,  1.64it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   9%|▉         | 15/167 [00:01<00:08, 18.33it/s][A
Epoch 4:  74%|███████▍  | 4404/5971 [44:36<15:52,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  11%|█         | 18/167 [00:01<00:07, 20.47it/s][A
Epoch 4:  74%|███████▍  | 4407/5971 [44:36<15:49,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.05it/s][A
Epoch 4:  74%|███████▍  | 4410/5971 [44:36<15:47,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 22.74it/s][A
Epoch 4:  74%|███████▍  | 4413/5971 [44:36<15:44,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.85it/s][A
Epoch 4:  74%|███████▍  | 4416/5971 [44:37<15:42,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.71it/s][A
Epoch 4:  74%|███████▍  | 4419/5971 [44:37<15:40,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.08it/s][A
Epoch 4:  74%|███████▍  | 4422/5971 [44:37<15:37,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.11it/s][A
Epoch 4:  74%|███████▍  | 4425/5971 [44:37<15:35,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 24.12it/s][A
Epoch 4:  74%|███████▍  | 4428/5971 [44:37<15:32,  1.65it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.13it/s][A
Epoch 4:  74%|███████▍  | 4431/5971 [44:37<15:30,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 24.55it/s][A
Epoch 4:  74%|███████▍  | 4434/5971 [44:37<15:28,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.52it/s][A
Epoch 4:  74%|███████▍  | 4437/5971 [44:37<15:25,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.11it/s][A
Epoch 4:  74%|███████▍  | 4440/5971 [44:38<15:23,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.02it/s][A
Epoch 4:  74%|███████▍  | 4443/5971 [44:38<15:20,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.34it/s][A
Epoch 4:  74%|███████▍  | 4446/5971 [44:38<15:18,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 27.24it/s][A
Epoch 4:  75%|███████▍  | 4450/5971 [44:38<15:15,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.60it/s][A
Epoch 4:  75%|███████▍  | 4454/5971 [44:38<15:12,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|████      | 67/167 [00:03<00:04, 24.99it/s][A
Epoch 4:  75%|███████▍  | 4458/5971 [44:38<15:08,  1.66it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.45it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.20it/s][A
Epoch 4:  75%|███████▍  | 4462/5971 [44:38<15:05,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.22it/s][A
Epoch 4:  75%|███████▍  | 4466/5971 [44:39<15:02,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.83it/s][A
Epoch 4:  75%|███████▍  | 4470/5971 [44:39<14:59,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.68it/s][A

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.17it/s][A
Epoch 4:  75%|███████▍  | 4474/5971 [44:39<14:56,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 25.17it/s][A
Epoch 4:  75%|███████▍  | 4478/5971 [44:39<14:53,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 25.33it/s][A
Epoch 4:  75%|███████▌  | 4482/5971 [44:39<14:50,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.16it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.14it/s][A
Epoch 4:  75%|███████▌  | 4486/5971 [44:39<14:46,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.03it/s][A
Epoch 4:  75%|███████▌  | 4490/5971 [44:39<14:43,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 26.33it/s][A
Epoch 4:  75%|███████▌  | 4494/5971 [44:40<14:40,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.73it/s][A
Epoch 4:  75%|███████▌  | 4498/5971 [44:40<14:37,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.69it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.70it/s][A
Epoch 4:  75%|███████▌  | 4502/5971 [44:40<14:34,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 27.97it/s][A
Epoch 4:  75%|███████▌  | 4506/5971 [44:40<14:31,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.46it/s][A
Epoch 4:  76%|███████▌  | 4510/5971 [44:40<14:28,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.07it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.02it/s][A
Epoch 4:  76%|███████▌  | 4514/5971 [44:40<14:25,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 28.45it/s][A
Epoch 4:  76%|███████▌  | 4518/5971 [44:40<14:22,  1.69it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.87it/s][A
Epoch 4:  76%|███████▌  | 4522/5971 [44:41<14:18,  1.69it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.37it/s][A
Epoch 4:  76%|███████▌  | 4526/5971 [44:41<14:15,  1.69it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.34it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.05it/s][A
Epoch 4:  76%|███████▌  | 4530/5971 [44:41<14:12,  1.69it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.58it/s][A
Epoch 4:  76%|███████▌  | 4534/5971 [44:41<14:09,  1.69it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.77it/s][A
Epoch 4:  76%|███████▌  | 4538/5971 [44:41<14:06,  1.69it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.59it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.63it/s][A
Epoch 4:  76%|███████▌  | 4542/5971 [44:41<14:03,  1.69it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 28.16it/s][A
Epoch 4:  76%|███████▌  | 4546/5971 [44:42<14:00,  1.70it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 27.80it/s][A
Epoch 4:  76%|███████▌  | 4550/5971 [44:42<13:57,  1.70it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 27.03it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 27.02it/s][A
Epoch 4:  76%|███████▋  | 4554/5971 [44:42<13:54,  1.70it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4556/5971 [44:42<13:52,  1.70it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000327, train/loss_step=0.0967, global_step=2724.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  76%|███████▋  | 4557/5971 [44:43<13:52,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.37e-5, train/loss_step=0.0026, global_step=2725.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  76%|███████▋  | 4558/5971 [44:44<13:52,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.37e-5, train/loss_step=0.0026, global_step=2725.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4558/5971 [44:44<13:52,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00345, train/loss_vlb_step=1.81e-5, train/loss_step=0.00345, global_step=2725.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4559/5971 [44:45<13:51,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000133, train/loss_step=0.0368, global_step=2725.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  76%|███████▋  | 4560/5971 [44:48<13:51,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.0011, train/loss_step=0.277, global_step=2725.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  76%|███████▋  | 4561/5971 [44:49<13:51,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=5.49e-5, train/loss_step=0.0149, global_step=2726.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4562/5971 [44:50<13:50,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=5.49e-5, train/loss_step=0.0149, global_step=2726.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4562/5971 [44:50<13:50,  1.70it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.21e-5, train/loss_step=0.0022, global_step=2726.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4563/5971 [44:50<13:50,  1.70it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.54e-5, train/loss_step=0.00291, global_step=2726.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4564/5971 [44:53<13:50,  1.69it/s, loss=0.102, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000432, train/loss_step=0.131, global_step=2726.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  76%|███████▋  | 4565/5971 [44:54<13:49,  1.69it/s, loss=0.0769, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=2727.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4566/5971 [44:55<13:49,  1.69it/s, loss=0.0769, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=2727.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  76%|███████▋  | 4566/5971 [44:55<13:49,  1.69it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00126, train/loss_step=0.281, global_step=2727.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  76%|███████▋  | 4567/5971 [44:56<13:48,  1.69it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000105, train/loss_step=0.0298, global_step=2727.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4568/5971 [44:58<13:48,  1.69it/s, loss=0.102, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00102, train/loss_step=0.267, global_step=2727.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  77%|███████▋  | 4569/5971 [44:59<13:48,  1.69it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000135, train/loss_step=0.0374, global_step=2728.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4570/5971 [44:59<13:47,  1.69it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000135, train/loss_step=0.0374, global_step=2728.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4570/5971 [44:59<13:47,  1.69it/s, loss=0.115, v_num=0, train/loss_simple_step=0.793, train/loss_vlb_step=0.0319, train/loss_step=0.793, global_step=2728.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  77%|███████▋  | 4571/5971 [45:00<13:47,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.82e-5, train/loss_step=0.0112, global_step=2728.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4572/5971 [45:03<13:46,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000875, train/loss_step=0.240, global_step=2728.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4573/5971 [45:04<13:46,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00619, train/loss_vlb_step=3.02e-5, train/loss_step=0.00619, global_step=2729.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4574/5971 [45:05<13:46,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00619, train/loss_vlb_step=3.02e-5, train/loss_step=0.00619, global_step=2729.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4574/5971 [45:05<13:46,  1.69it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.49e-5, train/loss_step=0.0181, global_step=2729.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  77%|███████▋  | 4575/5971 [45:06<13:45,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00421, train/loss_vlb_step=2.14e-5, train/loss_step=0.00421, global_step=2729.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4576/5971 [45:08<13:45,  1.69it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=5.84e-5, train/loss_step=0.0141, global_step=2729.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4577/5971 [45:09<13:44,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00123, train/loss_step=0.305, global_step=2730.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4578/5971 [45:10<13:44,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00123, train/loss_step=0.305, global_step=2730.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4578/5971 [45:10<13:44,  1.69it/s, loss=0.142, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.000958, train/loss_step=0.260, global_step=2730.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4579/5971 [45:11<13:43,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000224, train/loss_step=0.0656, global_step=2730.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4580/5971 [45:13<13:43,  1.69it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.01e-5, train/loss_step=0.0182, global_step=2730.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4581/5971 [45:14<13:43,  1.69it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.9e-5, train/loss_step=0.00358, global_step=2731.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4582/5971 [45:14<13:42,  1.69it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.9e-5, train/loss_step=0.00358, global_step=2731.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4582/5971 [45:14<13:42,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.000281, train/loss_step=0.0847, global_step=2731.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4583/5971 [45:15<13:42,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.3e-6, train/loss_step=0.00158, global_step=2731.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4584/5971 [45:18<13:42,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.470, train/loss_vlb_step=0.00577, train/loss_step=0.470, global_step=2731.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  77%|███████▋  | 4585/5971 [45:19<13:41,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000117, train/loss_step=0.0311, global_step=2732.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4586/5971 [45:20<13:41,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000117, train/loss_step=0.0311, global_step=2732.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4586/5971 [45:20<13:41,  1.69it/s, loss=0.145, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000801, train/loss_step=0.234, global_step=2732.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4587/5971 [45:21<13:40,  1.69it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.53e-5, train/loss_step=0.0102, global_step=2732.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4588/5971 [45:23<13:40,  1.68it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000117, train/loss_step=0.0339, global_step=2732.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4589/5971 [45:24<13:40,  1.68it/s, loss=0.14, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000657, train/loss_step=0.191, global_step=2733.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  77%|███████▋  | 4590/5971 [45:25<13:39,  1.68it/s, loss=0.14, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000657, train/loss_step=0.191, global_step=2733.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4590/5971 [45:25<13:39,  1.68it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.37e-5, train/loss_step=0.00256, global_step=2733.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4591/5971 [45:26<13:39,  1.68it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=5.59e-5, train/loss_step=0.0149, global_step=2733.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4592/5971 [45:29<13:39,  1.68it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000139, train/loss_step=0.0374, global_step=2733.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4593/5971 [45:30<13:39,  1.68it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.000279, train/loss_step=0.0845, global_step=2734.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4594/5971 [45:31<13:38,  1.68it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.000279, train/loss_step=0.0845, global_step=2734.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4594/5971 [45:31<13:38,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.00107, train/loss_step=0.246, global_step=2734.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  77%|███████▋  | 4595/5971 [45:32<13:38,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.86e-5, train/loss_step=0.00344, global_step=2734.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4596/5971 [45:35<13:38,  1.68it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.000247, train/loss_step=0.0727, global_step=2734.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  77%|███████▋  | 4597/5971 [45:36<13:37,  1.68it/s, loss=0.125, v_num=0, train/loss_simple_step=0.640, train/loss_vlb_step=0.0134, train/loss_step=0.640, global_step=2735.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  77%|███████▋  | 4598/5971 [45:37<13:37,  1.68it/s, loss=0.125, v_num=0, train/loss_simple_step=0.640, train/loss_vlb_step=0.0134, train/loss_step=0.640, global_step=2735.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4598/5971 [45:37<13:37,  1.68it/s, loss=0.123, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000798, train/loss_step=0.220, global_step=2735.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4599/5971 [45:38<13:36,  1.68it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.9e-5, train/loss_step=0.0161, global_step=2735.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4600/5971 [45:40<13:36,  1.68it/s, loss=0.131, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000883, train/loss_step=0.225, global_step=2735.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4601/5971 [45:41<13:36,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000394, train/loss_step=0.119, global_step=2736.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4602/5971 [45:42<13:35,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000394, train/loss_step=0.119, global_step=2736.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4602/5971 [45:42<13:35,  1.68it/s, loss=0.156, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00348, train/loss_step=0.459, global_step=2736.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  77%|███████▋  | 4603/5971 [45:42<13:35,  1.68it/s, loss=0.161, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000386, train/loss_step=0.118, global_step=2736.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4604/5971 [45:45<13:34,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000417, train/loss_step=0.126, global_step=2736.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4605/5971 [45:46<13:34,  1.68it/s, loss=0.162, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.0017, train/loss_step=0.379, global_step=2737.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4606/5971 [45:47<13:33,  1.68it/s, loss=0.162, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.0017, train/loss_step=0.379, global_step=2737.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4606/5971 [45:47<13:33,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=2737.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4607/5971 [45:47<13:33,  1.68it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000116, train/loss_step=0.0323, global_step=2737.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4608/5971 [45:50<13:33,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000669, train/loss_step=0.184, global_step=2737.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4609/5971 [45:50<13:32,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=2738.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  77%|███████▋  | 4610/5971 [45:51<13:32,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=2738.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4610/5971 [45:51<13:32,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.36e-5, train/loss_step=0.012, global_step=2738.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  77%|███████▋  | 4611/5971 [45:52<13:31,  1.68it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=9.53e-5, train/loss_step=0.0269, global_step=2738.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4612/5971 [45:55<13:31,  1.67it/s, loss=0.172, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00118, train/loss_step=0.263, global_step=2738.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4613/5971 [45:56<13:31,  1.67it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.59e-5, train/loss_step=0.0102, global_step=2739.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4614/5971 [45:57<13:30,  1.67it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.59e-5, train/loss_step=0.0102, global_step=2739.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4614/5971 [45:57<13:30,  1.67it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0913, train/loss_vlb_step=0.0003, train/loss_step=0.0913, global_step=2739.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4615/5971 [45:57<13:30,  1.67it/s, loss=0.195, v_num=0, train/loss_simple_step=0.700, train/loss_vlb_step=0.0164, train/loss_step=0.700, global_step=2739.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  77%|███████▋  | 4616/5971 [46:00<13:30,  1.67it/s, loss=0.219, v_num=0, train/loss_simple_step=0.549, train/loss_vlb_step=0.00607, train/loss_step=0.549, global_step=2739.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4617/5971 [46:01<13:29,  1.67it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0608, train/loss_vlb_step=0.000208, train/loss_step=0.0608, global_step=2740.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4618/5971 [46:02<13:29,  1.67it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0608, train/loss_vlb_step=0.000208, train/loss_step=0.0608, global_step=2740.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4618/5971 [46:02<13:29,  1.67it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.06e-5, train/loss_step=0.00422, global_step=2740.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4619/5971 [46:03<13:28,  1.67it/s, loss=0.179, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.46e-5, train/loss_step=0.012, global_step=2740.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  77%|███████▋  | 4620/5971 [46:05<13:28,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.01e-5, train/loss_step=0.0201, global_step=2740.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4621/5971 [46:06<13:27,  1.67it/s, loss=0.176, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00094, train/loss_step=0.253, global_step=2741.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  77%|███████▋  | 4622/5971 [46:06<13:27,  1.67it/s, loss=0.176, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00094, train/loss_step=0.253, global_step=2741.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4622/5971 [46:06<13:27,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.637, train/loss_vlb_step=0.00965, train/loss_step=0.637, global_step=2741.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4623/5971 [46:07<13:26,  1.67it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00841, train/loss_vlb_step=3.87e-5, train/loss_step=0.00841, global_step=2741.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4624/5971 [46:10<13:26,  1.67it/s, loss=0.194, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00293, train/loss_step=0.422, global_step=2741.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  77%|███████▋  | 4625/5971 [46:11<13:26,  1.67it/s, loss=0.187, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000882, train/loss_step=0.236, global_step=2742.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4626/5971 [46:12<13:25,  1.67it/s, loss=0.187, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000882, train/loss_step=0.236, global_step=2742.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4626/5971 [46:12<13:25,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000191, train/loss_step=0.0569, global_step=2742.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  77%|███████▋  | 4627/5971 [46:12<13:25,  1.67it/s, loss=0.201, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00206, train/loss_step=0.364, global_step=2742.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  78%|███████▊  | 4628/5971 [46:15<13:25,  1.67it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.13e-5, train/loss_step=0.0236, global_step=2742.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4629/5971 [46:15<13:24,  1.67it/s, loss=0.189, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000141, train/loss_step=0.038, global_step=2743.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  78%|███████▊  | 4630/5971 [46:16<13:24,  1.67it/s, loss=0.189, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000141, train/loss_step=0.038, global_step=2743.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4630/5971 [46:16<13:24,  1.67it/s, loss=0.2, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000822, train/loss_step=0.232, global_step=2743.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  78%|███████▊  | 4631/5971 [46:17<13:23,  1.67it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.76e-5, train/loss_step=0.00533, global_step=2743.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4632/5971 [46:19<13:23,  1.67it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=2.06e-5, train/loss_step=0.00377, global_step=2743.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4633/5971 [46:20<13:22,  1.67it/s, loss=0.193, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000473, train/loss_step=0.141, global_step=2744.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  78%|███████▊  | 4634/5971 [46:21<13:22,  1.67it/s, loss=0.193, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000473, train/loss_step=0.141, global_step=2744.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4634/5971 [46:21<13:22,  1.67it/s, loss=0.193, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000341, train/loss_step=0.103, global_step=2744.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4635/5971 [46:22<13:21,  1.67it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.5e-5, train/loss_step=0.0179, global_step=2744.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4636/5971 [46:24<13:21,  1.67it/s, loss=0.141, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.00102, train/loss_step=0.188, global_step=2744.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  78%|███████▊  | 4637/5971 [46:25<13:21,  1.66it/s, loss=0.145, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000469, train/loss_step=0.139, global_step=2745.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4638/5971 [46:26<13:20,  1.66it/s, loss=0.145, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000469, train/loss_step=0.139, global_step=2745.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4638/5971 [46:26<13:20,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00106, train/loss_step=0.257, global_step=2745.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  78%|███████▊  | 4639/5971 [46:27<13:20,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.00019, train/loss_step=0.0536, global_step=2745.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4640/5971 [46:29<13:20,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000644, train/loss_step=0.190, global_step=2745.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4641/5971 [46:30<13:19,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0897, train/loss_vlb_step=0.000305, train/loss_step=0.0897, global_step=2746.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4642/5971 [46:31<13:19,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0897, train/loss_vlb_step=0.000305, train/loss_step=0.0897, global_step=2746.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4642/5971 [46:31<13:19,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00044, train/loss_step=0.130, global_step=2746.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  78%|███████▊  | 4643/5971 [46:32<13:18,  1.66it/s, loss=0.165, v_num=0, train/loss_simple_step=0.601, train/loss_vlb_step=0.00711, train/loss_step=0.601, global_step=2746.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4644/5971 [46:34<13:18,  1.66it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=8.11e-5, train/loss_step=0.0193, global_step=2746.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4645/5971 [46:35<13:17,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000159, train/loss_step=0.0461, global_step=2747.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4646/5971 [46:36<13:17,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000159, train/loss_step=0.0461, global_step=2747.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4646/5971 [46:36<13:17,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000665, train/loss_step=0.196, global_step=2747.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  78%|███████▊  | 4647/5971 [46:37<13:16,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.44e-5, train/loss_step=0.0149, global_step=2747.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4648/5971 [46:39<13:16,  1.66it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0015, train/loss_vlb_step=9e-6, train/loss_step=0.0015, global_step=2747.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  78%|███████▊  | 4649/5971 [46:40<13:16,  1.66it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00623, train/loss_vlb_step=3.09e-5, train/loss_step=0.00623, global_step=2748.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4650/5971 [46:41<13:15,  1.66it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00623, train/loss_vlb_step=3.09e-5, train/loss_step=0.00623, global_step=2748.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4650/5971 [46:41<13:15,  1.66it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000156, train/loss_step=0.0432, global_step=2748.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  78%|███████▊  | 4651/5971 [46:42<13:15,  1.66it/s, loss=0.156, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.0561, train/loss_step=0.873, global_step=2748.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  78%|███████▊  | 4652/5971 [46:44<13:14,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2748.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4653/5971 [46:45<13:14,  1.66it/s, loss=0.163, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000638, train/loss_step=0.188, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4654/5971 [46:46<13:13,  1.66it/s, loss=0.163, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000638, train/loss_step=0.188, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4654/5971 [46:46<13:13,  1.66it/s, loss=0.163, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000343, train/loss_step=0.103, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4655/5971 [46:47<13:13,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000638, train/loss_step=0.177, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  78%|███████▊  | 4656/5971 [46:49<13:13,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:12,  2.30it/s][A
Epoch 4:  78%|███████▊  | 4658/5971 [46:49<13:11,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:46,  3.56it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.87it/s][A
Epoch 4:  78%|███████▊  | 4662/5971 [46:49<13:08,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.50it/s][A
Epoch 4:  78%|███████▊  | 4666/5971 [46:50<13:05,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:01<00:10, 15.45it/s][A
Epoch 4:  78%|███████▊  | 4670/5971 [46:50<13:02,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.56it/s][A

Validating:  10%|█         | 17/167 [00:01<00:07, 20.95it/s][A
Epoch 4:  78%|███████▊  | 4674/5971 [46:50<12:59,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.73it/s][A
Epoch 4:  78%|███████▊  | 4678/5971 [46:50<12:56,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.56it/s][A
Epoch 4:  78%|███████▊  | 4682/5971 [46:50<12:53,  1.67it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.82it/s][A
Epoch 4:  78%|███████▊  | 4686/5971 [46:50<12:50,  1.67it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.08it/s][A

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.19it/s][A
Epoch 4:  79%|███████▊  | 4690/5971 [46:51<12:47,  1.67it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.25it/s][A
Epoch 4:  79%|███████▊  | 4694/5971 [46:51<12:44,  1.67it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.17it/s][A
Epoch 4:  79%|███████▊  | 4698/5971 [46:51<12:41,  1.67it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.31it/s][A
Epoch 4:  79%|███████▊  | 4702/5971 [46:51<12:38,  1.67it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 24.62it/s][A

Validating:  29%|██▉       | 49/167 [00:02<00:04, 24.71it/s][A
Epoch 4:  79%|███████▉  | 4706/5971 [46:51<12:35,  1.67it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 52/167 [00:02<00:04, 24.60it/s][A
Epoch 4:  79%|███████▉  | 4710/5971 [46:51<12:32,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 25.16it/s][A
Epoch 4:  79%|███████▉  | 4714/5971 [46:52<12:29,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.07it/s][A
Epoch 4:  79%|███████▉  | 4718/5971 [46:52<12:26,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.79it/s][A
Epoch 4:  79%|███████▉  | 4722/5971 [46:52<12:23,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.93it/s][A
Epoch 4:  79%|███████▉  | 4726/5971 [46:52<12:20,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 28.94it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.69it/s][A
Epoch 4:  79%|███████▉  | 4730/5971 [46:52<12:17,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.80it/s][A
Epoch 4:  79%|███████▉  | 4734/5971 [46:52<12:14,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.84it/s][A
Epoch 4:  79%|███████▉  | 4738/5971 [46:52<12:11,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 25.06it/s][A

Validating:  51%|█████     | 85/167 [00:03<00:03, 25.62it/s][A
Epoch 4:  79%|███████▉  | 4742/5971 [46:53<12:08,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 24.90it/s][A
Epoch 4:  79%|███████▉  | 4746/5971 [46:53<12:05,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 24.44it/s][A
Epoch 4:  80%|███████▉  | 4750/5971 [46:53<12:03,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.32it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.51it/s][A
Epoch 4:  80%|███████▉  | 4754/5971 [46:53<12:00,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.31it/s][A
Epoch 4:  80%|███████▉  | 4758/5971 [46:53<11:57,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 26.53it/s][A
Epoch 4:  80%|███████▉  | 4762/5971 [46:53<11:54,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.82it/s][A

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.07it/s][A
Epoch 4:  80%|███████▉  | 4766/5971 [46:54<11:51,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.66it/s][A
Epoch 4:  80%|███████▉  | 4770/5971 [46:54<11:48,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 27.22it/s][A
Epoch 4:  80%|███████▉  | 4774/5971 [46:54<11:45,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.22it/s][A

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.80it/s][A
Epoch 4:  80%|████████  | 4778/5971 [46:54<11:42,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.02it/s][A
Epoch 4:  80%|████████  | 4782/5971 [46:54<11:39,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.92it/s][A
Epoch 4:  80%|████████  | 4786/5971 [46:54<11:36,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 25.24it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.40it/s][A
Epoch 4:  80%|████████  | 4790/5971 [46:54<11:33,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 25.96it/s][A
Epoch 4:  80%|████████  | 4794/5971 [46:55<11:31,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.98it/s][A
Epoch 4:  80%|████████  | 4798/5971 [46:55<11:28,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 25.88it/s][A

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 24.71it/s][A
Epoch 4:  80%|████████  | 4802/5971 [46:55<11:25,  1.71it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 24.88it/s][A
Epoch 4:  80%|████████  | 4806/5971 [46:55<11:22,  1.71it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 25.23it/s][A
Epoch 4:  81%|████████  | 4810/5971 [46:55<11:19,  1.71it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 25.53it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.34it/s][A
Epoch 4:  81%|████████  | 4814/5971 [46:55<11:16,  1.71it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.23it/s][A
Epoch 4:  81%|████████  | 4818/5971 [46:56<11:13,  1.71it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.90it/s][A
Epoch 4:  81%|████████  | 4822/5971 [46:56<11:10,  1.71it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.14it/s][A
Epoch 4:  81%|████████  | 4824/5971 [46:56<11:09,  1.71it/s, loss=0.166, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=2749.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  81%|████████  | 4825/5971 [46:57<11:09,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000213, train/loss_step=0.0625, global_step=2750.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4826/5971 [46:58<11:08,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000213, train/loss_step=0.0625, global_step=2750.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4826/5971 [46:58<11:08,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000472, train/loss_step=0.139, global_step=2750.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  81%|████████  | 4827/5971 [46:59<11:08,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.00098, train/loss_step=0.226, global_step=2750.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  81%|████████  | 4828/5971 [47:01<11:07,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000733, train/loss_step=0.198, global_step=2750.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4829/5971 [47:02<11:07,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000713, train/loss_step=0.209, global_step=2751.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4830/5971 [47:03<11:06,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000713, train/loss_step=0.209, global_step=2751.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4830/5971 [47:03<11:06,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00717, train/loss_vlb_step=3.37e-5, train/loss_step=0.00717, global_step=2751.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4831/5971 [47:04<11:06,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.7e-5, train/loss_step=0.0133, global_step=2751.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  81%|████████  | 4832/5971 [47:06<11:06,  1.71it/s, loss=0.149, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00107, train/loss_step=0.277, global_step=2751.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  81%|████████  | 4833/5971 [47:07<11:05,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000672, train/loss_step=0.195, global_step=2752.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4834/5971 [47:07<11:05,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000672, train/loss_step=0.195, global_step=2752.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4834/5971 [47:07<11:05,  1.71it/s, loss=0.162, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00153, train/loss_step=0.322, global_step=2752.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  81%|████████  | 4835/5971 [47:08<11:04,  1.71it/s, loss=0.163, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000126, train/loss_step=0.033, global_step=2752.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4836/5971 [47:10<11:04,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=5.04e-5, train/loss_step=0.0107, global_step=2752.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4837/5971 [47:11<11:03,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.362, train/loss_vlb_step=0.00214, train/loss_step=0.362, global_step=2753.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  81%|████████  | 4838/5971 [47:12<11:03,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.362, train/loss_vlb_step=0.00214, train/loss_step=0.362, global_step=2753.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4838/5971 [47:12<11:03,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.00019, train/loss_step=0.055, global_step=2753.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4839/5971 [47:13<11:02,  1.71it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00606, train/loss_vlb_step=2.92e-5, train/loss_step=0.00606, global_step=2753.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4840/5971 [47:16<11:02,  1.71it/s, loss=0.145, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000878, train/loss_step=0.222, global_step=2753.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  81%|████████  | 4841/5971 [47:17<11:02,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.01e-5, train/loss_step=0.00671, global_step=2754.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4842/5971 [47:17<11:01,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.01e-5, train/loss_step=0.00671, global_step=2754.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4842/5971 [47:17<11:01,  1.71it/s, loss=0.14, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000631, train/loss_step=0.176, global_step=2754.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  81%|████████  | 4843/5971 [47:18<11:01,  1.71it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000109, train/loss_step=0.0299, global_step=2754.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4844/5971 [47:20<11:00,  1.71it/s, loss=0.135, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000505, train/loss_step=0.146, global_step=2754.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  81%|████████  | 4845/5971 [47:21<11:00,  1.71it/s, loss=0.133, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000125, train/loss_step=0.034, global_step=2755.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4846/5971 [47:22<10:59,  1.71it/s, loss=0.133, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000125, train/loss_step=0.034, global_step=2755.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4846/5971 [47:22<10:59,  1.71it/s, loss=0.142, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00191, train/loss_step=0.309, global_step=2755.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  81%|████████  | 4847/5971 [47:23<10:59,  1.70it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.2e-5, train/loss_step=0.0216, global_step=2755.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4848/5971 [47:25<10:59,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.14e-5, train/loss_step=0.00193, global_step=2755.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4849/5971 [47:26<10:58,  1.70it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.99e-5, train/loss_step=0.0258, global_step=2756.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  81%|████████  | 4850/5971 [47:27<10:58,  1.70it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.99e-5, train/loss_step=0.0258, global_step=2756.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4850/5971 [47:27<10:58,  1.70it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00761, train/loss_vlb_step=3.6e-5, train/loss_step=0.00761, global_step=2756.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████  | 4851/5971 [47:28<10:57,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.781, train/loss_vlb_step=0.0572, train/loss_step=0.781, global_step=2756.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  81%|████████▏ | 4852/5971 [47:30<10:57,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00117, train/loss_step=0.279, global_step=2756.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4853/5971 [47:31<10:56,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00515, train/loss_vlb_step=2.58e-5, train/loss_step=0.00515, global_step=2757.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4854/5971 [47:32<10:56,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00515, train/loss_vlb_step=2.58e-5, train/loss_step=0.00515, global_step=2757.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4854/5971 [47:32<10:56,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00118, train/loss_step=0.300, global_step=2757.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  81%|████████▏ | 4855/5971 [47:33<10:55,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0872, train/loss_vlb_step=0.000288, train/loss_step=0.0872, global_step=2757.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4856/5971 [47:35<10:55,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00561, train/loss_vlb_step=2.83e-5, train/loss_step=0.00561, global_step=2757.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4857/5971 [47:36<10:54,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000118, train/loss_step=0.0309, global_step=2758.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  81%|████████▏ | 4858/5971 [47:37<10:54,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000118, train/loss_step=0.0309, global_step=2758.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4858/5971 [47:37<10:54,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00148, train/loss_vlb_step=8.95e-6, train/loss_step=0.00148, global_step=2758.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4859/5971 [47:38<10:53,  1.70it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0555, train/loss_vlb_step=0.000195, train/loss_step=0.0555, global_step=2758.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  81%|████████▏ | 4860/5971 [47:40<10:53,  1.70it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00509, train/loss_vlb_step=2.62e-5, train/loss_step=0.00509, global_step=2758.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4861/5971 [47:41<10:53,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.677, train/loss_vlb_step=0.0223, train/loss_step=0.677, global_step=2759.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  81%|████████▏ | 4862/5971 [47:42<10:52,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.677, train/loss_vlb_step=0.0223, train/loss_step=0.677, global_step=2759.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4862/5971 [47:42<10:52,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.73e-5, train/loss_step=0.0231, global_step=2759.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4863/5971 [47:43<10:52,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000147, train/loss_step=0.0394, global_step=2759.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4864/5971 [47:45<10:52,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.87e-5, train/loss_step=0.0138, global_step=2759.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  81%|████████▏ | 4865/5971 [47:46<10:51,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0662, train/loss_vlb_step=0.000226, train/loss_step=0.0662, global_step=2760.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4866/5971 [47:47<10:50,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0662, train/loss_vlb_step=0.000226, train/loss_step=0.0662, global_step=2760.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  81%|████████▏ | 4866/5971 [47:47<10:50,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.52e-5, train/loss_step=0.0175, global_step=2760.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4867/5971 [47:48<10:50,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.71e-5, train/loss_step=0.0131, global_step=2760.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4868/5971 [47:50<10:50,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00191, train/loss_step=0.325, global_step=2760.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  82%|████████▏ | 4869/5971 [47:51<10:49,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00998, train/loss_vlb_step=4.43e-5, train/loss_step=0.00998, global_step=2761.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4870/5971 [47:52<10:49,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00998, train/loss_vlb_step=4.43e-5, train/loss_step=0.00998, global_step=2761.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4870/5971 [47:52<10:49,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.15e-5, train/loss_step=0.00193, global_step=2761.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4871/5971 [47:53<10:48,  1.70it/s, loss=0.117, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00195, train/loss_step=0.375, global_step=2761.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  82%|████████▏ | 4872/5971 [47:55<10:48,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000123, train/loss_step=0.0314, global_step=2761.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4873/5971 [47:56<10:47,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.00359, train/loss_step=0.496, global_step=2762.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  82%|████████▏ | 4874/5971 [47:57<10:47,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.00359, train/loss_step=0.496, global_step=2762.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4874/5971 [47:57<10:47,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0679, train/loss_vlb_step=0.000233, train/loss_step=0.0679, global_step=2762.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4875/5971 [47:57<10:46,  1.69it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.12e-5, train/loss_step=0.00393, global_step=2762.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4876/5971 [48:00<10:46,  1.69it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.39e-5, train/loss_step=0.0024, global_step=2762.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  82%|████████▏ | 4877/5971 [48:01<10:46,  1.69it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000152, train/loss_step=0.0454, global_step=2763.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4878/5971 [48:01<10:45,  1.69it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000152, train/loss_step=0.0454, global_step=2763.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4878/5971 [48:01<10:45,  1.69it/s, loss=0.139, v_num=0, train/loss_simple_step=0.505, train/loss_vlb_step=0.0047, train/loss_step=0.505, global_step=2763.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  82%|████████▏ | 4879/5971 [48:02<10:45,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.75e-5, train/loss_step=0.0184, global_step=2763.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4880/5971 [48:05<10:44,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.4e-5, train/loss_step=0.0215, global_step=2763.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4881/5971 [48:06<10:44,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00719, train/loss_vlb_step=3.42e-5, train/loss_step=0.00719, global_step=2764.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4882/5971 [48:07<10:43,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00719, train/loss_vlb_step=3.42e-5, train/loss_step=0.00719, global_step=2764.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4882/5971 [48:07<10:43,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00207, train/loss_step=0.373, global_step=2764.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  82%|████████▏ | 4883/5971 [48:08<10:43,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.470, train/loss_vlb_step=0.00417, train/loss_step=0.470, global_step=2764.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4884/5971 [48:10<10:43,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000885, train/loss_step=0.243, global_step=2764.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4885/5971 [48:11<10:42,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.00026, train/loss_step=0.078, global_step=2765.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4886/5971 [48:12<10:42,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.00026, train/loss_step=0.078, global_step=2765.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4886/5971 [48:12<10:42,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000981, train/loss_step=0.242, global_step=2765.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4887/5971 [48:13<10:41,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000765, train/loss_step=0.203, global_step=2765.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4888/5971 [48:16<10:41,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00258, train/loss_vlb_step=1.46e-5, train/loss_step=0.00258, global_step=2765.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4889/5971 [48:17<10:41,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.16e-5, train/loss_step=0.00388, global_step=2766.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4890/5971 [48:18<10:40,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.16e-5, train/loss_step=0.00388, global_step=2766.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4890/5971 [48:18<10:40,  1.69it/s, loss=0.172, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000992, train/loss_step=0.255, global_step=2766.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  82%|████████▏ | 4891/5971 [48:18<10:39,  1.69it/s, loss=0.167, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.00116, train/loss_step=0.261, global_step=2766.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4892/5971 [48:21<10:39,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0787, train/loss_vlb_step=0.00026, train/loss_step=0.0787, global_step=2766.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4893/5971 [48:22<10:39,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000477, train/loss_step=0.145, global_step=2767.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4894/5971 [48:23<10:38,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000477, train/loss_step=0.145, global_step=2767.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4894/5971 [48:23<10:38,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.068, train/loss_vlb_step=0.000231, train/loss_step=0.068, global_step=2767.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4895/5971 [48:24<10:38,  1.69it/s, loss=0.179, v_num=0, train/loss_simple_step=0.547, train/loss_vlb_step=0.00569, train/loss_step=0.547, global_step=2767.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4896/5971 [48:26<10:37,  1.68it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.000156, train/loss_step=0.0404, global_step=2767.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4897/5971 [48:27<10:37,  1.68it/s, loss=0.201, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00487, train/loss_step=0.454, global_step=2768.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  82%|████████▏ | 4898/5971 [48:28<10:36,  1.68it/s, loss=0.201, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00487, train/loss_step=0.454, global_step=2768.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4898/5971 [48:28<10:36,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.706, train/loss_vlb_step=0.018, train/loss_step=0.706, global_step=2768.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  82%|████████▏ | 4899/5971 [48:28<10:36,  1.68it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000208, train/loss_step=0.0614, global_step=2768.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4900/5971 [48:31<10:36,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0972, train/loss_vlb_step=0.000321, train/loss_step=0.0972, global_step=2768.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4901/5971 [48:32<10:35,  1.68it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0773, train/loss_vlb_step=0.000265, train/loss_step=0.0773, global_step=2769.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4902/5971 [48:33<10:35,  1.68it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0773, train/loss_vlb_step=0.000265, train/loss_step=0.0773, global_step=2769.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4902/5971 [48:33<10:35,  1.68it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.000161, train/loss_step=0.0477, global_step=2769.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4903/5971 [48:34<10:34,  1.68it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0088, train/loss_vlb_step=4e-5, train/loss_step=0.0088, global_step=2769.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  82%|████████▏ | 4904/5971 [48:36<10:34,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.92e-5, train/loss_step=0.00364, global_step=2769.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4905/5971 [48:37<10:33,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.24e-5, train/loss_step=0.00219, global_step=2770.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4906/5971 [48:38<10:33,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.24e-5, train/loss_step=0.00219, global_step=2770.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4906/5971 [48:38<10:33,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=2770.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  82%|████████▏ | 4907/5971 [48:38<10:32,  1.68it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=8.85e-5, train/loss_step=0.0229, global_step=2770.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4908/5971 [48:41<10:32,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.070, train/loss_vlb_step=0.000234, train/loss_step=0.070, global_step=2770.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4909/5971 [48:42<10:32,  1.68it/s, loss=0.173, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00344, train/loss_step=0.410, global_step=2771.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4910/5971 [48:42<10:31,  1.68it/s, loss=0.173, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00344, train/loss_step=0.410, global_step=2771.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4910/5971 [48:42<10:31,  1.68it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.52e-5, train/loss_step=0.0207, global_step=2771.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4911/5971 [48:43<10:30,  1.68it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00769, train/loss_vlb_step=3.8e-5, train/loss_step=0.00769, global_step=2771.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4912/5971 [48:46<10:30,  1.68it/s, loss=0.163, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00182, train/loss_step=0.358, global_step=2771.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  82%|████████▏ | 4913/5971 [48:46<10:30,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.437, train/loss_vlb_step=0.00297, train/loss_step=0.437, global_step=2772.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4914/5971 [48:47<10:29,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.437, train/loss_vlb_step=0.00297, train/loss_step=0.437, global_step=2772.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4914/5971 [48:47<10:29,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.56e-5, train/loss_step=0.013, global_step=2772.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4915/5971 [48:48<10:29,  1.68it/s, loss=0.172, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00547, train/loss_step=0.495, global_step=2772.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4916/5971 [48:51<10:28,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000366, train/loss_step=0.110, global_step=2772.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4917/5971 [48:51<10:28,  1.68it/s, loss=0.178, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.00598, train/loss_step=0.496, global_step=2773.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4918/5971 [48:52<10:27,  1.68it/s, loss=0.178, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.00598, train/loss_step=0.496, global_step=2773.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4918/5971 [48:52<10:27,  1.68it/s, loss=0.15, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000582, train/loss_step=0.156, global_step=2773.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4919/5971 [48:53<10:27,  1.68it/s, loss=0.152, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=2773.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4920/5971 [48:56<10:27,  1.68it/s, loss=0.151, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000268, train/loss_step=0.077, global_step=2773.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4921/5971 [48:56<10:26,  1.68it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000219, train/loss_step=0.0626, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4922/5971 [48:57<10:25,  1.68it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000219, train/loss_step=0.0626, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  82%|████████▏ | 4922/5971 [48:57<10:25,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4923/5971 [48:58<10:25,  1.68it/s, loss=0.168, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00111, train/loss_step=0.290, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  82%|████████▏ | 4924/5971 [49:00<10:25,  1.67it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:21,  2.03it/s][A
Epoch 4:  82%|████████▏ | 4926/5971 [49:01<10:23,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:48,  3.43it/s][A
Epoch 4:  83%|████████▎ | 4930/5971 [49:01<10:21,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   4%|▎         | 6/167 [00:00<00:14, 10.82it/s][A

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.82it/s][A
Epoch 4:  83%|████████▎ | 4934/5971 [49:01<10:18,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 12/167 [00:01<00:08, 17.91it/s][A
Epoch 4:  83%|████████▎ | 4938/5971 [49:01<10:15,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   9%|▉         | 15/167 [00:01<00:08, 18.28it/s][A
Epoch 4:  83%|████████▎ | 4942/5971 [49:02<10:12,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  11%|█         | 18/167 [00:01<00:07, 20.38it/s][A

Validating:  13%|█▎        | 21/167 [00:01<00:06, 20.89it/s][A
Epoch 4:  83%|████████▎ | 4946/5971 [49:02<10:09,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 22.64it/s][A
Epoch 4:  83%|████████▎ | 4950/5971 [49:02<10:06,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.71it/s][A
Epoch 4:  83%|████████▎ | 4954/5971 [49:02<10:03,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.52it/s][A

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.84it/s][A
Epoch 4:  83%|████████▎ | 4958/5971 [49:02<10:01,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.43it/s][A
Epoch 4:  83%|████████▎ | 4962/5971 [49:02<09:58,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 25.84it/s][A
Epoch 4:  83%|████████▎ | 4966/5971 [49:03<09:55,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 27.32it/s][A
Epoch 4:  83%|████████▎ | 4970/5971 [49:03<09:52,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 28.34it/s][A
Epoch 4:  83%|████████▎ | 4974/5971 [49:03<09:49,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 28.31it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 27.67it/s][A
Epoch 4:  83%|████████▎ | 4978/5971 [49:03<09:47,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.08it/s][A
Epoch 4:  83%|████████▎ | 4982/5971 [49:03<09:44,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.91it/s][A
Epoch 4:  84%|████████▎ | 4986/5971 [49:03<09:41,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.95it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:03, 25.86it/s][A
Epoch 4:  84%|████████▎ | 4990/5971 [49:03<09:38,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.28it/s][A
Epoch 4:  84%|████████▎ | 4994/5971 [49:04<09:35,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.20it/s][A
Epoch 4:  84%|████████▎ | 4998/5971 [49:04<09:33,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.61it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.32it/s][A
Epoch 4:  84%|████████▍ | 5002/5971 [49:04<09:30,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.31it/s][A
Epoch 4:  84%|████████▍ | 5006/5971 [49:04<09:27,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.15it/s][A
Epoch 4:  84%|████████▍ | 5010/5971 [49:04<09:24,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.46it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 25.38it/s][A
Epoch 4:  84%|████████▍ | 5014/5971 [49:04<09:21,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.68it/s][A
Epoch 4:  84%|████████▍ | 5018/5971 [49:04<09:19,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.89it/s][A
Epoch 4:  84%|████████▍ | 5022/5971 [49:05<09:16,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.57it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.10it/s][A
Epoch 4:  84%|████████▍ | 5026/5971 [49:05<09:13,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.80it/s][A
Epoch 4:  84%|████████▍ | 5030/5971 [49:05<09:10,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.15it/s][A
Epoch 4:  84%|████████▍ | 5034/5971 [49:05<09:08,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.25it/s][A
Epoch 4:  84%|████████▍ | 5038/5971 [49:05<09:05,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 25.48it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.44it/s][A
Epoch 4:  84%|████████▍ | 5042/5971 [49:05<09:02,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.49it/s][A
Epoch 4:  85%|████████▍ | 5046/5971 [49:06<08:59,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.84it/s][A
Epoch 4:  85%|████████▍ | 5050/5971 [49:06<08:57,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.70it/s][A
Epoch 4:  85%|████████▍ | 5054/5971 [49:06<08:54,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.10it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.87it/s][A
Epoch 4:  85%|████████▍ | 5058/5971 [49:06<08:51,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 25.95it/s][A
Epoch 4:  85%|████████▍ | 5062/5971 [49:06<08:49,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.76it/s][A
Epoch 4:  85%|████████▍ | 5066/5971 [49:06<08:46,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.44it/s][A
Epoch 4:  85%|████████▍ | 5070/5971 [49:06<08:43,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.72it/s][A

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.10it/s][A
Epoch 4:  85%|████████▍ | 5074/5971 [49:07<08:40,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 24.27it/s][A
Epoch 4:  85%|████████▌ | 5078/5971 [49:07<08:38,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 23.54it/s][A
Epoch 4:  85%|████████▌ | 5082/5971 [49:07<08:35,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 24.14it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.11it/s][A
Epoch 4:  85%|████████▌ | 5086/5971 [49:07<08:32,  1.73it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.13it/s][A
Epoch 4:  85%|████████▌ | 5090/5971 [49:07<08:30,  1.73it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.00it/s][A
Epoch 4:  85%|████████▌ | 5092/5971 [49:08<08:28,  1.73it/s, loss=0.176, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000567, train/loss_step=0.164, global_step=2774.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  85%|████████▌ | 5093/5971 [49:09<08:28,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000588, train/loss_step=0.174, global_step=2775.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5094/5971 [49:10<08:27,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000588, train/loss_step=0.174, global_step=2775.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5094/5971 [49:10<08:27,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=2775.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5095/5971 [49:10<08:27,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.13e-5, train/loss_step=0.0163, global_step=2775.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5096/5971 [49:13<08:27,  1.73it/s, loss=0.201, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00244, train/loss_step=0.395, global_step=2775.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  85%|████████▌ | 5097/5971 [49:14<08:26,  1.73it/s, loss=0.194, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00104, train/loss_step=0.266, global_step=2776.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5098/5971 [49:15<08:25,  1.73it/s, loss=0.194, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00104, train/loss_step=0.266, global_step=2776.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5098/5971 [49:15<08:25,  1.73it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.83e-5, train/loss_step=0.00579, global_step=2776.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5099/5971 [49:16<08:25,  1.73it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00551, train/loss_vlb_step=2.79e-5, train/loss_step=0.00551, global_step=2776.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5100/5971 [49:18<08:25,  1.72it/s, loss=0.185, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000691, train/loss_step=0.193, global_step=2776.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  85%|████████▌ | 5101/5971 [49:19<08:24,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000153, train/loss_step=0.0424, global_step=2777.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5102/5971 [49:20<08:24,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000153, train/loss_step=0.0424, global_step=2777.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5102/5971 [49:20<08:24,  1.72it/s, loss=0.181, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00149, train/loss_step=0.335, global_step=2777.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  85%|████████▌ | 5103/5971 [49:21<08:23,  1.72it/s, loss=0.162, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000397, train/loss_step=0.118, global_step=2777.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5104/5971 [49:23<08:23,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000253, train/loss_step=0.0757, global_step=2777.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  85%|████████▌ | 5105/5971 [49:24<08:22,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00623, train/loss_vlb_step=3.03e-5, train/loss_step=0.00623, global_step=2778.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5106/5971 [49:24<08:22,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00623, train/loss_vlb_step=3.03e-5, train/loss_step=0.00623, global_step=2778.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5106/5971 [49:24<08:22,  1.72it/s, loss=0.137, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000623, train/loss_step=0.174, global_step=2778.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  86%|████████▌ | 5107/5971 [49:25<08:21,  1.72it/s, loss=0.139, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000505, train/loss_step=0.153, global_step=2778.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5108/5971 [49:28<08:21,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.88e-6, train/loss_step=0.00165, global_step=2778.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5109/5971 [49:28<08:20,  1.72it/s, loss=0.14, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000566, train/loss_step=0.156, global_step=2779.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  86%|████████▌ | 5110/5971 [49:29<08:20,  1.72it/s, loss=0.14, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000566, train/loss_step=0.156, global_step=2779.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5110/5971 [49:29<08:20,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0452, train/loss_vlb_step=0.000164, train/loss_step=0.0452, global_step=2779.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5111/5971 [49:30<08:19,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.098, train/loss_vlb_step=0.000325, train/loss_step=0.098, global_step=2779.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  86%|████████▌ | 5112/5971 [49:32<08:19,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000132, train/loss_step=0.0341, global_step=2779.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5113/5971 [49:33<08:18,  1.72it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0789, train/loss_vlb_step=0.000261, train/loss_step=0.0789, global_step=2780.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5114/5971 [49:34<08:18,  1.72it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0789, train/loss_vlb_step=0.000261, train/loss_step=0.0789, global_step=2780.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5114/5971 [49:34<08:18,  1.72it/s, loss=0.138, v_num=0, train/loss_simple_step=0.553, train/loss_vlb_step=0.00454, train/loss_step=0.553, global_step=2780.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  86%|████████▌ | 5115/5971 [49:35<08:17,  1.72it/s, loss=0.166, v_num=0, train/loss_simple_step=0.576, train/loss_vlb_step=0.00551, train/loss_step=0.576, global_step=2780.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5116/5971 [49:37<08:17,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.29e-5, train/loss_step=0.00429, global_step=2780.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5117/5971 [49:38<08:17,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00128, train/loss_step=0.302, global_step=2781.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  86%|████████▌ | 5118/5971 [49:39<08:16,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00128, train/loss_step=0.302, global_step=2781.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5118/5971 [49:39<08:16,  1.72it/s, loss=0.174, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00538, train/loss_step=0.533, global_step=2781.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5119/5971 [49:40<08:15,  1.72it/s, loss=0.18, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=2781.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5120/5971 [49:42<08:15,  1.72it/s, loss=0.183, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.000944, train/loss_step=0.256, global_step=2781.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5121/5971 [49:43<08:15,  1.72it/s, loss=0.191, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000671, train/loss_step=0.201, global_step=2782.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5122/5971 [49:44<08:14,  1.72it/s, loss=0.191, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000671, train/loss_step=0.201, global_step=2782.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5122/5971 [49:44<08:14,  1.72it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000107, train/loss_step=0.0272, global_step=2782.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5123/5971 [49:45<08:14,  1.72it/s, loss=0.177, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000534, train/loss_step=0.155, global_step=2782.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  86%|████████▌ | 5124/5971 [49:47<08:13,  1.72it/s, loss=0.187, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.0012, train/loss_step=0.272, global_step=2782.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  86%|████████▌ | 5125/5971 [49:48<08:13,  1.72it/s, loss=0.194, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000468, train/loss_step=0.140, global_step=2783.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5126/5971 [49:49<08:12,  1.72it/s, loss=0.194, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000468, train/loss_step=0.140, global_step=2783.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5126/5971 [49:49<08:12,  1.72it/s, loss=0.192, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000481, train/loss_step=0.145, global_step=2783.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5127/5971 [49:50<08:12,  1.71it/s, loss=0.195, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000747, train/loss_step=0.204, global_step=2783.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5128/5971 [49:52<08:11,  1.71it/s, loss=0.221, v_num=0, train/loss_simple_step=0.532, train/loss_vlb_step=0.00472, train/loss_step=0.532, global_step=2783.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▌ | 5129/5971 [49:53<08:11,  1.71it/s, loss=0.221, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.00056, train/loss_step=0.161, global_step=2784.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5130/5971 [49:54<08:10,  1.71it/s, loss=0.221, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.00056, train/loss_step=0.161, global_step=2784.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5130/5971 [49:54<08:10,  1.71it/s, loss=0.226, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.00052, train/loss_step=0.142, global_step=2784.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5131/5971 [49:55<08:10,  1.71it/s, loss=0.236, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00133, train/loss_step=0.297, global_step=2784.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5132/5971 [49:57<08:09,  1.71it/s, loss=0.255, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00202, train/loss_step=0.408, global_step=2784.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5133/5971 [49:58<08:09,  1.71it/s, loss=0.273, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.00269, train/loss_step=0.444, global_step=2785.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5134/5971 [49:59<08:08,  1.71it/s, loss=0.273, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.00269, train/loss_step=0.444, global_step=2785.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5134/5971 [49:59<08:08,  1.71it/s, loss=0.246, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.5e-5, train/loss_step=0.00731, global_step=2785.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5135/5971 [50:00<08:08,  1.71it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.34e-5, train/loss_step=0.0203, global_step=2785.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▌ | 5136/5971 [50:02<08:07,  1.71it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=5.94e-5, train/loss_step=0.0141, global_step=2785.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5137/5971 [50:03<08:07,  1.71it/s, loss=0.208, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=2786.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  86%|████████▌ | 5138/5971 [50:03<08:06,  1.71it/s, loss=0.208, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=2786.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5138/5971 [50:03<08:06,  1.71it/s, loss=0.183, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.77e-5, train/loss_step=0.019, global_step=2786.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5139/5971 [50:04<08:06,  1.71it/s, loss=0.179, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000143, train/loss_step=0.037, global_step=2786.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5140/5971 [50:06<08:06,  1.71it/s, loss=0.205, v_num=0, train/loss_simple_step=0.775, train/loss_vlb_step=0.0228, train/loss_step=0.775, global_step=2786.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  86%|████████▌ | 5141/5971 [50:07<08:05,  1.71it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00839, train/loss_vlb_step=4.1e-5, train/loss_step=0.00839, global_step=2787.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5142/5971 [50:08<08:04,  1.71it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00839, train/loss_vlb_step=4.1e-5, train/loss_step=0.00839, global_step=2787.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5142/5971 [50:08<08:04,  1.71it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00606, train/loss_vlb_step=3.1e-5, train/loss_step=0.00606, global_step=2787.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5143/5971 [50:09<08:04,  1.71it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0589, train/loss_vlb_step=0.000206, train/loss_step=0.0589, global_step=2787.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▌ | 5144/5971 [50:12<08:04,  1.71it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.43e-5, train/loss_step=0.0119, global_step=2787.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5145/5971 [50:13<08:03,  1.71it/s, loss=0.213, v_num=0, train/loss_simple_step=0.871, train/loss_vlb_step=0.056, train/loss_step=0.871, global_step=2788.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  86%|████████▌ | 5146/5971 [50:13<08:03,  1.71it/s, loss=0.213, v_num=0, train/loss_simple_step=0.871, train/loss_vlb_step=0.056, train/loss_step=0.871, global_step=2788.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5146/5971 [50:13<08:03,  1.71it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00979, train/loss_vlb_step=4.29e-5, train/loss_step=0.00979, global_step=2788.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5147/5971 [50:14<08:02,  1.71it/s, loss=0.223, v_num=0, train/loss_simple_step=0.547, train/loss_vlb_step=0.00386, train/loss_step=0.547, global_step=2788.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  86%|████████▌ | 5148/5971 [50:17<08:02,  1.71it/s, loss=0.228, v_num=0, train/loss_simple_step=0.618, train/loss_vlb_step=0.00837, train/loss_step=0.618, global_step=2788.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▌ | 5149/5971 [50:17<08:01,  1.71it/s, loss=0.226, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=2789.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5150/5971 [50:18<08:01,  1.71it/s, loss=0.226, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=2789.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5150/5971 [50:18<08:01,  1.71it/s, loss=0.219, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.6e-6, train/loss_step=0.00164, global_step=2789.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5151/5971 [50:19<08:00,  1.71it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.51e-5, train/loss_step=0.0215, global_step=2789.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▋ | 5152/5971 [50:22<08:00,  1.71it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.59e-5, train/loss_step=0.0107, global_step=2789.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5153/5971 [50:23<07:59,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000984, train/loss_step=0.266, global_step=2790.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▋ | 5154/5971 [50:23<07:59,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000984, train/loss_step=0.266, global_step=2790.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5154/5971 [50:23<07:59,  1.70it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0345, train/loss_vlb_step=0.000129, train/loss_step=0.0345, global_step=2790.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5155/5971 [50:24<07:58,  1.70it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000226, train/loss_step=0.0671, global_step=2790.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▋ | 5156/5971 [50:26<07:58,  1.70it/s, loss=0.182, v_num=0, train/loss_simple_step=0.067, train/loss_vlb_step=0.000221, train/loss_step=0.067, global_step=2790.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▋ | 5157/5971 [50:27<07:57,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000391, train/loss_step=0.118, global_step=2791.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5158/5971 [50:28<07:57,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000391, train/loss_step=0.118, global_step=2791.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5158/5971 [50:28<07:57,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00359, train/loss_vlb_step=1.88e-5, train/loss_step=0.00359, global_step=2791.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5159/5971 [50:29<07:56,  1.70it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0774, train/loss_vlb_step=0.000255, train/loss_step=0.0774, global_step=2791.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▋ | 5160/5971 [50:31<07:56,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000322, train/loss_step=0.0971, global_step=2791.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5161/5971 [50:32<07:55,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00207, train/loss_step=0.405, global_step=2792.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  86%|████████▋ | 5162/5971 [50:33<07:55,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00207, train/loss_step=0.405, global_step=2792.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5162/5971 [50:33<07:55,  1.70it/s, loss=0.203, v_num=0, train/loss_simple_step=0.661, train/loss_vlb_step=0.0155, train/loss_step=0.661, global_step=2792.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  86%|████████▋ | 5163/5971 [50:34<07:54,  1.70it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0411, train/loss_vlb_step=0.000148, train/loss_step=0.0411, global_step=2792.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  86%|████████▋ | 5164/5971 [50:36<07:54,  1.70it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.19e-6, train/loss_step=0.00154, global_step=2792.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5165/5971 [50:37<07:53,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00115, train/loss_step=0.253, global_step=2793.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  87%|████████▋ | 5166/5971 [50:38<07:53,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00115, train/loss_step=0.253, global_step=2793.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5166/5971 [50:38<07:53,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0482, train/loss_vlb_step=0.000181, train/loss_step=0.0482, global_step=2793.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5167/5971 [50:39<07:52,  1.70it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0039, train/loss_vlb_step=2.03e-5, train/loss_step=0.0039, global_step=2793.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  87%|████████▋ | 5168/5971 [50:41<07:52,  1.70it/s, loss=0.136, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00213, train/loss_step=0.417, global_step=2793.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  87%|████████▋ | 5169/5971 [50:42<07:51,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000107, train/loss_step=0.0264, global_step=2794.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5170/5971 [50:43<07:51,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000107, train/loss_step=0.0264, global_step=2794.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5170/5971 [50:43<07:51,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000212, train/loss_step=0.0622, global_step=2794.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5171/5971 [50:44<07:50,  1.70it/s, loss=0.152, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00201, train/loss_step=0.372, global_step=2794.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  87%|████████▋ | 5172/5971 [50:47<07:50,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000374, train/loss_step=0.114, global_step=2794.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5173/5971 [50:48<07:50,  1.70it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.00021, train/loss_step=0.0613, global_step=2795.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5174/5971 [50:49<07:49,  1.70it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.00021, train/loss_step=0.0613, global_step=2795.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5174/5971 [50:49<07:49,  1.70it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000111, train/loss_step=0.0297, global_step=2795.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5175/5971 [50:50<07:49,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2795.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  87%|████████▋ | 5176/5971 [50:52<07:48,  1.70it/s, loss=0.155, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.00107, train/loss_step=0.176, global_step=2795.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5177/5971 [50:53<07:48,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.01e-5, train/loss_step=0.0119, global_step=2796.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5178/5971 [50:54<07:47,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.01e-5, train/loss_step=0.0119, global_step=2796.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5178/5971 [50:54<07:47,  1.70it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.000221, train/loss_step=0.0647, global_step=2796.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5179/5971 [50:55<07:47,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000589, train/loss_step=0.168, global_step=2796.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  87%|████████▋ | 5180/5971 [50:57<07:46,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0555, train/loss_vlb_step=0.000194, train/loss_step=0.0555, global_step=2796.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5181/5971 [50:58<07:46,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=2.02e-5, train/loss_step=0.0037, global_step=2797.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  87%|████████▋ | 5182/5971 [50:59<07:45,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=2.02e-5, train/loss_step=0.0037, global_step=2797.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5182/5971 [50:59<07:45,  1.69it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.00011, train/loss_step=0.0285, global_step=2797.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5183/5971 [51:00<07:45,  1.69it/s, loss=0.13, v_num=0, train/loss_simple_step=0.582, train/loss_vlb_step=0.00702, train/loss_step=0.582, global_step=2797.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  87%|████████▋ | 5184/5971 [51:02<07:44,  1.69it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.54e-5, train/loss_step=0.0187, global_step=2797.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5185/5971 [51:03<07:44,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.091, train/loss_vlb_step=0.000299, train/loss_step=0.091, global_step=2798.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  87%|████████▋ | 5186/5971 [51:04<07:43,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.091, train/loss_vlb_step=0.000299, train/loss_step=0.091, global_step=2798.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5186/5971 [51:04<07:43,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.94e-5, train/loss_step=0.0237, global_step=2798.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5187/5971 [51:05<07:43,  1.69it/s, loss=0.156, v_num=0, train/loss_simple_step=0.682, train/loss_vlb_step=0.0103, train/loss_step=0.682, global_step=2798.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  87%|████████▋ | 5188/5971 [51:07<07:42,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00561, train/loss_vlb_step=2.8e-5, train/loss_step=0.00561, global_step=2798.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5189/5971 [51:08<07:42,  1.69it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0431, train/loss_vlb_step=0.000147, train/loss_step=0.0431, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5190/5971 [51:09<07:41,  1.69it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0431, train/loss_vlb_step=0.000147, train/loss_step=0.0431, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5190/5971 [51:09<07:41,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000105, train/loss_step=0.026, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  87%|████████▋ | 5191/5971 [51:10<07:41,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0545, train/loss_vlb_step=0.000189, train/loss_step=0.0545, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  87%|████████▋ | 5192/5971 [51:12<07:40,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:12,  2.28it/s][A
Epoch 4:  87%|████████▋ | 5194/5971 [51:13<07:39,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:41,  4.01it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:15, 10.25it/s][A
Epoch 4:  87%|████████▋ | 5198/5971 [51:13<07:36,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:10, 14.57it/s][A
Epoch 4:  87%|████████▋ | 5202/5971 [51:13<07:34,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.22it/s][A
Epoch 4:  87%|████████▋ | 5206/5971 [51:13<07:31,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.69it/s][A

Validating:  10%|█         | 17/167 [00:01<00:06, 22.54it/s][A
Epoch 4:  87%|████████▋ | 5210/5971 [51:14<07:28,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.85it/s][A
Epoch 4:  87%|████████▋ | 5214/5971 [51:14<07:26,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.54it/s][A
Epoch 4:  87%|████████▋ | 5218/5971 [51:14<07:23,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.60it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.13it/s][A
Epoch 4:  87%|████████▋ | 5222/5971 [51:14<07:20,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.10it/s][A
Epoch 4:  88%|████████▊ | 5226/5971 [51:14<07:18,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 27.40it/s][A
Epoch 4:  88%|████████▊ | 5230/5971 [51:14<07:15,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 27.09it/s][A
Epoch 4:  88%|████████▊ | 5234/5971 [51:14<07:12,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 28.24it/s][A
Epoch 4:  88%|████████▊ | 5238/5971 [51:15<07:10,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 28.28it/s][A

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.56it/s][A
Epoch 4:  88%|████████▊ | 5242/5971 [51:15<07:07,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  31%|███       | 52/167 [00:02<00:04, 27.30it/s][A
Epoch 4:  88%|████████▊ | 5246/5971 [51:15<07:04,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:03, 28.69it/s][A
Epoch 4:  88%|████████▊ | 5250/5971 [51:15<07:02,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 29.96it/s][A
Epoch 4:  88%|████████▊ | 5254/5971 [51:15<06:59,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 29.02it/s][A
Epoch 4:  88%|████████▊ | 5258/5971 [51:15<06:57,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:02<00:03, 28.36it/s][A

Validating:  41%|████▏     | 69/167 [00:02<00:03, 28.33it/s][A
Epoch 4:  88%|████████▊ | 5262/5971 [51:15<06:54,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 28.14it/s][A
Epoch 4:  88%|████████▊ | 5266/5971 [51:16<06:51,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 28.25it/s][A
Epoch 4:  88%|████████▊ | 5270/5971 [51:16<06:49,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.93it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.90it/s][A
Epoch 4:  88%|████████▊ | 5274/5971 [51:16<06:46,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.45it/s][A
Epoch 4:  88%|████████▊ | 5278/5971 [51:16<06:43,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.28it/s][A
Epoch 4:  88%|████████▊ | 5282/5971 [51:16<06:41,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:03<00:03, 24.50it/s][A

Validating:  56%|█████▌    | 93/167 [00:03<00:03, 24.21it/s][A
Epoch 4:  89%|████████▊ | 5286/5971 [51:16<06:38,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.38it/s][A
Epoch 4:  89%|████████▊ | 5290/5971 [51:16<06:36,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.14it/s][A
Epoch 4:  89%|████████▊ | 5294/5971 [51:17<06:33,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 24.04it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 24.63it/s][A
Epoch 4:  89%|████████▊ | 5298/5971 [51:17<06:30,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 24.69it/s][A
Epoch 4:  89%|████████▉ | 5302/5971 [51:17<06:28,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 23.69it/s][A
Epoch 4:  89%|████████▉ | 5306/5971 [51:17<06:25,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 23.37it/s][A

Validating:  70%|███████   | 117/167 [00:04<00:01, 25.00it/s][A
Epoch 4:  89%|████████▉ | 5310/5971 [51:17<06:23,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.80it/s][A
Epoch 4:  89%|████████▉ | 5314/5971 [51:17<06:20,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.82it/s][A
Epoch 4:  89%|████████▉ | 5318/5971 [51:18<06:17,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 24.89it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.72it/s][A
Epoch 4:  89%|████████▉ | 5322/5971 [51:18<06:15,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 24.94it/s][A
Epoch 4:  89%|████████▉ | 5326/5971 [51:18<06:12,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:05<00:01, 23.52it/s][A
Epoch 4:  89%|████████▉ | 5330/5971 [51:18<06:10,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 22.78it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:01, 24.14it/s][A
Epoch 4:  89%|████████▉ | 5334/5971 [51:18<06:07,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:06<00:01, 22.65it/s][A
Epoch 4:  89%|████████▉ | 5338/5971 [51:19<06:05,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 23.43it/s][A
Epoch 4:  89%|████████▉ | 5342/5971 [51:19<06:02,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.02it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 24.11it/s][A
Epoch 4:  90%|████████▉ | 5346/5971 [51:19<05:59,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 23.69it/s][A
Epoch 4:  90%|████████▉ | 5350/5971 [51:19<05:57,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 24.67it/s][A
Epoch 4:  90%|████████▉ | 5354/5971 [51:19<05:54,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.83it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.94it/s][A
Epoch 4:  90%|████████▉ | 5358/5971 [51:19<05:52,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5360/5971 [51:20<05:51,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.22it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.83it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.23it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.51it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.68it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.91it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.04it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.37it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.43it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.25it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.20it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.25it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.24it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.33it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.29it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.37it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.30it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.27it/s][A
Epoch 4:  90%|████████▉ | 5360/5971 [51:31<05:52,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.34it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.39it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.05it/s]

Epoch 4:  90%|████████▉ | 5361/5971 [51:32<05:51,  1.73it/s, loss=0.119, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=2799.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5361/5971 [51:32<05:51,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000116, train/loss_step=0.0314, global_step=2800.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.32it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.12it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.69it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.13it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.45it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.74it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.93it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.08it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.29it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.67it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.69it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.70it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.44it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.15it/s]

Epoch 4:  90%|████████▉ | 5362/5971 [51:44<05:52,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000116, train/loss_step=0.0314, global_step=2800.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5362/5971 [51:44<05:52,  1.73it/s, loss=0.131, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00134, train/loss_step=0.308, global_step=2800.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.20it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.81it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.31it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.71it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.00it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.21it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.48it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.46it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.48it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.53it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.58it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.50it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.46it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.38it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.28it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.23it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.36it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.49it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.35it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.13it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.09it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.09it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.13it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.15it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.15it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.08it/s]

Epoch 4:  90%|████████▉ | 5363/5971 [51:57<05:53,  1.72it/s, loss=0.131, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00134, train/loss_step=0.308, global_step=2800.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5363/5971 [51:57<05:53,  1.72it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000109, train/loss_step=0.0309, global_step=2800.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.87it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.61it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.82it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.90it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.97it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.44it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.21it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.19it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.46it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.27it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.28it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.35it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.56it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s]

Epoch 4:  90%|████████▉ | 5364/5971 [52:10<05:54,  1.71it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000109, train/loss_step=0.0309, global_step=2800.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5364/5971 [52:10<05:54,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000372, train/loss_step=0.112, global_step=2800.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  90%|████████▉ | 5365/5971 [52:11<05:53,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000372, train/loss_step=0.112, global_step=2800.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5365/5971 [52:11<05:53,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000311, train/loss_step=0.0948, global_step=2801.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5366/5971 [52:12<05:53,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000311, train/loss_step=0.0948, global_step=2801.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5366/5971 [52:12<05:53,  1.71it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000155, train/loss_step=0.0406, global_step=2801.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5367/5971 [52:13<05:52,  1.71it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000155, train/loss_step=0.0406, global_step=2801.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5367/5971 [52:13<05:52,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000309, train/loss_step=0.0935, global_step=2801.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5368/5971 [52:15<05:52,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000309, train/loss_step=0.0935, global_step=2801.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5368/5971 [52:15<05:52,  1.71it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.77e-5, train/loss_step=0.0105, global_step=2801.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  90%|████████▉ | 5369/5971 [52:16<05:51,  1.71it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.77e-5, train/loss_step=0.0105, global_step=2801.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5369/5971 [52:16<05:51,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00185, train/loss_step=0.349, global_step=2802.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  90%|████████▉ | 5370/5971 [52:17<05:51,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00185, train/loss_step=0.349, global_step=2802.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5370/5971 [52:17<05:51,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=9.98e-6, train/loss_step=0.00167, global_step=2802.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5371/5971 [52:18<05:50,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=9.98e-6, train/loss_step=0.00167, global_step=2802.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5371/5971 [52:18<05:50,  1.71it/s, loss=0.124, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00145, train/loss_step=0.333, global_step=2802.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  90%|████████▉ | 5372/5971 [52:20<05:50,  1.71it/s, loss=0.124, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00145, train/loss_step=0.333, global_step=2802.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5372/5971 [52:20<05:50,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.25e-5, train/loss_step=0.0146, global_step=2802.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5373/5971 [52:21<05:49,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.25e-5, train/loss_step=0.0146, global_step=2802.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|████████▉ | 5373/5971 [52:21<05:49,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000171, train/loss_step=0.049, global_step=2803.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  90%|█████████ | 5374/5971 [52:22<05:49,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000171, train/loss_step=0.049, global_step=2803.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5374/5971 [52:22<05:49,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000436, train/loss_step=0.131, global_step=2803.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5375/5971 [52:23<05:48,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000436, train/loss_step=0.131, global_step=2803.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5375/5971 [52:23<05:48,  1.71it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00781, train/loss_vlb_step=3.65e-5, train/loss_step=0.00781, global_step=2803.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5376/5971 [52:25<05:48,  1.71it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00781, train/loss_vlb_step=3.65e-5, train/loss_step=0.00781, global_step=2803.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5376/5971 [52:25<05:48,  1.71it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.61e-5, train/loss_step=0.0136, global_step=2803.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  90%|█████████ | 5377/5971 [52:26<05:47,  1.71it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.61e-5, train/loss_step=0.0136, global_step=2803.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5377/5971 [52:26<05:47,  1.71it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000558, train/loss_step=0.162, global_step=2804.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  90%|█████████ | 5378/5971 [52:27<05:46,  1.71it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000558, train/loss_step=0.162, global_step=2804.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5378/5971 [52:27<05:46,  1.71it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.07e-5, train/loss_step=0.00176, global_step=2804.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5379/5971 [52:28<05:46,  1.71it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.07e-5, train/loss_step=0.00176, global_step=2804.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5379/5971 [52:28<05:46,  1.71it/s, loss=0.0957, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=2.9e-5, train/loss_step=0.00608, global_step=2804.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  90%|█████████ | 5380/5971 [52:30<05:46,  1.71it/s, loss=0.0957, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=2.9e-5, train/loss_step=0.00608, global_step=2804.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5380/5971 [52:30<05:46,  1.71it/s, loss=0.0901, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.64e-5, train/loss_step=0.010, global_step=2804.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  90%|█████████ | 5381/5971 [52:31<05:45,  1.71it/s, loss=0.0901, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.64e-5, train/loss_step=0.010, global_step=2804.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5381/5971 [52:31<05:45,  1.71it/s, loss=0.089, v_num=0, train/loss_simple_step=0.00929, train/loss_vlb_step=4.32e-5, train/loss_step=0.00929, global_step=2805.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5382/5971 [52:32<05:44,  1.71it/s, loss=0.089, v_num=0, train/loss_simple_step=0.00929, train/loss_vlb_step=4.32e-5, train/loss_step=0.00929, global_step=2805.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5382/5971 [52:32<05:44,  1.71it/s, loss=0.0751, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000115, train/loss_step=0.0294, global_step=2805.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5383/5971 [52:33<05:44,  1.71it/s, loss=0.0751, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000115, train/loss_step=0.0294, global_step=2805.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5383/5971 [52:33<05:44,  1.71it/s, loss=0.0772, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.000245, train/loss_step=0.0728, global_step=2805.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5384/5971 [52:35<05:43,  1.71it/s, loss=0.0772, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.000245, train/loss_step=0.0728, global_step=2805.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5384/5971 [52:35<05:43,  1.71it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00107, train/loss_step=0.278, global_step=2805.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  90%|█████████ | 5385/5971 [52:36<05:43,  1.71it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00107, train/loss_step=0.278, global_step=2805.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5385/5971 [52:36<05:43,  1.71it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00111, train/loss_step=0.291, global_step=2806.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5386/5971 [52:37<05:42,  1.71it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00111, train/loss_step=0.291, global_step=2806.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5386/5971 [52:37<05:42,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.562, train/loss_vlb_step=0.00498, train/loss_step=0.562, global_step=2806.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  90%|█████████ | 5387/5971 [52:38<05:42,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.562, train/loss_vlb_step=0.00498, train/loss_step=0.562, global_step=2806.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5387/5971 [52:38<05:42,  1.71it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.4e-5, train/loss_step=0.00248, global_step=2806.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5388/5971 [52:40<05:41,  1.71it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.4e-5, train/loss_step=0.00248, global_step=2806.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5388/5971 [52:40<05:41,  1.71it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000205, train/loss_step=0.0622, global_step=2806.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5389/5971 [52:41<05:41,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000205, train/loss_step=0.0622, global_step=2806.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5389/5971 [52:41<05:41,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000305, train/loss_step=0.0916, global_step=2807.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5390/5971 [52:42<05:40,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000305, train/loss_step=0.0916, global_step=2807.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5390/5971 [52:42<05:40,  1.70it/s, loss=0.117, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000714, train/loss_step=0.203, global_step=2807.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  90%|█████████ | 5391/5971 [52:43<05:40,  1.70it/s, loss=0.117, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000714, train/loss_step=0.203, global_step=2807.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5391/5971 [52:43<05:40,  1.70it/s, loss=0.12, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.0033, train/loss_step=0.410, global_step=2807.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  90%|█████████ | 5392/5971 [52:45<05:39,  1.70it/s, loss=0.12, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.0033, train/loss_step=0.410, global_step=2807.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5392/5971 [52:45<05:39,  1.70it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00177, train/loss_vlb_step=1.06e-5, train/loss_step=0.00177, global_step=2807.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5393/5971 [52:46<05:39,  1.70it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00177, train/loss_vlb_step=1.06e-5, train/loss_step=0.00177, global_step=2807.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5393/5971 [52:46<05:39,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000462, train/loss_step=0.134, global_step=2808.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  90%|█████████ | 5394/5971 [52:47<05:38,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000462, train/loss_step=0.134, global_step=2808.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5394/5971 [52:47<05:38,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00121, train/loss_step=0.269, global_step=2808.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  90%|█████████ | 5395/5971 [52:48<05:38,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00121, train/loss_step=0.269, global_step=2808.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5395/5971 [52:48<05:38,  1.70it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0524, train/loss_vlb_step=0.00018, train/loss_step=0.0524, global_step=2808.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5396/5971 [52:50<05:37,  1.70it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0524, train/loss_vlb_step=0.00018, train/loss_step=0.0524, global_step=2808.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5396/5971 [52:50<05:37,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.0015, train/loss_step=0.332, global_step=2808.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  90%|█████████ | 5397/5971 [52:51<05:37,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.0015, train/loss_step=0.332, global_step=2808.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5397/5971 [52:51<05:37,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=2809.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5398/5971 [52:52<05:36,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=2809.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5398/5971 [52:52<05:36,  1.70it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0808, train/loss_vlb_step=0.00027, train/loss_step=0.0808, global_step=2809.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5399/5971 [52:53<05:36,  1.70it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0808, train/loss_vlb_step=0.00027, train/loss_step=0.0808, global_step=2809.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5399/5971 [52:53<05:36,  1.70it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.22e-5, train/loss_step=0.0196, global_step=2809.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5400/5971 [52:55<05:35,  1.70it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.22e-5, train/loss_step=0.0196, global_step=2809.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5400/5971 [52:55<05:35,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.672, train/loss_vlb_step=0.0164, train/loss_step=0.672, global_step=2809.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  90%|█████████ | 5401/5971 [52:56<05:35,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.672, train/loss_vlb_step=0.0164, train/loss_step=0.672, global_step=2809.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5401/5971 [52:56<05:35,  1.70it/s, loss=0.191, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=2810.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5402/5971 [52:57<05:34,  1.70it/s, loss=0.191, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=2810.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5402/5971 [52:57<05:34,  1.70it/s, loss=0.212, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00247, train/loss_step=0.449, global_step=2810.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  90%|█████████ | 5403/5971 [52:58<05:34,  1.70it/s, loss=0.212, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00247, train/loss_step=0.449, global_step=2810.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  90%|█████████ | 5403/5971 [52:58<05:34,  1.70it/s, loss=0.22, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.00084, train/loss_step=0.219, global_step=2810.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  91%|█████████ | 5404/5971 [53:00<05:33,  1.70it/s, loss=0.22, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.00084, train/loss_step=0.219, global_step=2810.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5404/5971 [53:00<05:33,  1.70it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.74e-5, train/loss_step=0.00315, global_step=2810.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5405/5971 [53:01<05:33,  1.70it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.74e-5, train/loss_step=0.00315, global_step=2810.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5405/5971 [53:01<05:33,  1.70it/s, loss=0.203, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000869, train/loss_step=0.228, global_step=2811.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████ | 5406/5971 [53:02<05:32,  1.70it/s, loss=0.203, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000869, train/loss_step=0.228, global_step=2811.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5406/5971 [53:02<05:32,  1.70it/s, loss=0.188, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00137, train/loss_step=0.274, global_step=2811.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  91%|█████████ | 5407/5971 [53:03<05:31,  1.70it/s, loss=0.188, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00137, train/loss_step=0.274, global_step=2811.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5407/5971 [53:03<05:31,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.1e-5, train/loss_step=0.014, global_step=2811.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  91%|█████████ | 5408/5971 [53:05<05:31,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.1e-5, train/loss_step=0.014, global_step=2811.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5408/5971 [53:05<05:31,  1.70it/s, loss=0.198, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000883, train/loss_step=0.243, global_step=2811.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5409/5971 [53:06<05:30,  1.70it/s, loss=0.198, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000883, train/loss_step=0.243, global_step=2811.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5409/5971 [53:06<05:30,  1.70it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000165, train/loss_step=0.0466, global_step=2812.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5410/5971 [53:06<05:30,  1.70it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000165, train/loss_step=0.0466, global_step=2812.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5410/5971 [53:06<05:30,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00448, train/loss_vlb_step=2.34e-5, train/loss_step=0.00448, global_step=2812.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5411/5971 [53:07<05:29,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00448, train/loss_vlb_step=2.34e-5, train/loss_step=0.00448, global_step=2812.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5411/5971 [53:07<05:29,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.072, train/loss_vlb_step=0.000243, train/loss_step=0.072, global_step=2812.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████ | 5412/5971 [53:10<05:29,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.072, train/loss_vlb_step=0.000243, train/loss_step=0.072, global_step=2812.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5412/5971 [53:10<05:29,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.28e-5, train/loss_step=0.0022, global_step=2812.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5413/5971 [53:11<05:28,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.28e-5, train/loss_step=0.0022, global_step=2812.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5413/5971 [53:11<05:28,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.24e-6, train/loss_step=0.00155, global_step=2813.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5414/5971 [53:12<05:28,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.24e-6, train/loss_step=0.00155, global_step=2813.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5414/5971 [53:12<05:28,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=8.06e-6, train/loss_step=0.00132, global_step=2813.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5415/5971 [53:13<05:27,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=8.06e-6, train/loss_step=0.00132, global_step=2813.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5415/5971 [53:13<05:27,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000815, train/loss_step=0.227, global_step=2813.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████ | 5416/5971 [53:15<05:27,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000815, train/loss_step=0.227, global_step=2813.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5416/5971 [53:15<05:27,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0068, train/loss_vlb_step=3.42e-5, train/loss_step=0.0068, global_step=2813.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5417/5971 [53:16<05:26,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0068, train/loss_vlb_step=3.42e-5, train/loss_step=0.0068, global_step=2813.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5417/5971 [53:16<05:26,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00965, train/loss_vlb_step=4.42e-5, train/loss_step=0.00965, global_step=2814.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5418/5971 [53:17<05:26,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00965, train/loss_vlb_step=4.42e-5, train/loss_step=0.00965, global_step=2814.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5418/5971 [53:17<05:26,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000426, train/loss_step=0.129, global_step=2814.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████ | 5419/5971 [53:18<05:25,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000426, train/loss_step=0.129, global_step=2814.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5419/5971 [53:18<05:25,  1.69it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.000173, train/loss_step=0.0493, global_step=2814.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5420/5971 [53:20<05:25,  1.69it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.000173, train/loss_step=0.0493, global_step=2814.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5420/5971 [53:20<05:25,  1.69it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.24e-5, train/loss_step=0.00643, global_step=2814.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5421/5971 [53:21<05:24,  1.69it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.24e-5, train/loss_step=0.00643, global_step=2814.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5421/5971 [53:21<05:24,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000347, train/loss_step=0.104, global_step=2815.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████ | 5422/5971 [53:22<05:24,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000347, train/loss_step=0.104, global_step=2815.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5422/5971 [53:22<05:24,  1.69it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000107, train/loss_step=0.0268, global_step=2815.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5423/5971 [53:23<05:23,  1.69it/s, loss=0.0834, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000107, train/loss_step=0.0268, global_step=2815.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5423/5971 [53:23<05:23,  1.69it/s, loss=0.0772, v_num=0, train/loss_simple_step=0.0964, train/loss_vlb_step=0.000317, train/loss_step=0.0964, global_step=2815.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5424/5971 [53:25<05:23,  1.69it/s, loss=0.0772, v_num=0, train/loss_simple_step=0.0964, train/loss_vlb_step=0.000317, train/loss_step=0.0964, global_step=2815.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5424/5971 [53:25<05:23,  1.69it/s, loss=0.0806, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.000242, train/loss_step=0.0693, global_step=2815.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5425/5971 [53:26<05:22,  1.69it/s, loss=0.0806, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.000242, train/loss_step=0.0693, global_step=2815.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5425/5971 [53:26<05:22,  1.69it/s, loss=0.0749, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000381, train/loss_step=0.116, global_step=2816.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  91%|█████████ | 5426/5971 [53:27<05:22,  1.69it/s, loss=0.0749, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000381, train/loss_step=0.116, global_step=2816.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5426/5971 [53:27<05:22,  1.69it/s, loss=0.0617, v_num=0, train/loss_simple_step=0.00835, train/loss_vlb_step=3.87e-5, train/loss_step=0.00835, global_step=2816.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5427/5971 [53:28<05:21,  1.69it/s, loss=0.0617, v_num=0, train/loss_simple_step=0.00835, train/loss_vlb_step=3.87e-5, train/loss_step=0.00835, global_step=2816.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5427/5971 [53:28<05:21,  1.69it/s, loss=0.105, v_num=0, train/loss_simple_step=0.890, train/loss_vlb_step=0.0652, train/loss_step=0.890, global_step=2816.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:  91%|█████████ | 5428/5971 [53:30<05:21,  1.69it/s, loss=0.105, v_num=0, train/loss_simple_step=0.890, train/loss_vlb_step=0.0652, train/loss_step=0.890, global_step=2816.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5428/5971 [53:30<05:21,  1.69it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.59e-5, train/loss_step=0.0212, global_step=2816.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5429/5971 [53:31<05:20,  1.69it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.59e-5, train/loss_step=0.0212, global_step=2816.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5429/5971 [53:31<05:20,  1.69it/s, loss=0.108, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00115, train/loss_step=0.313, global_step=2817.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████ | 5430/5971 [53:32<05:19,  1.69it/s, loss=0.108, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00115, train/loss_step=0.313, global_step=2817.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5430/5971 [53:32<05:19,  1.69it/s, loss=0.121, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00107, train/loss_step=0.268, global_step=2817.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5431/5971 [53:33<05:19,  1.69it/s, loss=0.121, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00107, train/loss_step=0.268, global_step=2817.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5431/5971 [53:33<05:19,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000623, train/loss_step=0.185, global_step=2817.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5432/5971 [53:35<05:19,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000623, train/loss_step=0.185, global_step=2817.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5432/5971 [53:35<05:19,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000194, train/loss_step=0.0544, global_step=2817.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5433/5971 [53:36<05:18,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000194, train/loss_step=0.0544, global_step=2817.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5433/5971 [53:36<05:18,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.93e-5, train/loss_step=0.00351, global_step=2818.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5434/5971 [53:37<05:17,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.93e-5, train/loss_step=0.00351, global_step=2818.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5434/5971 [53:37<05:17,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.790, train/loss_vlb_step=0.0296, train/loss_step=0.790, global_step=2818.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  91%|█████████ | 5435/5971 [53:38<05:17,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.790, train/loss_vlb_step=0.0296, train/loss_step=0.790, global_step=2818.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5435/5971 [53:38<05:17,  1.69it/s, loss=0.168, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000811, train/loss_step=0.215, global_step=2818.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5436/5971 [53:41<05:16,  1.69it/s, loss=0.168, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000811, train/loss_step=0.215, global_step=2818.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5436/5971 [53:41<05:16,  1.69it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.45e-5, train/loss_step=0.00251, global_step=2818.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5437/5971 [53:42<05:16,  1.69it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.45e-5, train/loss_step=0.00251, global_step=2818.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5437/5971 [53:42<05:16,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000544, train/loss_step=0.156, global_step=2819.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████ | 5438/5971 [53:43<05:15,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000544, train/loss_step=0.156, global_step=2819.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5438/5971 [53:43<05:15,  1.69it/s, loss=0.174, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000374, train/loss_step=0.113, global_step=2819.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5439/5971 [53:43<05:15,  1.69it/s, loss=0.174, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000374, train/loss_step=0.113, global_step=2819.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5439/5971 [53:43<05:15,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0794, train/loss_vlb_step=0.000266, train/loss_step=0.0794, global_step=2819.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5440/5971 [53:46<05:14,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0794, train/loss_vlb_step=0.000266, train/loss_step=0.0794, global_step=2819.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5440/5971 [53:46<05:14,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.79e-5, train/loss_step=0.00544, global_step=2819.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5441/5971 [53:47<05:14,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.79e-5, train/loss_step=0.00544, global_step=2819.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5441/5971 [53:47<05:14,  1.69it/s, loss=0.195, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00359, train/loss_step=0.495, global_step=2820.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  91%|█████████ | 5442/5971 [53:48<05:13,  1.69it/s, loss=0.195, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00359, train/loss_step=0.495, global_step=2820.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5442/5971 [53:48<05:13,  1.69it/s, loss=0.208, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.0012, train/loss_step=0.271, global_step=2820.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  91%|█████████ | 5443/5971 [53:49<05:13,  1.69it/s, loss=0.208, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.0012, train/loss_step=0.271, global_step=2820.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5443/5971 [53:49<05:13,  1.69it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=5.7e-5, train/loss_step=0.0141, global_step=2820.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5444/5971 [53:51<05:12,  1.69it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=5.7e-5, train/loss_step=0.0141, global_step=2820.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5444/5971 [53:51<05:12,  1.69it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0585, train/loss_vlb_step=0.000196, train/loss_step=0.0585, global_step=2820.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5445/5971 [53:52<05:12,  1.68it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0585, train/loss_vlb_step=0.000196, train/loss_step=0.0585, global_step=2820.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5445/5971 [53:52<05:12,  1.68it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000164, train/loss_step=0.0449, global_step=2821.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5446/5971 [53:53<05:11,  1.68it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000164, train/loss_step=0.0449, global_step=2821.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5446/5971 [53:53<05:11,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000826, train/loss_step=0.234, global_step=2821.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  91%|█████████ | 5447/5971 [53:54<05:11,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000826, train/loss_step=0.234, global_step=2821.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5447/5971 [53:54<05:11,  1.68it/s, loss=0.178, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000793, train/loss_step=0.225, global_step=2821.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5448/5971 [53:56<05:10,  1.68it/s, loss=0.178, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000793, train/loss_step=0.225, global_step=2821.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████ | 5448/5971 [53:56<05:10,  1.68it/s, loss=0.191, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00164, train/loss_step=0.283, global_step=2821.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  91%|█████████▏| 5449/5971 [53:57<05:10,  1.68it/s, loss=0.191, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00164, train/loss_step=0.283, global_step=2821.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5449/5971 [53:57<05:10,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.58e-5, train/loss_step=0.010, global_step=2822.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5450/5971 [53:58<05:09,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.58e-5, train/loss_step=0.010, global_step=2822.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5450/5971 [53:58<05:09,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00109, train/loss_step=0.260, global_step=2822.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5451/5971 [53:58<05:08,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00109, train/loss_step=0.260, global_step=2822.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5451/5971 [53:58<05:08,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.9e-5, train/loss_step=0.00364, global_step=2822.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5452/5971 [54:01<05:08,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.9e-5, train/loss_step=0.00364, global_step=2822.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5452/5971 [54:01<05:08,  1.68it/s, loss=0.181, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00184, train/loss_step=0.347, global_step=2822.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████▏| 5453/5971 [54:02<05:07,  1.68it/s, loss=0.181, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00184, train/loss_step=0.347, global_step=2822.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5453/5971 [54:02<05:07,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00127, train/loss_step=0.296, global_step=2823.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5454/5971 [54:03<05:07,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00127, train/loss_step=0.296, global_step=2823.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5454/5971 [54:03<05:07,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00111, train/loss_step=0.260, global_step=2823.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5455/5971 [54:04<05:06,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00111, train/loss_step=0.260, global_step=2823.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5455/5971 [54:04<05:06,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00381, train/loss_vlb_step=1.89e-5, train/loss_step=0.00381, global_step=2823.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5456/5971 [54:06<05:06,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00381, train/loss_vlb_step=1.89e-5, train/loss_step=0.00381, global_step=2823.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5456/5971 [54:06<05:06,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00859, train/loss_vlb_step=4.01e-5, train/loss_step=0.00859, global_step=2823.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5457/5971 [54:07<05:05,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00859, train/loss_vlb_step=4.01e-5, train/loss_step=0.00859, global_step=2823.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5457/5971 [54:07<05:05,  1.68it/s, loss=0.182, v_num=0, train/loss_simple_step=0.637, train/loss_vlb_step=0.00634, train/loss_step=0.637, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  91%|█████████▏| 5458/5971 [54:08<05:05,  1.68it/s, loss=0.182, v_num=0, train/loss_simple_step=0.637, train/loss_vlb_step=0.00634, train/loss_step=0.637, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5458/5971 [54:08<05:05,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00687, train/loss_vlb_step=3.39e-5, train/loss_step=0.00687, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5459/5971 [54:09<05:04,  1.68it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00687, train/loss_vlb_step=3.39e-5, train/loss_step=0.00687, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5459/5971 [54:09<05:04,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.0002, train/loss_step=0.0571, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  91%|█████████▏| 5460/5971 [54:11<05:04,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.0002, train/loss_step=0.0571, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  91%|█████████▏| 5460/5971 [54:11<05:04,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:04,  2.57it/s][A
Epoch 4:  91%|█████████▏| 5462/5971 [54:12<05:03,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:49,  3.35it/s][A
Epoch 4:  92%|█████████▏| 5464/5971 [54:12<05:01,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.57it/s][A
Epoch 4:  92%|█████████▏| 5467/5971 [54:12<04:59,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▌         | 9/167 [00:00<00:11, 13.89it/s][A
Epoch 4:  92%|█████████▏| 5470/5971 [54:12<04:57,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 12/167 [00:01<00:09, 17.03it/s][A
Epoch 4:  92%|█████████▏| 5473/5971 [54:12<04:55,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   9%|▉         | 15/167 [00:01<00:07, 19.77it/s][A
Epoch 4:  92%|█████████▏| 5476/5971 [54:12<04:53,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  11%|█         | 18/167 [00:01<00:06, 21.45it/s][A
Epoch 4:  92%|█████████▏| 5479/5971 [54:12<04:52,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 21.36it/s][A
Epoch 4:  92%|█████████▏| 5482/5971 [54:13<04:50,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 23.85it/s][A
Epoch 4:  92%|█████████▏| 5486/5971 [54:13<04:47,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.97it/s][A
Epoch 4:  92%|█████████▏| 5490/5971 [54:13<04:44,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.99it/s][A
Epoch 4:  92%|█████████▏| 5494/5971 [54:13<04:42,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  20%|██        | 34/167 [00:01<00:04, 26.60it/s][A
Epoch 4:  92%|█████████▏| 5498/5971 [54:13<04:39,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 28.19it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:04, 28.06it/s][A
Epoch 4:  92%|█████████▏| 5502/5971 [54:13<04:37,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.88it/s][A
Epoch 4:  92%|█████████▏| 5506/5971 [54:13<04:34,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.02it/s][A
Epoch 4:  92%|█████████▏| 5510/5971 [54:14<04:32,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.63it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.40it/s][A
Epoch 4:  92%|█████████▏| 5514/5971 [54:14<04:29,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.10it/s][A
Epoch 4:  92%|█████████▏| 5518/5971 [54:14<04:27,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.28it/s][A
Epoch 4:  92%|█████████▏| 5522/5971 [54:14<04:24,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.72it/s][A
Epoch 4:  93%|█████████▎| 5526/5971 [54:14<04:22,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.53it/s][A
Epoch 4:  93%|█████████▎| 5530/5971 [54:14<04:19,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 28.45it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.35it/s][A
Epoch 4:  93%|█████████▎| 5534/5971 [54:14<04:16,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.80it/s][A
Epoch 4:  93%|█████████▎| 5538/5971 [54:15<04:14,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.91it/s][A
Epoch 4:  93%|█████████▎| 5542/5971 [54:15<04:11,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|█████     | 84/167 [00:03<00:02, 28.22it/s][A
Epoch 4:  93%|█████████▎| 5546/5971 [54:15<04:09,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 27.20it/s][A
Epoch 4:  93%|█████████▎| 5550/5971 [54:15<04:06,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.57it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.26it/s][A
Epoch 4:  93%|█████████▎| 5554/5971 [54:15<04:04,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.70it/s][A
Epoch 4:  93%|█████████▎| 5558/5971 [54:15<04:01,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.91it/s][A
Epoch 4:  93%|█████████▎| 5562/5971 [54:16<03:59,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.77it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.55it/s][A
Epoch 4:  93%|█████████▎| 5566/5971 [54:16<03:56,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.10it/s][A
Epoch 4:  93%|█████████▎| 5570/5971 [54:16<03:54,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 25.76it/s][A
Epoch 4:  93%|█████████▎| 5574/5971 [54:16<03:51,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 26.14it/s][A

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.18it/s][A
Epoch 4:  93%|█████████▎| 5578/5971 [54:16<03:49,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.97it/s][A
Epoch 4:  93%|█████████▎| 5582/5971 [54:16<03:46,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 28.19it/s][A
Epoch 4:  94%|█████████▎| 5586/5971 [54:16<03:44,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.88it/s][A
Epoch 4:  94%|█████████▎| 5590/5971 [54:17<03:41,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.32it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.76it/s][A
Epoch 4:  94%|█████████▎| 5594/5971 [54:17<03:39,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.71it/s][A
Epoch 4:  94%|█████████▍| 5598/5971 [54:17<03:37,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.46it/s][A
Epoch 4:  94%|█████████▍| 5602/5971 [54:17<03:34,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  85%|████████▌ | 142/167 [00:05<00:01, 24.50it/s][A

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 25.70it/s][A
Epoch 4:  94%|█████████▍| 5606/5971 [54:17<03:32,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.83it/s][A
Epoch 4:  94%|█████████▍| 5610/5971 [54:17<03:29,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.32it/s][A
Epoch 4:  94%|█████████▍| 5614/5971 [54:17<03:27,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.34it/s][A
Epoch 4:  94%|█████████▍| 5618/5971 [54:18<03:24,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.63it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.51it/s][A
Epoch 4:  94%|█████████▍| 5622/5971 [54:18<03:22,  1.73it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.45it/s][A
Epoch 4:  94%|█████████▍| 5626/5971 [54:18<03:19,  1.73it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating: 100%|██████████| 167/167 [00:06<00:00, 25.98it/s][A
Epoch 4:  94%|█████████▍| 5628/5971 [54:18<03:18,  1.73it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.2e-6, train/loss_step=0.00152, global_step=2824.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  94%|█████████▍| 5629/5971 [54:19<03:18,  1.73it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.6e-5, train/loss_step=0.0236, global_step=2825.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  94%|█████████▍| 5630/5971 [54:20<03:17,  1.73it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.6e-5, train/loss_step=0.0236, global_step=2825.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5630/5971 [54:20<03:17,  1.73it/s, loss=0.147, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00059, train/loss_step=0.168, global_step=2825.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  94%|█████████▍| 5631/5971 [54:21<03:16,  1.73it/s, loss=0.148, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000106, train/loss_step=0.028, global_step=2825.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5632/5971 [54:24<03:16,  1.73it/s, loss=0.153, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.00059, train/loss_step=0.169, global_step=2825.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  94%|█████████▍| 5633/5971 [54:25<03:15,  1.73it/s, loss=0.162, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000765, train/loss_step=0.220, global_step=2826.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5634/5971 [54:26<03:15,  1.73it/s, loss=0.162, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000765, train/loss_step=0.220, global_step=2826.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5634/5971 [54:26<03:15,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.51e-5, train/loss_step=0.0125, global_step=2826.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5635/5971 [54:27<03:14,  1.73it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.38e-5, train/loss_step=0.0232, global_step=2826.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5636/5971 [54:29<03:14,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.00052, train/loss_step=0.153, global_step=2826.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  94%|█████████▍| 5637/5971 [54:30<03:13,  1.72it/s, loss=0.157, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00409, train/loss_step=0.461, global_step=2827.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5638/5971 [54:31<03:13,  1.72it/s, loss=0.157, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00409, train/loss_step=0.461, global_step=2827.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5638/5971 [54:31<03:13,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.65e-5, train/loss_step=0.0104, global_step=2827.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5639/5971 [54:32<03:12,  1.72it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00443, train/loss_vlb_step=2.42e-5, train/loss_step=0.00443, global_step=2827.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5640/5971 [54:34<03:12,  1.72it/s, loss=0.135, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000503, train/loss_step=0.151, global_step=2827.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  94%|█████████▍| 5641/5971 [54:35<03:11,  1.72it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0849, train/loss_vlb_step=0.000282, train/loss_step=0.0849, global_step=2828.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5642/5971 [54:36<03:11,  1.72it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0849, train/loss_vlb_step=0.000282, train/loss_step=0.0849, global_step=2828.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  94%|█████████▍| 5642/5971 [54:36<03:11,  1.72it/s, loss=0.128, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00151, train/loss_step=0.341, global_step=2828.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  95%|█████████▍| 5643/5971 [54:37<03:10,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.416, train/loss_vlb_step=0.00301, train/loss_step=0.416, global_step=2828.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5644/5971 [54:39<03:09,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.45e-5, train/loss_step=0.015, global_step=2828.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5645/5971 [54:40<03:09,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000129, train/loss_step=0.0343, global_step=2829.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5646/5971 [54:40<03:08,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000129, train/loss_step=0.0343, global_step=2829.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5646/5971 [54:40<03:08,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000113, train/loss_step=0.0286, global_step=2829.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▍| 5647/5971 [54:41<03:08,  1.72it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0962, train/loss_vlb_step=0.000317, train/loss_step=0.0962, global_step=2829.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5648/5971 [54:44<03:07,  1.72it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0811, train/loss_vlb_step=0.000269, train/loss_step=0.0811, global_step=2829.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5649/5971 [54:44<03:07,  1.72it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.61e-5, train/loss_step=0.00295, global_step=2830.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5650/5971 [54:45<03:06,  1.72it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.61e-5, train/loss_step=0.00295, global_step=2830.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5650/5971 [54:45<03:06,  1.72it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.00012, train/loss_step=0.0304, global_step=2830.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  95%|█████████▍| 5651/5971 [54:46<03:06,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.00024, train/loss_step=0.0727, global_step=2830.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▍| 5652/5971 [54:48<03:05,  1.72it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000117, train/loss_step=0.0304, global_step=2830.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5653/5971 [54:49<03:05,  1.72it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.99e-5, train/loss_step=0.0037, global_step=2831.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▍| 5654/5971 [54:50<03:04,  1.72it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.99e-5, train/loss_step=0.0037, global_step=2831.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5654/5971 [54:50<03:04,  1.72it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.33e-5, train/loss_step=0.00244, global_step=2831.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5655/5971 [54:51<03:03,  1.72it/s, loss=0.109, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000565, train/loss_step=0.167, global_step=2831.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  95%|█████████▍| 5656/5971 [54:53<03:03,  1.72it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.69e-6, train/loss_step=0.00161, global_step=2831.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5657/5971 [54:54<03:02,  1.72it/s, loss=0.105, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00384, train/loss_step=0.535, global_step=2832.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  95%|█████████▍| 5658/5971 [54:55<03:02,  1.72it/s, loss=0.105, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00384, train/loss_step=0.535, global_step=2832.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5658/5971 [54:55<03:02,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.09e-5, train/loss_step=0.0123, global_step=2832.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5659/5971 [54:56<03:01,  1.72it/s, loss=0.112, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000432, train/loss_step=0.130, global_step=2832.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▍| 5660/5971 [54:58<03:01,  1.72it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.28e-5, train/loss_step=0.0023, global_step=2832.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5661/5971 [54:59<03:00,  1.72it/s, loss=0.122, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00257, train/loss_step=0.429, global_step=2833.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  95%|█████████▍| 5662/5971 [55:00<03:00,  1.72it/s, loss=0.122, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00257, train/loss_step=0.429, global_step=2833.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5662/5971 [55:00<03:00,  1.72it/s, loss=0.11, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000362, train/loss_step=0.108, global_step=2833.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5663/5971 [55:01<02:59,  1.72it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000707, train/loss_step=0.184, global_step=2833.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5664/5971 [55:03<02:59,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00317, train/loss_step=0.495, global_step=2833.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  95%|█████████▍| 5665/5971 [55:04<02:58,  1.71it/s, loss=0.126, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=2834.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5666/5971 [55:05<02:57,  1.71it/s, loss=0.126, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=2834.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5666/5971 [55:05<02:57,  1.71it/s, loss=0.142, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00231, train/loss_step=0.354, global_step=2834.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▍| 5667/5971 [55:06<02:57,  1.71it/s, loss=0.175, v_num=0, train/loss_simple_step=0.742, train/loss_vlb_step=0.0189, train/loss_step=0.742, global_step=2834.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▍| 5668/5971 [55:08<02:56,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.35e-5, train/loss_step=0.0162, global_step=2834.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5669/5971 [55:09<02:56,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.39e-5, train/loss_step=0.00238, global_step=2835.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5670/5971 [55:10<02:55,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.39e-5, train/loss_step=0.00238, global_step=2835.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5670/5971 [55:10<02:55,  1.71it/s, loss=0.198, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00424, train/loss_step=0.571, global_step=2835.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  95%|█████████▍| 5671/5971 [55:10<02:55,  1.71it/s, loss=0.203, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000532, train/loss_step=0.161, global_step=2835.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▍| 5672/5971 [55:13<02:54,  1.71it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000153, train/loss_step=0.0426, global_step=2835.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5673/5971 [55:14<02:54,  1.71it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.00013, train/loss_step=0.0343, global_step=2836.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▌| 5674/5971 [55:15<02:53,  1.71it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.00013, train/loss_step=0.0343, global_step=2836.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5674/5971 [55:15<02:53,  1.71it/s, loss=0.21, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=2836.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  95%|█████████▌| 5675/5971 [55:16<02:52,  1.71it/s, loss=0.207, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=2836.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5676/5971 [55:18<02:52,  1.71it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.6e-5, train/loss_step=0.00297, global_step=2836.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5677/5971 [55:19<02:51,  1.71it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00967, train/loss_vlb_step=4.29e-5, train/loss_step=0.00967, global_step=2837.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5678/5971 [55:20<02:51,  1.71it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00967, train/loss_vlb_step=4.29e-5, train/loss_step=0.00967, global_step=2837.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5678/5971 [55:20<02:51,  1.71it/s, loss=0.203, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00244, train/loss_step=0.443, global_step=2837.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  95%|█████████▌| 5679/5971 [55:21<02:50,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=9e-5, train/loss_step=0.0226, global_step=2837.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▌| 5680/5971 [55:23<02:50,  1.71it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000179, train/loss_step=0.0503, global_step=2837.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5681/5971 [55:24<02:49,  1.71it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.13e-5, train/loss_step=0.0232, global_step=2838.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5682/5971 [55:24<02:49,  1.71it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.13e-5, train/loss_step=0.0232, global_step=2838.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5682/5971 [55:24<02:49,  1.71it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.15e-6, train/loss_step=0.00151, global_step=2838.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5683/5971 [55:25<02:48,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00606, train/loss_vlb_step=3.08e-5, train/loss_step=0.00606, global_step=2838.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5684/5971 [55:27<02:48,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000209, train/loss_step=0.0596, global_step=2838.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▌| 5685/5971 [55:28<02:47,  1.71it/s, loss=0.147, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000652, train/loss_step=0.185, global_step=2839.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  95%|█████████▌| 5686/5971 [55:29<02:46,  1.71it/s, loss=0.147, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000652, train/loss_step=0.185, global_step=2839.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5686/5971 [55:29<02:46,  1.71it/s, loss=0.131, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.25e-5, train/loss_step=0.023, global_step=2839.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▌| 5687/5971 [55:30<02:46,  1.71it/s, loss=0.094, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=2.02e-5, train/loss_step=0.00379, global_step=2839.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5688/5971 [55:32<02:45,  1.71it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.14e-5, train/loss_step=0.00398, global_step=2839.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5689/5971 [55:33<02:45,  1.71it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.00849, train/loss_vlb_step=3.95e-5, train/loss_step=0.00849, global_step=2840.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5690/5971 [55:34<02:44,  1.71it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.00849, train/loss_vlb_step=3.95e-5, train/loss_step=0.00849, global_step=2840.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5690/5971 [55:34<02:44,  1.71it/s, loss=0.0674, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000162, train/loss_step=0.0449, global_step=2840.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▌| 5691/5971 [55:35<02:44,  1.71it/s, loss=0.0596, v_num=0, train/loss_simple_step=0.00518, train/loss_vlb_step=2.64e-5, train/loss_step=0.00518, global_step=2840.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5692/5971 [55:37<02:43,  1.71it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00297, train/loss_step=0.518, global_step=2840.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  95%|█████████▌| 5693/5971 [55:38<02:43,  1.71it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.76e-5, train/loss_step=0.00533, global_step=2841.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5694/5971 [55:39<02:42,  1.71it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.76e-5, train/loss_step=0.00533, global_step=2841.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5694/5971 [55:39<02:42,  1.71it/s, loss=0.0788, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000148, train/loss_step=0.0418, global_step=2841.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  95%|█████████▌| 5695/5971 [55:40<02:41,  1.71it/s, loss=0.074, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.75e-5, train/loss_step=0.022, global_step=2841.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  95%|█████████▌| 5696/5971 [55:42<02:41,  1.70it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000433, train/loss_step=0.132, global_step=2841.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5697/5971 [55:43<02:40,  1.70it/s, loss=0.0841, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000291, train/loss_step=0.0837, global_step=2842.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5698/5971 [55:44<02:40,  1.70it/s, loss=0.0841, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000291, train/loss_step=0.0837, global_step=2842.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5698/5971 [55:44<02:40,  1.70it/s, loss=0.0639, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000145, train/loss_step=0.0386, global_step=2842.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5699/5971 [55:45<02:39,  1.70it/s, loss=0.063, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.84e-5, train/loss_step=0.00347, global_step=2842.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5700/5971 [55:47<02:39,  1.70it/s, loss=0.0734, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00146, train/loss_step=0.259, global_step=2842.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  95%|█████████▌| 5701/5971 [55:48<02:38,  1.70it/s, loss=0.0725, v_num=0, train/loss_simple_step=0.00432, train/loss_vlb_step=2.28e-5, train/loss_step=0.00432, global_step=2843.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5702/5971 [55:49<02:37,  1.70it/s, loss=0.0725, v_num=0, train/loss_simple_step=0.00432, train/loss_vlb_step=2.28e-5, train/loss_step=0.00432, global_step=2843.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  95%|█████████▌| 5702/5971 [55:49<02:37,  1.70it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00205, train/loss_step=0.418, global_step=2843.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  96%|█████████▌| 5703/5971 [55:50<02:37,  1.70it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.00331, train/loss_vlb_step=1.77e-5, train/loss_step=0.00331, global_step=2843.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5704/5971 [55:52<02:36,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00223, train/loss_step=0.391, global_step=2843.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]      
Epoch 4:  96%|█████████▌| 5705/5971 [55:53<02:36,  1.70it/s, loss=0.114, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00102, train/loss_step=0.268, global_step=2844.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5706/5971 [55:54<02:35,  1.70it/s, loss=0.114, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00102, train/loss_step=0.268, global_step=2844.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5706/5971 [55:54<02:35,  1.70it/s, loss=0.118, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000337, train/loss_step=0.102, global_step=2844.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5707/5971 [55:55<02:35,  1.70it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.91e-5, train/loss_step=0.0109, global_step=2844.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5708/5971 [55:57<02:34,  1.70it/s, loss=0.156, v_num=0, train/loss_simple_step=0.767, train/loss_vlb_step=0.0205, train/loss_step=0.767, global_step=2844.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  96%|█████████▌| 5709/5971 [55:58<02:34,  1.70it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0741, train/loss_vlb_step=0.000246, train/loss_step=0.0741, global_step=2845.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5710/5971 [55:59<02:33,  1.70it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0741, train/loss_vlb_step=0.000246, train/loss_step=0.0741, global_step=2845.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5710/5971 [55:59<02:33,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00118, train/loss_step=0.273, global_step=2845.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  96%|█████████▌| 5711/5971 [55:59<02:32,  1.70it/s, loss=0.175, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000311, train/loss_step=0.094, global_step=2845.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5712/5971 [56:02<02:32,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.95e-5, train/loss_step=0.0188, global_step=2845.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5713/5971 [56:03<02:31,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00122, train/loss_step=0.307, global_step=2846.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  96%|█████████▌| 5714/5971 [56:03<02:31,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00122, train/loss_step=0.307, global_step=2846.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5714/5971 [56:03<02:31,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00236, train/loss_step=0.389, global_step=2846.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5715/5971 [56:04<02:30,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.13e-5, train/loss_step=0.0171, global_step=2846.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5716/5971 [56:06<02:30,  1.70it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0962, train/loss_vlb_step=0.000318, train/loss_step=0.0962, global_step=2846.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5717/5971 [56:07<02:29,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000679, train/loss_step=0.186, global_step=2847.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  96%|█████████▌| 5718/5971 [56:08<02:29,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000679, train/loss_step=0.186, global_step=2847.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5718/5971 [56:08<02:29,  1.70it/s, loss=0.203, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00233, train/loss_step=0.382, global_step=2847.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  96%|█████████▌| 5719/5971 [56:09<02:28,  1.70it/s, loss=0.225, v_num=0, train/loss_simple_step=0.437, train/loss_vlb_step=0.00281, train/loss_step=0.437, global_step=2847.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5720/5971 [56:11<02:27,  1.70it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0498, train/loss_vlb_step=0.000174, train/loss_step=0.0498, global_step=2847.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5721/5971 [56:12<02:27,  1.70it/s, loss=0.227, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00109, train/loss_step=0.262, global_step=2848.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  96%|█████████▌| 5722/5971 [56:13<02:26,  1.70it/s, loss=0.227, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00109, train/loss_step=0.262, global_step=2848.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5722/5971 [56:13<02:26,  1.70it/s, loss=0.234, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.00645, train/loss_step=0.554, global_step=2848.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5723/5971 [56:14<02:26,  1.70it/s, loss=0.254, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.0029, train/loss_step=0.400, global_step=2848.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  96%|█████████▌| 5724/5971 [56:16<02:25,  1.70it/s, loss=0.244, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000667, train/loss_step=0.192, global_step=2848.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5725/5971 [56:17<02:25,  1.70it/s, loss=0.245, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00127, train/loss_step=0.288, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  96%|█████████▌| 5726/5971 [56:18<02:24,  1.70it/s, loss=0.245, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00127, train/loss_step=0.288, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5726/5971 [56:18<02:24,  1.70it/s, loss=0.24, v_num=0, train/loss_simple_step=0.00506, train/loss_vlb_step=2.67e-5, train/loss_step=0.00506, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  96%|█████████▌| 5727/5971 [56:19<02:23,  1.69it/s, loss=0.241, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.44e-5, train/loss_step=0.0243, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  96%|█████████▌| 5728/5971 [56:21<02:23,  1.69it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.24it/s][A
Epoch 4:  96%|█████████▌| 5730/5971 [56:22<02:22,  1.69it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   1%|          | 2/167 [00:00<00:46,  3.58it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.94it/s][A
Epoch 4:  96%|█████████▌| 5734/5971 [56:22<02:19,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.79it/s][A
Epoch 4:  96%|█████████▌| 5738/5971 [56:22<02:17,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.81it/s][A
Epoch 4:  96%|█████████▌| 5742/5971 [56:22<02:14,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.49it/s][A

Validating:  10%|█         | 17/167 [00:01<00:06, 22.12it/s][A
Epoch 4:  96%|█████████▌| 5746/5971 [56:22<02:12,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.76it/s][A
Epoch 4:  96%|█████████▋| 5750/5971 [56:23<02:10,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.59it/s][A
Epoch 4:  96%|█████████▋| 5754/5971 [56:23<02:07,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.58it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.92it/s][A
Epoch 4:  96%|█████████▋| 5758/5971 [56:23<02:05,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.95it/s][A
Epoch 4:  96%|█████████▋| 5762/5971 [56:23<02:02,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  21%|██        | 35/167 [00:01<00:05, 26.17it/s][A
Epoch 4:  97%|█████████▋| 5766/5971 [56:23<02:00,  1.70it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.07it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.47it/s][A
Epoch 4:  97%|█████████▋| 5770/5971 [56:23<01:57,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.60it/s][A
Epoch 4:  97%|█████████▋| 5774/5971 [56:23<01:55,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.91it/s][A
Epoch 4:  97%|█████████▋| 5778/5971 [56:24<01:53,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.40it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 27.18it/s][A
Epoch 4:  97%|█████████▋| 5782/5971 [56:24<01:50,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.52it/s][A
Epoch 4:  97%|█████████▋| 5786/5971 [56:24<01:48,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 28.19it/s][A
Epoch 4:  97%|█████████▋| 5790/5971 [56:24<01:45,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.45it/s][A

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.84it/s][A
Epoch 4:  97%|█████████▋| 5794/5971 [56:24<01:43,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.31it/s][A
Epoch 4:  97%|█████████▋| 5798/5971 [56:24<01:40,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.31it/s][A
Epoch 4:  97%|█████████▋| 5802/5971 [56:24<01:38,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.65it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.53it/s][A
Epoch 4:  97%|█████████▋| 5806/5971 [56:25<01:36,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 28.22it/s][A
Epoch 4:  97%|█████████▋| 5810/5971 [56:25<01:33,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.81it/s][A
Epoch 4:  97%|█████████▋| 5814/5971 [56:25<01:31,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.28it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.07it/s][A
Epoch 4:  97%|█████████▋| 5818/5971 [56:25<01:29,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.30it/s][A
Epoch 4:  98%|█████████▊| 5822/5971 [56:25<01:26,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.07it/s][A
Epoch 4:  98%|█████████▊| 5826/5971 [56:25<01:24,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.70it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.71it/s][A
Epoch 4:  98%|█████████▊| 5830/5971 [56:26<01:21,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.33it/s][A
Epoch 4:  98%|█████████▊| 5834/5971 [56:26<01:19,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.15it/s][A
Epoch 4:  98%|█████████▊| 5838/5971 [56:26<01:17,  1.72it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.02it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 25.83it/s][A
Epoch 4:  98%|█████████▊| 5842/5971 [56:26<01:14,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.43it/s][A
Epoch 4:  98%|█████████▊| 5846/5971 [56:26<01:12,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.15it/s][A
Epoch 4:  98%|█████████▊| 5850/5971 [56:26<01:10,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.00it/s][A
Epoch 4:  98%|█████████▊| 5854/5971 [56:26<01:07,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.67it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.90it/s][A
Epoch 4:  98%|█████████▊| 5858/5971 [56:27<01:05,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.83it/s][A
Epoch 4:  98%|█████████▊| 5862/5971 [56:27<01:02,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.65it/s][A
Epoch 4:  98%|█████████▊| 5866/5971 [56:27<01:00,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 27.14it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.37it/s][A
Epoch 4:  98%|█████████▊| 5870/5971 [56:27<00:58,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 26.07it/s][A
Epoch 4:  98%|█████████▊| 5874/5971 [56:27<00:55,  1.73it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.74it/s][A
Epoch 4:  98%|█████████▊| 5878/5971 [56:27<00:53,  1.74it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.31it/s][A
Epoch 4:  99%|█████████▊| 5882/5971 [56:27<00:51,  1.74it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 28.32it/s][A
Epoch 4:  99%|█████████▊| 5886/5971 [56:28<00:48,  1.74it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 28.49it/s][A
Epoch 4:  99%|█████████▊| 5890/5971 [56:28<00:46,  1.74it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 27.70it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.41it/s][A
Epoch 4:  99%|█████████▊| 5894/5971 [56:28<00:44,  1.74it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▊| 5896/5971 [56:29<00:43,  1.74it/s, loss=0.224, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00215, train/loss_step=0.427, global_step=2849.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]

                                                             [A
Epoch 4:  99%|█████████▉| 5897/5971 [56:30<00:42,  1.74it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0902, train/loss_vlb_step=0.000296, train/loss_step=0.0902, global_step=2850.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5898/5971 [56:30<00:41,  1.74it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0902, train/loss_vlb_step=0.000296, train/loss_step=0.0902, global_step=2850.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5898/5971 [56:30<00:41,  1.74it/s, loss=0.217, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000417, train/loss_step=0.127, global_step=2850.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  99%|█████████▉| 5899/5971 [56:31<00:41,  1.74it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.44e-5, train/loss_step=0.0149, global_step=2850.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5900/5971 [56:33<00:40,  1.74it/s, loss=0.22, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000514, train/loss_step=0.152, global_step=2850.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  99%|█████████▉| 5901/5971 [56:34<00:40,  1.74it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000104, train/loss_step=0.0271, global_step=2851.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5902/5971 [56:35<00:39,  1.74it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000104, train/loss_step=0.0271, global_step=2851.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5902/5971 [56:35<00:39,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.00017, train/loss_step=0.0499, global_step=2851.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  99%|█████████▉| 5903/5971 [56:36<00:39,  1.74it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.36e-5, train/loss_step=0.00242, global_step=2851.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5904/5971 [56:38<00:38,  1.74it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.00019, train/loss_step=0.0511, global_step=2851.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  99%|█████████▉| 5905/5971 [56:39<00:37,  1.74it/s, loss=0.184, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.00051, train/loss_step=0.143, global_step=2852.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  99%|█████████▉| 5906/5971 [56:40<00:37,  1.74it/s, loss=0.184, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.00051, train/loss_step=0.143, global_step=2852.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5906/5971 [56:40<00:37,  1.74it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.00013, train/loss_step=0.0371, global_step=2852.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5907/5971 [56:41<00:36,  1.74it/s, loss=0.158, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.00127, train/loss_step=0.261, global_step=2852.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  99%|█████████▉| 5908/5971 [56:43<00:36,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00262, train/loss_step=0.386, global_step=2852.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5909/5971 [56:44<00:35,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000145, train/loss_step=0.0373, global_step=2853.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5910/5971 [56:45<00:35,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000145, train/loss_step=0.0373, global_step=2853.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5910/5971 [56:45<00:35,  1.74it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000117, train/loss_step=0.0301, global_step=2853.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5911/5971 [56:46<00:34,  1.74it/s, loss=0.16, v_num=0, train/loss_simple_step=0.844, train/loss_vlb_step=0.0484, train/loss_step=0.844, global_step=2853.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  99%|█████████▉| 5912/5971 [56:48<00:34,  1.73it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000163, train/loss_step=0.0462, global_step=2853.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5913/5971 [56:49<00:33,  1.73it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00958, train/loss_vlb_step=4.41e-5, train/loss_step=0.00958, global_step=2854.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5914/5971 [56:50<00:32,  1.73it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00958, train/loss_vlb_step=4.41e-5, train/loss_step=0.00958, global_step=2854.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5914/5971 [56:50<00:32,  1.73it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.12e-5, train/loss_step=0.00184, global_step=2854.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5915/5971 [56:51<00:32,  1.73it/s, loss=0.175, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0395, train/loss_step=0.764, global_step=2854.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  99%|█████████▉| 5916/5971 [56:53<00:31,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00338, train/loss_vlb_step=1.82e-5, train/loss_step=0.00338, global_step=2854.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5917/5971 [56:54<00:31,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00129, train/loss_step=0.303, global_step=2855.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4:  99%|█████████▉| 5918/5971 [56:55<00:30,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00129, train/loss_step=0.303, global_step=2855.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5918/5971 [56:55<00:30,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000122, train/loss_step=0.0315, global_step=2855.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5919/5971 [56:56<00:30,  1.73it/s, loss=0.175, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00121, train/loss_step=0.311, global_step=2855.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  99%|█████████▉| 5920/5971 [56:58<00:29,  1.73it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0595, train/loss_vlb_step=0.000203, train/loss_step=0.0595, global_step=2855.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5921/5971 [56:59<00:28,  1.73it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0774, train/loss_vlb_step=0.000265, train/loss_step=0.0774, global_step=2856.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5922/5971 [57:00<00:28,  1.73it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0774, train/loss_vlb_step=0.000265, train/loss_step=0.0774, global_step=2856.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5922/5971 [57:00<00:28,  1.73it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.58e-5, train/loss_step=0.0129, global_step=2856.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  99%|█████████▉| 5923/5971 [57:01<00:27,  1.73it/s, loss=0.178, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000473, train/loss_step=0.144, global_step=2856.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4:  99%|█████████▉| 5924/5971 [57:03<00:27,  1.73it/s, loss=0.183, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000533, train/loss_step=0.159, global_step=2856.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5925/5971 [57:04<00:26,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000594, train/loss_step=0.179, global_step=2857.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5926/5971 [57:05<00:26,  1.73it/s, loss=0.185, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000594, train/loss_step=0.179, global_step=2857.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5926/5971 [57:05<00:26,  1.73it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000105, train/loss_step=0.0271, global_step=2857.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5927/5971 [57:06<00:25,  1.73it/s, loss=0.183, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000981, train/loss_step=0.233, global_step=2857.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  99%|█████████▉| 5928/5971 [57:08<00:24,  1.73it/s, loss=0.169, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000375, train/loss_step=0.113, global_step=2857.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5929/5971 [57:09<00:24,  1.73it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.61e-5, train/loss_step=0.00536, global_step=2858.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5930/5971 [57:10<00:23,  1.73it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.61e-5, train/loss_step=0.00536, global_step=2858.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5930/5971 [57:10<00:23,  1.73it/s, loss=0.18, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00103, train/loss_step=0.284, global_step=2858.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]     
Epoch 4:  99%|█████████▉| 5931/5971 [57:11<00:23,  1.73it/s, loss=0.148, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000703, train/loss_step=0.190, global_step=2858.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5932/5971 [57:14<00:22,  1.73it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0029, train/loss_vlb_step=1.6e-5, train/loss_step=0.0029, global_step=2858.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5933/5971 [57:15<00:22,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00126, train/loss_step=0.290, global_step=2859.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4:  99%|█████████▉| 5934/5971 [57:16<00:21,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00126, train/loss_step=0.290, global_step=2859.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5934/5971 [57:16<00:21,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.0018, train/loss_step=0.351, global_step=2859.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5935/5971 [57:17<00:20,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00147, train/loss_vlb_step=8.9e-6, train/loss_step=0.00147, global_step=2859.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5936/5971 [57:19<00:20,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00539, train/loss_vlb_step=2.69e-5, train/loss_step=0.00539, global_step=2859.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5937/5971 [57:20<00:19,  1.73it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.47e-5, train/loss_step=0.00755, global_step=2860.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5938/5971 [57:21<00:19,  1.73it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.47e-5, train/loss_step=0.00755, global_step=2860.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5938/5971 [57:21<00:19,  1.73it/s, loss=0.135, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000975, train/loss_step=0.246, global_step=2860.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:  99%|█████████▉| 5939/5971 [57:22<00:18,  1.73it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00444, train/loss_vlb_step=2.31e-5, train/loss_step=0.00444, global_step=2860.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5940/5971 [57:24<00:17,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0478, train/loss_vlb_step=0.000166, train/loss_step=0.0478, global_step=2860.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4:  99%|█████████▉| 5941/5971 [57:25<00:17,  1.72it/s, loss=0.121, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000393, train/loss_step=0.117, global_step=2861.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4: 100%|█████████▉| 5942/5971 [57:26<00:16,  1.72it/s, loss=0.121, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000393, train/loss_step=0.117, global_step=2861.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5942/5971 [57:26<00:16,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.560, train/loss_vlb_step=0.00675, train/loss_step=0.560, global_step=2861.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4: 100%|█████████▉| 5943/5971 [57:27<00:16,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00344, train/loss_step=0.429, global_step=2861.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5944/5971 [57:29<00:15,  1.72it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.85e-5, train/loss_step=0.0243, global_step=2861.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5945/5971 [57:30<00:15,  1.72it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000188, train/loss_step=0.0554, global_step=2862.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5946/5971 [57:31<00:14,  1.72it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000188, train/loss_step=0.0554, global_step=2862.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5946/5971 [57:31<00:14,  1.72it/s, loss=0.196, v_num=0, train/loss_simple_step=0.942, train/loss_vlb_step=0.474, train/loss_step=0.942, global_step=2862.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4: 100%|█████████▉| 5947/5971 [57:31<00:13,  1.72it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0745, train/loss_vlb_step=0.000249, train/loss_step=0.0745, global_step=2862.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5948/5971 [57:34<00:13,  1.72it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.58e-5, train/loss_step=0.00529, global_step=2862.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5949/5971 [57:34<00:12,  1.72it/s, loss=0.193, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000771, train/loss_step=0.218, global_step=2863.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4: 100%|█████████▉| 5950/5971 [57:35<00:12,  1.72it/s, loss=0.193, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000771, train/loss_step=0.218, global_step=2863.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5950/5971 [57:35<00:12,  1.72it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0576, train/loss_vlb_step=0.000199, train/loss_step=0.0576, global_step=2863.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5951/5971 [57:36<00:11,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000249, train/loss_step=0.0757, global_step=2863.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5952/5971 [57:38<00:11,  1.72it/s, loss=0.191, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.0013, train/loss_step=0.315, global_step=2863.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]    
Epoch 4: 100%|█████████▉| 5953/5971 [57:39<00:10,  1.72it/s, loss=0.186, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.00061, train/loss_step=0.175, global_step=2864.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5954/5971 [57:40<00:09,  1.72it/s, loss=0.186, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.00061, train/loss_step=0.175, global_step=2864.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5954/5971 [57:40<00:09,  1.72it/s, loss=0.178, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000879, train/loss_step=0.199, global_step=2864.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5955/5971 [57:41<00:09,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.3e-5, train/loss_step=0.0182, global_step=2864.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5956/5971 [57:43<00:08,  1.72it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0924, train/loss_vlb_step=0.000308, train/loss_step=0.0924, global_step=2864.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5957/5971 [57:44<00:08,  1.72it/s, loss=0.198, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00183, train/loss_step=0.300, global_step=2865.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4: 100%|█████████▉| 5958/5971 [57:45<00:07,  1.72it/s, loss=0.198, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00183, train/loss_step=0.300, global_step=2865.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5958/5971 [57:45<00:07,  1.72it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00572, train/loss_vlb_step=2.85e-5, train/loss_step=0.00572, global_step=2865.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5959/5971 [57:46<00:06,  1.72it/s, loss=0.187, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000124, train/loss_step=0.033, global_step=2865.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4: 100%|█████████▉| 5960/5971 [57:48<00:06,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0774, train/loss_vlb_step=0.000257, train/loss_step=0.0774, global_step=2865.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5961/5971 [57:49<00:05,  1.72it/s, loss=0.196, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00103, train/loss_step=0.258, global_step=2866.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4: 100%|█████████▉| 5962/5971 [57:50<00:05,  1.72it/s, loss=0.196, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00103, train/loss_step=0.258, global_step=2866.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5962/5971 [57:50<00:05,  1.72it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000171, train/loss_step=0.0474, global_step=2866.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5963/5971 [57:51<00:04,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.78e-5, train/loss_step=0.00322, global_step=2866.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5964/5971 [57:53<00:04,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000104, train/loss_step=0.0291, global_step=2866.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4: 100%|█████████▉| 5965/5971 [57:54<00:03,  1.72it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.07e-5, train/loss_step=0.0115, global_step=2867.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4: 100%|█████████▉| 5966/5971 [57:55<00:02,  1.72it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.07e-5, train/loss_step=0.0115, global_step=2867.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5966/5971 [57:55<00:02,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00248, train/loss_step=0.407, global_step=2867.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4: 100%|█████████▉| 5967/5971 [57:55<00:02,  1.72it/s, loss=0.13, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00123, train/loss_step=0.276, global_step=2867.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5968/5971 [57:58<00:01,  1.72it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000195, train/loss_step=0.0547, global_step=2867.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5969/5971 [57:58<00:01,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.35e-5, train/loss_step=0.0231, global_step=2868.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4: 100%|█████████▉| 5970/5971 [57:59<00:00,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.35e-5, train/loss_step=0.0231, global_step=2868.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|█████████▉| 5970/5971 [57:59<00:00,  1.72it/s, loss=0.145, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00301, train/loss_step=0.497, global_step=2868.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4: 100%|██████████| 5971/5971 [58:00<00:00,  1.72it/s, loss=0.151, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000705, train/loss_step=0.193, global_step=2868.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:02<00:00,  1.71it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0083, train/loss_vlb_step=3.62e-5, train/loss_step=0.0083, global_step=2868.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:03<00:00,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.2e-5, train/loss_step=0.0119, global_step=2869.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4: 100%|██████████| 5971/5971 [58:04<00:00,  1.71it/s, loss=0.145, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00502, train/loss_step=0.545, global_step=2869.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4: 100%|██████████| 5971/5971 [58:05<00:00,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.21e-5, train/loss_step=0.0146, global_step=2869.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:07<00:00,  1.71it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000201, train/loss_step=0.0573, global_step=2869.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:08<00:00,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000677, train/loss_step=0.190, global_step=2870.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4: 100%|██████████| 5971/5971 [58:09<00:00,  1.71it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000151, train/loss_step=0.0417, global_step=2870.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:10<00:00,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00161, train/loss_step=0.337, global_step=2870.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4: 100%|██████████| 5971/5971 [58:11<00:00,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00161, train/loss_step=0.337, global_step=2870.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:12<00:00,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000488, train/loss_step=0.148, global_step=2870.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:13<00:00,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000826, train/loss_step=0.224, global_step=2871.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:14<00:00,  1.71it/s, loss=0.168, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00158, train/loss_step=0.287, global_step=2871.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 4: 100%|██████████| 5971/5971 [58:15<00:00,  1.71it/s, loss=0.178, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000681, train/loss_step=0.203, global_step=2871.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:17<00:00,  1.71it/s, loss=0.215, v_num=0, train/loss_simple_step=0.776, train/loss_vlb_step=0.0189, train/loss_step=0.776, global_step=2871.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4: 100%|██████████| 5971/5971 [58:18<00:00,  1.71it/s, loss=0.231, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00155, train/loss_step=0.328, global_step=2872.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:19<00:00,  1.71it/s, loss=0.221, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000874, train/loss_step=0.207, global_step=2872.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:20<00:00,  1.71it/s, loss=0.226, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.0017, train/loss_step=0.366, global_step=2872.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]  
Epoch 4: 100%|██████████| 5971/5971 [58:22<00:00,  1.71it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.48e-5, train/loss_step=0.0188, global_step=2872.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:23<00:00,  1.70it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.84e-5, train/loss_step=0.0164, global_step=2873.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:24<00:00,  1.70it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00793, train/loss_vlb_step=3.96e-5, train/loss_step=0.00793, global_step=2873.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:24<00:00,  1.70it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.55e-5, train/loss_step=0.0191, global_step=2873.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4: 100%|██████████| 5971/5971 [58:27<00:00,  1.70it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00597, train/loss_vlb_step=3.02e-5, train/loss_step=0.00597, global_step=2873.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 4: 100%|██████████| 5971/5971 [58:29<00:00,  1.70it/s, loss=0.208, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00247, train/loss_step=0.375, global_step=2874.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]   
Epoch 4:   0%|          | 0/5971 [00:00<00:00, 7410.43it/s, loss=0.208, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00247, train/loss_step=0.375, global_step=2874.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153] 
Epoch 5:   0%|          | 0/5971 [00:00<00:03, 1789.38it/s, loss=0.208, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00247, train/loss_step=0.375, global_step=2874.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 5:   0%|          | 1/5971 [00:02<1:46:51,  1.07s/it, loss=0.208, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00247, train/loss_step=0.375, global_step=2874.0, train/loss_simple_epoch=0.153, train/loss_vlb_epoch=0.00292, train/loss_epoch=0.153]
Epoch 5:   0%|          | 1/5971 [00:02<1:46:55,  1.07s/it, loss=0.218, v_num=0, train/loss_simple_step=0.742, train/loss_vlb_step=0.0189, train/loss_step=0.742, global_step=2875.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   0%|          | 2/5971 [00:03<1:42:54,  1.03s/it, loss=0.218, v_num=0, train/loss_simple_step=0.742, train/loss_vlb_step=0.0189, train/loss_step=0.742, global_step=2875.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 2/5971 [00:03<1:42:56,  1.03s/it, loss=0.225, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000484, train/loss_step=0.147, global_step=2875.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 3/5971 [00:04<1:39:28,  1.00s/it, loss=0.225, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000484, train/loss_step=0.147, global_step=2875.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 3/5971 [00:04<1:39:30,  1.00s/it, loss=0.222, v_num=0, train/loss_simple_step=0.00767, train/loss_vlb_step=3.72e-5, train/loss_step=0.00767, global_step=2875.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 4/5971 [00:06<2:10:43,  1.31s/it, loss=0.222, v_num=0, train/loss_simple_step=0.00767, train/loss_vlb_step=3.72e-5, train/loss_step=0.00767, global_step=2875.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 4/5971 [00:06<2:10:44,  1.31s/it, loss=0.234, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.00179, train/loss_step=0.414, global_step=2875.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   0%|          | 5/5971 [00:07<2:04:29,  1.25s/it, loss=0.234, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.00179, train/loss_step=0.414, global_step=2875.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 5/5971 [00:07<2:04:30,  1.25s/it, loss=0.243, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000806, train/loss_step=0.224, global_step=2876.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 6/5971 [00:08<1:59:20,  1.20s/it, loss=0.243, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000806, train/loss_step=0.224, global_step=2876.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 6/5971 [00:08<1:59:21,  1.20s/it, loss=0.248, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00274, train/loss_step=0.452, global_step=2876.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   0%|          | 7/5971 [00:09<1:55:27,  1.16s/it, loss=0.248, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00274, train/loss_step=0.452, global_step=2876.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 7/5971 [00:09<1:55:28,  1.16s/it, loss=0.272, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00908, train/loss_step=0.628, global_step=2876.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 8/5971 [00:11<2:09:09,  1.30s/it, loss=0.272, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00908, train/loss_step=0.628, global_step=2876.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 8/5971 [00:11<2:09:10,  1.30s/it, loss=0.283, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00283, train/loss_step=0.439, global_step=2876.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 9/5971 [00:12<2:05:15,  1.26s/it, loss=0.283, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00283, train/loss_step=0.439, global_step=2876.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 9/5971 [00:12<2:05:16,  1.26s/it, loss=0.269, v_num=0, train/loss_simple_step=0.00271, train/loss_vlb_step=1.57e-5, train/loss_step=0.00271, global_step=2877.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 10/5971 [00:13<2:01:56,  1.23s/it, loss=0.269, v_num=0, train/loss_simple_step=0.00271, train/loss_vlb_step=1.57e-5, train/loss_step=0.00271, global_step=2877.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 10/5971 [00:13<2:01:57,  1.23s/it, loss=0.259, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=2.9e-5, train/loss_step=0.00598, global_step=2877.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   0%|          | 11/5971 [00:14<1:59:06,  1.20s/it, loss=0.259, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=2.9e-5, train/loss_step=0.00598, global_step=2877.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 11/5971 [00:14<1:59:07,  1.20s/it, loss=0.226, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000358, train/loss_step=0.109, global_step=2877.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   0%|          | 12/5971 [00:16<2:07:50,  1.29s/it, loss=0.226, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000358, train/loss_step=0.109, global_step=2877.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 12/5971 [00:16<2:07:51,  1.29s/it, loss=0.235, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.004, train/loss_step=0.508, global_step=2877.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   0%|          | 13/5971 [00:17<2:05:04,  1.26s/it, loss=0.235, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.004, train/loss_step=0.508, global_step=2877.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 13/5971 [00:17<2:05:04,  1.26s/it, loss=0.225, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.24e-5, train/loss_step=0.00221, global_step=2878.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 14/5971 [00:18<2:02:40,  1.24s/it, loss=0.225, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.24e-5, train/loss_step=0.00221, global_step=2878.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 14/5971 [00:18<2:02:40,  1.24s/it, loss=0.207, v_num=0, train/loss_simple_step=0.00741, train/loss_vlb_step=3.54e-5, train/loss_step=0.00741, global_step=2878.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 15/5971 [00:19<2:00:30,  1.21s/it, loss=0.207, v_num=0, train/loss_simple_step=0.00741, train/loss_vlb_step=3.54e-5, train/loss_step=0.00741, global_step=2878.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 15/5971 [00:19<2:00:31,  1.21s/it, loss=0.229, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00653, train/loss_step=0.462, global_step=2878.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   0%|          | 16/5971 [00:21<2:06:55,  1.28s/it, loss=0.229, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00653, train/loss_step=0.462, global_step=2878.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 16/5971 [00:21<2:06:56,  1.28s/it, loss=0.259, v_num=0, train/loss_simple_step=0.615, train/loss_vlb_step=0.0091, train/loss_step=0.615, global_step=2878.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   0%|          | 17/5971 [00:22<2:04:50,  1.26s/it, loss=0.259, v_num=0, train/loss_simple_step=0.615, train/loss_vlb_step=0.0091, train/loss_step=0.615, global_step=2878.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 17/5971 [00:22<2:04:50,  1.26s/it, loss=0.283, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00475, train/loss_step=0.490, global_step=2879.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 18/5971 [00:23<2:02:50,  1.24s/it, loss=0.283, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00475, train/loss_step=0.490, global_step=2879.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 18/5971 [00:23<2:02:50,  1.24s/it, loss=0.284, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.00016, train/loss_step=0.0436, global_step=2879.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 19/5971 [00:24<2:01:07,  1.22s/it, loss=0.284, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.00016, train/loss_step=0.0436, global_step=2879.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 19/5971 [00:24<2:01:07,  1.22s/it, loss=0.284, v_num=0, train/loss_simple_step=0.0042, train/loss_vlb_step=2.3e-5, train/loss_step=0.0042, global_step=2879.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   0%|          | 20/5971 [00:26<2:05:46,  1.27s/it, loss=0.284, v_num=0, train/loss_simple_step=0.0042, train/loss_vlb_step=2.3e-5, train/loss_step=0.0042, global_step=2879.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 20/5971 [00:26<2:05:47,  1.27s/it, loss=0.282, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00189, train/loss_step=0.337, global_step=2879.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   0%|          | 21/5971 [00:27<2:04:12,  1.25s/it, loss=0.282, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00189, train/loss_step=0.337, global_step=2879.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 21/5971 [00:27<2:04:12,  1.25s/it, loss=0.248, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000222, train/loss_step=0.0657, global_step=2880.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 22/5971 [00:28<2:02:35,  1.24s/it, loss=0.248, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000222, train/loss_step=0.0657, global_step=2880.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 22/5971 [00:28<2:02:35,  1.24s/it, loss=0.243, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000129, train/loss_step=0.0348, global_step=2880.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 23/5971 [00:29<2:01:04,  1.22s/it, loss=0.243, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000129, train/loss_step=0.0348, global_step=2880.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 23/5971 [00:29<2:01:05,  1.22s/it, loss=0.269, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.0116, train/loss_step=0.537, global_step=2880.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   0%|          | 24/5971 [00:32<2:07:04,  1.28s/it, loss=0.269, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.0116, train/loss_step=0.537, global_step=2880.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 24/5971 [00:32<2:07:04,  1.28s/it, loss=0.263, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00131, train/loss_step=0.299, global_step=2880.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 25/5971 [00:32<2:05:35,  1.27s/it, loss=0.263, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00131, train/loss_step=0.299, global_step=2880.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 25/5971 [00:32<2:05:36,  1.27s/it, loss=0.253, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.44e-5, train/loss_step=0.0209, global_step=2881.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 26/5971 [00:33<2:04:06,  1.25s/it, loss=0.253, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.44e-5, train/loss_step=0.0209, global_step=2881.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 26/5971 [00:33<2:04:06,  1.25s/it, loss=0.231, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.75e-5, train/loss_step=0.0134, global_step=2881.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 27/5971 [00:34<2:02:44,  1.24s/it, loss=0.231, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.75e-5, train/loss_step=0.0134, global_step=2881.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 27/5971 [00:34<2:02:44,  1.24s/it, loss=0.2, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.59e-5, train/loss_step=0.00278, global_step=2881.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 28/5971 [00:36<2:05:59,  1.27s/it, loss=0.2, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.59e-5, train/loss_step=0.00278, global_step=2881.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 28/5971 [00:36<2:06:00,  1.27s/it, loss=0.194, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00135, train/loss_step=0.326, global_step=2881.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   0%|          | 29/5971 [00:37<2:04:43,  1.26s/it, loss=0.194, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00135, train/loss_step=0.326, global_step=2881.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 29/5971 [00:37<2:04:43,  1.26s/it, loss=0.195, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.73e-5, train/loss_step=0.0136, global_step=2882.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 30/5971 [00:38<2:03:28,  1.25s/it, loss=0.195, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.73e-5, train/loss_step=0.0136, global_step=2882.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 30/5971 [00:38<2:03:28,  1.25s/it, loss=0.195, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.48e-5, train/loss_step=0.0125, global_step=2882.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 31/5971 [00:39<2:02:16,  1.24s/it, loss=0.195, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.48e-5, train/loss_step=0.0125, global_step=2882.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 31/5971 [00:39<2:02:16,  1.24s/it, loss=0.206, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00161, train/loss_step=0.330, global_step=2882.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   1%|          | 32/5971 [00:41<2:05:25,  1.27s/it, loss=0.206, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00161, train/loss_step=0.330, global_step=2882.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 32/5971 [00:41<2:05:25,  1.27s/it, loss=0.182, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.63e-5, train/loss_step=0.0156, global_step=2882.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 33/5971 [00:42<2:04:21,  1.26s/it, loss=0.182, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.63e-5, train/loss_step=0.0156, global_step=2882.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 33/5971 [00:42<2:04:21,  1.26s/it, loss=0.202, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00243, train/loss_step=0.402, global_step=2883.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   1%|          | 34/5971 [00:43<2:03:15,  1.25s/it, loss=0.202, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00243, train/loss_step=0.402, global_step=2883.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 34/5971 [00:43<2:03:15,  1.25s/it, loss=0.217, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00167, train/loss_step=0.320, global_step=2883.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 35/5971 [00:44<2:02:14,  1.24s/it, loss=0.217, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00167, train/loss_step=0.320, global_step=2883.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 35/5971 [00:44<2:02:14,  1.24s/it, loss=0.204, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000738, train/loss_step=0.194, global_step=2883.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 36/5971 [00:46<2:04:44,  1.26s/it, loss=0.204, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000738, train/loss_step=0.194, global_step=2883.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 36/5971 [00:46<2:04:44,  1.26s/it, loss=0.173, v_num=0, train/loss_simple_step=0.00718, train/loss_vlb_step=3.44e-5, train/loss_step=0.00718, global_step=2883.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 37/5971 [00:47<2:03:45,  1.25s/it, loss=0.173, v_num=0, train/loss_simple_step=0.00718, train/loss_vlb_step=3.44e-5, train/loss_step=0.00718, global_step=2883.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 37/5971 [00:47<2:03:45,  1.25s/it, loss=0.158, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000684, train/loss_step=0.188, global_step=2884.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   1%|          | 38/5971 [00:48<2:02:47,  1.24s/it, loss=0.158, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000684, train/loss_step=0.188, global_step=2884.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 38/5971 [00:48<2:02:47,  1.24s/it, loss=0.157, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=2884.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 39/5971 [00:49<2:01:50,  1.23s/it, loss=0.157, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=2884.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 39/5971 [00:49<2:01:51,  1.23s/it, loss=0.158, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.14e-5, train/loss_step=0.0181, global_step=2884.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 40/5971 [00:51<2:05:07,  1.27s/it, loss=0.158, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.14e-5, train/loss_step=0.0181, global_step=2884.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 40/5971 [00:51<2:05:07,  1.27s/it, loss=0.142, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.37e-5, train/loss_step=0.0177, global_step=2884.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 41/5971 [00:52<2:04:16,  1.26s/it, loss=0.142, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.37e-5, train/loss_step=0.0177, global_step=2884.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 41/5971 [00:52<2:04:17,  1.26s/it, loss=0.139, v_num=0, train/loss_simple_step=0.0081, train/loss_vlb_step=3.58e-5, train/loss_step=0.0081, global_step=2885.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 42/5971 [00:53<2:03:25,  1.25s/it, loss=0.139, v_num=0, train/loss_simple_step=0.0081, train/loss_vlb_step=3.58e-5, train/loss_step=0.0081, global_step=2885.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 42/5971 [00:53<2:03:25,  1.25s/it, loss=0.139, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000174, train/loss_step=0.0496, global_step=2885.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 43/5971 [00:54<2:02:33,  1.24s/it, loss=0.139, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000174, train/loss_step=0.0496, global_step=2885.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 43/5971 [00:54<2:02:33,  1.24s/it, loss=0.113, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.52e-6, train/loss_step=0.00159, global_step=2885.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 44/5971 [00:56<2:04:40,  1.26s/it, loss=0.113, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.52e-6, train/loss_step=0.00159, global_step=2885.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 44/5971 [00:56<2:04:40,  1.26s/it, loss=0.125, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00891, train/loss_step=0.545, global_step=2885.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   1%|          | 45/5971 [00:57<2:03:54,  1.25s/it, loss=0.125, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00891, train/loss_step=0.545, global_step=2885.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 45/5971 [00:57<2:03:54,  1.25s/it, loss=0.128, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.0003, train/loss_step=0.0912, global_step=2886.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 46/5971 [00:58<2:03:03,  1.25s/it, loss=0.128, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.0003, train/loss_step=0.0912, global_step=2886.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 46/5971 [00:58<2:03:03,  1.25s/it, loss=0.154, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.0066, train/loss_step=0.518, global_step=2886.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   1%|          | 47/5971 [00:59<2:02:15,  1.24s/it, loss=0.154, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.0066, train/loss_step=0.518, global_step=2886.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 47/5971 [00:59<2:02:15,  1.24s/it, loss=0.157, v_num=0, train/loss_simple_step=0.0663, train/loss_vlb_step=0.000231, train/loss_step=0.0663, global_step=2886.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 48/5971 [01:02<2:06:19,  1.28s/it, loss=0.157, v_num=0, train/loss_simple_step=0.0663, train/loss_vlb_step=0.000231, train/loss_step=0.0663, global_step=2886.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 48/5971 [01:02<2:06:19,  1.28s/it, loss=0.142, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000113, train/loss_step=0.029, global_step=2886.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   1%|          | 49/5971 [01:03<2:05:36,  1.27s/it, loss=0.142, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000113, train/loss_step=0.029, global_step=2886.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 49/5971 [01:03<2:05:36,  1.27s/it, loss=0.153, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00085, train/loss_step=0.243, global_step=2887.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|          | 50/5971 [01:04<2:04:48,  1.26s/it, loss=0.153, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00085, train/loss_step=0.243, global_step=2887.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 50/5971 [01:04<2:04:49,  1.26s/it, loss=0.165, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000872, train/loss_step=0.233, global_step=2887.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 51/5971 [01:05<2:04:04,  1.26s/it, loss=0.165, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000872, train/loss_step=0.233, global_step=2887.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 51/5971 [01:05<2:04:05,  1.26s/it, loss=0.18, v_num=0, train/loss_simple_step=0.649, train/loss_vlb_step=0.0119, train/loss_step=0.649, global_step=2887.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   1%|          | 52/5971 [01:07<2:05:37,  1.27s/it, loss=0.18, v_num=0, train/loss_simple_step=0.649, train/loss_vlb_step=0.0119, train/loss_step=0.649, global_step=2887.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 52/5971 [01:07<2:05:37,  1.27s/it, loss=0.194, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00134, train/loss_step=0.287, global_step=2887.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 53/5971 [01:08<2:04:53,  1.27s/it, loss=0.194, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00134, train/loss_step=0.287, global_step=2887.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 53/5971 [01:08<2:04:53,  1.27s/it, loss=0.207, v_num=0, train/loss_simple_step=0.669, train/loss_vlb_step=0.0197, train/loss_step=0.669, global_step=2888.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|          | 54/5971 [01:09<2:04:08,  1.26s/it, loss=0.207, v_num=0, train/loss_simple_step=0.669, train/loss_vlb_step=0.0197, train/loss_step=0.669, global_step=2888.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 54/5971 [01:09<2:04:08,  1.26s/it, loss=0.192, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.82e-5, train/loss_step=0.00347, global_step=2888.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 55/5971 [01:10<2:03:27,  1.25s/it, loss=0.192, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.82e-5, train/loss_step=0.00347, global_step=2888.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 55/5971 [01:10<2:03:27,  1.25s/it, loss=0.185, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.0002, train/loss_step=0.0571, global_step=2888.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   1%|          | 56/5971 [01:12<2:04:58,  1.27s/it, loss=0.185, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.0002, train/loss_step=0.0571, global_step=2888.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 56/5971 [01:12<2:04:58,  1.27s/it, loss=0.211, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00368, train/loss_step=0.538, global_step=2888.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|          | 57/5971 [01:13<2:04:19,  1.26s/it, loss=0.211, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00368, train/loss_step=0.538, global_step=2888.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 57/5971 [01:13<2:04:19,  1.26s/it, loss=0.202, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.62e-5, train/loss_step=0.00291, global_step=2889.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 58/5971 [01:14<2:03:41,  1.26s/it, loss=0.202, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.62e-5, train/loss_step=0.00291, global_step=2889.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 58/5971 [01:14<2:03:41,  1.26s/it, loss=0.212, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000749, train/loss_step=0.205, global_step=2889.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   1%|          | 59/5971 [01:14<2:03:03,  1.25s/it, loss=0.212, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000749, train/loss_step=0.205, global_step=2889.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 59/5971 [01:14<2:03:03,  1.25s/it, loss=0.218, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000501, train/loss_step=0.152, global_step=2889.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 60/5971 [01:17<2:05:14,  1.27s/it, loss=0.218, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000501, train/loss_step=0.152, global_step=2889.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 60/5971 [01:17<2:05:15,  1.27s/it, loss=0.217, v_num=0, train/loss_simple_step=0.00118, train/loss_vlb_step=7.07e-6, train/loss_step=0.00118, global_step=2889.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 61/5971 [01:18<2:04:39,  1.27s/it, loss=0.217, v_num=0, train/loss_simple_step=0.00118, train/loss_vlb_step=7.07e-6, train/loss_step=0.00118, global_step=2889.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 61/5971 [01:18<2:04:39,  1.27s/it, loss=0.224, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=2890.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   1%|          | 62/5971 [01:19<2:04:02,  1.26s/it, loss=0.224, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=2890.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 62/5971 [01:19<2:04:02,  1.26s/it, loss=0.222, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.83e-5, train/loss_step=0.015, global_step=2890.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|          | 63/5971 [01:20<2:03:25,  1.25s/it, loss=0.222, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.83e-5, train/loss_step=0.015, global_step=2890.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 63/5971 [01:20<2:03:25,  1.25s/it, loss=0.222, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.05e-5, train/loss_step=0.00384, global_step=2890.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 64/5971 [01:22<2:04:55,  1.27s/it, loss=0.222, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.05e-5, train/loss_step=0.00384, global_step=2890.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 64/5971 [01:22<2:04:55,  1.27s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.8e-5, train/loss_step=0.0164, global_step=2890.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   1%|          | 65/5971 [01:23<2:04:24,  1.26s/it, loss=0.196, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.8e-5, train/loss_step=0.0164, global_step=2890.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 65/5971 [01:23<2:04:24,  1.26s/it, loss=0.207, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00151, train/loss_step=0.313, global_step=2891.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|          | 66/5971 [01:24<2:03:49,  1.26s/it, loss=0.207, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00151, train/loss_step=0.313, global_step=2891.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 66/5971 [01:24<2:03:49,  1.26s/it, loss=0.181, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.72e-5, train/loss_step=0.0103, global_step=2891.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 67/5971 [01:25<2:03:16,  1.25s/it, loss=0.181, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.72e-5, train/loss_step=0.0103, global_step=2891.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 67/5971 [01:25<2:03:16,  1.25s/it, loss=0.186, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000558, train/loss_step=0.161, global_step=2891.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|          | 68/5971 [01:27<2:04:33,  1.27s/it, loss=0.186, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000558, train/loss_step=0.161, global_step=2891.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 68/5971 [01:27<2:04:33,  1.27s/it, loss=0.188, v_num=0, train/loss_simple_step=0.0691, train/loss_vlb_step=0.00023, train/loss_step=0.0691, global_step=2891.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 69/5971 [01:28<2:04:00,  1.26s/it, loss=0.188, v_num=0, train/loss_simple_step=0.0691, train/loss_vlb_step=0.00023, train/loss_step=0.0691, global_step=2891.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 69/5971 [01:28<2:04:00,  1.26s/it, loss=0.177, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.55e-5, train/loss_step=0.010, global_step=2892.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   1%|          | 70/5971 [01:29<2:03:26,  1.26s/it, loss=0.177, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.55e-5, train/loss_step=0.010, global_step=2892.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 70/5971 [01:29<2:03:26,  1.26s/it, loss=0.167, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000168, train/loss_step=0.0483, global_step=2892.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 71/5971 [01:29<2:02:54,  1.25s/it, loss=0.167, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000168, train/loss_step=0.0483, global_step=2892.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 71/5971 [01:29<2:02:54,  1.25s/it, loss=0.135, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.54e-5, train/loss_step=0.00277, global_step=2892.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 72/5971 [01:32<2:04:06,  1.26s/it, loss=0.135, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.54e-5, train/loss_step=0.00277, global_step=2892.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 72/5971 [01:32<2:04:06,  1.26s/it, loss=0.125, v_num=0, train/loss_simple_step=0.093, train/loss_vlb_step=0.000306, train/loss_step=0.093, global_step=2892.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   1%|          | 73/5971 [01:33<2:03:39,  1.26s/it, loss=0.125, v_num=0, train/loss_simple_step=0.093, train/loss_vlb_step=0.000306, train/loss_step=0.093, global_step=2892.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 73/5971 [01:33<2:03:39,  1.26s/it, loss=0.107, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00147, train/loss_step=0.307, global_step=2893.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|          | 74/5971 [01:33<2:03:08,  1.25s/it, loss=0.107, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00147, train/loss_step=0.307, global_step=2893.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|          | 74/5971 [01:33<2:03:08,  1.25s/it, loss=0.114, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000465, train/loss_step=0.140, global_step=2893.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 75/5971 [01:34<2:02:39,  1.25s/it, loss=0.114, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000465, train/loss_step=0.140, global_step=2893.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 75/5971 [01:34<2:02:39,  1.25s/it, loss=0.12, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000642, train/loss_step=0.183, global_step=2893.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|▏         | 76/5971 [01:36<2:03:44,  1.26s/it, loss=0.12, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000642, train/loss_step=0.183, global_step=2893.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 76/5971 [01:36<2:03:44,  1.26s/it, loss=0.0935, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.06e-5, train/loss_step=0.0018, global_step=2893.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 77/5971 [01:37<2:03:15,  1.25s/it, loss=0.0935, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.06e-5, train/loss_step=0.0018, global_step=2893.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 77/5971 [01:37<2:03:15,  1.25s/it, loss=0.0948, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000108, train/loss_step=0.0292, global_step=2894.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 78/5971 [01:38<2:02:45,  1.25s/it, loss=0.0948, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000108, train/loss_step=0.0292, global_step=2894.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 78/5971 [01:38<2:02:45,  1.25s/it, loss=0.101, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00168, train/loss_step=0.324, global_step=2894.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   1%|▏         | 79/5971 [01:39<2:02:17,  1.25s/it, loss=0.101, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00168, train/loss_step=0.324, global_step=2894.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 79/5971 [01:39<2:02:17,  1.25s/it, loss=0.0937, v_num=0, train/loss_simple_step=0.00927, train/loss_vlb_step=4.31e-5, train/loss_step=0.00927, global_step=2894.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 80/5971 [01:41<2:03:20,  1.26s/it, loss=0.0937, v_num=0, train/loss_simple_step=0.00927, train/loss_vlb_step=4.31e-5, train/loss_step=0.00927, global_step=2894.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 80/5971 [01:41<2:03:20,  1.26s/it, loss=0.117, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00462, train/loss_step=0.477, global_step=2894.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:   1%|▏         | 81/5971 [01:42<2:02:55,  1.25s/it, loss=0.117, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00462, train/loss_step=0.477, global_step=2894.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 81/5971 [01:42<2:02:55,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.00439, train/loss_vlb_step=2.27e-5, train/loss_step=0.00439, global_step=2895.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 82/5971 [01:43<2:02:26,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.00439, train/loss_vlb_step=2.27e-5, train/loss_step=0.00439, global_step=2895.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 82/5971 [01:43<2:02:26,  1.25s/it, loss=0.11, v_num=0, train/loss_simple_step=0.00587, train/loss_vlb_step=3.03e-5, train/loss_step=0.00587, global_step=2895.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   1%|▏         | 83/5971 [01:44<2:01:59,  1.24s/it, loss=0.11, v_num=0, train/loss_simple_step=0.00587, train/loss_vlb_step=3.03e-5, train/loss_step=0.00587, global_step=2895.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 83/5971 [01:44<2:01:59,  1.24s/it, loss=0.111, v_num=0, train/loss_simple_step=0.00498, train/loss_vlb_step=2.48e-5, train/loss_step=0.00498, global_step=2895.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 84/5971 [01:46<2:03:00,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.00498, train/loss_vlb_step=2.48e-5, train/loss_step=0.00498, global_step=2895.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 84/5971 [01:46<2:03:00,  1.25s/it, loss=0.112, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.00014, train/loss_step=0.041, global_step=2895.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   1%|▏         | 85/5971 [01:47<2:02:34,  1.25s/it, loss=0.112, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.00014, train/loss_step=0.041, global_step=2895.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 85/5971 [01:47<2:02:34,  1.25s/it, loss=0.102, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=2896.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 86/5971 [01:48<2:02:08,  1.25s/it, loss=0.102, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=2896.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 86/5971 [01:48<2:02:08,  1.25s/it, loss=0.103, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000119, train/loss_step=0.0306, global_step=2896.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 87/5971 [01:49<2:01:42,  1.24s/it, loss=0.103, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000119, train/loss_step=0.0306, global_step=2896.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 87/5971 [01:49<2:01:43,  1.24s/it, loss=0.0968, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.000154, train/loss_step=0.0433, global_step=2896.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 88/5971 [01:51<2:03:06,  1.26s/it, loss=0.0968, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.000154, train/loss_step=0.0433, global_step=2896.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 88/5971 [01:51<2:03:06,  1.26s/it, loss=0.0977, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.000292, train/loss_step=0.0881, global_step=2896.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 89/5971 [01:52<2:02:47,  1.25s/it, loss=0.0977, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.000292, train/loss_step=0.0881, global_step=2896.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   1%|▏         | 89/5971 [01:52<2:02:47,  1.25s/it, loss=0.0976, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.76e-5, train/loss_step=0.00757, global_step=2897.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 90/5971 [01:53<2:02:21,  1.25s/it, loss=0.0976, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.76e-5, train/loss_step=0.00757, global_step=2897.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 90/5971 [01:53<2:02:21,  1.25s/it, loss=0.0962, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.23e-5, train/loss_step=0.0206, global_step=2897.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   2%|▏         | 91/5971 [01:54<2:01:56,  1.24s/it, loss=0.0962, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.23e-5, train/loss_step=0.0206, global_step=2897.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 91/5971 [01:54<2:01:56,  1.24s/it, loss=0.0969, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.15e-5, train/loss_step=0.0172, global_step=2897.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 92/5971 [01:56<2:02:49,  1.25s/it, loss=0.0969, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.15e-5, train/loss_step=0.0172, global_step=2897.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 92/5971 [01:56<2:02:49,  1.25s/it, loss=0.093, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.53e-5, train/loss_step=0.0135, global_step=2897.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   2%|▏         | 93/5971 [01:57<2:02:26,  1.25s/it, loss=0.093, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.53e-5, train/loss_step=0.0135, global_step=2897.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 93/5971 [01:57<2:02:26,  1.25s/it, loss=0.0794, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.00014, train/loss_step=0.0362, global_step=2898.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 94/5971 [01:58<2:02:03,  1.25s/it, loss=0.0794, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.00014, train/loss_step=0.0362, global_step=2898.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 94/5971 [01:58<2:02:03,  1.25s/it, loss=0.0785, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=2898.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   2%|▏         | 95/5971 [01:59<2:01:40,  1.24s/it, loss=0.0785, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=2898.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 95/5971 [01:59<2:01:40,  1.24s/it, loss=0.0721, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000193, train/loss_step=0.0554, global_step=2898.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 96/5971 [02:01<2:02:31,  1.25s/it, loss=0.0721, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000193, train/loss_step=0.0554, global_step=2898.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 96/5971 [02:01<2:02:31,  1.25s/it, loss=0.0894, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00232, train/loss_step=0.348, global_step=2898.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   2%|▏         | 97/5971 [02:02<2:02:11,  1.25s/it, loss=0.0894, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00232, train/loss_step=0.348, global_step=2898.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 97/5971 [02:02<2:02:11,  1.25s/it, loss=0.0984, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000784, train/loss_step=0.210, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 98/5971 [02:03<2:01:48,  1.24s/it, loss=0.0984, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000784, train/loss_step=0.210, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 98/5971 [02:03<2:01:48,  1.24s/it, loss=0.0826, v_num=0, train/loss_simple_step=0.00688, train/loss_vlb_step=3.38e-5, train/loss_step=0.00688, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 99/5971 [02:04<2:01:25,  1.24s/it, loss=0.0826, v_num=0, train/loss_simple_step=0.00688, train/loss_vlb_step=3.38e-5, train/loss_step=0.00688, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 99/5971 [02:04<2:01:25,  1.24s/it, loss=0.096, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.0012, train/loss_step=0.278, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]      
Epoch 5:   2%|▏         | 100/5971 [02:06<2:02:17,  1.25s/it, loss=0.096, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.0012, train/loss_step=0.278, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   2%|▏         | 100/5971 [02:06<2:02:17,  1.25s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:18,  2.10it/s][A
Epoch 5:   2%|▏         | 102/5971 [02:06<2:00:22,  1.23s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:00<00:47,  3.48it/s][A
Epoch 5:   2%|▏         | 104/5971 [02:06<1:58:13,  1.21s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   2%|▏         | 4/167 [00:00<00:23,  7.01it/s][A
Epoch 5:   2%|▏         | 107/5971 [02:07<1:55:00,  1.18s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▍         | 7/167 [00:00<00:13, 11.98it/s][A
Epoch 5:   2%|▏         | 110/5971 [02:07<1:51:57,  1.15s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   6%|▌         | 10/167 [00:00<00:10, 15.26it/s][A
Epoch 5:   2%|▏         | 113/5971 [02:07<1:49:04,  1.12s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 13/167 [00:01<00:08, 17.45it/s][A
Epoch 5:   2%|▏         | 116/5971 [02:07<1:46:19,  1.09s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.51it/s][A
Epoch 5:   2%|▏         | 119/5971 [02:07<1:43:44,  1.06s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█▏        | 19/167 [00:01<00:07, 19.27it/s][A
Epoch 5:   2%|▏         | 122/5971 [02:07<1:41:16,  1.04s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 22/167 [00:01<00:07, 20.28it/s][A
Epoch 5:   2%|▏         | 125/5971 [02:07<1:38:53,  1.01s/it, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 22.04it/s][A
Epoch 5:   2%|▏         | 128/5971 [02:07<1:36:37,  1.01it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  17%|█▋        | 28/167 [00:01<00:06, 22.03it/s][A
Epoch 5:   2%|▏         | 131/5971 [02:08<1:34:29,  1.03it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▊        | 31/167 [00:01<00:06, 22.25it/s][A
Epoch 5:   2%|▏         | 134/5971 [02:08<1:32:25,  1.05it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|██        | 34/167 [00:02<00:05, 24.09it/s][A

Validating:  22%|██▏       | 37/167 [00:02<00:05, 25.44it/s][A
Epoch 5:   2%|▏         | 138/5971 [02:08<1:29:47,  1.08it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.81it/s][A
Epoch 5:   2%|▏         | 142/5971 [02:08<1:27:19,  1.11it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.94it/s][A
Epoch 5:   2%|▏         | 146/5971 [02:08<1:24:59,  1.14it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.80it/s][A

Validating:  29%|██▉       | 49/167 [00:02<00:04, 25.28it/s][A
Epoch 5:   3%|▎         | 150/5971 [02:08<1:22:47,  1.17it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 52/167 [00:02<00:04, 25.44it/s][A
Epoch 5:   3%|▎         | 154/5971 [02:09<1:20:42,  1.20it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 24.55it/s][A
Epoch 5:   3%|▎         | 158/5971 [02:09<1:18:42,  1.23it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 24.76it/s][A

Validating:  37%|███▋      | 61/167 [00:03<00:04, 25.57it/s][A
Epoch 5:   3%|▎         | 162/5971 [02:09<1:16:48,  1.26it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 25.76it/s][A
Epoch 5:   3%|▎         | 166/5971 [02:09<1:15:00,  1.29it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.67it/s][A
Epoch 5:   3%|▎         | 170/5971 [02:09<1:13:17,  1.32it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.27it/s][A
Epoch 5:   3%|▎         | 174/5971 [02:09<1:11:39,  1.35it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.34it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.05it/s][A
Epoch 5:   3%|▎         | 178/5971 [02:09<1:10:04,  1.38it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 28.49it/s][A
Epoch 5:   3%|▎         | 182/5971 [02:10<1:08:33,  1.41it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  50%|█████     | 84/167 [00:03<00:02, 28.04it/s][A
Epoch 5:   3%|▎         | 186/5971 [02:10<1:07:07,  1.44it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  52%|█████▏    | 87/167 [00:04<00:02, 27.29it/s][A
Epoch 5:   3%|▎         | 190/5971 [02:10<1:05:45,  1.47it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.13it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.52it/s][A
Epoch 5:   3%|▎         | 194/5971 [02:10<1:04:26,  1.49it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.09it/s][A
Epoch 5:   3%|▎         | 198/5971 [02:10<1:03:10,  1.52it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.13it/s][A
Epoch 5:   3%|▎         | 202/5971 [02:10<1:01:57,  1.55it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.06it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 24.49it/s][A
Epoch 5:   3%|▎         | 206/5971 [02:10<1:00:48,  1.58it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 25.35it/s][A
Epoch 5:   4%|▎         | 210/5971 [02:11<59:41,  1.61it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 24.11it/s][A
Epoch 5:   4%|▎         | 214/5971 [02:11<58:36,  1.64it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 25.45it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:01, 25.27it/s][A
Epoch 5:   4%|▎         | 218/5971 [02:11<57:33,  1.67it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.21it/s][A
Epoch 5:   4%|▎         | 222/5971 [02:11<56:32,  1.69it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.90it/s][A
Epoch 5:   4%|▍         | 226/5971 [02:11<55:34,  1.72it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.03it/s][A
Epoch 5:   4%|▍         | 230/5971 [02:11<54:38,  1.75it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.84it/s][A
Epoch 5:   4%|▍         | 234/5971 [02:12<53:43,  1.78it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|████████  | 134/167 [00:05<00:01, 27.32it/s][A

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 27.11it/s][A
Epoch 5:   4%|▍         | 238/5971 [02:12<52:51,  1.81it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 140/167 [00:06<00:00, 27.42it/s][A
Epoch 5:   4%|▍         | 242/5971 [02:12<52:00,  1.84it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 27.06it/s][A
Epoch 5:   4%|▍         | 246/5971 [02:12<51:11,  1.86it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.97it/s][A

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.62it/s][A
Epoch 5:   4%|▍         | 250/5971 [02:12<50:23,  1.89it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.08it/s][A
Epoch 5:   4%|▍         | 254/5971 [02:12<49:36,  1.92it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.63it/s][A
Epoch 5:   4%|▍         | 258/5971 [02:12<48:52,  1.95it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.83it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.36it/s][A
Epoch 5:   4%|▍         | 262/5971 [02:13<48:09,  1.98it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.56it/s][A
Epoch 5:   4%|▍         | 266/5971 [02:13<47:27,  2.00it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 100%|██████████| 167/167 [00:07<00:00, 24.93it/s][A
Epoch 5:   4%|▍         | 268/5971 [02:13<47:15,  2.01it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000119, train/loss_step=0.0318, global_step=2899.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.68it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.89it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.08it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.21it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.43it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.41it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.60it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.51it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.50it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.50it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.61it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.64it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.59it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.63it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.66it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.69it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.70it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.70it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.70it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.23it/s]

Epoch 5:   5%|▍         | 269/5971 [02:25<51:16,  1.85it/s, loss=0.074, v_num=0, train/loss_simple_step=0.00882, train/loss_vlb_step=4.13e-5, train/loss_step=0.00882, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.82it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.23it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.51it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.74it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.87it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.06it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.15it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.19it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.19it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.43it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.50it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.58it/s][A
Epoch 5:   5%|▍         | 269/5971 [02:31<53:15,  1.78it/s, loss=0.074, v_num=0, train/loss_simple_step=0.00882, train/loss_vlb_step=4.13e-5, train/loss_step=0.00882, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.47it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.50it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.66it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.47it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.39it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.27it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.30it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.59it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.13it/s]

Epoch 5:   5%|▍         | 270/5971 [02:37<55:19,  1.72it/s, loss=0.074, v_num=0, train/loss_simple_step=0.00882, train/loss_vlb_step=4.13e-5, train/loss_step=0.00882, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 270/5971 [02:37<55:19,  1.72it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:33,  1.47it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:18,  2.59it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:13,  3.44it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  4.07it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.50it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.82it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.22it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.49it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.59it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.53it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.52it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:05,  4.51it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:05,  4.79it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  4.94it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.10it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.46it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.42it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.44it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.39it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.43it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.48it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.54it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.59it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.51it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 5:   5%|▍         | 271/5971 [02:49<59:21,  1.60it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 271/5971 [02:49<59:21,  1.60it/s, loss=0.0963, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00178, train/loss_step=0.347, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.23it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.84it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.27it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.57it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.84it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.94it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.66it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.67it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.54it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.58it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.65it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.71it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.73it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.59it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.52it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.47it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.46it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.44it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.42it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.44it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.55it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.24it/s]

Epoch 5:   5%|▍         | 272/5971 [03:03<1:03:44,  1.49it/s, loss=0.0963, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00178, train/loss_step=0.347, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 272/5971 [03:03<1:03:44,  1.49it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000101, train/loss_step=0.0262, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 273/5971 [03:04<1:03:48,  1.49it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000101, train/loss_step=0.0262, global_step=2900.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 273/5971 [03:04<1:03:48,  1.49it/s, loss=0.109, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00175, train/loss_step=0.377, global_step=2901.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   5%|▍         | 274/5971 [03:05<1:03:53,  1.49it/s, loss=0.109, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00175, train/loss_step=0.377, global_step=2901.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 274/5971 [03:05<1:03:53,  1.49it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00962, train/loss_vlb_step=4.31e-5, train/loss_step=0.00962, global_step=2901.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 275/5971 [03:05<1:03:57,  1.48it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00962, train/loss_vlb_step=4.31e-5, train/loss_step=0.00962, global_step=2901.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 275/5971 [03:05<1:03:57,  1.48it/s, loss=0.109, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.000192, train/loss_step=0.057, global_step=2901.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   5%|▍         | 276/5971 [03:08<1:04:34,  1.47it/s, loss=0.109, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.000192, train/loss_step=0.057, global_step=2901.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 276/5971 [03:08<1:04:34,  1.47it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.00019, train/loss_step=0.0531, global_step=2901.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 277/5971 [03:09<1:04:38,  1.47it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.00019, train/loss_step=0.0531, global_step=2901.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 277/5971 [03:09<1:04:38,  1.47it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.00015, train/loss_step=0.0432, global_step=2902.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 278/5971 [03:10<1:04:41,  1.47it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.00015, train/loss_step=0.0432, global_step=2902.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 278/5971 [03:10<1:04:41,  1.47it/s, loss=0.125, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00151, train/loss_step=0.348, global_step=2902.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   5%|▍         | 279/5971 [03:11<1:04:45,  1.46it/s, loss=0.125, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00151, train/loss_step=0.348, global_step=2902.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 279/5971 [03:11<1:04:45,  1.46it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00827, train/loss_vlb_step=3.78e-5, train/loss_step=0.00827, global_step=2902.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 280/5971 [03:13<1:05:14,  1.45it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00827, train/loss_vlb_step=3.78e-5, train/loss_step=0.00827, global_step=2902.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 280/5971 [03:13<1:05:14,  1.45it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0687, train/loss_vlb_step=0.00024, train/loss_step=0.0687, global_step=2902.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   5%|▍         | 281/5971 [03:14<1:05:17,  1.45it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0687, train/loss_vlb_step=0.00024, train/loss_step=0.0687, global_step=2902.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 281/5971 [03:14<1:05:18,  1.45it/s, loss=0.156, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.0114, train/loss_step=0.621, global_step=2903.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   5%|▍         | 282/5971 [03:15<1:05:21,  1.45it/s, loss=0.156, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.0114, train/loss_step=0.621, global_step=2903.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 282/5971 [03:15<1:05:21,  1.45it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.34e-5, train/loss_step=0.0168, global_step=2903.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 283/5971 [03:15<1:05:23,  1.45it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.34e-5, train/loss_step=0.0168, global_step=2903.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 283/5971 [03:15<1:05:23,  1.45it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.24e-6, train/loss_step=0.0016, global_step=2903.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 284/5971 [03:18<1:05:59,  1.44it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.24e-6, train/loss_step=0.0016, global_step=2903.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 284/5971 [03:18<1:05:59,  1.44it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.22e-5, train/loss_step=0.00755, global_step=2903.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 285/5971 [03:19<1:06:03,  1.43it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.22e-5, train/loss_step=0.00755, global_step=2903.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 285/5971 [03:19<1:06:03,  1.43it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00178, train/loss_vlb_step=1.08e-5, train/loss_step=0.00178, global_step=2904.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 286/5971 [03:20<1:06:05,  1.43it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00178, train/loss_vlb_step=1.08e-5, train/loss_step=0.00178, global_step=2904.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 286/5971 [03:20<1:06:05,  1.43it/s, loss=0.135, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00132, train/loss_step=0.277, global_step=2904.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   5%|▍         | 287/5971 [03:21<1:06:08,  1.43it/s, loss=0.135, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00132, train/loss_step=0.277, global_step=2904.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 287/5971 [03:21<1:06:08,  1.43it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.16e-5, train/loss_step=0.0194, global_step=2904.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 288/5971 [03:23<1:06:35,  1.42it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.16e-5, train/loss_step=0.0194, global_step=2904.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 288/5971 [03:23<1:06:35,  1.42it/s, loss=0.129, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000594, train/loss_step=0.170, global_step=2904.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   5%|▍         | 289/5971 [03:24<1:06:39,  1.42it/s, loss=0.129, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000594, train/loss_step=0.170, global_step=2904.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 289/5971 [03:24<1:06:39,  1.42it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=6.73e-5, train/loss_step=0.0175, global_step=2905.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 290/5971 [03:24<1:06:41,  1.42it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=6.73e-5, train/loss_step=0.0175, global_step=2905.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 290/5971 [03:24<1:06:41,  1.42it/s, loss=0.135, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.00091, train/loss_step=0.224, global_step=2905.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   5%|▍         | 291/5971 [03:25<1:06:44,  1.42it/s, loss=0.135, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.00091, train/loss_step=0.224, global_step=2905.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 291/5971 [03:25<1:06:44,  1.42it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.46e-5, train/loss_step=0.0026, global_step=2905.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 292/5971 [03:28<1:07:14,  1.41it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0026, train/loss_vlb_step=1.46e-5, train/loss_step=0.0026, global_step=2905.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 292/5971 [03:28<1:07:14,  1.41it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000165, train/loss_step=0.0481, global_step=2905.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 293/5971 [03:29<1:07:17,  1.41it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000165, train/loss_step=0.0481, global_step=2905.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 293/5971 [03:29<1:07:17,  1.41it/s, loss=0.105, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=2906.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   5%|▍         | 294/5971 [03:29<1:07:19,  1.41it/s, loss=0.105, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=2906.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 294/5971 [03:29<1:07:19,  1.41it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.00032, train/loss_step=0.0974, global_step=2906.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 295/5971 [03:30<1:07:22,  1.40it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.00032, train/loss_step=0.0974, global_step=2906.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 295/5971 [03:30<1:07:22,  1.40it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.13e-5, train/loss_step=0.00195, global_step=2906.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 296/5971 [03:33<1:07:50,  1.39it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.13e-5, train/loss_step=0.00195, global_step=2906.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 296/5971 [03:33<1:07:50,  1.39it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000132, train/loss_step=0.0382, global_step=2906.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   5%|▍         | 297/5971 [03:33<1:07:53,  1.39it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000132, train/loss_step=0.0382, global_step=2906.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 297/5971 [03:33<1:07:53,  1.39it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=6.69e-5, train/loss_step=0.0171, global_step=2907.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   5%|▍         | 298/5971 [03:34<1:07:55,  1.39it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=6.69e-5, train/loss_step=0.0171, global_step=2907.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▍         | 298/5971 [03:34<1:07:55,  1.39it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=2907.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 299/5971 [03:35<1:07:57,  1.39it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=2907.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 299/5971 [03:35<1:07:57,  1.39it/s, loss=0.11, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0022, train/loss_step=0.358, global_step=2907.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   5%|▌         | 300/5971 [03:37<1:08:24,  1.38it/s, loss=0.11, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0022, train/loss_step=0.358, global_step=2907.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 300/5971 [03:37<1:08:24,  1.38it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00802, train/loss_vlb_step=3.94e-5, train/loss_step=0.00802, global_step=2907.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 301/5971 [03:38<1:08:27,  1.38it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00802, train/loss_vlb_step=3.94e-5, train/loss_step=0.00802, global_step=2907.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 301/5971 [03:38<1:08:27,  1.38it/s, loss=0.0764, v_num=0, train/loss_simple_step=0.00985, train/loss_vlb_step=4.55e-5, train/loss_step=0.00985, global_step=2908.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 302/5971 [03:39<1:08:29,  1.38it/s, loss=0.0764, v_num=0, train/loss_simple_step=0.00985, train/loss_vlb_step=4.55e-5, train/loss_step=0.00985, global_step=2908.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 302/5971 [03:39<1:08:29,  1.38it/s, loss=0.104, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.00478, train/loss_step=0.577, global_step=2908.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:   5%|▌         | 303/5971 [03:40<1:08:31,  1.38it/s, loss=0.104, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.00478, train/loss_step=0.577, global_step=2908.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 303/5971 [03:40<1:08:31,  1.38it/s, loss=0.116, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000964, train/loss_step=0.232, global_step=2908.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 304/5971 [03:42<1:08:58,  1.37it/s, loss=0.116, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000964, train/loss_step=0.232, global_step=2908.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 304/5971 [03:42<1:08:58,  1.37it/s, loss=0.125, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000661, train/loss_step=0.186, global_step=2908.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 305/5971 [03:43<1:09:00,  1.37it/s, loss=0.125, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000661, train/loss_step=0.186, global_step=2908.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 305/5971 [03:43<1:09:00,  1.37it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.000107, train/loss_step=0.0276, global_step=2909.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 306/5971 [03:44<1:09:02,  1.37it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.000107, train/loss_step=0.0276, global_step=2909.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 306/5971 [03:44<1:09:02,  1.37it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.89e-5, train/loss_step=0.0114, global_step=2909.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   5%|▌         | 307/5971 [03:45<1:09:04,  1.37it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.89e-5, train/loss_step=0.0114, global_step=2909.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 307/5971 [03:45<1:09:04,  1.37it/s, loss=0.129, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.0017, train/loss_step=0.348, global_step=2909.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   5%|▌         | 308/5971 [03:47<1:09:29,  1.36it/s, loss=0.129, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.0017, train/loss_step=0.348, global_step=2909.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 308/5971 [03:47<1:09:29,  1.36it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.000103, train/loss_step=0.0265, global_step=2909.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 309/5971 [03:48<1:09:31,  1.36it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.000103, train/loss_step=0.0265, global_step=2909.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 309/5971 [03:48<1:09:31,  1.36it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00753, train/loss_vlb_step=3.55e-5, train/loss_step=0.00753, global_step=2910.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 310/5971 [03:49<1:09:33,  1.36it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00753, train/loss_vlb_step=3.55e-5, train/loss_step=0.00753, global_step=2910.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 310/5971 [03:49<1:09:33,  1.36it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00119, train/loss_vlb_step=7.26e-6, train/loss_step=0.00119, global_step=2910.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   5%|▌         | 311/5971 [03:50<1:09:35,  1.36it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00119, train/loss_vlb_step=7.26e-6, train/loss_step=0.00119, global_step=2910.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 311/5971 [03:50<1:09:35,  1.36it/s, loss=0.121, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000774, train/loss_step=0.209, global_step=2910.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   5%|▌         | 312/5971 [03:52<1:10:05,  1.35it/s, loss=0.121, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000774, train/loss_step=0.209, global_step=2910.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 312/5971 [03:52<1:10:05,  1.35it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.66e-5, train/loss_step=0.0152, global_step=2910.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 313/5971 [03:53<1:10:07,  1.34it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.66e-5, train/loss_step=0.0152, global_step=2910.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 313/5971 [03:53<1:10:07,  1.34it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.61e-5, train/loss_step=0.0206, global_step=2911.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 314/5971 [03:54<1:10:08,  1.34it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.61e-5, train/loss_step=0.0206, global_step=2911.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 314/5971 [03:54<1:10:09,  1.34it/s, loss=0.115, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=2911.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   5%|▌         | 315/5971 [03:55<1:10:10,  1.34it/s, loss=0.115, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=2911.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 315/5971 [03:55<1:10:10,  1.34it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.28e-5, train/loss_step=0.0194, global_step=2911.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 316/5971 [03:57<1:10:42,  1.33it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.28e-5, train/loss_step=0.0194, global_step=2911.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 316/5971 [03:57<1:10:42,  1.33it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.07e-5, train/loss_step=0.0112, global_step=2911.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 317/5971 [03:58<1:10:44,  1.33it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.07e-5, train/loss_step=0.0112, global_step=2911.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 317/5971 [03:58<1:10:44,  1.33it/s, loss=0.126, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000963, train/loss_step=0.237, global_step=2912.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   5%|▌         | 318/5971 [03:59<1:10:46,  1.33it/s, loss=0.126, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000963, train/loss_step=0.237, global_step=2912.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 318/5971 [03:59<1:10:46,  1.33it/s, loss=0.133, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00112, train/loss_step=0.247, global_step=2912.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   5%|▌         | 319/5971 [04:00<1:10:47,  1.33it/s, loss=0.133, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00112, train/loss_step=0.247, global_step=2912.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 319/5971 [04:00<1:10:47,  1.33it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00367, train/loss_vlb_step=2.03e-5, train/loss_step=0.00367, global_step=2912.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 320/5971 [04:02<1:11:10,  1.32it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00367, train/loss_vlb_step=2.03e-5, train/loss_step=0.00367, global_step=2912.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 320/5971 [04:02<1:11:10,  1.32it/s, loss=0.116, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9.22e-5, train/loss_step=0.022, global_step=2912.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   5%|▌         | 321/5971 [04:03<1:11:13,  1.32it/s, loss=0.116, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9.22e-5, train/loss_step=0.022, global_step=2912.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 321/5971 [04:03<1:11:13,  1.32it/s, loss=0.134, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00419, train/loss_step=0.366, global_step=2913.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 322/5971 [04:04<1:11:14,  1.32it/s, loss=0.134, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00419, train/loss_step=0.366, global_step=2913.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 322/5971 [04:04<1:11:14,  1.32it/s, loss=0.11, v_num=0, train/loss_simple_step=0.099, train/loss_vlb_step=0.000325, train/loss_step=0.099, global_step=2913.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 323/5971 [04:05<1:11:15,  1.32it/s, loss=0.11, v_num=0, train/loss_simple_step=0.099, train/loss_vlb_step=0.000325, train/loss_step=0.099, global_step=2913.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 323/5971 [04:05<1:11:15,  1.32it/s, loss=0.117, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00253, train/loss_step=0.381, global_step=2913.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 324/5971 [04:07<1:11:39,  1.31it/s, loss=0.117, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00253, train/loss_step=0.381, global_step=2913.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 324/5971 [04:07<1:11:39,  1.31it/s, loss=0.138, v_num=0, train/loss_simple_step=0.598, train/loss_vlb_step=0.00651, train/loss_step=0.598, global_step=2913.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 325/5971 [04:08<1:11:41,  1.31it/s, loss=0.138, v_num=0, train/loss_simple_step=0.598, train/loss_vlb_step=0.00651, train/loss_step=0.598, global_step=2913.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 325/5971 [04:08<1:11:41,  1.31it/s, loss=0.17, v_num=0, train/loss_simple_step=0.682, train/loss_vlb_step=0.0212, train/loss_step=0.682, global_step=2914.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   5%|▌         | 326/5971 [04:09<1:11:42,  1.31it/s, loss=0.17, v_num=0, train/loss_simple_step=0.682, train/loss_vlb_step=0.0212, train/loss_step=0.682, global_step=2914.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 326/5971 [04:09<1:11:42,  1.31it/s, loss=0.194, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00483, train/loss_step=0.478, global_step=2914.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 327/5971 [04:10<1:11:43,  1.31it/s, loss=0.194, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00483, train/loss_step=0.478, global_step=2914.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 327/5971 [04:10<1:11:43,  1.31it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0668, train/loss_vlb_step=0.00023, train/loss_step=0.0668, global_step=2914.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 328/5971 [04:12<1:12:09,  1.30it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0668, train/loss_vlb_step=0.00023, train/loss_step=0.0668, global_step=2914.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   5%|▌         | 328/5971 [04:12<1:12:09,  1.30it/s, loss=0.196, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00158, train/loss_step=0.346, global_step=2914.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 329/5971 [04:13<1:12:10,  1.30it/s, loss=0.196, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00158, train/loss_step=0.346, global_step=2914.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 329/5971 [04:13<1:12:10,  1.30it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0687, train/loss_vlb_step=0.000231, train/loss_step=0.0687, global_step=2915.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 330/5971 [04:14<1:12:12,  1.30it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0687, train/loss_vlb_step=0.000231, train/loss_step=0.0687, global_step=2915.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 330/5971 [04:14<1:12:12,  1.30it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.02e-5, train/loss_step=0.00175, global_step=2915.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 331/5971 [04:15<1:12:13,  1.30it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.02e-5, train/loss_step=0.00175, global_step=2915.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 331/5971 [04:15<1:12:13,  1.30it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000142, train/loss_step=0.0413, global_step=2915.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   6%|▌         | 332/5971 [04:17<1:12:37,  1.29it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000142, train/loss_step=0.0413, global_step=2915.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 332/5971 [04:17<1:12:37,  1.29it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00935, train/loss_vlb_step=4.3e-5, train/loss_step=0.00935, global_step=2915.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 333/5971 [04:18<1:12:39,  1.29it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00935, train/loss_vlb_step=4.3e-5, train/loss_step=0.00935, global_step=2915.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 333/5971 [04:18<1:12:39,  1.29it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.52e-5, train/loss_step=0.0243, global_step=2916.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 334/5971 [04:19<1:12:40,  1.29it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.52e-5, train/loss_step=0.0243, global_step=2916.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 334/5971 [04:19<1:12:40,  1.29it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.89e-5, train/loss_step=0.0104, global_step=2916.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 335/5971 [04:20<1:12:41,  1.29it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.89e-5, train/loss_step=0.0104, global_step=2916.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 335/5971 [04:20<1:12:41,  1.29it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000222, train/loss_step=0.0657, global_step=2916.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 336/5971 [04:22<1:13:02,  1.29it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000222, train/loss_step=0.0657, global_step=2916.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 336/5971 [04:22<1:13:02,  1.29it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.24e-5, train/loss_step=0.0209, global_step=2916.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 337/5971 [04:23<1:13:04,  1.29it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.24e-5, train/loss_step=0.0209, global_step=2916.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 337/5971 [04:23<1:13:04,  1.29it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0416, train/loss_vlb_step=0.000149, train/loss_step=0.0416, global_step=2917.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 338/5971 [04:23<1:13:05,  1.28it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0416, train/loss_vlb_step=0.000149, train/loss_step=0.0416, global_step=2917.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 338/5971 [04:23<1:13:05,  1.28it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00311, train/loss_vlb_step=1.77e-5, train/loss_step=0.00311, global_step=2917.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 339/5971 [04:24<1:13:06,  1.28it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00311, train/loss_vlb_step=1.77e-5, train/loss_step=0.00311, global_step=2917.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 339/5971 [04:24<1:13:06,  1.28it/s, loss=0.181, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00121, train/loss_step=0.296, global_step=2917.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   6%|▌         | 340/5971 [04:27<1:13:31,  1.28it/s, loss=0.181, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00121, train/loss_step=0.296, global_step=2917.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 340/5971 [04:27<1:13:31,  1.28it/s, loss=0.186, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2917.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 341/5971 [04:28<1:13:32,  1.28it/s, loss=0.186, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=2917.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 341/5971 [04:28<1:13:32,  1.28it/s, loss=0.177, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000647, train/loss_step=0.188, global_step=2918.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 342/5971 [04:28<1:13:34,  1.28it/s, loss=0.177, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000647, train/loss_step=0.188, global_step=2918.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 342/5971 [04:28<1:13:34,  1.28it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.63e-5, train/loss_step=0.0134, global_step=2918.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 343/5971 [04:29<1:13:34,  1.27it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.63e-5, train/loss_step=0.0134, global_step=2918.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 343/5971 [04:29<1:13:34,  1.27it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.06e-5, train/loss_step=0.00188, global_step=2918.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 344/5971 [04:32<1:13:56,  1.27it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.06e-5, train/loss_step=0.00188, global_step=2918.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 344/5971 [04:32<1:13:56,  1.27it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00596, train/loss_vlb_step=3.02e-5, train/loss_step=0.00596, global_step=2918.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 345/5971 [04:32<1:13:57,  1.27it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00596, train/loss_vlb_step=3.02e-5, train/loss_step=0.00596, global_step=2918.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 345/5971 [04:32<1:13:57,  1.27it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.00815, train/loss_vlb_step=3.9e-5, train/loss_step=0.00815, global_step=2919.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 346/5971 [04:33<1:13:58,  1.27it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.00815, train/loss_vlb_step=3.9e-5, train/loss_step=0.00815, global_step=2919.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 346/5971 [04:33<1:13:58,  1.27it/s, loss=0.0783, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000944, train/loss_step=0.233, global_step=2919.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   6%|▌         | 347/5971 [04:34<1:13:59,  1.27it/s, loss=0.0783, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000944, train/loss_step=0.233, global_step=2919.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 347/5971 [04:34<1:13:59,  1.27it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.9e-5, train/loss_step=0.0158, global_step=2919.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 348/5971 [04:36<1:14:19,  1.26it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.9e-5, train/loss_step=0.0158, global_step=2919.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 348/5971 [04:36<1:14:19,  1.26it/s, loss=0.0661, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000502, train/loss_step=0.152, global_step=2919.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 349/5971 [04:37<1:14:24,  1.26it/s, loss=0.0661, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000502, train/loss_step=0.152, global_step=2919.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 349/5971 [04:37<1:14:24,  1.26it/s, loss=0.0635, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.67e-5, train/loss_step=0.0164, global_step=2920.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 350/5971 [04:38<1:14:24,  1.26it/s, loss=0.0635, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.67e-5, train/loss_step=0.0164, global_step=2920.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 350/5971 [04:38<1:14:24,  1.26it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.0029, train/loss_vlb_step=1.65e-5, train/loss_step=0.0029, global_step=2920.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 351/5971 [04:39<1:14:25,  1.26it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.0029, train/loss_vlb_step=1.65e-5, train/loss_step=0.0029, global_step=2920.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 351/5971 [04:39<1:14:25,  1.26it/s, loss=0.068, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=2920.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   6%|▌         | 352/5971 [04:42<1:14:49,  1.25it/s, loss=0.068, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=2920.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 352/5971 [04:42<1:14:49,  1.25it/s, loss=0.0816, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00117, train/loss_step=0.283, global_step=2920.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 353/5971 [04:42<1:14:49,  1.25it/s, loss=0.0816, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00117, train/loss_step=0.283, global_step=2920.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 353/5971 [04:42<1:14:49,  1.25it/s, loss=0.0825, v_num=0, train/loss_simple_step=0.0414, train/loss_vlb_step=0.000153, train/loss_step=0.0414, global_step=2921.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 354/5971 [04:43<1:14:50,  1.25it/s, loss=0.0825, v_num=0, train/loss_simple_step=0.0414, train/loss_vlb_step=0.000153, train/loss_step=0.0414, global_step=2921.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 354/5971 [04:43<1:14:50,  1.25it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.68e-5, train/loss_step=0.0184, global_step=2921.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 355/5971 [04:44<1:14:50,  1.25it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.68e-5, train/loss_step=0.0184, global_step=2921.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 355/5971 [04:44<1:14:50,  1.25it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000229, train/loss_step=0.065, global_step=2921.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 356/5971 [04:46<1:15:10,  1.24it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000229, train/loss_step=0.065, global_step=2921.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 356/5971 [04:46<1:15:10,  1.24it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.0414, train/loss_vlb_step=0.000157, train/loss_step=0.0414, global_step=2921.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 357/5971 [04:47<1:15:10,  1.24it/s, loss=0.0839, v_num=0, train/loss_simple_step=0.0414, train/loss_vlb_step=0.000157, train/loss_step=0.0414, global_step=2921.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 357/5971 [04:47<1:15:10,  1.24it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=5.85e-5, train/loss_step=0.0156, global_step=2922.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 358/5971 [04:48<1:15:10,  1.24it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=5.85e-5, train/loss_step=0.0156, global_step=2922.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 358/5971 [04:48<1:15:10,  1.24it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=2922.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 359/5971 [04:49<1:15:11,  1.24it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=2922.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 359/5971 [04:49<1:15:11,  1.24it/s, loss=0.0742, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.11e-5, train/loss_step=0.00404, global_step=2922.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 360/5971 [04:51<1:15:35,  1.24it/s, loss=0.0742, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.11e-5, train/loss_step=0.00404, global_step=2922.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 360/5971 [04:51<1:15:35,  1.24it/s, loss=0.0688, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.12e-5, train/loss_step=0.0142, global_step=2922.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   6%|▌         | 361/5971 [04:52<1:15:35,  1.24it/s, loss=0.0688, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.12e-5, train/loss_step=0.0142, global_step=2922.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 361/5971 [04:52<1:15:35,  1.24it/s, loss=0.0596, v_num=0, train/loss_simple_step=0.00305, train/loss_vlb_step=1.67e-5, train/loss_step=0.00305, global_step=2923.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 362/5971 [04:53<1:15:35,  1.24it/s, loss=0.0596, v_num=0, train/loss_simple_step=0.00305, train/loss_vlb_step=1.67e-5, train/loss_step=0.00305, global_step=2923.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 362/5971 [04:53<1:15:36,  1.24it/s, loss=0.059, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.34e-5, train/loss_step=0.00223, global_step=2923.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 363/5971 [04:54<1:15:36,  1.24it/s, loss=0.059, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.34e-5, train/loss_step=0.00223, global_step=2923.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 363/5971 [04:54<1:15:36,  1.24it/s, loss=0.0591, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.46e-5, train/loss_step=0.00259, global_step=2923.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 364/5971 [04:56<1:15:55,  1.23it/s, loss=0.0591, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.46e-5, train/loss_step=0.00259, global_step=2923.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 364/5971 [04:56<1:15:55,  1.23it/s, loss=0.0603, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000118, train/loss_step=0.0311, global_step=2923.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   6%|▌         | 365/5971 [04:57<1:15:55,  1.23it/s, loss=0.0603, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000118, train/loss_step=0.0311, global_step=2923.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 365/5971 [04:57<1:15:55,  1.23it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.544, train/loss_vlb_step=0.00769, train/loss_step=0.544, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   6%|▌         | 366/5971 [04:58<1:15:56,  1.23it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.544, train/loss_vlb_step=0.00769, train/loss_step=0.544, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 366/5971 [04:58<1:15:56,  1.23it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.51e-5, train/loss_step=0.0243, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 367/5971 [04:59<1:15:56,  1.23it/s, loss=0.0767, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.51e-5, train/loss_step=0.0243, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 367/5971 [04:59<1:15:56,  1.23it/s, loss=0.0762, v_num=0, train/loss_simple_step=0.00673, train/loss_vlb_step=3.26e-5, train/loss_step=0.00673, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 368/5971 [05:01<1:16:15,  1.22it/s, loss=0.0762, v_num=0, train/loss_simple_step=0.00673, train/loss_vlb_step=3.26e-5, train/loss_step=0.00673, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   6%|▌         | 368/5971 [05:01<1:16:15,  1.22it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:19,  2.08it/s][A
Epoch 5:   6%|▌         | 370/5971 [05:01<1:15:57,  1.23it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   2%|▏         | 4/167 [00:00<00:20,  7.95it/s][A
Epoch 5:   6%|▌         | 373/5971 [05:02<1:15:20,  1.24it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.22it/s][A
Epoch 5:   6%|▋         | 377/5971 [05:02<1:14:31,  1.25it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.77it/s][A
Epoch 5:   6%|▋         | 381/5971 [05:02<1:13:43,  1.26it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 14/167 [00:00<00:07, 19.32it/s][A
Epoch 5:   6%|▋         | 385/5971 [05:02<1:12:56,  1.28it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.91it/s][A
Epoch 5:   7%|▋         | 389/5971 [05:02<1:12:10,  1.29it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:01<00:05, 24.46it/s][A

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.60it/s][A
Epoch 5:   7%|▋         | 393/5971 [05:02<1:11:25,  1.30it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 26.48it/s][A
Epoch 5:   7%|▋         | 397/5971 [05:02<1:10:41,  1.31it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 26.59it/s][A
Epoch 5:   7%|▋         | 401/5971 [05:03<1:09:58,  1.33it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|██        | 34/167 [00:01<00:05, 25.72it/s][A
Epoch 5:   7%|▋         | 405/5971 [05:03<1:09:16,  1.34it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 37/167 [00:01<00:05, 24.67it/s][A

Validating:  24%|██▍       | 40/167 [00:01<00:05, 24.93it/s][A
Epoch 5:   7%|▋         | 409/5971 [05:03<1:08:35,  1.35it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.26it/s][A
Epoch 5:   7%|▋         | 413/5971 [05:03<1:07:54,  1.36it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.47it/s][A
Epoch 5:   7%|▋         | 417/5971 [05:03<1:07:14,  1.38it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.24it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 26.03it/s][A
Epoch 5:   7%|▋         | 421/5971 [05:03<1:06:35,  1.39it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.54it/s][A
Epoch 5:   7%|▋         | 425/5971 [05:03<1:05:57,  1.40it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.91it/s][A
Epoch 5:   7%|▋         | 429/5971 [05:04<1:05:19,  1.41it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 25.67it/s][A

Validating:  38%|███▊      | 64/167 [00:02<00:04, 24.41it/s][A
Epoch 5:   7%|▋         | 433/5971 [05:04<1:04:42,  1.43it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.49it/s][A
Epoch 5:   7%|▋         | 437/5971 [05:04<1:04:06,  1.44it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.25it/s][A
Epoch 5:   7%|▋         | 441/5971 [05:04<1:03:30,  1.45it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.76it/s][A
Epoch 5:   7%|▋         | 445/5971 [05:04<1:02:55,  1.46it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.58it/s][A
Epoch 5:   8%|▊         | 449/5971 [05:04<1:02:20,  1.48it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.09it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.46it/s][A
Epoch 5:   8%|▊         | 453/5971 [05:05<1:01:47,  1.49it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 26.43it/s][A
Epoch 5:   8%|▊         | 457/5971 [05:05<1:01:13,  1.50it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.13it/s][A
Epoch 5:   8%|▊         | 461/5971 [05:05<1:00:41,  1.51it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:03<00:02, 27.48it/s][A
Epoch 5:   8%|▊         | 465/5971 [05:05<1:00:08,  1.53it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.94it/s][A
Epoch 5:   8%|▊         | 469/5971 [05:05<59:37,  1.54it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.96it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.15it/s][A
Epoch 5:   8%|▊         | 473/5971 [05:05<59:06,  1.55it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.48it/s][A
Epoch 5:   8%|▊         | 477/5971 [05:05<58:36,  1.56it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.57it/s][A
Epoch 5:   8%|▊         | 481/5971 [05:06<58:06,  1.57it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 25.87it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.67it/s][A
Epoch 5:   8%|▊         | 485/5971 [05:06<57:36,  1.59it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 25.70it/s][A
Epoch 5:   8%|▊         | 489/5971 [05:06<57:07,  1.60it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.04it/s][A
Epoch 5:   8%|▊         | 493/5971 [05:06<56:39,  1.61it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.99it/s][A
Epoch 5:   8%|▊         | 497/5971 [05:06<56:11,  1.62it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.81it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.11it/s][A
Epoch 5:   8%|▊         | 501/5971 [05:06<55:43,  1.64it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.77it/s][A
Epoch 5:   8%|▊         | 505/5971 [05:06<55:16,  1.65it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.30it/s][A
Epoch 5:   9%|▊         | 509/5971 [05:07<54:49,  1.66it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.35it/s][A

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 26.32it/s][A
Epoch 5:   9%|▊         | 513/5971 [05:07<54:22,  1.67it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.62it/s][A
Epoch 5:   9%|▊         | 517/5971 [05:07<53:56,  1.68it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.48it/s][A
Epoch 5:   9%|▊         | 521/5971 [05:07<53:31,  1.70it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.58it/s][A
Epoch 5:   9%|▉         | 525/5971 [05:07<53:05,  1.71it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 28.44it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 28.35it/s][A
Epoch 5:   9%|▉         | 529/5971 [05:07<52:40,  1.72it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.57it/s][A
Epoch 5:   9%|▉         | 533/5971 [05:07<52:16,  1.73it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.18it/s][A
Epoch 5:   9%|▉         | 536/5971 [05:08<52:02,  1.74it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:   9%|▉         | 537/5971 [05:09<52:05,  1.74it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.00974, train/loss_vlb_step=4.51e-5, train/loss_step=0.00974, global_step=2924.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 537/5971 [05:09<52:05,  1.74it/s, loss=0.0904, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00272, train/loss_step=0.443, global_step=2925.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   9%|▉         | 538/5971 [05:10<52:07,  1.74it/s, loss=0.0938, v_num=0, train/loss_simple_step=0.0705, train/loss_vlb_step=0.00024, train/loss_step=0.0705, global_step=2925.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 539/5971 [05:11<52:10,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.716, train/loss_vlb_step=0.0371, train/loss_step=0.716, global_step=2925.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   9%|▉         | 540/5971 [05:13<52:29,  1.72it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.02e-5, train/loss_step=0.00378, global_step=2925.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 541/5971 [05:14<52:31,  1.72it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.02e-5, train/loss_step=0.00378, global_step=2925.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 541/5971 [05:14<52:31,  1.72it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000176, train/loss_step=0.0499, global_step=2926.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   9%|▉         | 542/5971 [05:15<52:34,  1.72it/s, loss=0.117, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000522, train/loss_step=0.157, global_step=2926.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   9%|▉         | 543/5971 [05:16<52:36,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000357, train/loss_step=0.109, global_step=2926.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 544/5971 [05:18<52:51,  1.71it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=1.01e-5, train/loss_step=0.00167, global_step=2926.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 545/5971 [05:19<52:54,  1.71it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=1.01e-5, train/loss_step=0.00167, global_step=2926.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 545/5971 [05:19<52:54,  1.71it/s, loss=0.133, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00168, train/loss_step=0.342, global_step=2927.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:   9%|▉         | 546/5971 [05:20<52:56,  1.71it/s, loss=0.138, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000746, train/loss_step=0.221, global_step=2927.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 547/5971 [05:21<52:58,  1.71it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0589, train/loss_vlb_step=0.000202, train/loss_step=0.0589, global_step=2927.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 548/5971 [05:23<53:13,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000194, train/loss_step=0.0533, global_step=2927.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 549/5971 [05:24<53:15,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000194, train/loss_step=0.0533, global_step=2927.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 549/5971 [05:24<53:15,  1.70it/s, loss=0.18, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0639, train/loss_step=0.750, global_step=2928.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:   9%|▉         | 550/5971 [05:25<53:17,  1.70it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.17e-5, train/loss_step=0.00206, global_step=2928.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 551/5971 [05:25<53:19,  1.69it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.68e-5, train/loss_step=0.0108, global_step=2928.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   9%|▉         | 552/5971 [05:28<53:34,  1.69it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.11e-5, train/loss_step=0.0183, global_step=2928.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 553/5971 [05:28<53:37,  1.68it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.11e-5, train/loss_step=0.0183, global_step=2928.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 553/5971 [05:28<53:37,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.82e-5, train/loss_step=0.00344, global_step=2929.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 554/5971 [05:29<53:39,  1.68it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00371, train/loss_vlb_step=1.94e-5, train/loss_step=0.00371, global_step=2929.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 555/5971 [05:30<53:41,  1.68it/s, loss=0.157, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000358, train/loss_step=0.108, global_step=2929.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   9%|▉         | 556/5971 [05:33<54:01,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.48e-5, train/loss_step=0.0236, global_step=2929.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 557/5971 [05:34<54:03,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.48e-5, train/loss_step=0.0236, global_step=2929.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 557/5971 [05:34<54:03,  1.67it/s, loss=0.143, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000579, train/loss_step=0.161, global_step=2930.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   9%|▉         | 558/5971 [05:35<54:05,  1.67it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.69e-5, train/loss_step=0.0142, global_step=2930.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 559/5971 [05:36<54:07,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.86e-5, train/loss_step=0.0108, global_step=2930.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 560/5971 [05:38<54:23,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00371, train/loss_step=0.443, global_step=2930.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:   9%|▉         | 561/5971 [05:39<54:26,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00371, train/loss_step=0.443, global_step=2930.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 561/5971 [05:39<54:26,  1.66it/s, loss=0.148, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00334, train/loss_step=0.469, global_step=2931.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 562/5971 [05:40<54:28,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.000193, train/loss_step=0.0571, global_step=2931.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 563/5971 [05:41<54:30,  1.65it/s, loss=0.156, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00298, train/loss_step=0.375, global_step=2931.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:   9%|▉         | 564/5971 [05:43<54:44,  1.65it/s, loss=0.168, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000951, train/loss_step=0.240, global_step=2931.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 565/5971 [05:44<54:46,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000951, train/loss_step=0.240, global_step=2931.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 565/5971 [05:44<54:46,  1.64it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0875, train/loss_vlb_step=0.000289, train/loss_step=0.0875, global_step=2932.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   9%|▉         | 566/5971 [05:44<54:48,  1.64it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.72e-5, train/loss_step=0.0077, global_step=2932.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:   9%|▉         | 567/5971 [05:45<54:50,  1.64it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.000138, train/loss_step=0.0385, global_step=2932.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 568/5971 [05:47<55:04,  1.64it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0793, train/loss_vlb_step=0.000262, train/loss_step=0.0793, global_step=2932.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 569/5971 [05:48<55:06,  1.63it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0793, train/loss_vlb_step=0.000262, train/loss_step=0.0793, global_step=2932.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 569/5971 [05:48<55:06,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00143, train/loss_step=0.308, global_step=2933.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  10%|▉         | 570/5971 [05:49<55:08,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.63e-5, train/loss_step=0.0103, global_step=2933.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 571/5971 [05:50<55:10,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00304, train/loss_vlb_step=1.67e-5, train/loss_step=0.00304, global_step=2933.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 572/5971 [05:52<55:24,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00457, train/loss_vlb_step=2.32e-5, train/loss_step=0.00457, global_step=2933.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 573/5971 [05:53<55:26,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00457, train/loss_vlb_step=2.32e-5, train/loss_step=0.00457, global_step=2933.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 573/5971 [05:53<55:26,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.32e-5, train/loss_step=0.0148, global_step=2934.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  10%|▉         | 574/5971 [05:54<55:27,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.23e-5, train/loss_step=0.00221, global_step=2934.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 575/5971 [05:55<55:29,  1.62it/s, loss=0.129, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.00104, train/loss_step=0.226, global_step=2934.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  10%|▉         | 576/5971 [05:57<55:46,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00792, train/loss_vlb_step=3.76e-5, train/loss_step=0.00792, global_step=2934.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 577/5971 [05:58<55:49,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00792, train/loss_vlb_step=3.76e-5, train/loss_step=0.00792, global_step=2934.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 577/5971 [05:58<55:49,  1.61it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.68e-5, train/loss_step=0.00316, global_step=2935.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|▉         | 578/5971 [05:59<55:50,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000467, train/loss_step=0.140, global_step=2935.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  10%|▉         | 579/5971 [06:00<55:52,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.09e-5, train/loss_step=0.00398, global_step=2935.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 580/5971 [06:02<56:05,  1.60it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000246, train/loss_step=0.0735, global_step=2935.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|▉         | 581/5971 [06:03<56:07,  1.60it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000246, train/loss_step=0.0735, global_step=2935.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 581/5971 [06:03<56:07,  1.60it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=9.98e-5, train/loss_step=0.0285, global_step=2936.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 582/5971 [06:04<56:09,  1.60it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00135, train/loss_step=0.335, global_step=2936.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  10%|▉         | 583/5971 [06:05<56:11,  1.60it/s, loss=0.0851, v_num=0, train/loss_simple_step=0.0896, train/loss_vlb_step=0.0003, train/loss_step=0.0896, global_step=2936.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 584/5971 [06:07<56:24,  1.59it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.0047, train/loss_step=0.492, global_step=2936.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  10%|▉         | 585/5971 [06:08<56:26,  1.59it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.0047, train/loss_step=0.492, global_step=2936.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 585/5971 [06:08<56:26,  1.59it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.7e-5, train/loss_step=0.0108, global_step=2937.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 586/5971 [06:09<56:28,  1.59it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000243, train/loss_step=0.0722, global_step=2937.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 587/5971 [06:10<56:30,  1.59it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000133, train/loss_step=0.0354, global_step=2937.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|▉         | 588/5971 [06:12<56:43,  1.58it/s, loss=0.118, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00395, train/loss_step=0.497, global_step=2937.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  10%|▉         | 589/5971 [06:13<56:45,  1.58it/s, loss=0.118, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00395, train/loss_step=0.497, global_step=2937.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 589/5971 [06:13<56:45,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00315, train/loss_step=0.388, global_step=2938.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 590/5971 [06:14<56:46,  1.58it/s, loss=0.123, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000157, train/loss_step=0.043, global_step=2938.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 591/5971 [06:15<56:48,  1.58it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=9.26e-5, train/loss_step=0.0219, global_step=2938.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 592/5971 [06:17<57:01,  1.57it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0882, train/loss_vlb_step=0.000298, train/loss_step=0.0882, global_step=2938.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 593/5971 [06:18<57:03,  1.57it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0882, train/loss_vlb_step=0.000298, train/loss_step=0.0882, global_step=2938.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 593/5971 [06:18<57:03,  1.57it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.000228, train/loss_step=0.0685, global_step=2939.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 594/5971 [06:18<57:04,  1.57it/s, loss=0.141, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000723, train/loss_step=0.205, global_step=2939.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  10%|▉         | 595/5971 [06:19<57:06,  1.57it/s, loss=0.136, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00036, train/loss_step=0.110, global_step=2939.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|▉         | 596/5971 [06:22<57:21,  1.56it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.41e-5, train/loss_step=0.0195, global_step=2939.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 597/5971 [06:23<57:23,  1.56it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.41e-5, train/loss_step=0.0195, global_step=2939.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|▉         | 597/5971 [06:23<57:23,  1.56it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.000245, train/loss_step=0.0726, global_step=2940.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 598/5971 [06:24<57:24,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000338, train/loss_step=0.102, global_step=2940.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|█         | 599/5971 [06:24<57:26,  1.56it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.00022, train/loss_step=0.0643, global_step=2940.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 600/5971 [06:27<57:38,  1.55it/s, loss=0.151, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00113, train/loss_step=0.276, global_step=2940.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  10%|█         | 601/5971 [06:27<57:40,  1.55it/s, loss=0.151, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00113, train/loss_step=0.276, global_step=2940.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 601/5971 [06:27<57:40,  1.55it/s, loss=0.173, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00314, train/loss_step=0.473, global_step=2941.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 602/5971 [06:28<57:41,  1.55it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00796, train/loss_vlb_step=3.69e-5, train/loss_step=0.00796, global_step=2941.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 603/5971 [06:29<57:43,  1.55it/s, loss=0.166, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00108, train/loss_step=0.267, global_step=2941.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  10%|█         | 604/5971 [06:32<58:00,  1.54it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=7.91e-5, train/loss_step=0.0212, global_step=2941.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 605/5971 [06:33<58:02,  1.54it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=7.91e-5, train/loss_step=0.0212, global_step=2941.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 605/5971 [06:33<58:02,  1.54it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000145, train/loss_step=0.0396, global_step=2942.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 606/5971 [06:34<58:03,  1.54it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.00017, train/loss_step=0.0474, global_step=2942.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|█         | 607/5971 [06:35<58:05,  1.54it/s, loss=0.169, v_num=0, train/loss_simple_step=0.565, train/loss_vlb_step=0.0049, train/loss_step=0.565, global_step=2942.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  10%|█         | 608/5971 [06:37<58:18,  1.53it/s, loss=0.155, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000769, train/loss_step=0.218, global_step=2942.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 609/5971 [06:38<58:20,  1.53it/s, loss=0.155, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000769, train/loss_step=0.218, global_step=2942.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 609/5971 [06:38<58:20,  1.53it/s, loss=0.145, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000647, train/loss_step=0.186, global_step=2943.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 610/5971 [06:39<58:21,  1.53it/s, loss=0.157, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00124, train/loss_step=0.295, global_step=2943.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|█         | 611/5971 [06:40<58:23,  1.53it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.71e-5, train/loss_step=0.0112, global_step=2943.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 612/5971 [06:42<58:35,  1.52it/s, loss=0.187, v_num=0, train/loss_simple_step=0.691, train/loss_vlb_step=0.0193, train/loss_step=0.691, global_step=2943.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  10%|█         | 613/5971 [06:43<58:36,  1.52it/s, loss=0.187, v_num=0, train/loss_simple_step=0.691, train/loss_vlb_step=0.0193, train/loss_step=0.691, global_step=2943.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 613/5971 [06:43<58:36,  1.52it/s, loss=0.189, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=2944.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 614/5971 [06:43<58:38,  1.52it/s, loss=0.191, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00102, train/loss_step=0.244, global_step=2944.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|█         | 615/5971 [06:44<58:39,  1.52it/s, loss=0.193, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000514, train/loss_step=0.154, global_step=2944.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 616/5971 [06:47<58:53,  1.52it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.75e-5, train/loss_step=0.0187, global_step=2944.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 617/5971 [06:48<58:54,  1.51it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.75e-5, train/loss_step=0.0187, global_step=2944.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 617/5971 [06:48<58:54,  1.51it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.51e-5, train/loss_step=0.0194, global_step=2945.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 618/5971 [06:48<58:56,  1.51it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.32e-5, train/loss_step=0.00223, global_step=2945.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 619/5971 [06:49<58:57,  1.51it/s, loss=0.206, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.00382, train/loss_step=0.466, global_step=2945.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  10%|█         | 620/5971 [06:51<59:09,  1.51it/s, loss=0.203, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000803, train/loss_step=0.221, global_step=2945.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 621/5971 [06:52<59:10,  1.51it/s, loss=0.203, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000803, train/loss_step=0.221, global_step=2945.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 621/5971 [06:52<59:10,  1.51it/s, loss=0.193, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.0012, train/loss_step=0.273, global_step=2946.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  10%|█         | 622/5971 [06:53<59:11,  1.51it/s, loss=0.23, v_num=0, train/loss_simple_step=0.757, train/loss_vlb_step=0.0304, train/loss_step=0.757, global_step=2946.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  10%|█         | 623/5971 [06:54<59:12,  1.51it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000328, train/loss_step=0.0992, global_step=2946.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 624/5971 [06:56<59:26,  1.50it/s, loss=0.242, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00222, train/loss_step=0.417, global_step=2946.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  10%|█         | 625/5971 [06:57<59:27,  1.50it/s, loss=0.242, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00222, train/loss_step=0.417, global_step=2946.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 625/5971 [06:57<59:27,  1.50it/s, loss=0.241, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.46e-5, train/loss_step=0.0204, global_step=2947.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  10%|█         | 626/5971 [06:58<59:28,  1.50it/s, loss=0.246, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000497, train/loss_step=0.146, global_step=2947.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  11%|█         | 627/5971 [06:59<59:30,  1.50it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00623, train/loss_vlb_step=3.09e-5, train/loss_step=0.00623, global_step=2947.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  11%|█         | 628/5971 [07:01<59:41,  1.49it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.000239, train/loss_step=0.0726, global_step=2947.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  11%|█         | 629/5971 [07:02<59:42,  1.49it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.000239, train/loss_step=0.0726, global_step=2947.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  11%|█         | 629/5971 [07:02<59:42,  1.49it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000143, train/loss_step=0.0387, global_step=2948.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  11%|█         | 630/5971 [07:03<59:43,  1.49it/s, loss=0.203, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00105, train/loss_step=0.283, global_step=2948.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  11%|█         | 631/5971 [07:04<59:45,  1.49it/s, loss=0.226, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.0045, train/loss_step=0.489, global_step=2948.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  11%|█         | 632/5971 [07:06<59:58,  1.48it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00163, train/loss_vlb_step=9.47e-6, train/loss_step=0.00163, global_step=2948.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  11%|█         | 633/5971 [07:07<1:00:00,  1.48it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00163, train/loss_vlb_step=9.47e-6, train/loss_step=0.00163, global_step=2948.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  11%|█         | 633/5971 [07:07<1:00:00,  1.48it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00522, train/loss_vlb_step=2.65e-5, train/loss_step=0.00522, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  11%|█         | 634/5971 [07:08<1:00:01,  1.48it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.82e-5, train/loss_step=0.0103, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  11%|█         | 635/5971 [07:09<1:00:02,  1.48it/s, loss=0.181, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00118, train/loss_step=0.274, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  11%|█         | 636/5971 [07:11<1:00:15,  1.48it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  11%|█         | 637/5971 [07:11<1:00:09,  1.48it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:18,  2.11it/s][A

Validating:   1%|          | 2/167 [00:00<00:42,  3.89it/s][A
Epoch 5:  11%|█         | 641/5971 [07:12<59:49,  1.48it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.82it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.93it/s][A
Epoch 5:  11%|█         | 645/5971 [07:12<59:25,  1.49it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.46it/s][A
Epoch 5:  11%|█         | 649/5971 [07:12<59:02,  1.50it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.90it/s][A
Epoch 5:  11%|█         | 653/5971 [07:12<58:39,  1.51it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.45it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.39it/s][A
Epoch 5:  11%|█         | 657/5971 [07:12<58:16,  1.52it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.22it/s][A
Epoch 5:  11%|█         | 661/5971 [07:13<57:54,  1.53it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.91it/s][A
Epoch 5:  11%|█         | 665/5971 [07:13<57:32,  1.54it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.26it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.96it/s][A
Epoch 5:  11%|█         | 669/5971 [07:13<57:10,  1.55it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.55it/s][A
Epoch 5:  11%|█▏        | 673/5971 [07:13<56:48,  1.55it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 26.83it/s][A
Epoch 5:  11%|█▏        | 677/5971 [07:13<56:26,  1.56it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 27.87it/s][A
Epoch 5:  11%|█▏        | 681/5971 [07:13<56:05,  1.57it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.28it/s][A
Epoch 5:  11%|█▏        | 685/5971 [07:14<55:44,  1.58it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.67it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 26.89it/s][A
Epoch 5:  12%|█▏        | 689/5971 [07:14<55:23,  1.59it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.57it/s][A
Epoch 5:  12%|█▏        | 693/5971 [07:14<55:03,  1.60it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.61it/s][A
Epoch 5:  12%|█▏        | 697/5971 [07:14<54:42,  1.61it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 25.22it/s][A
Epoch 5:  12%|█▏        | 701/5971 [07:14<54:22,  1.62it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.90it/s][A

Validating:  41%|████      | 68/167 [00:03<00:03, 27.09it/s][A
Epoch 5:  12%|█▏        | 705/5971 [07:14<54:02,  1.62it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.52it/s][A
Epoch 5:  12%|█▏        | 709/5971 [07:14<53:43,  1.63it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.53it/s][A
Epoch 5:  12%|█▏        | 713/5971 [07:15<53:23,  1.64it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.31it/s][A
Epoch 5:  12%|█▏        | 717/5971 [07:15<53:04,  1.65it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.73it/s][A
Epoch 5:  12%|█▏        | 721/5971 [07:15<52:45,  1.66it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████     | 85/167 [00:03<00:03, 27.08it/s][A

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 25.32it/s][A
Epoch 5:  12%|█▏        | 725/5971 [07:15<52:27,  1.67it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 91/167 [00:03<00:03, 24.84it/s][A
Epoch 5:  12%|█▏        | 729/5971 [07:15<52:08,  1.68it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 24.99it/s][A
Epoch 5:  12%|█▏        | 733/5971 [07:15<51:50,  1.68it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.89it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.87it/s][A
Epoch 5:  12%|█▏        | 737/5971 [07:15<51:32,  1.69it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.85it/s][A
Epoch 5:  12%|█▏        | 741/5971 [07:16<51:14,  1.70it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.95it/s][A
Epoch 5:  12%|█▏        | 745/5971 [07:16<50:56,  1.71it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▋   | 111/167 [00:04<00:01, 28.74it/s][A
Epoch 5:  13%|█▎        | 749/5971 [07:16<50:38,  1.72it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 28.65it/s][A
Epoch 5:  13%|█▎        | 753/5971 [07:16<50:21,  1.73it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.61it/s][A

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 27.95it/s][A
Epoch 5:  13%|█▎        | 757/5971 [07:16<50:03,  1.74it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.38it/s][A
Epoch 5:  13%|█▎        | 761/5971 [07:16<49:46,  1.74it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.63it/s][A
Epoch 5:  13%|█▎        | 765/5971 [07:17<49:30,  1.75it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.40it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.68it/s][A
Epoch 5:  13%|█▎        | 769/5971 [07:17<49:13,  1.76it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████  | 135/167 [00:05<00:01, 25.13it/s][A
Epoch 5:  13%|█▎        | 773/5971 [07:17<48:57,  1.77it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.36it/s][A
Epoch 5:  13%|█▎        | 777/5971 [07:17<48:40,  1.78it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 141/167 [00:05<00:01, 25.97it/s][A
Epoch 5:  13%|█▎        | 781/5971 [07:17<48:24,  1.79it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 27.67it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.51it/s][A
Epoch 5:  13%|█▎        | 785/5971 [07:17<48:08,  1.80it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.42it/s][A
Epoch 5:  13%|█▎        | 789/5971 [07:17<47:52,  1.80it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.68it/s][A
Epoch 5:  13%|█▎        | 793/5971 [07:18<47:36,  1.81it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.82it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 27.41it/s][A
Epoch 5:  13%|█▎        | 797/5971 [07:18<47:21,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.24it/s][A
Epoch 5:  13%|█▎        | 801/5971 [07:18<47:05,  1.83it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.24it/s][A
Epoch 5:  13%|█▎        | 804/5971 [07:18<46:56,  1.83it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  13%|█▎        | 805/5971 [07:19<46:58,  1.83it/s, loss=0.19, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000764, train/loss_step=0.200, global_step=2949.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  13%|█▎        | 805/5971 [07:19<46:58,  1.83it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.31e-5, train/loss_step=0.0044, global_step=2950.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  13%|█▎        | 806/5971 [07:20<46:59,  1.83it/s, loss=0.19, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=8.28e-5, train/loss_step=0.020, global_step=2950.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  14%|█▎        | 807/5971 [07:21<47:01,  1.83it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000194, train/loss_step=0.0568, global_step=2950.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 808/5971 [07:23<47:12,  1.82it/s, loss=0.185, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00416, train/loss_step=0.533, global_step=2950.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▎        | 809/5971 [07:24<47:14,  1.82it/s, loss=0.185, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00416, train/loss_step=0.533, global_step=2950.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 809/5971 [07:24<47:14,  1.82it/s, loss=0.179, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000482, train/loss_step=0.145, global_step=2951.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 810/5971 [07:25<47:15,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0067, train/loss_vlb_step=3.41e-5, train/loss_step=0.0067, global_step=2951.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 811/5971 [07:26<47:17,  1.82it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.83e-5, train/loss_step=0.00332, global_step=2951.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 812/5971 [07:28<47:27,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0823, train/loss_vlb_step=0.000273, train/loss_step=0.0823, global_step=2951.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▎        | 813/5971 [07:29<47:29,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0823, train/loss_vlb_step=0.000273, train/loss_step=0.0823, global_step=2951.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 813/5971 [07:29<47:29,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.47e-5, train/loss_step=0.0198, global_step=2952.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  14%|█▎        | 814/5971 [07:30<47:30,  1.81it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000286, train/loss_step=0.0859, global_step=2952.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 815/5971 [07:31<47:32,  1.81it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.06e-5, train/loss_step=0.00186, global_step=2952.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 816/5971 [07:33<47:41,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.28e-5, train/loss_step=0.00217, global_step=2952.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 817/5971 [07:34<47:43,  1.80it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.28e-5, train/loss_step=0.00217, global_step=2952.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 817/5971 [07:34<47:43,  1.80it/s, loss=0.135, v_num=0, train/loss_simple_step=0.484, train/loss_vlb_step=0.00389, train/loss_step=0.484, global_step=2953.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  14%|█▎        | 818/5971 [07:35<47:45,  1.80it/s, loss=0.135, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00145, train/loss_step=0.286, global_step=2953.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 819/5971 [07:36<47:46,  1.80it/s, loss=0.125, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00119, train/loss_step=0.270, global_step=2953.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 820/5971 [07:38<47:57,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.51e-5, train/loss_step=0.0179, global_step=2953.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 821/5971 [07:39<47:59,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.51e-5, train/loss_step=0.0179, global_step=2953.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▎        | 821/5971 [07:39<47:59,  1.79it/s, loss=0.146, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00615, train/loss_step=0.425, global_step=2954.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▍        | 822/5971 [07:40<48:00,  1.79it/s, loss=0.154, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000512, train/loss_step=0.154, global_step=2954.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 823/5971 [07:41<48:02,  1.79it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0646, train/loss_vlb_step=0.000222, train/loss_step=0.0646, global_step=2954.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 824/5971 [07:43<48:11,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0075, train/loss_vlb_step=3.54e-5, train/loss_step=0.0075, global_step=2954.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  14%|█▍        | 825/5971 [07:44<48:13,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0075, train/loss_vlb_step=3.54e-5, train/loss_step=0.0075, global_step=2954.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 825/5971 [07:44<48:13,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000236, train/loss_step=0.0657, global_step=2955.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 826/5971 [07:45<48:14,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000129, train/loss_step=0.0349, global_step=2955.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 827/5971 [07:46<48:16,  1.78it/s, loss=0.145, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000827, train/loss_step=0.203, global_step=2955.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▍        | 828/5971 [07:48<48:25,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.32e-5, train/loss_step=0.0118, global_step=2955.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 829/5971 [07:49<48:27,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.32e-5, train/loss_step=0.0118, global_step=2955.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 829/5971 [07:49<48:27,  1.77it/s, loss=0.123, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000931, train/loss_step=0.244, global_step=2956.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  14%|█▍        | 830/5971 [07:50<48:28,  1.77it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00537, train/loss_vlb_step=2.74e-5, train/loss_step=0.00537, global_step=2956.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 831/5971 [07:51<48:30,  1.77it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00463, train/loss_vlb_step=2.44e-5, train/loss_step=0.00463, global_step=2956.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 832/5971 [07:53<48:39,  1.76it/s, loss=0.131, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000881, train/loss_step=0.236, global_step=2956.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  14%|█▍        | 833/5971 [07:54<48:40,  1.76it/s, loss=0.131, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000881, train/loss_step=0.236, global_step=2956.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 833/5971 [07:54<48:40,  1.76it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00918, train/loss_vlb_step=4.25e-5, train/loss_step=0.00918, global_step=2957.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 834/5971 [07:54<48:42,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.361, train/loss_vlb_step=0.00158, train/loss_step=0.361, global_step=2957.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  14%|█▍        | 835/5971 [07:55<48:43,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.48e-5, train/loss_step=0.00252, global_step=2957.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 836/5971 [07:58<48:52,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=7.56e-6, train/loss_step=0.00136, global_step=2957.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 837/5971 [07:58<48:53,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=7.56e-6, train/loss_step=0.00136, global_step=2957.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 837/5971 [07:58<48:53,  1.75it/s, loss=0.135, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00148, train/loss_step=0.301, global_step=2958.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  14%|█▍        | 838/5971 [07:59<48:55,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.0017, train/loss_step=0.370, global_step=2958.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  14%|█▍        | 839/5971 [08:00<48:56,  1.75it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.24e-6, train/loss_step=0.00155, global_step=2958.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 840/5971 [08:03<49:11,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=9.61e-5, train/loss_step=0.0268, global_step=2958.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▍        | 841/5971 [08:04<49:13,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=9.61e-5, train/loss_step=0.0268, global_step=2958.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 841/5971 [08:04<49:13,  1.74it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000185, train/loss_step=0.0536, global_step=2959.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 842/5971 [08:05<49:14,  1.74it/s, loss=0.109, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000609, train/loss_step=0.181, global_step=2959.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▍        | 843/5971 [08:06<49:15,  1.73it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0429, train/loss_vlb_step=0.000152, train/loss_step=0.0429, global_step=2959.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 844/5971 [08:08<49:24,  1.73it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.32e-5, train/loss_step=0.0236, global_step=2959.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  14%|█▍        | 845/5971 [08:09<49:26,  1.73it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.32e-5, train/loss_step=0.0236, global_step=2959.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 845/5971 [08:09<49:26,  1.73it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000143, train/loss_step=0.0425, global_step=2960.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 846/5971 [08:10<49:27,  1.73it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.27e-5, train/loss_step=0.0144, global_step=2960.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  14%|█▍        | 847/5971 [08:11<49:28,  1.73it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=2.05e-5, train/loss_step=0.00374, global_step=2960.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 848/5971 [08:13<49:37,  1.72it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.82e-5, train/loss_step=0.0132, global_step=2960.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▍        | 849/5971 [08:14<49:39,  1.72it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.82e-5, train/loss_step=0.0132, global_step=2960.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 849/5971 [08:14<49:39,  1.72it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.43e-5, train/loss_step=0.00269, global_step=2961.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 850/5971 [08:15<49:40,  1.72it/s, loss=0.101, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00157, train/loss_step=0.321, global_step=2961.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  14%|█▍        | 851/5971 [08:16<49:41,  1.72it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.27e-5, train/loss_step=0.00231, global_step=2961.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 852/5971 [08:18<49:52,  1.71it/s, loss=0.104, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00165, train/loss_step=0.314, global_step=2961.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▍        | 853/5971 [08:19<49:53,  1.71it/s, loss=0.104, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00165, train/loss_step=0.314, global_step=2961.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 853/5971 [08:19<49:53,  1.71it/s, loss=0.125, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00256, train/loss_step=0.420, global_step=2962.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 854/5971 [08:20<49:54,  1.71it/s, loss=0.125, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00238, train/loss_step=0.356, global_step=2962.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 855/5971 [08:21<49:55,  1.71it/s, loss=0.138, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.000985, train/loss_step=0.259, global_step=2962.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 856/5971 [08:23<50:04,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00263, train/loss_vlb_step=1.47e-5, train/loss_step=0.00263, global_step=2962.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 857/5971 [08:24<50:05,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00263, train/loss_vlb_step=1.47e-5, train/loss_step=0.00263, global_step=2962.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 857/5971 [08:24<50:05,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000956, train/loss_step=0.240, global_step=2963.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  14%|█▍        | 858/5971 [08:25<50:06,  1.70it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.24e-5, train/loss_step=0.00214, global_step=2963.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 859/5971 [08:26<50:07,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000808, train/loss_step=0.209, global_step=2963.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  14%|█▍        | 860/5971 [08:28<50:18,  1.69it/s, loss=0.132, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000442, train/loss_step=0.131, global_step=2963.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 861/5971 [08:29<50:19,  1.69it/s, loss=0.132, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000442, train/loss_step=0.131, global_step=2963.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 861/5971 [08:29<50:19,  1.69it/s, loss=0.144, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00116, train/loss_step=0.293, global_step=2964.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  14%|█▍        | 862/5971 [08:30<50:20,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000177, train/loss_step=0.051, global_step=2964.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 863/5971 [08:31<50:21,  1.69it/s, loss=0.136, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=0.0001, train/loss_step=0.025, global_step=2964.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  14%|█▍        | 864/5971 [08:33<50:30,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000156, train/loss_step=0.0418, global_step=2964.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 865/5971 [08:34<50:32,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0418, train/loss_vlb_step=0.000156, train/loss_step=0.0418, global_step=2964.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  14%|█▍        | 865/5971 [08:34<50:32,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.64e-5, train/loss_step=0.00302, global_step=2965.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 866/5971 [08:35<50:33,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000156, train/loss_step=0.0453, global_step=2965.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  15%|█▍        | 867/5971 [08:36<50:34,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=6.92e-5, train/loss_step=0.0167, global_step=2965.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  15%|█▍        | 868/5971 [08:38<50:43,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.11e-5, train/loss_step=0.0203, global_step=2965.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 869/5971 [08:39<50:44,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.11e-5, train/loss_step=0.0203, global_step=2965.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 869/5971 [08:39<50:44,  1.68it/s, loss=0.162, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.0044, train/loss_step=0.496, global_step=2966.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  15%|█▍        | 870/5971 [08:40<50:45,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.00312, train/loss_step=0.492, global_step=2966.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 871/5971 [08:40<50:46,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000462, train/loss_step=0.138, global_step=2966.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 872/5971 [08:42<50:54,  1.67it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.77e-5, train/loss_step=0.00326, global_step=2966.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 873/5971 [08:43<50:55,  1.67it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.77e-5, train/loss_step=0.00326, global_step=2966.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 873/5971 [08:43<50:55,  1.67it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0509, train/loss_vlb_step=0.000178, train/loss_step=0.0509, global_step=2967.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  15%|█▍        | 874/5971 [08:44<50:56,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000185, train/loss_step=0.0528, global_step=2967.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 875/5971 [08:45<50:58,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00323, train/loss_step=0.398, global_step=2967.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  15%|█▍        | 876/5971 [08:47<51:06,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000859, train/loss_step=0.210, global_step=2967.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 877/5971 [08:48<51:07,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000859, train/loss_step=0.210, global_step=2967.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 877/5971 [08:48<51:07,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=8.26e-5, train/loss_step=0.020, global_step=2968.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  15%|█▍        | 878/5971 [08:49<51:08,  1.66it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000124, train/loss_step=0.0338, global_step=2968.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 879/5971 [08:50<51:09,  1.66it/s, loss=0.147, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00252, train/loss_step=0.411, global_step=2968.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  15%|█▍        | 880/5971 [08:52<51:19,  1.65it/s, loss=0.145, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000348, train/loss_step=0.104, global_step=2968.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 881/5971 [08:53<51:20,  1.65it/s, loss=0.145, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000348, train/loss_step=0.104, global_step=2968.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 881/5971 [08:53<51:20,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00106, train/loss_step=0.249, global_step=2969.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  15%|█▍        | 882/5971 [08:54<51:21,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=2969.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 883/5971 [08:55<51:22,  1.65it/s, loss=0.152, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000504, train/loss_step=0.149, global_step=2969.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 884/5971 [08:57<51:30,  1.65it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0541, train/loss_vlb_step=0.000188, train/loss_step=0.0541, global_step=2969.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 885/5971 [08:58<51:31,  1.65it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0541, train/loss_vlb_step=0.000188, train/loss_step=0.0541, global_step=2969.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 885/5971 [08:58<51:31,  1.65it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.02e-5, train/loss_step=0.0115, global_step=2970.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  15%|█▍        | 886/5971 [08:59<51:32,  1.64it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000296, train/loss_step=0.0899, global_step=2970.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 887/5971 [09:00<51:33,  1.64it/s, loss=0.181, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00734, train/loss_step=0.530, global_step=2970.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  15%|█▍        | 888/5971 [09:02<51:42,  1.64it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.34e-5, train/loss_step=0.0023, global_step=2970.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 889/5971 [09:03<51:43,  1.64it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.34e-5, train/loss_step=0.0023, global_step=2970.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 889/5971 [09:03<51:43,  1.64it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.49e-5, train/loss_step=0.0235, global_step=2971.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 890/5971 [09:04<51:44,  1.64it/s, loss=0.143, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000931, train/loss_step=0.236, global_step=2971.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  15%|█▍        | 891/5971 [09:05<51:45,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000731, train/loss_step=0.200, global_step=2971.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 892/5971 [09:07<51:55,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000117, train/loss_step=0.0306, global_step=2971.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 893/5971 [09:08<51:56,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000117, train/loss_step=0.0306, global_step=2971.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 893/5971 [09:08<51:56,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00695, train/loss_vlb_step=3.46e-5, train/loss_step=0.00695, global_step=2972.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▍        | 894/5971 [09:09<51:58,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000187, train/loss_step=0.0522, global_step=2972.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  15%|█▍        | 895/5971 [09:10<51:59,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.00063, train/loss_step=0.189, global_step=2972.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  15%|█▌        | 896/5971 [09:12<52:07,  1.62it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0949, train/loss_vlb_step=0.000313, train/loss_step=0.0949, global_step=2972.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▌        | 897/5971 [09:13<52:08,  1.62it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0949, train/loss_vlb_step=0.000313, train/loss_step=0.0949, global_step=2972.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▌        | 897/5971 [09:13<52:08,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.608, train/loss_vlb_step=0.0127, train/loss_step=0.608, global_step=2973.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  15%|█▌        | 898/5971 [09:14<52:08,  1.62it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0913, train/loss_vlb_step=0.0003, train/loss_step=0.0913, global_step=2973.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▌        | 899/5971 [09:15<52:09,  1.62it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00607, train/loss_vlb_step=3.08e-5, train/loss_step=0.00607, global_step=2973.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▌        | 900/5971 [09:17<52:18,  1.62it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00752, train/loss_vlb_step=3.55e-5, train/loss_step=0.00752, global_step=2973.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▌        | 901/5971 [09:18<52:19,  1.61it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00752, train/loss_vlb_step=3.55e-5, train/loss_step=0.00752, global_step=2973.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▌        | 901/5971 [09:18<52:19,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.89e-5, train/loss_step=0.0254, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  15%|█▌        | 902/5971 [09:19<52:20,  1.61it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.52e-5, train/loss_step=0.00478, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  15%|█▌        | 903/5971 [09:20<52:21,  1.61it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.57e-5, train/loss_step=0.0211, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  15%|█▌        | 904/5971 [09:22<52:29,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  15%|█▌        | 905/5971 [09:22<52:25,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:19,  2.08it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:28,  5.85it/s][A
Epoch 5:  15%|█▌        | 909/5971 [09:23<52:13,  1.62it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▎         | 6/167 [00:00<00:14, 11.15it/s][A
Epoch 5:  15%|█▌        | 913/5971 [09:23<51:57,  1.62it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:00<00:10, 15.18it/s][A

Validating:   7%|▋         | 12/167 [00:00<00:08, 18.39it/s][A
Epoch 5:  15%|█▌        | 917/5971 [09:23<51:42,  1.63it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.08it/s][A
Epoch 5:  15%|█▌        | 921/5971 [09:23<51:27,  1.64it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█         | 18/167 [00:01<00:06, 21.75it/s][A
Epoch 5:  15%|█▌        | 925/5971 [09:23<51:12,  1.64it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.95it/s][A

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.43it/s][A
Epoch 5:  16%|█▌        | 929/5971 [09:24<50:57,  1.65it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.26it/s][A
Epoch 5:  16%|█▌        | 933/5971 [09:24<50:43,  1.66it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 26.40it/s][A
Epoch 5:  16%|█▌        | 937/5971 [09:24<50:28,  1.66it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|██        | 34/167 [00:01<00:04, 26.68it/s][A
Epoch 5:  16%|█▌        | 941/5971 [09:24<50:13,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 37/167 [00:01<00:05, 25.39it/s][A

Validating:  24%|██▍       | 40/167 [00:02<00:05, 25.27it/s][A
Epoch 5:  16%|█▌        | 945/5971 [09:24<49:59,  1.68it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.42it/s][A
Epoch 5:  16%|█▌        | 949/5971 [09:24<49:45,  1.68it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.91it/s][A
Epoch 5:  16%|█▌        | 953/5971 [09:24<49:31,  1.69it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.82it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 26.07it/s][A
Epoch 5:  16%|█▌        | 957/5971 [09:25<49:17,  1.70it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.23it/s][A
Epoch 5:  16%|█▌        | 961/5971 [09:25<49:03,  1.70it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.28it/s][A
Epoch 5:  16%|█▌        | 965/5971 [09:25<48:49,  1.71it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 25.50it/s][A

Validating:  38%|███▊      | 64/167 [00:02<00:03, 25.91it/s][A
Epoch 5:  16%|█▌        | 969/5971 [09:25<48:36,  1.72it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.46it/s][A
Epoch 5:  16%|█▋        | 973/5971 [09:25<48:22,  1.72it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.19it/s][A
Epoch 5:  16%|█▋        | 977/5971 [09:25<48:09,  1.73it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.06it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.40it/s][A
Epoch 5:  16%|█▋        | 981/5971 [09:25<47:56,  1.73it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 24.35it/s][A
Epoch 5:  16%|█▋        | 985/5971 [09:26<47:43,  1.74it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 24.04it/s][A
Epoch 5:  17%|█▋        | 989/5971 [09:26<47:30,  1.75it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████     | 85/167 [00:03<00:03, 23.21it/s][A

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 23.56it/s][A
Epoch 5:  17%|█▋        | 993/5971 [09:26<47:17,  1.75it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 24.20it/s][A
Epoch 5:  17%|█▋        | 997/5971 [09:26<47:04,  1.76it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 24.62it/s][A
Epoch 5:  17%|█▋        | 1001/5971 [09:26<46:51,  1.77it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 24.38it/s][A
Epoch 5:  17%|█▋        | 1005/5971 [09:26<46:38,  1.77it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.16it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.14it/s][A
Epoch 5:  17%|█▋        | 1009/5971 [09:27<46:26,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.63it/s][A
Epoch 5:  17%|█▋        | 1013/5971 [09:27<46:13,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.44it/s][A
Epoch 5:  17%|█▋        | 1017/5971 [09:27<46:01,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.72it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 27.49it/s][A
Epoch 5:  17%|█▋        | 1021/5971 [09:27<45:49,  1.80it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.61it/s][A
Epoch 5:  17%|█▋        | 1025/5971 [09:27<45:36,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.67it/s][A
Epoch 5:  17%|█▋        | 1029/5971 [09:27<45:24,  1.81it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.30it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.84it/s][A
Epoch 5:  17%|█▋        | 1033/5971 [09:28<45:12,  1.82it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 27.49it/s][A
Epoch 5:  17%|█▋        | 1037/5971 [09:28<45:00,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|████████  | 134/167 [00:05<00:01, 27.90it/s][A
Epoch 5:  17%|█▋        | 1041/5971 [09:28<44:48,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.35it/s][A

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.74it/s][A
Epoch 5:  18%|█▊        | 1045/5971 [09:28<44:37,  1.84it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.38it/s][A
Epoch 5:  18%|█▊        | 1049/5971 [09:28<44:25,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.95it/s][A
Epoch 5:  18%|█▊        | 1053/5971 [09:28<44:13,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.92it/s][A

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.76it/s][A
Epoch 5:  18%|█▊        | 1057/5971 [09:28<44:02,  1.86it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.32it/s][A
Epoch 5:  18%|█▊        | 1061/5971 [09:29<43:51,  1.87it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.04it/s][A
Epoch 5:  18%|█▊        | 1065/5971 [09:29<43:39,  1.87it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.68it/s][A

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.73it/s][A
Epoch 5:  18%|█▊        | 1069/5971 [09:29<43:28,  1.88it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.81it/s][A
Epoch 5:  18%|█▊        | 1072/5971 [09:29<43:21,  1.88it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  18%|█▊        | 1073/5971 [09:30<43:22,  1.88it/s, loss=0.133, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00271, train/loss_step=0.424, global_step=2974.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1073/5971 [09:30<43:22,  1.88it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.7e-5, train/loss_step=0.0102, global_step=2975.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1074/5971 [09:31<43:23,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=9.96e-6, train/loss_step=0.00167, global_step=2975.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1075/5971 [09:32<43:24,  1.88it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.51e-5, train/loss_step=0.0117, global_step=2975.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  18%|█▊        | 1076/5971 [09:35<43:34,  1.87it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=7.56e-5, train/loss_step=0.0195, global_step=2975.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1077/5971 [09:36<43:36,  1.87it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=7.56e-5, train/loss_step=0.0195, global_step=2975.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1077/5971 [09:36<43:36,  1.87it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.21e-5, train/loss_step=0.0171, global_step=2976.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1078/5971 [09:37<43:37,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.00105, train/loss_step=0.262, global_step=2976.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  18%|█▊        | 1079/5971 [09:38<43:38,  1.87it/s, loss=0.101, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000425, train/loss_step=0.128, global_step=2976.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1080/5971 [09:40<43:44,  1.86it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.0081, train/loss_vlb_step=3.65e-5, train/loss_step=0.0081, global_step=2976.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1081/5971 [09:41<43:45,  1.86it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.0081, train/loss_vlb_step=3.65e-5, train/loss_step=0.0081, global_step=2976.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1081/5971 [09:41<43:45,  1.86it/s, loss=0.105, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000412, train/loss_step=0.124, global_step=2977.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  18%|█▊        | 1082/5971 [09:41<43:46,  1.86it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.00019, train/loss_step=0.0551, global_step=2977.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1083/5971 [09:42<43:47,  1.86it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.0707, train/loss_vlb_step=0.000248, train/loss_step=0.0707, global_step=2977.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1084/5971 [09:44<43:54,  1.85it/s, loss=0.102, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000538, train/loss_step=0.149, global_step=2977.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  18%|█▊        | 1085/5971 [09:45<43:55,  1.85it/s, loss=0.102, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000538, train/loss_step=0.149, global_step=2977.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1085/5971 [09:45<43:55,  1.85it/s, loss=0.0825, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000807, train/loss_step=0.213, global_step=2978.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1086/5971 [09:46<43:56,  1.85it/s, loss=0.0799, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000145, train/loss_step=0.0392, global_step=2978.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1087/5971 [09:47<43:57,  1.85it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000513, train/loss_step=0.150, global_step=2978.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  18%|█▊        | 1088/5971 [09:49<44:04,  1.85it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.3e-5, train/loss_step=0.0114, global_step=2978.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1089/5971 [09:50<44:05,  1.85it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.3e-5, train/loss_step=0.0114, global_step=2978.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1089/5971 [09:50<44:05,  1.85it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000632, train/loss_step=0.181, global_step=2979.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1090/5971 [09:51<44:07,  1.84it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.00017, train/loss_step=0.0497, global_step=2979.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1091/5971 [09:52<44:08,  1.84it/s, loss=0.115, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00175, train/loss_step=0.368, global_step=2979.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  18%|█▊        | 1092/5971 [09:54<44:14,  1.84it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000277, train/loss_step=0.083, global_step=2979.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1093/5971 [09:55<44:15,  1.84it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000277, train/loss_step=0.083, global_step=2979.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1093/5971 [09:55<44:15,  1.84it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000121, train/loss_step=0.0304, global_step=2980.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1094/5971 [09:56<44:16,  1.84it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000214, train/loss_step=0.0619, global_step=2980.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  18%|█▊        | 1095/5971 [09:57<44:17,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000686, train/loss_step=0.189, global_step=2980.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  18%|█▊        | 1096/5971 [10:00<44:26,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0996, train/loss_vlb_step=0.000327, train/loss_step=0.0996, global_step=2980.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1097/5971 [10:00<44:27,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0996, train/loss_vlb_step=0.000327, train/loss_step=0.0996, global_step=2980.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1097/5971 [10:00<44:27,  1.83it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0648, train/loss_vlb_step=0.000218, train/loss_step=0.0648, global_step=2981.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1098/5971 [10:01<44:28,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.72e-5, train/loss_step=0.003, global_step=2981.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  18%|█▊        | 1099/5971 [10:02<44:29,  1.83it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.57e-5, train/loss_step=0.00291, global_step=2981.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1100/5971 [10:04<44:35,  1.82it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000111, train/loss_step=0.0285, global_step=2981.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  18%|█▊        | 1101/5971 [10:05<44:36,  1.82it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000111, train/loss_step=0.0285, global_step=2981.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1101/5971 [10:05<44:36,  1.82it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=7.85e-5, train/loss_step=0.0208, global_step=2982.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  18%|█▊        | 1102/5971 [10:06<44:37,  1.82it/s, loss=0.094, v_num=0, train/loss_simple_step=0.0648, train/loss_vlb_step=0.000219, train/loss_step=0.0648, global_step=2982.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1103/5971 [10:07<44:38,  1.82it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.0958, train/loss_vlb_step=0.000316, train/loss_step=0.0958, global_step=2982.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  18%|█▊        | 1104/5971 [10:09<44:45,  1.81it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.00048, train/loss_step=0.145, global_step=2982.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▊        | 1105/5971 [10:10<44:46,  1.81it/s, loss=0.0951, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.00048, train/loss_step=0.145, global_step=2982.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1105/5971 [10:10<44:46,  1.81it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.56e-5, train/loss_step=0.0234, global_step=2983.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1106/5971 [10:11<44:47,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.513, train/loss_vlb_step=0.00511, train/loss_step=0.513, global_step=2983.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▊        | 1107/5971 [10:12<44:47,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.536, train/loss_vlb_step=0.0112, train/loss_step=0.536, global_step=2983.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  19%|█▊        | 1108/5971 [10:14<44:54,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.84e-5, train/loss_step=0.00348, global_step=2983.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1109/5971 [10:15<44:55,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.84e-5, train/loss_step=0.00348, global_step=2983.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1109/5971 [10:15<44:55,  1.80it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0965, train/loss_vlb_step=0.000322, train/loss_step=0.0965, global_step=2984.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  19%|█▊        | 1110/5971 [10:16<44:56,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.28e-5, train/loss_step=0.013, global_step=2984.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▊        | 1111/5971 [10:17<44:56,  1.80it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0523, train/loss_vlb_step=0.000188, train/loss_step=0.0523, global_step=2984.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1112/5971 [10:19<45:03,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.0018, train/loss_step=0.334, global_step=2984.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  19%|█▊        | 1113/5971 [10:20<45:04,  1.80it/s, loss=0.119, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.0018, train/loss_step=0.334, global_step=2984.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1113/5971 [10:20<45:04,  1.80it/s, loss=0.14, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00245, train/loss_step=0.448, global_step=2985.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1114/5971 [10:20<45:04,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0557, train/loss_vlb_step=0.000195, train/loss_step=0.0557, global_step=2985.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1115/5971 [10:21<45:05,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.00091, train/loss_step=0.221, global_step=2985.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▊        | 1116/5971 [10:24<45:13,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00675, train/loss_vlb_step=3.26e-5, train/loss_step=0.00675, global_step=2985.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1117/5971 [10:25<45:14,  1.79it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00675, train/loss_vlb_step=3.26e-5, train/loss_step=0.00675, global_step=2985.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1117/5971 [10:25<45:14,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000921, train/loss_step=0.237, global_step=2986.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▊        | 1118/5971 [10:26<45:15,  1.79it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.000163, train/loss_step=0.0434, global_step=2986.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▊        | 1119/5971 [10:26<45:16,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00703, train/loss_step=0.594, global_step=2986.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▉        | 1120/5971 [10:29<45:22,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00125, train/loss_step=0.327, global_step=2986.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1121/5971 [10:29<45:23,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00125, train/loss_step=0.327, global_step=2986.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1121/5971 [10:29<45:23,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.72e-5, train/loss_step=0.0032, global_step=2987.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1122/5971 [10:30<45:23,  1.78it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00465, train/loss_vlb_step=2.46e-5, train/loss_step=0.00465, global_step=2987.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1123/5971 [10:31<45:24,  1.78it/s, loss=0.208, v_num=0, train/loss_simple_step=0.505, train/loss_vlb_step=0.00333, train/loss_step=0.505, global_step=2987.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  19%|█▉        | 1124/5971 [10:34<45:31,  1.77it/s, loss=0.211, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000713, train/loss_step=0.196, global_step=2987.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1125/5971 [10:34<45:32,  1.77it/s, loss=0.211, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000713, train/loss_step=0.196, global_step=2987.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1125/5971 [10:34<45:32,  1.77it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.000256, train/loss_step=0.0778, global_step=2988.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1126/5971 [10:35<45:33,  1.77it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0038, train/loss_vlb_step=1.97e-5, train/loss_step=0.0038, global_step=2988.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  19%|█▉        | 1127/5971 [10:36<45:34,  1.77it/s, loss=0.17, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000586, train/loss_step=0.172, global_step=2988.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  19%|█▉        | 1128/5971 [10:38<45:40,  1.77it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00959, train/loss_vlb_step=4.32e-5, train/loss_step=0.00959, global_step=2988.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1129/5971 [10:39<45:41,  1.77it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00959, train/loss_vlb_step=4.32e-5, train/loss_step=0.00959, global_step=2988.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1129/5971 [10:39<45:41,  1.77it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0805, train/loss_vlb_step=0.000266, train/loss_step=0.0805, global_step=2989.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1130/5971 [10:40<45:42,  1.77it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.51e-5, train/loss_step=0.0077, global_step=2989.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  19%|█▉        | 1131/5971 [10:41<45:43,  1.76it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.06e-5, train/loss_step=0.00392, global_step=2989.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1132/5971 [10:43<45:49,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=2989.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▉        | 1133/5971 [10:44<45:50,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=2989.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1133/5971 [10:44<45:50,  1.76it/s, loss=0.143, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000663, train/loss_step=0.185, global_step=2990.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1134/5971 [10:45<45:51,  1.76it/s, loss=0.147, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000484, train/loss_step=0.147, global_step=2990.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1135/5971 [10:46<45:51,  1.76it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0981, train/loss_vlb_step=0.000325, train/loss_step=0.0981, global_step=2990.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1136/5971 [10:48<45:59,  1.75it/s, loss=0.154, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.0011, train/loss_step=0.259, global_step=2990.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  19%|█▉        | 1137/5971 [10:49<45:59,  1.75it/s, loss=0.154, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.0011, train/loss_step=0.259, global_step=2990.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1137/5971 [10:49<46:00,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00249, train/loss_vlb_step=1.37e-5, train/loss_step=0.00249, global_step=2991.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1138/5971 [10:50<46:00,  1.75it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00515, train/loss_vlb_step=2.72e-5, train/loss_step=0.00515, global_step=2991.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  19%|█▉        | 1139/5971 [10:51<46:01,  1.75it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000128, train/loss_step=0.0347, global_step=2991.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1140/5971 [10:53<46:07,  1.75it/s, loss=0.0959, v_num=0, train/loss_simple_step=0.0013, train/loss_vlb_step=7.91e-6, train/loss_step=0.0013, global_step=2991.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1141/5971 [10:54<46:08,  1.74it/s, loss=0.0959, v_num=0, train/loss_simple_step=0.0013, train/loss_vlb_step=7.91e-6, train/loss_step=0.0013, global_step=2991.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1141/5971 [10:54<46:08,  1.74it/s, loss=0.105, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000642, train/loss_step=0.188, global_step=2992.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  19%|█▉        | 1142/5971 [10:55<46:09,  1.74it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.58e-5, train/loss_step=0.0188, global_step=2992.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1143/5971 [10:56<46:09,  1.74it/s, loss=0.107, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00519, train/loss_step=0.528, global_step=2992.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  19%|█▉        | 1144/5971 [10:58<46:16,  1.74it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.36e-5, train/loss_step=0.00254, global_step=2992.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1145/5971 [10:59<46:17,  1.74it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.36e-5, train/loss_step=0.00254, global_step=2992.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1145/5971 [10:59<46:17,  1.74it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.06e-5, train/loss_step=0.00378, global_step=2993.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1146/5971 [11:00<46:18,  1.74it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.57e-5, train/loss_step=0.00278, global_step=2993.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1147/5971 [11:01<46:19,  1.74it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.0884, train/loss_vlb_step=0.00029, train/loss_step=0.0884, global_step=2993.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  19%|█▉        | 1148/5971 [11:03<46:24,  1.73it/s, loss=0.105, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00147, train/loss_step=0.329, global_step=2993.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▉        | 1149/5971 [11:04<46:25,  1.73it/s, loss=0.105, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00147, train/loss_step=0.329, global_step=2993.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1149/5971 [11:04<46:25,  1.73it/s, loss=0.107, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=2994.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1150/5971 [11:05<46:26,  1.73it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0905, train/loss_vlb_step=0.000297, train/loss_step=0.0905, global_step=2994.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1151/5971 [11:06<46:26,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00162, train/loss_step=0.354, global_step=2994.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▉        | 1152/5971 [11:08<46:33,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=2.92e-5, train/loss_step=0.00608, global_step=2994.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1153/5971 [11:09<46:34,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=2.92e-5, train/loss_step=0.00608, global_step=2994.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1153/5971 [11:09<46:34,  1.72it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00518, train/loss_vlb_step=2.52e-5, train/loss_step=0.00518, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  19%|█▉        | 1154/5971 [11:10<46:34,  1.72it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0687, train/loss_vlb_step=0.000232, train/loss_step=0.0687, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  19%|█▉        | 1155/5971 [11:10<46:35,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00724, train/loss_vlb_step=3.57e-5, train/loss_step=0.00724, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1156/5971 [11:13<46:41,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00109, train/loss_step=0.268, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  19%|█▉        | 1157/5971 [11:14<46:42,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00109, train/loss_step=0.268, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1157/5971 [11:14<46:42,  1.72it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000154, train/loss_step=0.0427, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1158/5971 [11:15<46:43,  1.72it/s, loss=0.12, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000987, train/loss_step=0.249, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  19%|█▉        | 1159/5971 [11:15<46:43,  1.72it/s, loss=0.132, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00114, train/loss_step=0.273, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1160/5971 [11:18<46:49,  1.71it/s, loss=0.147, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00121, train/loss_step=0.296, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1161/5971 [11:18<46:50,  1.71it/s, loss=0.147, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00121, train/loss_step=0.296, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1161/5971 [11:18<46:50,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.00045, train/loss_step=0.132, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1162/5971 [11:19<46:50,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00753, train/loss_vlb_step=3.22e-5, train/loss_step=0.00753, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  19%|█▉        | 1163/5971 [11:20<46:51,  1.71it/s, loss=0.135, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00193, train/loss_step=0.350, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  19%|█▉        | 1164/5971 [11:23<46:59,  1.71it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.25e-5, train/loss_step=0.00214, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1165/5971 [11:24<46:59,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.25e-5, train/loss_step=0.00214, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1165/5971 [11:24<46:59,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.29e-5, train/loss_step=0.0153, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  20%|█▉        | 1166/5971 [11:24<47:00,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000114, train/loss_step=0.0313, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1167/5971 [11:25<47:00,  1.70it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000226, train/loss_step=0.0665, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1168/5971 [11:28<47:06,  1.70it/s, loss=0.125, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000393, train/loss_step=0.119, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  20%|█▉        | 1169/5971 [11:28<47:07,  1.70it/s, loss=0.125, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000393, train/loss_step=0.119, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1169/5971 [11:28<47:07,  1.70it/s, loss=0.147, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00599, train/loss_step=0.550, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  20%|█▉        | 1169/5971 [11:41<47:58,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00599, train/loss_step=0.550, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1170/5971 [12:05<49:33,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00599, train/loss_step=0.550, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1170/5971 [12:05<49:33,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0199, train/loss_vlb_step=7.6e-5, train/loss_step=0.0199, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1171/5971 [12:06<49:34,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0199, train/loss_vlb_step=7.6e-5, train/loss_step=0.0199, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1171/5971 [12:06<49:34,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.12e-5, train/loss_step=0.00408, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1172/5971 [12:08<49:39,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00408, train/loss_vlb_step=2.12e-5, train/loss_step=0.00408, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  20%|█▉        | 1172/5971 [12:08<49:39,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:02,  2.68it/s][A
Epoch 5:  20%|█▉        | 1174/5971 [12:08<49:35,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:00<00:46,  3.56it/s][A
Epoch 5:  20%|█▉        | 1176/5971 [12:09<49:29,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.38it/s][A
Epoch 5:  20%|█▉        | 1179/5971 [12:09<49:21,  1.62it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.75it/s][A
Epoch 5:  20%|█▉        | 1182/5971 [12:09<49:12,  1.62it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.34it/s][A
Epoch 5:  20%|█▉        | 1185/5971 [12:09<49:03,  1.63it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.38it/s][A
Epoch 5:  20%|█▉        | 1188/5971 [12:09<48:54,  1.63it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.51it/s][A
Epoch 5:  20%|█▉        | 1191/5971 [12:09<48:45,  1.63it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 24.34it/s][A
Epoch 5:  20%|█▉        | 1194/5971 [12:09<48:36,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 25.03it/s][A
Epoch 5:  20%|██        | 1197/5971 [12:09<48:28,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 26.05it/s][A
Epoch 5:  20%|██        | 1201/5971 [12:09<48:16,  1.65it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 26.62it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.59it/s][A
Epoch 5:  20%|██        | 1205/5971 [12:10<48:05,  1.65it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 27.52it/s][A
Epoch 5:  20%|██        | 1209/5971 [12:10<47:53,  1.66it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 27.98it/s][A
Epoch 5:  20%|██        | 1213/5971 [12:10<47:42,  1.66it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.84it/s][A
Epoch 5:  20%|██        | 1217/5971 [12:10<47:31,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 28.77it/s][A
Epoch 5:  20%|██        | 1221/5971 [12:10<47:19,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 28.20it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 27.95it/s][A
Epoch 5:  21%|██        | 1225/5971 [12:10<47:08,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.84it/s][A
Epoch 5:  21%|██        | 1229/5971 [12:10<46:57,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▍      | 58/167 [00:02<00:03, 27.29it/s][A
Epoch 5:  21%|██        | 1233/5971 [12:11<46:47,  1.69it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 26.57it/s][A

Validating:  38%|███▊      | 64/167 [00:02<00:03, 26.87it/s][A
Epoch 5:  21%|██        | 1237/5971 [12:11<46:36,  1.69it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  40%|████      | 67/167 [00:02<00:03, 27.33it/s][A
Epoch 5:  21%|██        | 1241/5971 [12:11<46:25,  1.70it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.71it/s][A
Epoch 5:  21%|██        | 1245/5971 [12:11<46:14,  1.70it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.59it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.03it/s][A
Epoch 5:  21%|██        | 1249/5971 [12:11<46:04,  1.71it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.60it/s][A
Epoch 5:  21%|██        | 1253/5971 [12:11<45:53,  1.71it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 25.93it/s][A
Epoch 5:  21%|██        | 1257/5971 [12:11<45:42,  1.72it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.62it/s][A

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.32it/s][A
Epoch 5:  21%|██        | 1261/5971 [12:12<45:32,  1.72it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.56it/s][A
Epoch 5:  21%|██        | 1265/5971 [12:12<45:22,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:03<00:02, 27.21it/s][A
Epoch 5:  21%|██▏       | 1269/5971 [12:12<45:11,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.86it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.47it/s][A
Epoch 5:  21%|██▏       | 1273/5971 [12:12<45:01,  1.74it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 24.21it/s][A
Epoch 5:  21%|██▏       | 1277/5971 [12:12<44:51,  1.74it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.01it/s][A
Epoch 5:  21%|██▏       | 1281/5971 [12:12<44:41,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.60it/s][A
Epoch 5:  22%|██▏       | 1285/5971 [12:13<44:31,  1.75it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.86it/s][A
Epoch 5:  22%|██▏       | 1289/5971 [12:13<44:21,  1.76it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:04<00:01, 28.37it/s][A

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 27.41it/s][A
Epoch 5:  22%|██▏       | 1293/5971 [12:13<44:11,  1.76it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.26it/s][A
Epoch 5:  22%|██▏       | 1297/5971 [12:13<44:01,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.50it/s][A
Epoch 5:  22%|██▏       | 1301/5971 [12:13<43:51,  1.77it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.67it/s][A
Epoch 5:  22%|██▏       | 1305/5971 [12:13<43:41,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 28.20it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 28.21it/s][A
Epoch 5:  22%|██▏       | 1309/5971 [12:13<43:31,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.76it/s][A
Epoch 5:  22%|██▏       | 1313/5971 [12:14<43:22,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 25.57it/s][A
Epoch 5:  22%|██▏       | 1317/5971 [12:14<43:12,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 25.38it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 25.13it/s][A
Epoch 5:  22%|██▏       | 1321/5971 [12:14<43:03,  1.80it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.02it/s][A
Epoch 5:  22%|██▏       | 1325/5971 [12:14<42:53,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.81it/s][A
Epoch 5:  22%|██▏       | 1329/5971 [12:14<42:44,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 24.66it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 24.44it/s][A
Epoch 5:  22%|██▏       | 1333/5971 [12:14<42:35,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.57it/s][A
Epoch 5:  22%|██▏       | 1337/5971 [12:15<42:25,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 25.29it/s][A
Epoch 5:  22%|██▏       | 1340/5971 [12:15<42:19,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.23it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.83it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.29it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.59it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.86it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.21it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.09it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.09it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.08it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.08it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.08it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.08it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  5.12it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.13it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.12it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.14it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.40it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.42it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.22it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.24it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.37it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.01it/s]

Epoch 5:  22%|██▏       | 1341/5971 [12:27<42:59,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  22%|██▏       | 1341/5971 [12:27<42:59,  1.79it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.21e-5, train/loss_step=0.00213, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.78it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.64it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.92it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.03it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.03it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.06it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.19it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.22it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.17it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.18it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.17it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.21it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.15it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.10it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.07it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.06it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.11it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.11it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  5.20it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.13it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.25it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.31it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.36it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.23it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.23it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.29it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.29it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.30it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.52it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.98it/s]

Epoch 5:  22%|██▏       | 1342/5971 [12:40<43:39,  1.77it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.21e-5, train/loss_step=0.00213, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  22%|██▏       | 1342/5971 [12:40<43:39,  1.77it/s, loss=0.148, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00118, train/loss_step=0.256, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.21it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.82it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.25it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.54it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.83it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.00it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.11it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.33it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.42it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.36it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.34it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.33it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.39it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.49it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.52it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.08it/s]

Epoch 5:  22%|██▏       | 1343/5971 [12:52<44:19,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00118, train/loss_step=0.256, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  22%|██▏       | 1343/5971 [12:52<44:19,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.55e-5, train/loss_step=0.0129, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.33it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.12it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.70it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  4.07it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.67it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.88it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.09it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.44it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.52it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.58it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.38it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.32it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.28it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.23it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.19it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.22it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.48it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.24it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.24it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.35it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.42it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.47it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.48it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.53it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.07it/s]

Epoch 5:  23%|██▎       | 1344/5971 [13:06<45:04,  1.71it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.55e-5, train/loss_step=0.0129, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1344/5971 [13:06<45:04,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.00847, train/loss_step=0.599, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1345/5971 [13:06<45:04,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.00847, train/loss_step=0.599, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1345/5971 [13:06<45:04,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00213, train/loss_step=0.399, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1346/5971 [13:07<45:04,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00213, train/loss_step=0.399, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1346/5971 [13:07<45:04,  1.71it/s, loss=0.179, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000598, train/loss_step=0.171, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1347/5971 [13:08<45:05,  1.71it/s, loss=0.179, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000598, train/loss_step=0.171, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1347/5971 [13:08<45:05,  1.71it/s, loss=0.172, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000456, train/loss_step=0.136, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1348/5971 [13:11<45:11,  1.71it/s, loss=0.172, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000456, train/loss_step=0.136, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1348/5971 [13:11<45:11,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.001, train/loss_step=0.243, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  23%|██▎       | 1349/5971 [13:12<45:11,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.001, train/loss_step=0.243, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1349/5971 [13:12<45:11,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00989, train/loss_vlb_step=4.61e-5, train/loss_step=0.00989, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1350/5971 [13:12<45:12,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00989, train/loss_vlb_step=4.61e-5, train/loss_step=0.00989, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1350/5971 [13:12<45:12,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.000328, train/loss_step=0.0998, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1351/5971 [13:13<45:12,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.000328, train/loss_step=0.0998, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1351/5971 [13:13<45:12,  1.70it/s, loss=0.159, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000648, train/loss_step=0.190, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1352/5971 [13:15<45:17,  1.70it/s, loss=0.159, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000648, train/loss_step=0.190, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1352/5971 [13:15<45:17,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000184, train/loss_step=0.0522, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1353/5971 [13:16<45:17,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0522, train/loss_vlb_step=0.000184, train/loss_step=0.0522, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1353/5971 [13:16<45:17,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000105, train/loss_step=0.0277, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1354/5971 [13:17<45:18,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.000105, train/loss_step=0.0277, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1354/5971 [13:17<45:18,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.03e-5, train/loss_step=0.0168, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1355/5971 [13:18<45:18,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.03e-5, train/loss_step=0.0168, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1355/5971 [13:18<45:18,  1.70it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00454, train/loss_vlb_step=2.45e-5, train/loss_step=0.00454, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1356/5971 [13:21<45:24,  1.69it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00454, train/loss_vlb_step=2.45e-5, train/loss_step=0.00454, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1356/5971 [13:21<45:24,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.23e-5, train/loss_step=0.00209, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1357/5971 [13:21<45:24,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.23e-5, train/loss_step=0.00209, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1357/5971 [13:21<45:24,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.11e-5, train/loss_step=0.0178, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1358/5971 [13:22<45:24,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.11e-5, train/loss_step=0.0178, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1358/5971 [13:22<45:24,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00288, train/loss_step=0.438, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1359/5971 [13:23<45:25,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00288, train/loss_step=0.438, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1359/5971 [13:23<45:25,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000524, train/loss_step=0.158, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1360/5971 [13:25<45:29,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000524, train/loss_step=0.158, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1360/5971 [13:25<45:29,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.4e-5, train/loss_step=0.0163, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1361/5971 [13:26<45:30,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.4e-5, train/loss_step=0.0163, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1361/5971 [13:26<45:30,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000335, train/loss_step=0.101, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1362/5971 [13:27<45:30,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000335, train/loss_step=0.101, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1362/5971 [13:27<45:30,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=6.02e-5, train/loss_step=0.0137, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1363/5971 [13:28<45:31,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=6.02e-5, train/loss_step=0.0137, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1363/5971 [13:28<45:31,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00141, train/loss_step=0.318, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1364/5971 [13:30<45:35,  1.68it/s, loss=0.151, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00141, train/loss_step=0.318, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1364/5971 [13:30<45:35,  1.68it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000309, train/loss_step=0.0922, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1365/5971 [13:31<45:36,  1.68it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000309, train/loss_step=0.0922, global_step=3e+3, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1365/5971 [13:31<45:36,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000123, train/loss_step=0.0319, global_step=3006.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1366/5971 [13:32<45:36,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000123, train/loss_step=0.0319, global_step=3006.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1366/5971 [13:32<45:36,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000572, train/loss_step=0.171, global_step=3006.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1367/5971 [13:33<45:36,  1.68it/s, loss=0.107, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000572, train/loss_step=0.171, global_step=3006.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1367/5971 [13:33<45:36,  1.68it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.07e-5, train/loss_step=0.00395, global_step=3006.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1368/5971 [13:36<45:44,  1.68it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.07e-5, train/loss_step=0.00395, global_step=3006.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1368/5971 [13:36<45:44,  1.68it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=9.07e-5, train/loss_step=0.0222, global_step=3006.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1369/5971 [13:37<45:45,  1.68it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=9.07e-5, train/loss_step=0.0222, global_step=3006.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1369/5971 [13:37<45:45,  1.68it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.00369, train/loss_vlb_step=2.09e-5, train/loss_step=0.00369, global_step=3007.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1370/5971 [13:38<45:45,  1.68it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.00369, train/loss_vlb_step=2.09e-5, train/loss_step=0.00369, global_step=3007.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1370/5971 [13:38<45:45,  1.68it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.0667, train/loss_vlb_step=0.000224, train/loss_step=0.0667, global_step=3007.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1371/5971 [13:39<45:46,  1.68it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.0667, train/loss_vlb_step=0.000224, train/loss_step=0.0667, global_step=3007.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1371/5971 [13:39<45:46,  1.68it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000822, train/loss_step=0.219, global_step=3007.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1372/5971 [13:41<45:50,  1.67it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000822, train/loss_step=0.219, global_step=3007.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1372/5971 [13:41<45:50,  1.67it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=1.9e-5, train/loss_step=0.00391, global_step=3007.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1373/5971 [13:42<45:51,  1.67it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=1.9e-5, train/loss_step=0.00391, global_step=3007.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1373/5971 [13:42<45:51,  1.67it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.00449, train/loss_vlb_step=2.38e-5, train/loss_step=0.00449, global_step=3008.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1374/5971 [13:43<45:51,  1.67it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.00449, train/loss_vlb_step=2.38e-5, train/loss_step=0.00449, global_step=3008.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1374/5971 [13:43<45:51,  1.67it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000198, train/loss_step=0.0564, global_step=3008.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1375/5971 [13:44<45:52,  1.67it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000198, train/loss_step=0.0564, global_step=3008.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1375/5971 [13:44<45:52,  1.67it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000504, train/loss_step=0.152, global_step=3008.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1376/5971 [13:47<45:59,  1.66it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000504, train/loss_step=0.152, global_step=3008.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1376/5971 [13:47<45:59,  1.66it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.00138, train/loss_vlb_step=8.14e-6, train/loss_step=0.00138, global_step=3008.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1377/5971 [13:47<46:00,  1.66it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.00138, train/loss_vlb_step=8.14e-6, train/loss_step=0.00138, global_step=3008.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1377/5971 [13:47<46:00,  1.66it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.07e-5, train/loss_step=0.00394, global_step=3009.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1378/5971 [13:48<46:00,  1.66it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.07e-5, train/loss_step=0.00394, global_step=3009.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1378/5971 [13:48<46:00,  1.66it/s, loss=0.0727, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.89e-5, train/loss_step=0.0145, global_step=3009.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1379/5971 [13:49<46:00,  1.66it/s, loss=0.0727, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.89e-5, train/loss_step=0.0145, global_step=3009.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1379/5971 [13:49<46:00,  1.66it/s, loss=0.0719, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=3009.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1380/5971 [13:52<46:06,  1.66it/s, loss=0.0719, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=3009.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1380/5971 [13:52<46:06,  1.66it/s, loss=0.0799, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000659, train/loss_step=0.175, global_step=3009.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1381/5971 [13:53<46:06,  1.66it/s, loss=0.0799, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000659, train/loss_step=0.175, global_step=3009.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1381/5971 [13:53<46:06,  1.66it/s, loss=0.075, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.11e-5, train/loss_step=0.00392, global_step=3010.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1382/5971 [13:53<46:06,  1.66it/s, loss=0.075, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.11e-5, train/loss_step=0.00392, global_step=3010.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1382/5971 [13:53<46:06,  1.66it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000803, train/loss_step=0.220, global_step=3010.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1383/5971 [13:54<46:07,  1.66it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000803, train/loss_step=0.220, global_step=3010.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1383/5971 [13:54<46:07,  1.66it/s, loss=0.0742, v_num=0, train/loss_simple_step=0.0957, train/loss_vlb_step=0.000325, train/loss_step=0.0957, global_step=3010.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1384/5971 [13:57<46:12,  1.65it/s, loss=0.0742, v_num=0, train/loss_simple_step=0.0957, train/loss_vlb_step=0.000325, train/loss_step=0.0957, global_step=3010.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1384/5971 [13:57<46:12,  1.65it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000109, train/loss_step=0.0271, global_step=3010.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1385/5971 [13:57<46:12,  1.65it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000109, train/loss_step=0.0271, global_step=3010.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1385/5971 [13:57<46:12,  1.65it/s, loss=0.0696, v_num=0, train/loss_simple_step=0.00667, train/loss_vlb_step=3.14e-5, train/loss_step=0.00667, global_step=3011.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1386/5971 [13:58<46:12,  1.65it/s, loss=0.0696, v_num=0, train/loss_simple_step=0.00667, train/loss_vlb_step=3.14e-5, train/loss_step=0.00667, global_step=3011.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1386/5971 [13:58<46:12,  1.65it/s, loss=0.0611, v_num=0, train/loss_simple_step=0.00119, train/loss_vlb_step=7.22e-6, train/loss_step=0.00119, global_step=3011.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1387/5971 [13:59<46:13,  1.65it/s, loss=0.0611, v_num=0, train/loss_simple_step=0.00119, train/loss_vlb_step=7.22e-6, train/loss_step=0.00119, global_step=3011.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1387/5971 [13:59<46:13,  1.65it/s, loss=0.066, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=3011.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  23%|██▎       | 1388/5971 [14:02<46:20,  1.65it/s, loss=0.066, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=3011.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1388/5971 [14:02<46:20,  1.65it/s, loss=0.0728, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000523, train/loss_step=0.158, global_step=3011.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1389/5971 [14:03<46:21,  1.65it/s, loss=0.0728, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000523, train/loss_step=0.158, global_step=3011.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1389/5971 [14:03<46:21,  1.65it/s, loss=0.0731, v_num=0, train/loss_simple_step=0.00891, train/loss_vlb_step=4.18e-5, train/loss_step=0.00891, global_step=3012.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1390/5971 [14:04<46:21,  1.65it/s, loss=0.0731, v_num=0, train/loss_simple_step=0.00891, train/loss_vlb_step=4.18e-5, train/loss_step=0.00891, global_step=3012.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1390/5971 [14:04<46:21,  1.65it/s, loss=0.0706, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.98e-5, train/loss_step=0.0165, global_step=3012.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1391/5971 [14:05<46:21,  1.65it/s, loss=0.0706, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.98e-5, train/loss_step=0.0165, global_step=3012.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1391/5971 [14:05<46:21,  1.65it/s, loss=0.0598, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.77e-5, train/loss_step=0.00315, global_step=3012.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1392/5971 [14:07<46:26,  1.64it/s, loss=0.0598, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.77e-5, train/loss_step=0.00315, global_step=3012.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1392/5971 [14:07<46:26,  1.64it/s, loss=0.0696, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000701, train/loss_step=0.199, global_step=3012.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  23%|██▎       | 1393/5971 [14:08<46:27,  1.64it/s, loss=0.0696, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000701, train/loss_step=0.199, global_step=3012.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1393/5971 [14:08<46:27,  1.64it/s, loss=0.0765, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000486, train/loss_step=0.144, global_step=3013.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1394/5971 [14:09<46:27,  1.64it/s, loss=0.0765, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000486, train/loss_step=0.144, global_step=3013.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1394/5971 [14:09<46:27,  1.64it/s, loss=0.0817, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000528, train/loss_step=0.161, global_step=3013.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1395/5971 [14:10<46:27,  1.64it/s, loss=0.0817, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000528, train/loss_step=0.161, global_step=3013.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1395/5971 [14:10<46:27,  1.64it/s, loss=0.084, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000651, train/loss_step=0.196, global_step=3013.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1396/5971 [14:12<46:32,  1.64it/s, loss=0.084, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000651, train/loss_step=0.196, global_step=3013.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1396/5971 [14:12<46:32,  1.64it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000128, train/loss_step=0.0347, global_step=3013.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1397/5971 [14:13<46:32,  1.64it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000128, train/loss_step=0.0347, global_step=3013.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1397/5971 [14:13<46:32,  1.64it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.74e-5, train/loss_step=0.0206, global_step=3014.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1398/5971 [14:14<46:32,  1.64it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.74e-5, train/loss_step=0.0206, global_step=3014.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1398/5971 [14:14<46:32,  1.64it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000185, train/loss_step=0.053, global_step=3014.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1399/5971 [14:15<46:33,  1.64it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000185, train/loss_step=0.053, global_step=3014.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1399/5971 [14:15<46:33,  1.64it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00126, train/loss_step=0.283, global_step=3014.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  23%|██▎       | 1400/5971 [14:17<46:37,  1.63it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00126, train/loss_step=0.283, global_step=3014.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1400/5971 [14:17<46:37,  1.63it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.00827, train/loss_vlb_step=3.82e-5, train/loss_step=0.00827, global_step=3014.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1401/5971 [14:18<46:37,  1.63it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.00827, train/loss_vlb_step=3.82e-5, train/loss_step=0.00827, global_step=3014.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1401/5971 [14:18<46:37,  1.63it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.69e-5, train/loss_step=0.0103, global_step=3015.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1402/5971 [14:19<46:38,  1.63it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.69e-5, train/loss_step=0.0103, global_step=3015.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1402/5971 [14:19<46:38,  1.63it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00188, train/loss_step=0.402, global_step=3015.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  23%|██▎       | 1403/5971 [14:20<46:38,  1.63it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00188, train/loss_step=0.402, global_step=3015.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  23%|██▎       | 1403/5971 [14:20<46:38,  1.63it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000495, train/loss_step=0.150, global_step=3015.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1404/5971 [14:22<46:43,  1.63it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000495, train/loss_step=0.150, global_step=3015.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1404/5971 [14:22<46:43,  1.63it/s, loss=0.112, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.0013, train/loss_step=0.291, global_step=3015.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  24%|██▎       | 1405/5971 [14:23<46:43,  1.63it/s, loss=0.112, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.0013, train/loss_step=0.291, global_step=3015.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1405/5971 [14:23<46:43,  1.63it/s, loss=0.14, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00503, train/loss_step=0.567, global_step=3016.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1406/5971 [14:24<46:43,  1.63it/s, loss=0.14, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00503, train/loss_step=0.567, global_step=3016.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1406/5971 [14:24<46:43,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000566, train/loss_step=0.162, global_step=3016.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1407/5971 [14:25<46:44,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000566, train/loss_step=0.162, global_step=3016.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1407/5971 [14:25<46:44,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0855, train/loss_vlb_step=0.000281, train/loss_step=0.0855, global_step=3016.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1408/5971 [14:27<46:49,  1.62it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0855, train/loss_vlb_step=0.000281, train/loss_step=0.0855, global_step=3016.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1408/5971 [14:27<46:49,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000496, train/loss_step=0.151, global_step=3016.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  24%|██▎       | 1409/5971 [14:28<46:50,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000496, train/loss_step=0.151, global_step=3016.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1409/5971 [14:28<46:50,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000593, train/loss_step=0.173, global_step=3017.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1410/5971 [14:29<46:50,  1.62it/s, loss=0.156, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000593, train/loss_step=0.173, global_step=3017.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1410/5971 [14:29<46:50,  1.62it/s, loss=0.201, v_num=0, train/loss_simple_step=0.930, train/loss_vlb_step=0.468, train/loss_step=0.930, global_step=3017.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  24%|██▎       | 1411/5971 [14:30<46:51,  1.62it/s, loss=0.201, v_num=0, train/loss_simple_step=0.930, train/loss_vlb_step=0.468, train/loss_step=0.930, global_step=3017.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1411/5971 [14:30<46:51,  1.62it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.06e-5, train/loss_step=0.00183, global_step=3017.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1412/5971 [14:32<46:55,  1.62it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.06e-5, train/loss_step=0.00183, global_step=3017.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1412/5971 [14:32<46:55,  1.62it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00894, train/loss_vlb_step=4.13e-5, train/loss_step=0.00894, global_step=3017.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1413/5971 [14:33<46:55,  1.62it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00894, train/loss_vlb_step=4.13e-5, train/loss_step=0.00894, global_step=3017.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1413/5971 [14:33<46:55,  1.62it/s, loss=0.21, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00743, train/loss_step=0.514, global_step=3018.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  24%|██▎       | 1414/5971 [14:34<46:55,  1.62it/s, loss=0.21, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00743, train/loss_step=0.514, global_step=3018.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1414/5971 [14:34<46:55,  1.62it/s, loss=0.219, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00154, train/loss_step=0.343, global_step=3018.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1415/5971 [14:35<46:56,  1.62it/s, loss=0.219, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00154, train/loss_step=0.343, global_step=3018.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1415/5971 [14:35<46:56,  1.62it/s, loss=0.21, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.27e-5, train/loss_step=0.018, global_step=3018.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  24%|██▎       | 1416/5971 [14:37<47:00,  1.62it/s, loss=0.21, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.27e-5, train/loss_step=0.018, global_step=3018.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1416/5971 [14:37<47:00,  1.62it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0982, train/loss_vlb_step=0.000323, train/loss_step=0.0982, global_step=3018.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1417/5971 [14:38<47:00,  1.61it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0982, train/loss_vlb_step=0.000323, train/loss_step=0.0982, global_step=3018.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1417/5971 [14:38<47:00,  1.61it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0095, train/loss_vlb_step=4.43e-5, train/loss_step=0.0095, global_step=3019.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  24%|██▎       | 1418/5971 [14:39<47:00,  1.61it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0095, train/loss_vlb_step=4.43e-5, train/loss_step=0.0095, global_step=3019.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▎       | 1418/5971 [14:39<47:00,  1.61it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.63e-6, train/loss_step=0.0017, global_step=3019.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  24%|██▍       | 1419/5971 [14:40<47:01,  1.61it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.63e-6, train/loss_step=0.0017, global_step=3019.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1419/5971 [14:40<47:01,  1.61it/s, loss=0.226, v_num=0, train/loss_simple_step=0.596, train/loss_vlb_step=0.0079, train/loss_step=0.596, global_step=3019.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  24%|██▍       | 1420/5971 [14:42<47:05,  1.61it/s, loss=0.226, v_num=0, train/loss_simple_step=0.596, train/loss_vlb_step=0.0079, train/loss_step=0.596, global_step=3019.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1420/5971 [14:42<47:05,  1.61it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000281, train/loss_step=0.0828, global_step=3019.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1421/5971 [14:43<47:06,  1.61it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000281, train/loss_step=0.0828, global_step=3019.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1421/5971 [14:43<47:06,  1.61it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000148, train/loss_step=0.0408, global_step=3020.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1422/5971 [14:44<47:06,  1.61it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000148, train/loss_step=0.0408, global_step=3020.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1422/5971 [14:44<47:06,  1.61it/s, loss=0.213, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000111, train/loss_step=0.030, global_step=3020.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  24%|██▍       | 1423/5971 [14:45<47:06,  1.61it/s, loss=0.213, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000111, train/loss_step=0.030, global_step=3020.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1423/5971 [14:45<47:06,  1.61it/s, loss=0.235, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0127, train/loss_step=0.606, global_step=3020.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  24%|██▍       | 1424/5971 [14:47<47:11,  1.61it/s, loss=0.235, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0127, train/loss_step=0.606, global_step=3020.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1424/5971 [14:47<47:11,  1.61it/s, loss=0.237, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00144, train/loss_step=0.313, global_step=3020.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1425/5971 [14:48<47:11,  1.61it/s, loss=0.237, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00144, train/loss_step=0.313, global_step=3020.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1425/5971 [14:48<47:11,  1.61it/s, loss=0.216, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000691, train/loss_step=0.157, global_step=3021.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1426/5971 [14:49<47:11,  1.60it/s, loss=0.216, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000691, train/loss_step=0.157, global_step=3021.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1426/5971 [14:49<47:11,  1.60it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.00018, train/loss_step=0.0537, global_step=3021.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1427/5971 [14:50<47:12,  1.60it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.00018, train/loss_step=0.0537, global_step=3021.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1427/5971 [14:50<47:12,  1.60it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.000192, train/loss_step=0.0571, global_step=3021.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1428/5971 [14:52<47:17,  1.60it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0571, train/loss_vlb_step=0.000192, train/loss_step=0.0571, global_step=3021.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1428/5971 [14:52<47:17,  1.60it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0879, train/loss_vlb_step=0.000291, train/loss_step=0.0879, global_step=3021.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1429/5971 [14:53<47:18,  1.60it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0879, train/loss_vlb_step=0.000291, train/loss_step=0.0879, global_step=3021.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1429/5971 [14:53<47:18,  1.60it/s, loss=0.206, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000595, train/loss_step=0.176, global_step=3022.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  24%|██▍       | 1430/5971 [14:54<47:18,  1.60it/s, loss=0.206, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000595, train/loss_step=0.176, global_step=3022.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1430/5971 [14:54<47:18,  1.60it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000277, train/loss_step=0.0842, global_step=3022.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1431/5971 [14:55<47:18,  1.60it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000277, train/loss_step=0.0842, global_step=3022.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1431/5971 [14:55<47:18,  1.60it/s, loss=0.203, v_num=0, train/loss_simple_step=0.782, train/loss_vlb_step=0.0208, train/loss_step=0.782, global_step=3022.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  24%|██▍       | 1432/5971 [14:57<47:23,  1.60it/s, loss=0.203, v_num=0, train/loss_simple_step=0.782, train/loss_vlb_step=0.0208, train/loss_step=0.782, global_step=3022.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1432/5971 [14:57<47:23,  1.60it/s, loss=0.213, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000846, train/loss_step=0.204, global_step=3022.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1433/5971 [14:58<47:23,  1.60it/s, loss=0.213, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000846, train/loss_step=0.204, global_step=3022.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1433/5971 [14:58<47:23,  1.60it/s, loss=0.193, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000426, train/loss_step=0.128, global_step=3023.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1434/5971 [14:59<47:23,  1.60it/s, loss=0.193, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000426, train/loss_step=0.128, global_step=3023.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1434/5971 [14:59<47:23,  1.60it/s, loss=0.226, v_num=0, train/loss_simple_step=0.991, train/loss_vlb_step=0.499, train/loss_step=0.991, global_step=3023.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  24%|██▍       | 1435/5971 [15:00<47:23,  1.60it/s, loss=0.226, v_num=0, train/loss_simple_step=0.991, train/loss_vlb_step=0.499, train/loss_step=0.991, global_step=3023.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1435/5971 [15:00<47:23,  1.60it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0033, train/loss_vlb_step=1.72e-5, train/loss_step=0.0033, global_step=3023.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1436/5971 [15:02<47:27,  1.59it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0033, train/loss_vlb_step=1.72e-5, train/loss_step=0.0033, global_step=3023.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1436/5971 [15:02<47:27,  1.59it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00284, train/loss_vlb_step=1.66e-5, train/loss_step=0.00284, global_step=3023.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1437/5971 [15:03<47:27,  1.59it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00284, train/loss_vlb_step=1.66e-5, train/loss_step=0.00284, global_step=3023.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1437/5971 [15:03<47:27,  1.59it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.91e-5, train/loss_step=0.00541, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1438/5971 [15:04<47:28,  1.59it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00541, train/loss_vlb_step=2.91e-5, train/loss_step=0.00541, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1438/5971 [15:04<47:28,  1.59it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000209, train/loss_step=0.0619, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1439/5971 [15:05<47:28,  1.59it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000209, train/loss_step=0.0619, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1439/5971 [15:05<47:28,  1.59it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000125, train/loss_step=0.0351, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1440/5971 [15:07<47:33,  1.59it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000125, train/loss_step=0.0351, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  24%|██▍       | 1440/5971 [15:07<47:33,  1.59it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:02,  2.67it/s][A
Epoch 5:  24%|██▍       | 1442/5971 [15:07<47:29,  1.59it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:00<00:49,  3.36it/s][A
Epoch 5:  24%|██▍       | 1444/5971 [15:08<47:24,  1.59it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.88it/s][A
Epoch 5:  24%|██▍       | 1447/5971 [15:08<47:17,  1.59it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.75it/s][A
Epoch 5:  24%|██▍       | 1450/5971 [15:08<47:09,  1.60it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.37it/s][A
Epoch 5:  24%|██▍       | 1453/5971 [15:08<47:02,  1.60it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.13it/s][A
Epoch 5:  24%|██▍       | 1456/5971 [15:08<46:55,  1.60it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.22it/s][A
Epoch 5:  24%|██▍       | 1459/5971 [15:08<46:47,  1.61it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.47it/s][A
Epoch 5:  24%|██▍       | 1462/5971 [15:08<46:40,  1.61it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.83it/s][A
Epoch 5:  25%|██▍       | 1466/5971 [15:08<46:30,  1.61it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.69it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 26.45it/s][A
Epoch 5:  25%|██▍       | 1470/5971 [15:08<46:21,  1.62it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|█▉        | 33/167 [00:01<00:04, 27.82it/s][A
Epoch 5:  25%|██▍       | 1474/5971 [15:09<46:11,  1.62it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 28.29it/s][A
Epoch 5:  25%|██▍       | 1478/5971 [15:09<46:02,  1.63it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 27.99it/s][A
Epoch 5:  25%|██▍       | 1482/5971 [15:09<45:52,  1.63it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.55it/s][A
Epoch 5:  25%|██▍       | 1486/5971 [15:09<45:43,  1.63it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 28.22it/s][A
Epoch 5:  25%|██▍       | 1490/5971 [15:09<45:33,  1.64it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 28.11it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 28.05it/s][A
Epoch 5:  25%|██▌       | 1494/5971 [15:09<45:24,  1.64it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.44it/s][A
Epoch 5:  25%|██▌       | 1498/5971 [15:09<45:15,  1.65it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.93it/s][A
Epoch 5:  25%|██▌       | 1502/5971 [15:10<45:06,  1.65it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.79it/s][A

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.26it/s][A
Epoch 5:  25%|██▌       | 1506/5971 [15:10<44:57,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.88it/s][A
Epoch 5:  25%|██▌       | 1510/5971 [15:10<44:47,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.42it/s][A
Epoch 5:  25%|██▌       | 1514/5971 [15:10<44:38,  1.66it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.24it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.20it/s][A
Epoch 5:  25%|██▌       | 1518/5971 [15:10<44:29,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.56it/s][A
Epoch 5:  25%|██▌       | 1522/5971 [15:10<44:20,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.93it/s][A
Epoch 5:  26%|██▌       | 1526/5971 [15:11<44:11,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.01it/s][A
Epoch 5:  26%|██▌       | 1530/5971 [15:11<44:03,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.51it/s][A

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 27.13it/s][A
Epoch 5:  26%|██▌       | 1534/5971 [15:11<43:54,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.88it/s][A
Epoch 5:  26%|██▌       | 1538/5971 [15:11<43:45,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.35it/s][A
Epoch 5:  26%|██▌       | 1542/5971 [15:11<43:36,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.49it/s][A
Epoch 5:  26%|██▌       | 1546/5971 [15:11<43:27,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.94it/s][A

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.71it/s][A
Epoch 5:  26%|██▌       | 1550/5971 [15:11<43:19,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 27.18it/s][A
Epoch 5:  26%|██▌       | 1554/5971 [15:12<43:10,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 26.66it/s][A
Epoch 5:  26%|██▌       | 1558/5971 [15:12<43:02,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████   | 118/167 [00:04<00:01, 26.62it/s][A

Validating:  72%|███████▏  | 121/167 [00:04<00:01, 26.97it/s][A
Epoch 5:  26%|██▌       | 1562/5971 [15:12<42:53,  1.71it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.23it/s][A
Epoch 5:  26%|██▌       | 1566/5971 [15:12<42:45,  1.72it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 25.58it/s][A
Epoch 5:  26%|██▋       | 1570/5971 [15:12<42:36,  1.72it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.25it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.14it/s][A
Epoch 5:  26%|██▋       | 1574/5971 [15:12<42:28,  1.73it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.58it/s][A
Epoch 5:  26%|██▋       | 1578/5971 [15:12<42:20,  1.73it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 24.49it/s][A
Epoch 5:  26%|██▋       | 1582/5971 [15:13<42:11,  1.73it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 25.02it/s][A

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 24.72it/s][A
Epoch 5:  27%|██▋       | 1586/5971 [15:13<42:03,  1.74it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 24.08it/s][A
Epoch 5:  27%|██▋       | 1590/5971 [15:13<41:55,  1.74it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 23.19it/s][A
Epoch 5:  27%|██▋       | 1594/5971 [15:13<41:47,  1.75it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 22.73it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 23.53it/s][A
Epoch 5:  27%|██▋       | 1598/5971 [15:13<41:39,  1.75it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 24.17it/s][A
Epoch 5:  27%|██▋       | 1602/5971 [15:14<41:31,  1.75it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 24.99it/s][A
Epoch 5:  27%|██▋       | 1606/5971 [15:14<41:23,  1.76it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 25.37it/s][A
Epoch 5:  27%|██▋       | 1608/5971 [15:14<41:19,  1.76it/s, loss=0.197, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3024.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  27%|██▋       | 1609/5971 [15:15<41:20,  1.76it/s, loss=0.208, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000876, train/loss_step=0.249, global_step=3025.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1610/5971 [15:16<41:20,  1.76it/s, loss=0.208, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000876, train/loss_step=0.249, global_step=3025.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1610/5971 [15:16<41:20,  1.76it/s, loss=0.214, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000496, train/loss_step=0.147, global_step=3025.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1611/5971 [15:17<41:20,  1.76it/s, loss=0.189, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.00038, train/loss_step=0.115, global_step=3025.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  27%|██▋       | 1612/5971 [15:19<41:25,  1.75it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.05e-5, train/loss_step=0.00172, global_step=3025.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1613/5971 [15:20<41:25,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.00014, train/loss_step=0.0394, global_step=3026.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  27%|██▋       | 1614/5971 [15:21<41:26,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.00014, train/loss_step=0.0394, global_step=3026.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1614/5971 [15:21<41:26,  1.75it/s, loss=0.18, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00132, train/loss_step=0.305, global_step=3026.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  27%|██▋       | 1615/5971 [15:22<41:26,  1.75it/s, loss=0.183, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000377, train/loss_step=0.114, global_step=3026.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1616/5971 [15:24<41:30,  1.75it/s, loss=0.181, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000163, train/loss_step=0.047, global_step=3026.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1617/5971 [15:25<41:31,  1.75it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=7.8e-5, train/loss_step=0.0201, global_step=3027.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1618/5971 [15:26<41:31,  1.75it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=7.8e-5, train/loss_step=0.0201, global_step=3027.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1618/5971 [15:26<41:31,  1.75it/s, loss=0.19, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00264, train/loss_step=0.418, global_step=3027.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  27%|██▋       | 1619/5971 [15:27<41:31,  1.75it/s, loss=0.158, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000522, train/loss_step=0.152, global_step=3027.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1620/5971 [15:30<41:36,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000367, train/loss_step=0.112, global_step=3027.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1621/5971 [15:30<41:36,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00652, train/loss_vlb_step=3.21e-5, train/loss_step=0.00652, global_step=3028.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1622/5971 [15:31<41:37,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00652, train/loss_vlb_step=3.21e-5, train/loss_step=0.00652, global_step=3028.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1622/5971 [15:31<41:37,  1.74it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.06e-5, train/loss_step=0.00382, global_step=3028.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1623/5971 [15:32<41:37,  1.74it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.27e-5, train/loss_step=0.00221, global_step=3028.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1624/5971 [15:35<41:42,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00484, train/loss_step=0.486, global_step=3028.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  27%|██▋       | 1625/5971 [15:36<41:42,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0848, train/loss_vlb_step=0.000279, train/loss_step=0.0848, global_step=3029.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1626/5971 [15:37<41:42,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0848, train/loss_vlb_step=0.000279, train/loss_step=0.0848, global_step=3029.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1626/5971 [15:37<41:42,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0692, train/loss_vlb_step=0.000231, train/loss_step=0.0692, global_step=3029.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1627/5971 [15:38<41:43,  1.74it/s, loss=0.143, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00181, train/loss_step=0.352, global_step=3029.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  27%|██▋       | 1628/5971 [15:40<41:46,  1.73it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.23e-5, train/loss_step=0.00406, global_step=3029.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1629/5971 [15:41<41:47,  1.73it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00976, train/loss_vlb_step=4.57e-5, train/loss_step=0.00976, global_step=3030.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1630/5971 [15:42<41:47,  1.73it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00976, train/loss_vlb_step=4.57e-5, train/loss_step=0.00976, global_step=3030.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1630/5971 [15:42<41:47,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00493, train/loss_vlb_step=2.65e-5, train/loss_step=0.00493, global_step=3030.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1631/5971 [15:42<41:47,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.75e-5, train/loss_step=0.0181, global_step=3030.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  27%|██▋       | 1632/5971 [15:45<41:51,  1.73it/s, loss=0.149, v_num=0, train/loss_simple_step=0.726, train/loss_vlb_step=0.0152, train/loss_step=0.726, global_step=3030.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  27%|██▋       | 1633/5971 [15:46<41:51,  1.73it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.17e-5, train/loss_step=0.00195, global_step=3031.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1634/5971 [15:46<41:51,  1.73it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.17e-5, train/loss_step=0.00195, global_step=3031.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1634/5971 [15:46<41:51,  1.73it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.55e-5, train/loss_step=0.0221, global_step=3031.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  27%|██▋       | 1635/5971 [15:47<41:52,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0202, train/loss_vlb_step=7.97e-5, train/loss_step=0.0202, global_step=3031.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1636/5971 [15:50<41:56,  1.72it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.19e-5, train/loss_step=0.0144, global_step=3031.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1637/5971 [15:51<41:56,  1.72it/s, loss=0.143, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.0018, train/loss_step=0.349, global_step=3032.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  27%|██▋       | 1638/5971 [15:51<41:56,  1.72it/s, loss=0.143, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.0018, train/loss_step=0.349, global_step=3032.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1638/5971 [15:51<41:56,  1.72it/s, loss=0.154, v_num=0, train/loss_simple_step=0.634, train/loss_vlb_step=0.0113, train/loss_step=0.634, global_step=3032.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1639/5971 [15:52<41:56,  1.72it/s, loss=0.153, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000458, train/loss_step=0.139, global_step=3032.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1640/5971 [15:55<42:00,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.72e-5, train/loss_step=0.0032, global_step=3032.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1641/5971 [15:55<42:00,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000123, train/loss_step=0.0335, global_step=3033.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1642/5971 [15:56<42:00,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000123, train/loss_step=0.0335, global_step=3033.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  27%|██▋       | 1642/5971 [15:56<42:00,  1.72it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.00023, train/loss_step=0.0674, global_step=3033.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1643/5971 [15:57<42:01,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000647, train/loss_step=0.172, global_step=3033.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1644/5971 [16:00<42:06,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.33e-5, train/loss_step=0.0154, global_step=3033.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1645/5971 [16:01<42:07,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000184, train/loss_step=0.0536, global_step=3034.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1646/5971 [16:02<42:07,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000184, train/loss_step=0.0536, global_step=3034.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1646/5971 [16:02<42:07,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.00011, train/loss_step=0.029, global_step=3034.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  28%|██▊       | 1647/5971 [16:03<42:07,  1.71it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.5e-5, train/loss_step=0.0102, global_step=3034.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1648/5971 [16:05<42:10,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0888, train/loss_vlb_step=0.000293, train/loss_step=0.0888, global_step=3034.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1649/5971 [16:06<42:11,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000181, train/loss_step=0.0551, global_step=3035.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1650/5971 [16:07<42:11,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000181, train/loss_step=0.0551, global_step=3035.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1650/5971 [16:07<42:11,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000387, train/loss_step=0.118, global_step=3035.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  28%|██▊       | 1651/5971 [16:08<42:11,  1.71it/s, loss=0.13, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000142, train/loss_step=0.039, global_step=3035.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1652/5971 [16:10<42:15,  1.70it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.09e-5, train/loss_step=0.00179, global_step=3035.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1653/5971 [16:11<42:15,  1.70it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.000103, train/loss_step=0.0265, global_step=3036.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1654/5971 [16:12<42:15,  1.70it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.000103, train/loss_step=0.0265, global_step=3036.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1654/5971 [16:12<42:15,  1.70it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.63e-5, train/loss_step=0.00302, global_step=3036.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1655/5971 [16:12<42:15,  1.70it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.0795, train/loss_vlb_step=0.000275, train/loss_step=0.0795, global_step=3036.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1656/5971 [16:15<42:19,  1.70it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000116, train/loss_step=0.0314, global_step=3036.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1657/5971 [16:15<42:19,  1.70it/s, loss=0.0802, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.14e-5, train/loss_step=0.00397, global_step=3037.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1658/5971 [16:16<42:19,  1.70it/s, loss=0.0802, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.14e-5, train/loss_step=0.00397, global_step=3037.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1658/5971 [16:16<42:19,  1.70it/s, loss=0.0494, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.3e-5, train/loss_step=0.0184, global_step=3037.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  28%|██▊       | 1659/5971 [16:17<42:19,  1.70it/s, loss=0.0434, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.44e-5, train/loss_step=0.0192, global_step=3037.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1660/5971 [16:19<42:22,  1.70it/s, loss=0.0436, v_num=0, train/loss_simple_step=0.00744, train/loss_vlb_step=3.28e-5, train/loss_step=0.00744, global_step=3037.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1661/5971 [16:20<42:23,  1.69it/s, loss=0.043, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.77e-5, train/loss_step=0.0214, global_step=3038.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  28%|██▊       | 1662/5971 [16:21<42:23,  1.69it/s, loss=0.043, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.77e-5, train/loss_step=0.0214, global_step=3038.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1662/5971 [16:21<42:23,  1.69it/s, loss=0.0397, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.24e-6, train/loss_step=0.00155, global_step=3038.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1663/5971 [16:22<42:23,  1.69it/s, loss=0.0315, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.31e-5, train/loss_step=0.00656, global_step=3038.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1664/5971 [16:24<42:27,  1.69it/s, loss=0.0309, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.78e-5, train/loss_step=0.00341, global_step=3038.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1665/5971 [16:25<42:28,  1.69it/s, loss=0.0284, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=1.98e-5, train/loss_step=0.00378, global_step=3039.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1666/5971 [16:26<42:28,  1.69it/s, loss=0.0284, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=1.98e-5, train/loss_step=0.00378, global_step=3039.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1666/5971 [16:26<42:28,  1.69it/s, loss=0.0272, v_num=0, train/loss_simple_step=0.00576, train/loss_vlb_step=2.77e-5, train/loss_step=0.00576, global_step=3039.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1667/5971 [16:27<42:28,  1.69it/s, loss=0.0281, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.00011, train/loss_step=0.0271, global_step=3039.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  28%|██▊       | 1668/5971 [16:29<42:31,  1.69it/s, loss=0.025, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.00011, train/loss_step=0.0267, global_step=3039.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1669/5971 [16:30<42:31,  1.69it/s, loss=0.0258, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000242, train/loss_step=0.0709, global_step=3040.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1670/5971 [16:31<42:32,  1.69it/s, loss=0.0258, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000242, train/loss_step=0.0709, global_step=3040.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1670/5971 [16:31<42:32,  1.69it/s, loss=0.0628, v_num=0, train/loss_simple_step=0.859, train/loss_vlb_step=0.0876, train/loss_step=0.859, global_step=3040.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  28%|██▊       | 1671/5971 [16:32<42:32,  1.68it/s, loss=0.0747, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00139, train/loss_step=0.276, global_step=3040.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1672/5971 [16:34<42:35,  1.68it/s, loss=0.0756, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=7.98e-5, train/loss_step=0.0205, global_step=3040.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1673/5971 [16:35<42:35,  1.68it/s, loss=0.076, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000131, train/loss_step=0.0347, global_step=3041.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1674/5971 [16:36<42:35,  1.68it/s, loss=0.076, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000131, train/loss_step=0.0347, global_step=3041.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1674/5971 [16:36<42:35,  1.68it/s, loss=0.087, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000861, train/loss_step=0.222, global_step=3041.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  28%|██▊       | 1675/5971 [16:37<42:36,  1.68it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.00102, train/loss_step=0.222, global_step=3041.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1676/5971 [16:39<42:39,  1.68it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.0473, train/loss_vlb_step=0.000162, train/loss_step=0.0473, global_step=3041.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1677/5971 [16:40<42:39,  1.68it/s, loss=0.104, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000716, train/loss_step=0.193, global_step=3042.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  28%|██▊       | 1678/5971 [16:41<42:39,  1.68it/s, loss=0.104, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000716, train/loss_step=0.193, global_step=3042.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1678/5971 [16:41<42:39,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.00015, train/loss_step=0.0432, global_step=3042.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1679/5971 [16:42<42:40,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=8.78e-5, train/loss_step=0.0222, global_step=3042.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1680/5971 [16:44<42:43,  1.67it/s, loss=0.127, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00325, train/loss_step=0.433, global_step=3042.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  28%|██▊       | 1681/5971 [16:45<42:44,  1.67it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.2e-5, train/loss_step=0.00409, global_step=3043.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1682/5971 [16:46<42:44,  1.67it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.2e-5, train/loss_step=0.00409, global_step=3043.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1682/5971 [16:46<42:44,  1.67it/s, loss=0.145, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00249, train/loss_step=0.373, global_step=3043.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  28%|██▊       | 1683/5971 [16:47<42:44,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00266, train/loss_step=0.393, global_step=3043.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1684/5971 [16:49<42:48,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0049, train/loss_vlb_step=2.51e-5, train/loss_step=0.0049, global_step=3043.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1685/5971 [16:50<42:48,  1.67it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0672, train/loss_vlb_step=0.000226, train/loss_step=0.0672, global_step=3044.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1686/5971 [16:51<42:48,  1.67it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0672, train/loss_vlb_step=0.000226, train/loss_step=0.0672, global_step=3044.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1686/5971 [16:51<42:49,  1.67it/s, loss=0.173, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=3044.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  28%|██▊       | 1687/5971 [16:52<42:49,  1.67it/s, loss=0.188, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00148, train/loss_step=0.324, global_step=3044.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1688/5971 [16:54<42:52,  1.66it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0459, train/loss_vlb_step=0.000163, train/loss_step=0.0459, global_step=3044.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1689/5971 [16:55<42:52,  1.66it/s, loss=0.186, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.33e-5, train/loss_step=0.025, global_step=3045.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  28%|██▊       | 1690/5971 [16:56<42:52,  1.66it/s, loss=0.186, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.33e-5, train/loss_step=0.025, global_step=3045.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1690/5971 [16:56<42:52,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.000188, train/loss_step=0.0546, global_step=3045.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1691/5971 [16:57<42:53,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.15e-5, train/loss_step=0.00192, global_step=3045.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1692/5971 [16:59<42:56,  1.66it/s, loss=0.169, v_num=0, train/loss_simple_step=0.756, train/loss_vlb_step=0.0152, train/loss_step=0.756, global_step=3045.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  28%|██▊       | 1693/5971 [17:00<42:56,  1.66it/s, loss=0.183, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00149, train/loss_step=0.313, global_step=3046.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1694/5971 [17:01<42:56,  1.66it/s, loss=0.183, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00149, train/loss_step=0.313, global_step=3046.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1694/5971 [17:01<42:56,  1.66it/s, loss=0.206, v_num=0, train/loss_simple_step=0.683, train/loss_vlb_step=0.0201, train/loss_step=0.683, global_step=3046.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1695/5971 [17:02<42:56,  1.66it/s, loss=0.201, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000416, train/loss_step=0.126, global_step=3046.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1696/5971 [17:04<43:00,  1.66it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.49e-5, train/loss_step=0.0128, global_step=3046.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1697/5971 [17:05<43:00,  1.66it/s, loss=0.195, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=3047.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1698/5971 [17:06<43:00,  1.66it/s, loss=0.195, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=3047.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1698/5971 [17:06<43:00,  1.66it/s, loss=0.222, v_num=0, train/loss_simple_step=0.566, train/loss_vlb_step=0.00496, train/loss_step=0.566, global_step=3047.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  28%|██▊       | 1699/5971 [17:07<43:01,  1.66it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000208, train/loss_step=0.0631, global_step=3047.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1700/5971 [17:09<43:04,  1.65it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.72e-5, train/loss_step=0.00531, global_step=3047.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  28%|██▊       | 1701/5971 [17:10<43:04,  1.65it/s, loss=0.21, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.00052, train/loss_step=0.158, global_step=3048.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  29%|██▊       | 1702/5971 [17:10<43:04,  1.65it/s, loss=0.21, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.00052, train/loss_step=0.158, global_step=3048.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  29%|██▊       | 1702/5971 [17:10<43:04,  1.65it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.000303, train/loss_step=0.0901, global_step=3048.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  29%|██▊       | 1703/5971 [17:11<43:04,  1.65it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.43e-5, train/loss_step=0.0231, global_step=3048.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  29%|██▊       | 1704/5971 [17:14<43:09,  1.65it/s, loss=0.182, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=3048.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  29%|██▊       | 1705/5971 [17:15<43:09,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00279, train/loss_vlb_step=1.57e-5, train/loss_step=0.00279, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  29%|██▊       | 1706/5971 [17:16<43:09,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00279, train/loss_vlb_step=1.57e-5, train/loss_step=0.00279, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  29%|██▊       | 1706/5971 [17:16<43:09,  1.65it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000125, train/loss_step=0.0327, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  29%|██▊       | 1707/5971 [17:17<43:09,  1.65it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000204, train/loss_step=0.0575, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  29%|██▊       | 1708/5971 [17:19<43:12,  1.64it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:20,  2.07it/s][A
Epoch 5:  29%|██▊       | 1710/5971 [17:19<43:09,  1.65it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:00<00:46,  3.55it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.03it/s][A
Epoch 5:  29%|██▊       | 1714/5971 [17:20<43:02,  1.65it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▍         | 7/167 [00:00<00:13, 11.54it/s][A
Epoch 5:  29%|██▉       | 1718/5971 [17:20<42:54,  1.65it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   6%|▌         | 10/167 [00:00<00:10, 14.84it/s][A

Validating:   8%|▊         | 13/167 [00:01<00:09, 15.68it/s][A
Epoch 5:  29%|██▉       | 1722/5971 [17:20<42:46,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|▉         | 16/167 [00:01<00:08, 18.05it/s][A
Epoch 5:  29%|██▉       | 1726/5971 [17:20<42:38,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█▏        | 19/167 [00:01<00:07, 19.71it/s][A
Epoch 5:  29%|██▉       | 1730/5971 [17:21<42:30,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 22/167 [00:01<00:08, 17.25it/s][A

Validating:  14%|█▍        | 24/167 [00:01<00:08, 17.76it/s][A
Epoch 5:  29%|██▉       | 1734/5971 [17:21<42:22,  1.67it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 27/167 [00:01<00:06, 20.34it/s][A
Epoch 5:  29%|██▉       | 1738/5971 [17:21<42:14,  1.67it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  18%|█▊        | 30/167 [00:01<00:06, 22.27it/s][A

Validating:  20%|█▉        | 33/167 [00:02<00:05, 24.18it/s][A
Epoch 5:  29%|██▉       | 1742/5971 [17:21<42:07,  1.67it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 25.24it/s][A
Epoch 5:  29%|██▉       | 1746/5971 [17:21<41:59,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 26.09it/s][A
Epoch 5:  29%|██▉       | 1750/5971 [17:21<41:51,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.33it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.68it/s][A
Epoch 5:  29%|██▉       | 1754/5971 [17:21<41:43,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.12it/s][A
Epoch 5:  29%|██▉       | 1758/5971 [17:22<41:36,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 51/167 [00:02<00:04, 23.77it/s][A
Epoch 5:  30%|██▉       | 1762/5971 [17:22<41:28,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.01it/s][A

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.01it/s][A
Epoch 5:  30%|██▉       | 1766/5971 [17:22<41:20,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  36%|███▌      | 60/167 [00:03<00:04, 24.00it/s][A
Epoch 5:  30%|██▉       | 1770/5971 [17:22<41:13,  1.70it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 63/167 [00:03<00:04, 21.30it/s][A
Epoch 5:  30%|██▉       | 1774/5971 [17:22<41:06,  1.70it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 21.28it/s][A

Validating:  41%|████▏     | 69/167 [00:03<00:04, 23.01it/s][A
Epoch 5:  30%|██▉       | 1778/5971 [17:23<40:58,  1.71it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 72/167 [00:03<00:04, 19.20it/s][A
Epoch 5:  30%|██▉       | 1782/5971 [17:23<40:51,  1.71it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  45%|████▍     | 75/167 [00:03<00:04, 20.82it/s][A
Epoch 5:  30%|██▉       | 1786/5971 [17:23<40:43,  1.71it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 78/167 [00:04<00:04, 21.82it/s][A

Validating:  49%|████▊     | 81/167 [00:04<00:03, 23.21it/s][A
Epoch 5:  30%|██▉       | 1790/5971 [17:23<40:36,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  50%|█████     | 84/167 [00:04<00:03, 23.24it/s][A
Epoch 5:  30%|███       | 1794/5971 [17:23<40:28,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 24.57it/s][A
Epoch 5:  30%|███       | 1798/5971 [17:23<40:21,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 25.43it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.07it/s][A
Epoch 5:  30%|███       | 1802/5971 [17:24<40:14,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 24.92it/s][A
Epoch 5:  30%|███       | 1806/5971 [17:24<40:06,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 24.14it/s][A
Epoch 5:  30%|███       | 1810/5971 [17:24<39:59,  1.73it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  61%|██████    | 102/167 [00:04<00:02, 23.67it/s][A

Validating:  63%|██████▎   | 105/167 [00:05<00:02, 24.34it/s][A
Epoch 5:  30%|███       | 1814/5971 [17:24<39:52,  1.74it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 25.18it/s][A
Epoch 5:  30%|███       | 1818/5971 [17:24<39:45,  1.74it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 24.49it/s][A
Epoch 5:  31%|███       | 1822/5971 [17:24<39:38,  1.74it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 23.97it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:02, 24.18it/s][A
Epoch 5:  31%|███       | 1826/5971 [17:25<39:31,  1.75it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 120/167 [00:05<00:02, 23.35it/s][A
Epoch 5:  31%|███       | 1830/5971 [17:25<39:23,  1.75it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 23.92it/s][A
Epoch 5:  31%|███       | 1834/5971 [17:25<39:16,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 23.94it/s][A

Validating:  77%|███████▋  | 129/167 [00:06<00:01, 23.37it/s][A
Epoch 5:  31%|███       | 1838/5971 [17:25<39:09,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 24.17it/s][A
Epoch 5:  31%|███       | 1842/5971 [17:25<39:02,  1.76it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████  | 135/167 [00:06<00:01, 24.17it/s][A
Epoch 5:  31%|███       | 1846/5971 [17:25<38:55,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 24.64it/s][A

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 21.22it/s][A
Epoch 5:  31%|███       | 1850/5971 [17:26<38:49,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 144/167 [00:06<00:01, 22.89it/s][A
Epoch 5:  31%|███       | 1854/5971 [17:26<38:42,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 24.28it/s][A
Epoch 5:  31%|███       | 1858/5971 [17:26<38:35,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.43it/s][A

Validating:  92%|█████████▏| 153/167 [00:07<00:00, 25.23it/s][A
Epoch 5:  31%|███       | 1862/5971 [17:26<38:28,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  93%|█████████▎| 156/167 [00:07<00:00, 25.49it/s][A
Epoch 5:  31%|███▏      | 1866/5971 [17:26<38:21,  1.78it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▌| 159/167 [00:07<00:00, 26.33it/s][A
Epoch 5:  31%|███▏      | 1870/5971 [17:26<38:14,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 26.96it/s][A

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 27.09it/s][A
Epoch 5:  31%|███▏      | 1874/5971 [17:27<38:07,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  31%|███▏      | 1876/5971 [17:27<38:04,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.99e-5, train/loss_step=0.0111, global_step=3049.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  31%|███▏      | 1877/5971 [17:28<38:05,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000132, train/loss_step=0.0358, global_step=3050.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  31%|███▏      | 1878/5971 [17:29<38:05,  1.79it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000132, train/loss_step=0.0358, global_step=3050.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  31%|███▏      | 1878/5971 [17:29<38:05,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.16e-6, train/loss_step=0.00158, global_step=3050.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  31%|███▏      | 1879/5971 [17:30<38:05,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00183, train/loss_step=0.401, global_step=3050.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  31%|███▏      | 1880/5971 [17:32<38:08,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.47e-5, train/loss_step=0.0101, global_step=3050.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1881/5971 [17:33<38:08,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000862, train/loss_step=0.240, global_step=3051.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1882/5971 [17:34<38:09,  1.79it/s, loss=0.137, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000862, train/loss_step=0.240, global_step=3051.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1882/5971 [17:34<38:09,  1.79it/s, loss=0.112, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000667, train/loss_step=0.189, global_step=3051.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1883/5971 [17:35<38:09,  1.79it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.56e-5, train/loss_step=0.0128, global_step=3051.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1884/5971 [17:37<38:12,  1.78it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=8.87e-6, train/loss_step=0.00149, global_step=3051.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1885/5971 [17:38<38:12,  1.78it/s, loss=0.105, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=3052.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  32%|███▏      | 1886/5971 [17:38<38:12,  1.78it/s, loss=0.105, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=3052.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1886/5971 [17:38<38:12,  1.78it/s, loss=0.0774, v_num=0, train/loss_simple_step=0.0065, train/loss_vlb_step=2.99e-5, train/loss_step=0.0065, global_step=3052.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1887/5971 [17:39<38:12,  1.78it/s, loss=0.0746, v_num=0, train/loss_simple_step=0.0059, train/loss_vlb_step=2.98e-5, train/loss_step=0.0059, global_step=3052.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1888/5971 [17:41<38:15,  1.78it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00108, train/loss_step=0.257, global_step=3052.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  32%|███▏      | 1889/5971 [17:42<38:15,  1.78it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000497, train/loss_step=0.151, global_step=3053.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1890/5971 [17:43<38:15,  1.78it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000497, train/loss_step=0.151, global_step=3053.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1890/5971 [17:43<38:15,  1.78it/s, loss=0.0835, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=0.000101, train/loss_step=0.0247, global_step=3053.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1891/5971 [17:44<38:15,  1.78it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.0017, train/loss_step=0.330, global_step=3053.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  32%|███▏      | 1892/5971 [17:46<38:18,  1.77it/s, loss=0.0938, v_num=0, train/loss_simple_step=0.00272, train/loss_vlb_step=1.54e-5, train/loss_step=0.00272, global_step=3053.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1893/5971 [17:47<38:18,  1.77it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=0.000101, train/loss_step=0.0245, global_step=3054.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  32%|███▏      | 1894/5971 [17:48<38:18,  1.77it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=0.000101, train/loss_step=0.0245, global_step=3054.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1894/5971 [17:48<38:18,  1.77it/s, loss=0.107, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.0011, train/loss_step=0.270, global_step=3054.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  32%|███▏      | 1895/5971 [17:49<38:19,  1.77it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00538, train/loss_vlb_step=2.66e-5, train/loss_step=0.00538, global_step=3054.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1896/5971 [17:51<38:21,  1.77it/s, loss=0.12, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.0014, train/loss_step=0.336, global_step=3054.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]      
Epoch 5:  32%|███▏      | 1897/5971 [17:52<38:22,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.543, train/loss_vlb_step=0.00557, train/loss_step=0.543, global_step=3055.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1898/5971 [17:53<38:22,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.543, train/loss_vlb_step=0.00557, train/loss_step=0.543, global_step=3055.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1898/5971 [17:53<38:22,  1.77it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00778, train/loss_vlb_step=3.67e-5, train/loss_step=0.00778, global_step=3055.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1899/5971 [17:54<38:22,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000821, train/loss_step=0.217, global_step=3055.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  32%|███▏      | 1900/5971 [17:56<38:25,  1.77it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.83e-5, train/loss_step=0.0188, global_step=3055.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1901/5971 [17:57<38:25,  1.77it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.31e-5, train/loss_step=0.00224, global_step=3056.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1902/5971 [17:58<38:25,  1.76it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.31e-5, train/loss_step=0.00224, global_step=3056.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1902/5971 [17:58<38:25,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000108, train/loss_step=0.0283, global_step=3056.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  32%|███▏      | 1903/5971 [17:59<38:25,  1.76it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.00032, train/loss_step=0.0971, global_step=3056.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  32%|███▏      | 1904/5971 [18:01<38:28,  1.76it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.57e-5, train/loss_step=0.00281, global_step=3056.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1905/5971 [18:02<38:28,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.783, train/loss_vlb_step=0.0405, train/loss_step=0.783, global_step=3057.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  32%|███▏      | 1906/5971 [18:03<38:29,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.783, train/loss_vlb_step=0.0405, train/loss_step=0.783, global_step=3057.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1906/5971 [18:03<38:29,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00767, train/loss_vlb_step=3.62e-5, train/loss_step=0.00767, global_step=3057.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1907/5971 [18:04<38:29,  1.76it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.04e-5, train/loss_step=0.0111, global_step=3057.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  32%|███▏      | 1908/5971 [18:06<38:32,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000765, train/loss_step=0.224, global_step=3057.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  32%|███▏      | 1909/5971 [18:07<38:32,  1.76it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000127, train/loss_step=0.0357, global_step=3058.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1910/5971 [18:08<38:32,  1.76it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000127, train/loss_step=0.0357, global_step=3058.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1910/5971 [18:08<38:32,  1.76it/s, loss=0.17, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00265, train/loss_step=0.449, global_step=3058.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  32%|███▏      | 1911/5971 [18:09<38:32,  1.76it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000133, train/loss_step=0.0383, global_step=3058.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1912/5971 [18:11<38:35,  1.75it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0846, train/loss_vlb_step=0.000278, train/loss_step=0.0846, global_step=3058.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1913/5971 [18:12<38:35,  1.75it/s, loss=0.178, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00317, train/loss_step=0.391, global_step=3059.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  32%|███▏      | 1914/5971 [18:13<38:35,  1.75it/s, loss=0.178, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00317, train/loss_step=0.391, global_step=3059.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1914/5971 [18:13<38:35,  1.75it/s, loss=0.195, v_num=0, train/loss_simple_step=0.607, train/loss_vlb_step=0.0118, train/loss_step=0.607, global_step=3059.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  32%|███▏      | 1915/5971 [18:13<38:35,  1.75it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.24e-5, train/loss_step=0.0148, global_step=3059.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1916/5971 [18:16<38:38,  1.75it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00133, train/loss_vlb_step=8.05e-6, train/loss_step=0.00133, global_step=3059.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1917/5971 [18:17<38:38,  1.75it/s, loss=0.169, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00206, train/loss_step=0.355, global_step=3060.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  32%|███▏      | 1918/5971 [18:17<38:38,  1.75it/s, loss=0.169, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00206, train/loss_step=0.355, global_step=3060.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1918/5971 [18:17<38:38,  1.75it/s, loss=0.174, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000398, train/loss_step=0.120, global_step=3060.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1919/5971 [18:18<38:38,  1.75it/s, loss=0.19, v_num=0, train/loss_simple_step=0.527, train/loss_vlb_step=0.00684, train/loss_step=0.527, global_step=3060.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  32%|███▏      | 1920/5971 [18:20<38:41,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.31e-5, train/loss_step=0.00223, global_step=3060.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1921/5971 [18:21<38:41,  1.74it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.26e-5, train/loss_step=0.0231, global_step=3061.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  32%|███▏      | 1922/5971 [18:22<38:41,  1.74it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.26e-5, train/loss_step=0.0231, global_step=3061.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1922/5971 [18:22<38:41,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.35e-5, train/loss_step=0.0102, global_step=3061.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1923/5971 [18:23<38:41,  1.74it/s, loss=0.196, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.00086, train/loss_step=0.231, global_step=3061.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  32%|███▏      | 1924/5971 [18:25<38:44,  1.74it/s, loss=0.213, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.0016, train/loss_step=0.351, global_step=3061.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  32%|███▏      | 1925/5971 [18:26<38:44,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00896, train/loss_vlb_step=4.37e-5, train/loss_step=0.00896, global_step=3062.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1926/5971 [18:27<38:44,  1.74it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00896, train/loss_vlb_step=4.37e-5, train/loss_step=0.00896, global_step=3062.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1926/5971 [18:27<38:44,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00112, train/loss_step=0.286, global_step=3062.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  32%|███▏      | 1927/5971 [18:28<38:44,  1.74it/s, loss=0.201, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00125, train/loss_step=0.268, global_step=3062.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1928/5971 [18:30<38:48,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000775, train/loss_step=0.204, global_step=3062.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  32%|███▏      | 1929/5971 [18:31<38:48,  1.74it/s, loss=0.22, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00318, train/loss_step=0.424, global_step=3063.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1930/5971 [18:32<38:48,  1.74it/s, loss=0.22, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00318, train/loss_step=0.424, global_step=3063.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1930/5971 [18:32<38:48,  1.74it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.01e-5, train/loss_step=0.00388, global_step=3063.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1931/5971 [18:33<38:48,  1.74it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.35e-5, train/loss_step=0.0206, global_step=3063.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  32%|███▏      | 1932/5971 [18:35<38:51,  1.73it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=8.75e-6, train/loss_step=0.00152, global_step=3063.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1933/5971 [18:36<38:51,  1.73it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.19e-5, train/loss_step=0.0148, global_step=3064.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  32%|███▏      | 1934/5971 [18:37<38:51,  1.73it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.19e-5, train/loss_step=0.0148, global_step=3064.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1934/5971 [18:37<38:51,  1.73it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0769, train/loss_vlb_step=0.000265, train/loss_step=0.0769, global_step=3064.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1935/5971 [18:38<38:51,  1.73it/s, loss=0.147, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=5.9e-5, train/loss_step=0.015, global_step=3064.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  32%|███▏      | 1936/5971 [18:40<38:54,  1.73it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.00014, train/loss_step=0.0374, global_step=3064.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1937/5971 [18:41<38:54,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.672, train/loss_vlb_step=0.027, train/loss_step=0.672, global_step=3065.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  32%|███▏      | 1938/5971 [18:42<38:54,  1.73it/s, loss=0.165, v_num=0, train/loss_simple_step=0.672, train/loss_vlb_step=0.027, train/loss_step=0.672, global_step=3065.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1938/5971 [18:42<38:54,  1.73it/s, loss=0.172, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00136, train/loss_step=0.263, global_step=3065.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  32%|███▏      | 1939/5971 [18:43<38:54,  1.73it/s, loss=0.152, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.0004, train/loss_step=0.122, global_step=3065.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  32%|███▏      | 1940/5971 [18:45<38:57,  1.72it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000288, train/loss_step=0.0859, global_step=3065.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1941/5971 [18:46<38:57,  1.72it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.62e-5, train/loss_step=0.00314, global_step=3066.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1942/5971 [18:47<38:57,  1.72it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.62e-5, train/loss_step=0.00314, global_step=3066.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1942/5971 [18:47<38:57,  1.72it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.00013, train/loss_step=0.0337, global_step=3066.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  33%|███▎      | 1943/5971 [18:48<38:57,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00164, train/loss_step=0.368, global_step=3066.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  33%|███▎      | 1944/5971 [18:50<39:00,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00602, train/loss_vlb_step=3.06e-5, train/loss_step=0.00602, global_step=3066.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1945/5971 [18:51<39:00,  1.72it/s, loss=0.174, v_num=0, train/loss_simple_step=0.580, train/loss_vlb_step=0.00612, train/loss_step=0.580, global_step=3067.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  33%|███▎      | 1946/5971 [18:52<39:00,  1.72it/s, loss=0.174, v_num=0, train/loss_simple_step=0.580, train/loss_vlb_step=0.00612, train/loss_step=0.580, global_step=3067.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1946/5971 [18:52<39:00,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.88e-6, train/loss_step=0.00166, global_step=3067.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1947/5971 [18:52<39:00,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.002, train/loss_step=0.338, global_step=3067.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  33%|███▎      | 1948/5971 [18:55<39:03,  1.72it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.14e-5, train/loss_step=0.00394, global_step=3067.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1949/5971 [18:56<39:03,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000103, train/loss_step=0.0264, global_step=3068.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  33%|███▎      | 1950/5971 [18:57<39:03,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000103, train/loss_step=0.0264, global_step=3068.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1950/5971 [18:57<39:03,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.93e-5, train/loss_step=0.00347, global_step=3068.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1951/5971 [18:57<39:03,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0646, train/loss_vlb_step=0.00022, train/loss_step=0.0646, global_step=3068.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  33%|███▎      | 1952/5971 [19:00<39:06,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000132, train/loss_step=0.0357, global_step=3068.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1953/5971 [19:00<39:06,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=2.9e-5, train/loss_step=0.00593, global_step=3069.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1954/5971 [19:01<39:06,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=2.9e-5, train/loss_step=0.00593, global_step=3069.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1954/5971 [19:01<39:06,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0476, train/loss_vlb_step=0.000174, train/loss_step=0.0476, global_step=3069.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1955/5971 [19:02<39:06,  1.71it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.09e-5, train/loss_step=0.00404, global_step=3069.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1956/5971 [19:05<39:10,  1.71it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00289, train/loss_vlb_step=1.67e-5, train/loss_step=0.00289, global_step=3069.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1957/5971 [19:06<39:10,  1.71it/s, loss=0.1, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.79e-5, train/loss_step=0.011, global_step=3070.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]      
Epoch 5:  33%|███▎      | 1958/5971 [19:07<39:10,  1.71it/s, loss=0.1, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.79e-5, train/loss_step=0.011, global_step=3070.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1958/5971 [19:07<39:10,  1.71it/s, loss=0.102, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00119, train/loss_step=0.290, global_step=3070.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1959/5971 [19:08<39:10,  1.71it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.55e-5, train/loss_step=0.0102, global_step=3070.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1960/5971 [19:11<39:15,  1.70it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.006, train/loss_vlb_step=2.96e-5, train/loss_step=0.006, global_step=3070.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  33%|███▎      | 1961/5971 [19:13<39:17,  1.70it/s, loss=0.112, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00269, train/loss_step=0.393, global_step=3071.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  33%|███▎      | 1962/5971 [19:14<39:18,  1.70it/s, loss=0.112, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00269, train/loss_step=0.393, global_step=3071.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1962/5971 [19:14<39:18,  1.70it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0595, train/loss_vlb_step=0.000203, train/loss_step=0.0595, global_step=3071.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1963/5971 [19:15<39:18,  1.70it/s, loss=0.119, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00453, train/loss_step=0.499, global_step=3071.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  33%|███▎      | 1964/5971 [19:19<39:23,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000246, train/loss_step=0.0699, global_step=3071.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1965/5971 [19:20<39:25,  1.69it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000416, train/loss_step=0.126, global_step=3072.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  33%|███▎      | 1966/5971 [19:22<39:27,  1.69it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000416, train/loss_step=0.126, global_step=3072.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1966/5971 [19:22<39:27,  1.69it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.03e-6, train/loss_step=0.00151, global_step=3072.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1967/5971 [19:24<39:28,  1.69it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.0523, train/loss_vlb_step=0.000179, train/loss_step=0.0523, global_step=3072.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  33%|███▎      | 1968/5971 [19:28<39:34,  1.69it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.00012, train/loss_step=0.0291, global_step=3072.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  33%|███▎      | 1968/5971 [19:33<39:45,  1.68it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.00012, train/loss_step=0.0291, global_step=3072.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1969/5971 [19:38<39:53,  1.67it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.00012, train/loss_step=0.0291, global_step=3072.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1969/5971 [19:38<39:53,  1.67it/s, loss=0.088, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000168, train/loss_step=0.0474, global_step=3073.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1970/5971 [19:41<39:58,  1.67it/s, loss=0.088, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000168, train/loss_step=0.0474, global_step=3073.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1970/5971 [19:41<39:58,  1.67it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000194, train/loss_step=0.0561, global_step=3073.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1971/5971 [19:43<40:00,  1.67it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000194, train/loss_step=0.0561, global_step=3073.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1971/5971 [19:43<40:00,  1.67it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000113, train/loss_step=0.0308, global_step=3073.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1972/5971 [19:47<40:06,  1.66it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000113, train/loss_step=0.0308, global_step=3073.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1972/5971 [19:47<40:06,  1.66it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.000241, train/loss_step=0.0727, global_step=3073.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1973/5971 [19:49<40:08,  1.66it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.000241, train/loss_step=0.0727, global_step=3073.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1973/5971 [19:49<40:08,  1.66it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.29e-5, train/loss_step=0.0024, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  33%|███▎      | 1974/5971 [19:50<40:09,  1.66it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.29e-5, train/loss_step=0.0024, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1974/5971 [19:50<40:09,  1.66it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.00031, train/loss_step=0.0941, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1975/5971 [19:52<40:11,  1.66it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.00031, train/loss_step=0.0941, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1975/5971 [19:52<40:11,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000107, train/loss_step=0.0284, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1976/5971 [19:56<40:17,  1.65it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000107, train/loss_step=0.0284, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  33%|███▎      | 1976/5971 [19:56<40:17,  1.65it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<02:15,  1.23it/s][A
Epoch 5:  33%|███▎      | 1978/5971 [19:57<40:15,  1.65it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:01<01:32,  1.79it/s][A

Validating:   2%|▏         | 3/167 [00:01<00:58,  2.80it/s][A
Epoch 5:  33%|███▎      | 1980/5971 [19:57<40:12,  1.65it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:01<00:30,  5.26it/s][A
Epoch 5:  33%|███▎      | 1982/5971 [19:57<40:09,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▍         | 7/167 [00:01<00:21,  7.40it/s][A
Epoch 5:  33%|███▎      | 1984/5971 [19:57<40:06,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:01<00:22,  6.89it/s][A
Epoch 5:  33%|███▎      | 1986/5971 [19:58<40:03,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 12/167 [00:02<00:24,  6.22it/s][A
Epoch 5:  33%|███▎      | 1989/5971 [19:58<39:58,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 13/167 [00:02<00:25,  5.94it/s][A

Validating:   9%|▉         | 15/167 [00:02<00:21,  7.23it/s][A
Epoch 5:  33%|███▎      | 1992/5971 [19:59<39:54,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:02<00:17,  8.65it/s][A
Epoch 5:  33%|███▎      | 1995/5971 [19:59<39:49,  1.66it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█▏        | 19/167 [00:03<00:14, 10.41it/s][A

Validating:  13%|█▎        | 21/167 [00:03<00:12, 11.98it/s][A
Epoch 5:  33%|███▎      | 1998/5971 [19:59<39:44,  1.67it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:03<00:11, 12.55it/s][A
Epoch 5:  34%|███▎      | 2001/5971 [19:59<39:39,  1.67it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  15%|█▍        | 25/167 [00:03<00:10, 13.87it/s][A

Validating:  16%|█▌        | 27/167 [00:03<00:10, 13.41it/s][A
Epoch 5:  34%|███▎      | 2004/5971 [19:59<39:34,  1.67it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  17%|█▋        | 29/167 [00:03<00:09, 14.08it/s][A
Epoch 5:  34%|███▎      | 2007/5971 [20:00<39:29,  1.67it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▊        | 31/167 [00:03<00:09, 14.13it/s][A

Validating:  20%|█▉        | 33/167 [00:03<00:09, 13.59it/s][A
Epoch 5:  34%|███▎      | 2010/5971 [20:00<39:24,  1.68it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  21%|██        | 35/167 [00:04<00:09, 13.67it/s][A
Epoch 5:  34%|███▎      | 2013/5971 [20:00<39:19,  1.68it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 37/167 [00:04<00:09, 13.22it/s][A

Validating:  23%|██▎       | 39/167 [00:04<00:09, 13.66it/s][A
Epoch 5:  34%|███▍      | 2016/5971 [20:00<39:14,  1.68it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▍       | 41/167 [00:04<00:09, 13.68it/s][A
Epoch 5:  34%|███▍      | 2019/5971 [20:01<39:09,  1.68it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▌       | 43/167 [00:04<00:09, 13.57it/s][A

Validating:  27%|██▋       | 45/167 [00:04<00:08, 14.50it/s][A
Epoch 5:  34%|███▍      | 2022/5971 [20:01<39:04,  1.68it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 47/167 [00:05<00:08, 13.77it/s][A
Epoch 5:  34%|███▍      | 2025/5971 [20:01<39:00,  1.69it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▉       | 49/167 [00:05<00:09, 12.62it/s][A

Validating:  31%|███       | 51/167 [00:05<00:08, 13.40it/s][A
Epoch 5:  34%|███▍      | 2028/5971 [20:01<38:55,  1.69it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 54/167 [00:05<00:07, 15.97it/s][A
Epoch 5:  34%|███▍      | 2031/5971 [20:01<38:50,  1.69it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▎      | 56/167 [00:05<00:06, 16.17it/s][A
Epoch 5:  34%|███▍      | 2034/5971 [20:02<38:45,  1.69it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▍      | 58/167 [00:05<00:08, 13.43it/s][A

Validating:  36%|███▌      | 60/167 [00:06<00:11,  9.04it/s][A
Epoch 5:  34%|███▍      | 2037/5971 [20:02<38:41,  1.69it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 62/167 [00:06<00:11,  9.33it/s][A
Epoch 5:  34%|███▍      | 2040/5971 [20:02<38:36,  1.70it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:06<00:10,  9.97it/s][A

Validating:  40%|███▉      | 66/167 [00:06<00:11,  9.18it/s][A
Epoch 5:  34%|███▍      | 2043/5971 [20:03<38:32,  1.70it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████      | 68/167 [00:06<00:10,  9.81it/s][A
Epoch 5:  34%|███▍      | 2046/5971 [20:03<38:27,  1.70it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:07<00:09, 10.15it/s][A

Validating:  43%|████▎     | 72/167 [00:07<00:08, 11.39it/s][A
Epoch 5:  34%|███▍      | 2049/5971 [20:03<38:22,  1.70it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▍     | 74/167 [00:07<00:07, 12.94it/s][A
Epoch 5:  34%|███▍      | 2052/5971 [20:03<38:18,  1.71it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 76/167 [00:07<00:06, 13.64it/s][A

Validating:  47%|████▋     | 78/167 [00:07<00:06, 13.72it/s][A
Epoch 5:  34%|███▍      | 2055/5971 [20:04<38:13,  1.71it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  48%|████▊     | 80/167 [00:07<00:06, 14.38it/s][A
Epoch 5:  34%|███▍      | 2058/5971 [20:04<38:08,  1.71it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:07<00:06, 13.84it/s][A

Validating:  50%|█████     | 84/167 [00:08<00:06, 13.73it/s][A
Epoch 5:  35%|███▍      | 2061/5971 [20:04<38:04,  1.71it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████▏    | 86/167 [00:08<00:05, 14.08it/s][A
Epoch 5:  35%|███▍      | 2064/5971 [20:04<37:59,  1.71it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 88/167 [00:08<00:05, 14.52it/s][A

Validating:  54%|█████▍    | 90/167 [00:08<00:05, 15.30it/s][A
Epoch 5:  35%|███▍      | 2067/5971 [20:04<37:54,  1.72it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  55%|█████▌    | 92/167 [00:08<00:04, 15.11it/s][A
Epoch 5:  35%|███▍      | 2070/5971 [20:05<37:49,  1.72it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:08<00:06, 11.50it/s][A

Validating:  57%|█████▋    | 96/167 [00:09<00:05, 11.85it/s][A
Epoch 5:  35%|███▍      | 2073/5971 [20:05<37:45,  1.72it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▊    | 98/167 [00:09<00:05, 12.69it/s][A
Epoch 5:  35%|███▍      | 2076/5971 [20:05<37:40,  1.72it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|█████▉    | 100/167 [00:09<00:04, 13.72it/s][A
Epoch 5:  35%|███▍      | 2079/5971 [20:05<37:36,  1.73it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 103/167 [00:09<00:03, 16.03it/s][A

Validating:  63%|██████▎   | 105/167 [00:09<00:03, 15.50it/s][A
Epoch 5:  35%|███▍      | 2082/5971 [20:05<37:31,  1.73it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▍   | 108/167 [00:09<00:03, 17.34it/s][A
Epoch 5:  35%|███▍      | 2085/5971 [20:06<37:26,  1.73it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▋   | 111/167 [00:09<00:03, 18.12it/s][A
Epoch 5:  35%|███▍      | 2088/5971 [20:06<37:22,  1.73it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 114/167 [00:09<00:02, 19.63it/s][A
Epoch 5:  35%|███▌      | 2091/5971 [20:06<37:17,  1.73it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:10<00:02, 20.29it/s][A
Epoch 5:  35%|███▌      | 2094/5971 [20:06<37:12,  1.74it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 120/167 [00:10<00:02, 16.38it/s][A
Epoch 5:  35%|███▌      | 2097/5971 [20:06<37:08,  1.74it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  73%|███████▎  | 122/167 [00:10<00:03, 13.49it/s][A
Epoch 5:  35%|███▌      | 2100/5971 [20:07<37:04,  1.74it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▍  | 124/167 [00:10<00:03, 12.87it/s][A

Validating:  75%|███████▌  | 126/167 [00:10<00:03, 13.42it/s][A
Epoch 5:  35%|███▌      | 2103/5971 [20:07<36:59,  1.74it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 128/167 [00:11<00:02, 14.35it/s][A
Epoch 5:  35%|███▌      | 2106/5971 [20:07<36:55,  1.74it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 130/167 [00:11<00:02, 13.36it/s][A

Validating:  79%|███████▉  | 132/167 [00:11<00:02, 13.14it/s][A
Epoch 5:  35%|███▌      | 2109/5971 [20:07<36:50,  1.75it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|████████  | 134/167 [00:11<00:02, 13.16it/s][A
Epoch 5:  35%|███▌      | 2112/5971 [20:08<36:46,  1.75it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████▏ | 136/167 [00:11<00:02, 14.11it/s][A

Validating:  83%|████████▎ | 138/167 [00:11<00:01, 14.72it/s][A
Epoch 5:  35%|███▌      | 2115/5971 [20:08<36:41,  1.75it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 140/167 [00:11<00:01, 13.86it/s][A
Epoch 5:  35%|███▌      | 2118/5971 [20:08<36:37,  1.75it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  85%|████████▌ | 142/167 [00:12<00:01, 13.72it/s][A

Validating:  86%|████████▌ | 144/167 [00:12<00:01, 13.28it/s][A
Epoch 5:  36%|███▌      | 2121/5971 [20:08<36:32,  1.76it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:12<00:01, 13.72it/s][A
Epoch 5:  36%|███▌      | 2124/5971 [20:08<36:28,  1.76it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▊ | 148/167 [00:12<00:01, 13.56it/s][A

Validating:  90%|████████▉ | 150/167 [00:12<00:01, 14.62it/s][A
Epoch 5:  36%|███▌      | 2127/5971 [20:09<36:24,  1.76it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  91%|█████████ | 152/167 [00:12<00:01, 14.24it/s][A
Epoch 5:  36%|███▌      | 2130/5971 [20:09<36:19,  1.76it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 154/167 [00:12<00:01, 12.83it/s][A

Validating:  93%|█████████▎| 156/167 [00:13<00:00, 13.95it/s][A
Epoch 5:  36%|███▌      | 2133/5971 [20:09<36:15,  1.76it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▍| 158/167 [00:13<00:00, 14.72it/s][A
Epoch 5:  36%|███▌      | 2136/5971 [20:09<36:10,  1.77it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▌| 160/167 [00:13<00:00, 14.49it/s][A

Validating:  97%|█████████▋| 162/167 [00:13<00:00, 14.77it/s][A
Epoch 5:  36%|███▌      | 2139/5971 [20:09<36:06,  1.77it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 164/167 [00:13<00:00, 15.70it/s][A
Epoch 5:  36%|███▌      | 2142/5971 [20:10<36:02,  1.77it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 100%|██████████| 167/167 [00:13<00:00, 18.37it/s][A
Epoch 5:  36%|███▌      | 2144/5971 [20:10<35:59,  1.77it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  36%|███▌      | 2145/5971 [20:12<36:01,  1.77it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.08e-5, train/loss_step=0.00183, global_step=3074.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2145/5971 [20:12<36:01,  1.77it/s, loss=0.0978, v_num=0, train/loss_simple_step=0.0866, train/loss_vlb_step=0.000285, train/loss_step=0.0866, global_step=3075.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  36%|███▌      | 2146/5971 [20:13<36:02,  1.77it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000419, train/loss_step=0.123, global_step=3075.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  36%|███▌      | 2147/5971 [20:15<36:03,  1.77it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000418, train/loss_step=0.126, global_step=3075.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2148/5971 [20:18<36:07,  1.76it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000418, train/loss_step=0.126, global_step=3075.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2148/5971 [20:18<36:07,  1.76it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.000189, train/loss_step=0.0546, global_step=3075.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2149/5971 [20:20<36:09,  1.76it/s, loss=0.079, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.57e-5, train/loss_step=0.0189, global_step=3076.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  36%|███▌      | 2150/5971 [20:21<36:10,  1.76it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000811, train/loss_step=0.227, global_step=3076.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2151/5971 [20:23<36:11,  1.76it/s, loss=0.0874, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000811, train/loss_step=0.227, global_step=3076.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2151/5971 [20:23<36:11,  1.76it/s, loss=0.0652, v_num=0, train/loss_simple_step=0.0558, train/loss_vlb_step=0.000191, train/loss_step=0.0558, global_step=3076.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2152/5971 [20:27<36:17,  1.75it/s, loss=0.062, v_num=0, train/loss_simple_step=0.00547, train/loss_vlb_step=2.76e-5, train/loss_step=0.00547, global_step=3076.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2153/5971 [20:29<36:18,  1.75it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.0121, train/loss_step=0.622, global_step=3077.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  36%|███▌      | 2154/5971 [20:30<36:19,  1.75it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.0121, train/loss_step=0.622, global_step=3077.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2154/5971 [20:30<36:19,  1.75it/s, loss=0.089, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.000162, train/loss_step=0.046, global_step=3077.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2155/5971 [20:32<36:21,  1.75it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.3e-6, train/loss_step=0.00154, global_step=3077.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2156/5971 [20:36<36:27,  1.74it/s, loss=0.092, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000461, train/loss_step=0.140, global_step=3077.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  36%|███▌      | 2157/5971 [20:38<36:28,  1.74it/s, loss=0.092, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000461, train/loss_step=0.140, global_step=3077.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2157/5971 [20:38<36:28,  1.74it/s, loss=0.092, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000171, train/loss_step=0.0471, global_step=3078.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2158/5971 [20:40<36:30,  1.74it/s, loss=0.1, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000839, train/loss_step=0.222, global_step=3078.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  36%|███▌      | 2159/5971 [20:41<36:31,  1.74it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.00673, train/loss_vlb_step=3.32e-5, train/loss_step=0.00673, global_step=3078.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2160/5971 [20:46<36:37,  1.73it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.00673, train/loss_vlb_step=3.32e-5, train/loss_step=0.00673, global_step=3078.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2160/5971 [20:46<36:37,  1.73it/s, loss=0.111, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00132, train/loss_step=0.312, global_step=3078.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  36%|███▌      | 2161/5971 [20:47<36:39,  1.73it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0093, train/loss_vlb_step=4.23e-5, train/loss_step=0.0093, global_step=3079.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2162/5971 [20:49<36:40,  1.73it/s, loss=0.13, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00397, train/loss_step=0.469, global_step=3079.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  36%|███▌      | 2163/5971 [20:50<36:41,  1.73it/s, loss=0.13, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00397, train/loss_step=0.469, global_step=3079.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2163/5971 [20:51<36:41,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000754, train/loss_step=0.209, global_step=3079.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▌      | 2164/5971 [20:55<36:47,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000484, train/loss_step=0.140, global_step=3079.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2165/5971 [20:57<36:48,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000181, train/loss_step=0.0512, global_step=3080.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2166/5971 [20:58<36:49,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000181, train/loss_step=0.0512, global_step=3080.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2166/5971 [20:58<36:49,  1.72it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.87e-5, train/loss_step=0.0137, global_step=3080.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  36%|███▋      | 2167/5971 [21:00<36:51,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00121, train/loss_step=0.304, global_step=3080.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  36%|███▋      | 2168/5971 [21:04<36:57,  1.72it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00699, train/loss_vlb_step=3.4e-5, train/loss_step=0.00699, global_step=3080.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2169/5971 [21:06<36:58,  1.71it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00699, train/loss_vlb_step=3.4e-5, train/loss_step=0.00699, global_step=3080.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2169/5971 [21:06<36:58,  1.71it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000112, train/loss_step=0.0313, global_step=3081.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2170/5971 [21:07<36:59,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.727, train/loss_vlb_step=0.0468, train/loss_step=0.727, global_step=3081.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  36%|███▋      | 2171/5971 [21:09<37:00,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00674, train/loss_vlb_step=3.34e-5, train/loss_step=0.00674, global_step=3081.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2172/5971 [21:13<37:05,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00674, train/loss_vlb_step=3.34e-5, train/loss_step=0.00674, global_step=3081.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2172/5971 [21:13<37:05,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.49e-5, train/loss_step=0.0153, global_step=3081.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  36%|███▋      | 2173/5971 [21:14<37:06,  1.71it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.99e-5, train/loss_step=0.0185, global_step=3082.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2174/5971 [21:16<37:08,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000315, train/loss_step=0.0956, global_step=3082.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2175/5971 [21:17<37:09,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000315, train/loss_step=0.0956, global_step=3082.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2175/5971 [21:17<37:09,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000619, train/loss_step=0.175, global_step=3082.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  36%|███▋      | 2176/5971 [21:25<37:21,  1.69it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0329, train/loss_vlb_step=0.000122, train/loss_step=0.0329, global_step=3082.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2177/5971 [21:26<37:21,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000707, train/loss_step=0.191, global_step=3083.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  36%|███▋      | 2178/5971 [21:28<37:22,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000707, train/loss_step=0.191, global_step=3083.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2178/5971 [21:28<37:23,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000172, train/loss_step=0.0462, global_step=3083.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2179/5971 [21:29<37:23,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000172, train/loss_step=0.0462, global_step=3083.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  36%|███▋      | 2179/5971 [21:29<37:23,  1.69it/s, loss=0.166, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00391, train/loss_step=0.461, global_step=3083.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  37%|███▋      | 2180/5971 [21:34<37:30,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00391, train/loss_step=0.461, global_step=3083.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2180/5971 [21:34<37:30,  1.68it/s, loss=0.172, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.0026, train/loss_step=0.444, global_step=3083.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2181/5971 [21:36<37:31,  1.68it/s, loss=0.172, v_num=0, train/loss_simple_step=0.444, train/loss_vlb_step=0.0026, train/loss_step=0.444, global_step=3083.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2181/5971 [21:36<37:31,  1.68it/s, loss=0.186, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00115, train/loss_step=0.279, global_step=3084.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2182/5971 [21:37<37:32,  1.68it/s, loss=0.186, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00115, train/loss_step=0.279, global_step=3084.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2182/5971 [21:37<37:32,  1.68it/s, loss=0.186, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00357, train/loss_step=0.480, global_step=3084.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2183/5971 [21:39<37:33,  1.68it/s, loss=0.186, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00357, train/loss_step=0.480, global_step=3084.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2183/5971 [21:39<37:33,  1.68it/s, loss=0.199, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.00272, train/loss_step=0.466, global_step=3084.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2184/5971 [21:43<37:38,  1.68it/s, loss=0.199, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.00272, train/loss_step=0.466, global_step=3084.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2184/5971 [21:43<37:38,  1.68it/s, loss=0.206, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.001, train/loss_step=0.282, global_step=3084.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  37%|███▋      | 2185/5971 [21:44<37:39,  1.68it/s, loss=0.206, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.001, train/loss_step=0.282, global_step=3084.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2185/5971 [21:44<37:39,  1.68it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000253, train/loss_step=0.0716, global_step=3085.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2186/5971 [21:46<37:40,  1.67it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000253, train/loss_step=0.0716, global_step=3085.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2186/5971 [21:46<37:40,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=3085.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  37%|███▋      | 2187/5971 [21:47<37:41,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=3085.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2187/5971 [21:47<37:41,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000617, train/loss_step=0.177, global_step=3085.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2188/5971 [21:52<37:47,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000617, train/loss_step=0.177, global_step=3085.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2188/5971 [21:52<37:47,  1.67it/s, loss=0.221, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00142, train/loss_step=0.314, global_step=3085.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2189/5971 [21:53<37:48,  1.67it/s, loss=0.221, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00142, train/loss_step=0.314, global_step=3085.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2189/5971 [21:53<37:48,  1.67it/s, loss=0.226, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000502, train/loss_step=0.142, global_step=3086.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2190/5971 [21:55<37:49,  1.67it/s, loss=0.226, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000502, train/loss_step=0.142, global_step=3086.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2190/5971 [21:55<37:49,  1.67it/s, loss=0.213, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00364, train/loss_step=0.453, global_step=3086.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2191/5971 [21:56<37:50,  1.66it/s, loss=0.213, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00364, train/loss_step=0.453, global_step=3086.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2191/5971 [21:56<37:50,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0985, train/loss_vlb_step=0.000328, train/loss_step=0.0985, global_step=3086.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2192/5971 [21:59<37:54,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0985, train/loss_vlb_step=0.000328, train/loss_step=0.0985, global_step=3086.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2192/5971 [21:59<37:54,  1.66it/s, loss=0.229, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000998, train/loss_step=0.252, global_step=3086.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  37%|███▋      | 2193/5971 [22:01<37:55,  1.66it/s, loss=0.229, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.000998, train/loss_step=0.252, global_step=3086.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2193/5971 [22:01<37:55,  1.66it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000265, train/loss_step=0.0807, global_step=3087.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2194/5971 [22:02<37:56,  1.66it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000265, train/loss_step=0.0807, global_step=3087.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2194/5971 [22:02<37:56,  1.66it/s, loss=0.228, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.41e-5, train/loss_step=0.00924, global_step=3087.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2195/5971 [22:04<37:57,  1.66it/s, loss=0.228, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.41e-5, train/loss_step=0.00924, global_step=3087.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2195/5971 [22:04<37:57,  1.66it/s, loss=0.224, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000366, train/loss_step=0.110, global_step=3087.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  37%|███▋      | 2196/5971 [22:08<38:02,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000366, train/loss_step=0.110, global_step=3087.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2196/5971 [22:08<38:02,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=0.000102, train/loss_step=0.0245, global_step=3087.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2197/5971 [22:10<38:03,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=0.000102, train/loss_step=0.0245, global_step=3087.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2197/5971 [22:10<38:03,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.6e-5, train/loss_step=0.00755, global_step=3088.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2198/5971 [22:11<38:05,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.00755, train/loss_vlb_step=3.6e-5, train/loss_step=0.00755, global_step=3088.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2198/5971 [22:11<38:05,  1.65it/s, loss=0.222, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000815, train/loss_step=0.189, global_step=3088.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  37%|███▋      | 2199/5971 [22:13<38:06,  1.65it/s, loss=0.222, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000815, train/loss_step=0.189, global_step=3088.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2199/5971 [22:13<38:06,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000957, train/loss_step=0.237, global_step=3088.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2200/5971 [22:17<38:11,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000957, train/loss_step=0.237, global_step=3088.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2200/5971 [22:17<38:11,  1.65it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=3088.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2201/5971 [22:19<38:12,  1.64it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=3088.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2201/5971 [22:19<38:12,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.62e-5, train/loss_step=0.0109, global_step=3089.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2202/5971 [22:20<38:13,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.62e-5, train/loss_step=0.0109, global_step=3089.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2202/5971 [22:20<38:13,  1.64it/s, loss=0.162, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000789, train/loss_step=0.209, global_step=3089.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2203/5971 [22:22<38:14,  1.64it/s, loss=0.162, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000789, train/loss_step=0.209, global_step=3089.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2203/5971 [22:22<38:14,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00592, train/loss_vlb_step=2.91e-5, train/loss_step=0.00592, global_step=3089.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2204/5971 [22:26<38:20,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00592, train/loss_vlb_step=2.91e-5, train/loss_step=0.00592, global_step=3089.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2204/5971 [22:26<38:20,  1.64it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0593, train/loss_vlb_step=0.000212, train/loss_step=0.0593, global_step=3089.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2205/5971 [22:27<38:20,  1.64it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0593, train/loss_vlb_step=0.000212, train/loss_step=0.0593, global_step=3089.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2205/5971 [22:27<38:20,  1.64it/s, loss=0.127, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000164, train/loss_step=0.049, global_step=3090.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  37%|███▋      | 2206/5971 [22:29<38:21,  1.64it/s, loss=0.127, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000164, train/loss_step=0.049, global_step=3090.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2206/5971 [22:29<38:21,  1.64it/s, loss=0.132, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000627, train/loss_step=0.189, global_step=3090.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2207/5971 [22:30<38:22,  1.63it/s, loss=0.132, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000627, train/loss_step=0.189, global_step=3090.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2207/5971 [22:30<38:22,  1.63it/s, loss=0.124, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.74e-5, train/loss_step=0.019, global_step=3090.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2208/5971 [22:34<38:27,  1.63it/s, loss=0.124, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.74e-5, train/loss_step=0.019, global_step=3090.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2208/5971 [22:34<38:27,  1.63it/s, loss=0.114, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=3090.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2209/5971 [22:35<38:28,  1.63it/s, loss=0.114, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=3090.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2209/5971 [22:35<38:28,  1.63it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.58e-5, train/loss_step=0.00285, global_step=3091.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2210/5971 [22:37<38:29,  1.63it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.58e-5, train/loss_step=0.00285, global_step=3091.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2210/5971 [22:37<38:29,  1.63it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000816, train/loss_step=0.225, global_step=3091.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  37%|███▋      | 2211/5971 [22:38<38:30,  1.63it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000816, train/loss_step=0.225, global_step=3091.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2211/5971 [22:38<38:30,  1.63it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.00371, train/loss_vlb_step=2e-5, train/loss_step=0.00371, global_step=3091.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2212/5971 [22:43<38:36,  1.62it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.00371, train/loss_vlb_step=2e-5, train/loss_step=0.00371, global_step=3091.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2212/5971 [22:43<38:36,  1.62it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000164, train/loss_step=0.0467, global_step=3091.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2213/5971 [22:45<38:37,  1.62it/s, loss=0.0804, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000164, train/loss_step=0.0467, global_step=3091.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2213/5971 [22:45<38:37,  1.62it/s, loss=0.0817, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=3092.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  37%|███▋      | 2214/5971 [22:46<38:37,  1.62it/s, loss=0.0817, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=3092.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2214/5971 [22:46<38:37,  1.62it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.00639, train/loss_vlb_step=3.18e-5, train/loss_step=0.00639, global_step=3092.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2215/5971 [22:47<38:38,  1.62it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.00639, train/loss_vlb_step=3.18e-5, train/loss_step=0.00639, global_step=3092.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2215/5971 [22:47<38:38,  1.62it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.001, train/loss_step=0.244, global_step=3092.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]      
Epoch 5:  37%|███▋      | 2216/5971 [22:52<38:44,  1.62it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.001, train/loss_step=0.244, global_step=3092.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2216/5971 [22:52<38:44,  1.62it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.00941, train/loss_vlb_step=4.44e-5, train/loss_step=0.00941, global_step=3092.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2217/5971 [22:53<38:45,  1.61it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.00941, train/loss_vlb_step=4.44e-5, train/loss_step=0.00941, global_step=3092.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2217/5971 [22:53<38:45,  1.61it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.17e-5, train/loss_step=0.0149, global_step=3093.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  37%|███▋      | 2218/5971 [22:55<38:46,  1.61it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.17e-5, train/loss_step=0.0149, global_step=3093.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2218/5971 [22:55<38:46,  1.61it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.0684, train/loss_vlb_step=0.000233, train/loss_step=0.0684, global_step=3093.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2219/5971 [22:57<38:47,  1.61it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.0684, train/loss_vlb_step=0.000233, train/loss_step=0.0684, global_step=3093.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2219/5971 [22:57<38:47,  1.61it/s, loss=0.0737, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000248, train/loss_step=0.0743, global_step=3093.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2220/5971 [23:01<38:53,  1.61it/s, loss=0.0737, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000248, train/loss_step=0.0743, global_step=3093.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2220/5971 [23:01<38:53,  1.61it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00387, train/loss_step=0.501, global_step=3093.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  37%|███▋      | 2221/5971 [23:03<38:54,  1.61it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00387, train/loss_step=0.501, global_step=3093.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2221/5971 [23:03<38:54,  1.61it/s, loss=0.105, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000507, train/loss_step=0.153, global_step=3094.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2222/5971 [23:05<38:55,  1.60it/s, loss=0.105, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000507, train/loss_step=0.153, global_step=3094.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2222/5971 [23:05<38:55,  1.60it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.00652, train/loss_vlb_step=3.27e-5, train/loss_step=0.00652, global_step=3094.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2223/5971 [23:06<38:56,  1.60it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.00652, train/loss_vlb_step=3.27e-5, train/loss_step=0.00652, global_step=3094.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2223/5971 [23:06<38:56,  1.60it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.0513, train/loss_vlb_step=0.000184, train/loss_step=0.0513, global_step=3094.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2224/5971 [23:10<39:01,  1.60it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.0513, train/loss_vlb_step=0.000184, train/loss_step=0.0513, global_step=3094.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2224/5971 [23:10<39:01,  1.60it/s, loss=0.0945, v_num=0, train/loss_simple_step=0.00516, train/loss_vlb_step=2.59e-5, train/loss_step=0.00516, global_step=3094.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2225/5971 [23:11<39:01,  1.60it/s, loss=0.0945, v_num=0, train/loss_simple_step=0.00516, train/loss_vlb_step=2.59e-5, train/loss_step=0.00516, global_step=3094.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2225/5971 [23:11<39:02,  1.60it/s, loss=0.108, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00138, train/loss_step=0.328, global_step=3095.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  37%|███▋      | 2226/5971 [23:13<39:02,  1.60it/s, loss=0.108, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00138, train/loss_step=0.328, global_step=3095.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2226/5971 [23:13<39:02,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.00102, train/loss_step=0.228, global_step=3095.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2227/5971 [23:14<39:03,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.00102, train/loss_step=0.228, global_step=3095.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2227/5971 [23:14<39:03,  1.60it/s, loss=0.124, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00106, train/loss_step=0.283, global_step=3095.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2228/5971 [23:18<39:08,  1.59it/s, loss=0.124, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00106, train/loss_step=0.283, global_step=3095.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2228/5971 [23:18<39:08,  1.59it/s, loss=0.145, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.00515, train/loss_step=0.546, global_step=3095.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2229/5971 [23:22<39:13,  1.59it/s, loss=0.145, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.00515, train/loss_step=0.546, global_step=3095.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2229/5971 [23:22<39:13,  1.59it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0873, train/loss_vlb_step=0.000292, train/loss_step=0.0873, global_step=3096.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2230/5971 [23:23<39:14,  1.59it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0873, train/loss_vlb_step=0.000292, train/loss_step=0.0873, global_step=3096.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2230/5971 [23:23<39:14,  1.59it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0061, train/loss_vlb_step=2.87e-5, train/loss_step=0.0061, global_step=3096.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2231/5971 [23:25<39:14,  1.59it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0061, train/loss_vlb_step=2.87e-5, train/loss_step=0.0061, global_step=3096.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2231/5971 [23:25<39:14,  1.59it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000157, train/loss_step=0.0443, global_step=3096.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2232/5971 [23:28<39:18,  1.59it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000157, train/loss_step=0.0443, global_step=3096.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2232/5971 [23:28<39:18,  1.59it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000162, train/loss_step=0.0467, global_step=3096.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2233/5971 [23:30<39:19,  1.58it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000162, train/loss_step=0.0467, global_step=3096.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2233/5971 [23:30<39:19,  1.58it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0962, train/loss_vlb_step=0.000321, train/loss_step=0.0962, global_step=3097.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2234/5971 [23:31<39:20,  1.58it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0962, train/loss_vlb_step=0.000321, train/loss_step=0.0962, global_step=3097.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2234/5971 [23:31<39:20,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000874, train/loss_step=0.210, global_step=3097.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  37%|███▋      | 2235/5971 [23:33<39:21,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000874, train/loss_step=0.210, global_step=3097.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2235/5971 [23:33<39:21,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000621, train/loss_step=0.181, global_step=3097.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2236/5971 [23:37<39:26,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000621, train/loss_step=0.181, global_step=3097.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2236/5971 [23:37<39:26,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000134, train/loss_step=0.038, global_step=3097.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2237/5971 [23:39<39:27,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000134, train/loss_step=0.038, global_step=3097.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2237/5971 [23:39<39:27,  1.58it/s, loss=0.167, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00235, train/loss_step=0.393, global_step=3098.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  37%|███▋      | 2238/5971 [23:40<39:28,  1.58it/s, loss=0.167, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00235, train/loss_step=0.393, global_step=3098.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2238/5971 [23:40<39:28,  1.58it/s, loss=0.166, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000143, train/loss_step=0.040, global_step=3098.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2239/5971 [23:42<39:29,  1.57it/s, loss=0.166, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000143, train/loss_step=0.040, global_step=3098.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  37%|███▋      | 2239/5971 [23:42<39:29,  1.57it/s, loss=0.204, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.0396, train/loss_step=0.839, global_step=3098.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  38%|███▊      | 2240/5971 [23:45<39:33,  1.57it/s, loss=0.204, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.0396, train/loss_step=0.839, global_step=3098.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  38%|███▊      | 2240/5971 [23:45<39:33,  1.57it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.000108, train/loss_step=0.0282, global_step=3098.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  38%|███▊      | 2241/5971 [23:47<39:35,  1.57it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.000108, train/loss_step=0.0282, global_step=3098.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  38%|███▊      | 2241/5971 [23:47<39:35,  1.57it/s, loss=0.181, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000521, train/loss_step=0.157, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  38%|███▊      | 2242/5971 [23:49<39:36,  1.57it/s, loss=0.181, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000521, train/loss_step=0.157, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  38%|███▊      | 2242/5971 [23:49<39:36,  1.57it/s, loss=0.189, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000562, train/loss_step=0.169, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  38%|███▊      | 2243/5971 [23:51<39:37,  1.57it/s, loss=0.189, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000562, train/loss_step=0.169, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  38%|███▊      | 2243/5971 [23:51<39:37,  1.57it/s, loss=0.218, v_num=0, train/loss_simple_step=0.635, train/loss_vlb_step=0.00681, train/loss_step=0.635, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  38%|███▊      | 2244/5971 [23:55<39:42,  1.56it/s, loss=0.218, v_num=0, train/loss_simple_step=0.635, train/loss_vlb_step=0.00681, train/loss_step=0.635, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  38%|███▊      | 2244/5971 [23:55<39:42,  1.56it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:01<02:58,  1.07s/it][A
Epoch 5:  38%|███▊      | 2246/5971 [23:56<39:41,  1.56it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   2%|▏         | 3/167 [00:01<00:56,  2.90it/s][A
Epoch 5:  38%|███▊      | 2248/5971 [23:56<39:38,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:01<00:34,  4.66it/s][A
Epoch 5:  38%|███▊      | 2250/5971 [23:56<39:34,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▎         | 6/167 [00:01<00:31,  5.08it/s][A

Validating:   4%|▍         | 7/167 [00:01<00:28,  5.57it/s][A
Epoch 5:  38%|███▊      | 2252/5971 [23:57<39:32,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:01<00:20,  7.57it/s][A
Epoch 5:  38%|███▊      | 2254/5971 [23:57<39:28,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:02<00:16,  9.28it/s][A
Epoch 5:  38%|███▊      | 2256/5971 [23:57<39:25,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 13/167 [00:02<00:14, 10.50it/s][A
Epoch 5:  38%|███▊      | 2258/5971 [23:57<39:22,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   9%|▉         | 15/167 [00:02<00:12, 12.08it/s][A
Epoch 5:  38%|███▊      | 2260/5971 [23:57<39:19,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:02<00:11, 13.11it/s][A
Epoch 5:  38%|███▊      | 2262/5971 [23:57<39:16,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█▏        | 19/167 [00:02<00:10, 13.90it/s][A
Epoch 5:  38%|███▊      | 2264/5971 [23:57<39:13,  1.58it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:02<00:10, 14.39it/s][A
Epoch 5:  38%|███▊      | 2266/5971 [23:57<39:10,  1.58it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:02<00:09, 15.34it/s][A
Epoch 5:  38%|███▊      | 2268/5971 [23:58<39:06,  1.58it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  15%|█▍        | 25/167 [00:02<00:08, 15.91it/s][A
Epoch 5:  38%|███▊      | 2270/5971 [23:58<39:03,  1.58it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 27/167 [00:02<00:08, 16.05it/s][A
Epoch 5:  38%|███▊      | 2272/5971 [23:58<39:00,  1.58it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  18%|█▊        | 30/167 [00:03<00:07, 17.75it/s][A
Epoch 5:  38%|███▊      | 2275/5971 [23:58<38:55,  1.58it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▉        | 32/167 [00:03<00:07, 17.98it/s][A
Epoch 5:  38%|███▊      | 2278/5971 [23:58<38:51,  1.58it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|██        | 34/167 [00:03<00:07, 18.42it/s][A

Validating:  22%|██▏       | 36/167 [00:03<00:07, 18.26it/s][A
Epoch 5:  38%|███▊      | 2281/5971 [23:58<38:46,  1.59it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 38/167 [00:03<00:07, 17.33it/s][A
Epoch 5:  38%|███▊      | 2284/5971 [23:58<38:41,  1.59it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  24%|██▍       | 40/167 [00:03<00:07, 17.38it/s][A

Validating:  25%|██▌       | 42/167 [00:03<00:07, 17.47it/s][A
Epoch 5:  38%|███▊      | 2287/5971 [23:59<38:37,  1.59it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▋       | 44/167 [00:03<00:06, 18.12it/s][A
Epoch 5:  38%|███▊      | 2290/5971 [23:59<38:32,  1.59it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:04<00:06, 17.88it/s][A

Validating:  29%|██▊       | 48/167 [00:04<00:06, 17.58it/s][A
Epoch 5:  38%|███▊      | 2293/5971 [23:59<38:27,  1.59it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  30%|██▉       | 50/167 [00:04<00:06, 17.50it/s][A
Epoch 5:  38%|███▊      | 2296/5971 [23:59<38:23,  1.60it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 52/167 [00:04<00:06, 17.26it/s][A

Validating:  32%|███▏      | 54/167 [00:04<00:06, 16.35it/s][A
Epoch 5:  39%|███▊      | 2299/5971 [23:59<38:18,  1.60it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▎      | 56/167 [00:04<00:08, 13.34it/s][A
Epoch 5:  39%|███▊      | 2302/5971 [24:00<38:14,  1.60it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▍      | 58/167 [00:04<00:07, 13.69it/s][A

Validating:  36%|███▌      | 60/167 [00:04<00:07, 14.62it/s][A
Epoch 5:  39%|███▊      | 2305/5971 [24:00<38:09,  1.60it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 62/167 [00:05<00:06, 15.88it/s][A
Epoch 5:  39%|███▊      | 2308/5971 [24:00<38:05,  1.60it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:05<00:06, 14.84it/s][A

Validating:  40%|███▉      | 66/167 [00:05<00:06, 15.37it/s][A
Epoch 5:  39%|███▊      | 2311/5971 [24:00<38:00,  1.60it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████      | 68/167 [00:05<00:06, 15.65it/s][A
Epoch 5:  39%|███▉      | 2314/5971 [24:00<37:56,  1.61it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:05<00:06, 14.76it/s][A

Validating:  43%|████▎     | 72/167 [00:05<00:06, 14.36it/s][A
Epoch 5:  39%|███▉      | 2317/5971 [24:01<37:51,  1.61it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▍     | 74/167 [00:05<00:06, 15.19it/s][A
Epoch 5:  39%|███▉      | 2320/5971 [24:01<37:47,  1.61it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 77/167 [00:06<00:05, 17.17it/s][A
Epoch 5:  39%|███▉      | 2323/5971 [24:01<37:42,  1.61it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 79/167 [00:06<00:04, 17.62it/s][A
Epoch 5:  39%|███▉      | 2326/5971 [24:01<37:38,  1.61it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:06<00:04, 18.41it/s][A

Validating:  50%|█████     | 84/167 [00:06<00:04, 17.91it/s][A
Epoch 5:  39%|███▉      | 2329/5971 [24:01<37:33,  1.62it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████▏    | 86/167 [00:06<00:04, 17.63it/s][A
Epoch 5:  39%|███▉      | 2332/5971 [24:01<37:29,  1.62it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 89/167 [00:06<00:04, 18.76it/s][A
Epoch 5:  39%|███▉      | 2335/5971 [24:02<37:24,  1.62it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  55%|█████▌    | 92/167 [00:06<00:03, 19.49it/s][A
Epoch 5:  39%|███▉      | 2338/5971 [24:02<37:20,  1.62it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 95/167 [00:06<00:03, 20.07it/s][A
Epoch 5:  39%|███▉      | 2341/5971 [24:02<37:15,  1.62it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▊    | 98/167 [00:07<00:03, 20.52it/s][A
Epoch 5:  39%|███▉      | 2344/5971 [24:02<37:11,  1.63it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|██████    | 101/167 [00:07<00:03, 20.46it/s][A
Epoch 5:  39%|███▉      | 2347/5971 [24:02<37:06,  1.63it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 104/167 [00:07<00:03, 20.14it/s][A
Epoch 5:  39%|███▉      | 2350/5971 [24:02<37:02,  1.63it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  64%|██████▍   | 107/167 [00:07<00:02, 20.21it/s][A
Epoch 5:  39%|███▉      | 2353/5971 [24:02<36:57,  1.63it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▌   | 110/167 [00:07<00:02, 19.90it/s][A
Epoch 5:  39%|███▉      | 2356/5971 [24:03<36:53,  1.63it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 113/167 [00:07<00:02, 19.98it/s][A
Epoch 5:  40%|███▉      | 2359/5971 [24:03<36:48,  1.64it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  69%|██████▉   | 116/167 [00:07<00:02, 21.15it/s][A
Epoch 5:  40%|███▉      | 2362/5971 [24:03<36:44,  1.64it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████▏  | 119/167 [00:08<00:02, 21.41it/s][A
Epoch 5:  40%|███▉      | 2365/5971 [24:03<36:39,  1.64it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  73%|███████▎  | 122/167 [00:08<00:02, 21.73it/s][A
Epoch 5:  40%|███▉      | 2368/5971 [24:03<36:35,  1.64it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▍  | 125/167 [00:08<00:01, 22.23it/s][A
Epoch 5:  40%|███▉      | 2371/5971 [24:03<36:31,  1.64it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 128/167 [00:08<00:01, 22.39it/s][A
Epoch 5:  40%|███▉      | 2374/5971 [24:03<36:26,  1.64it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 131/167 [00:08<00:01, 21.57it/s][A
Epoch 5:  40%|███▉      | 2377/5971 [24:04<36:22,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|████████  | 134/167 [00:08<00:01, 20.74it/s][A
Epoch 5:  40%|███▉      | 2380/5971 [24:04<36:18,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  82%|████████▏ | 137/167 [00:08<00:01, 21.13it/s][A
Epoch 5:  40%|███▉      | 2383/5971 [24:04<36:13,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 140/167 [00:09<00:01, 20.54it/s][A
Epoch 5:  40%|███▉      | 2386/5971 [24:04<36:09,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 143/167 [00:09<00:01, 20.06it/s][A
Epoch 5:  40%|████      | 2389/5971 [24:04<36:05,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:09<00:01, 20.37it/s][A
Epoch 5:  40%|████      | 2392/5971 [24:04<36:00,  1.66it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▉ | 149/167 [00:09<00:00, 19.32it/s][A
Epoch 5:  40%|████      | 2395/5971 [24:04<35:56,  1.66it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  91%|█████████ | 152/167 [00:09<00:00, 19.75it/s][A
Epoch 5:  40%|████      | 2398/5971 [24:05<35:52,  1.66it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  93%|█████████▎| 155/167 [00:09<00:00, 19.89it/s][A
Epoch 5:  40%|████      | 2401/5971 [24:05<35:47,  1.66it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  94%|█████████▍| 157/167 [00:09<00:00, 19.20it/s][A
Epoch 5:  40%|████      | 2404/5971 [24:05<35:43,  1.66it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▌| 160/167 [00:10<00:00, 19.12it/s][A

Validating:  97%|█████████▋| 162/167 [00:10<00:00, 18.93it/s][A
Epoch 5:  40%|████      | 2407/5971 [24:05<35:39,  1.67it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 165/167 [00:10<00:00, 19.99it/s][A
Epoch 5:  40%|████      | 2410/5971 [24:05<35:35,  1.67it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 100%|██████████| 167/167 [00:10<00:00, 19.60it/s][A
Epoch 5:  40%|████      | 2412/5971 [24:06<35:32,  1.67it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:27,  1.72it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:25,  1.86it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:02<00:23,  1.96it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:21,  2.05it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:19,  2.26it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:17,  2.43it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:16,  2.52it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:04<00:15,  2.65it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:13,  2.90it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:04<00:12,  3.15it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:04<00:11,  3.40it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:10,  3.67it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:05<00:09,  3.90it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:05<00:08,  4.08it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:05<00:08,  4.21it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:08,  4.02it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:06<00:07,  4.04it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:06<00:07,  4.03it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:07,  3.88it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:07<00:08,  3.41it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:07<00:08,  3.12it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:07<00:08,  3.07it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:08<00:07,  3.25it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:08<00:07,  3.41it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:08<00:08,  2.98it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:09<00:07,  3.16it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:09<00:06,  3.27it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:09<00:06,  3.44it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:09<00:05,  3.53it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:10<00:05,  3.59it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:10<00:04,  3.65it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:10<00:04,  3.60it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:10<00:04,  3.52it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:11<00:04,  3.49it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:11<00:04,  3.23it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:11<00:03,  3.34it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:12<00:03,  3.47it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:12<00:03,  3.28it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:12<00:03,  3.16it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:13<00:02,  3.25it/s][A
Epoch 5:  40%|████      | 2412/5971 [24:21<35:55,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Spaced Sampler:  84%|████████▍ | 42/50 [00:13<00:02,  3.34it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:13<00:02,  3.40it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:13<00:01,  3.43it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:14<00:01,  3.43it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:14<00:01,  3.41it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:14<00:00,  3.42it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:15<00:00,  3.43it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:15<00:00,  3.33it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:15<00:00,  3.30it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:15<00:00,  3.17it/s]

Epoch 5:  40%|████      | 2413/5971 [24:25<36:00,  1.65it/s, loss=0.224, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000422, train/loss_step=0.125, global_step=3099.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  40%|████      | 2413/5971 [24:25<36:00,  1.65it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00969, train/loss_vlb_step=4.54e-5, train/loss_step=0.00969, global_step=3100.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:23,  2.01it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:17,  2.66it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:15,  2.98it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:17,  2.62it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:17,  2.54it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:14,  2.98it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:12,  3.36it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:10,  3.76it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:03<00:09,  4.05it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:09,  3.98it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:03<00:11,  3.24it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:04<00:11,  3.27it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:04<00:10,  3.52it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:04<00:09,  3.65it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:05<00:08,  3.82it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:09,  3.60it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:10,  3.00it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:06<00:10,  2.95it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:09,  3.15it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:06<00:08,  3.36it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:06<00:08,  3.46it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:07<00:07,  3.46it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:07<00:07,  3.68it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:07<00:06,  3.90it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:07<00:05,  4.16it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:08<00:05,  4.05it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:08<00:05,  3.88it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:08<00:05,  3.87it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:08<00:05,  3.80it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:09<00:05,  3.69it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:09<00:05,  3.47it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:09<00:04,  3.46it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:10<00:04,  3.63it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:10<00:03,  3.84it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:10<00:03,  4.00it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:10<00:03,  3.95it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:11<00:03,  3.79it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:11<00:03,  3.57it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:11<00:02,  3.75it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:11<00:02,  3.93it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:12<00:02,  3.99it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:12<00:01,  4.03it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:12<00:01,  4.01it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:12<00:01,  4.02it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:13<00:00,  4.01it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:13<00:00,  3.99it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:13<00:00,  4.04it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:13<00:00,  4.07it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  4.05it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.55it/s]

Epoch 5:  40%|████      | 2414/5971 [24:42<36:23,  1.63it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00969, train/loss_vlb_step=4.54e-5, train/loss_step=0.00969, global_step=3100.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  40%|████      | 2414/5971 [24:42<36:23,  1.63it/s, loss=0.202, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=3100.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:25,  1.89it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:21,  2.19it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:19,  2.39it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:18,  2.45it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:17,  2.48it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:17,  2.50it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:16,  2.51it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:16,  2.55it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:15,  2.54it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:04<00:15,  2.50it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:05<00:15,  2.49it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:14,  2.52it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:05<00:13,  2.64it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:06<00:12,  2.84it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:06<00:11,  3.08it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:06<00:10,  3.25it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:06<00:09,  3.42it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:07<00:08,  3.53it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:07<00:08,  3.34it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:07<00:09,  3.17it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:08<00:09,  3.08it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:08<00:08,  3.21it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:08<00:08,  3.18it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:09<00:07,  3.27it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:09<00:07,  3.09it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:09<00:07,  2.98it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:10<00:07,  3.09it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:10<00:06,  3.23it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:10<00:06,  3.31it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:10<00:06,  3.12it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:11<00:06,  2.98it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:11<00:05,  3.13it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:11<00:04,  3.27it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:12<00:04,  3.38it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:12<00:03,  3.51it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:12<00:03,  3.56it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:12<00:03,  3.64it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:13<00:03,  3.62it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:13<00:02,  3.65it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:13<00:02,  3.66it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:14<00:02,  3.65it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:14<00:01,  3.61it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:14<00:01,  3.57it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:14<00:01,  3.45it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:15<00:01,  3.27it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:15<00:00,  3.17it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:15<00:00,  3.05it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:16<00:00,  3.02it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:16<00:00,  3.29it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:16<00:00,  3.02it/s]

Epoch 5:  40%|████      | 2415/5971 [25:02<36:51,  1.61it/s, loss=0.202, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=3100.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  40%|████      | 2415/5971 [25:02<36:51,  1.61it/s, loss=0.205, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00181, train/loss_step=0.356, global_step=3100.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:25,  1.91it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:22,  2.14it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:20,  2.27it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:18,  2.39it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:17,  2.47it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:17,  2.51it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:16,  2.60it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:15,  2.64it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:14,  2.70it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:04<00:14,  2.75it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:04<00:13,  2.81it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:13,  2.81it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:05<00:13,  2.72it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:05<00:13,  2.68it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:06<00:12,  2.67it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:06<00:12,  2.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:07<00:12,  2.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:07<00:12,  2.52it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:07<00:11,  2.61it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:08<00:10,  2.76it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:08<00:09,  2.92it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:08<00:08,  3.08it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:09<00:09,  2.61it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:09<00:08,  2.91it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:09<00:07,  3.19it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:10<00:06,  3.40it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:10<00:06,  3.56it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:10<00:06,  3.08it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:10<00:06,  3.33it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:11<00:05,  3.51it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:11<00:04,  3.63it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:11<00:04,  3.68it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:12<00:04,  3.38it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:12<00:04,  3.12it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:12<00:04,  3.07it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:13<00:04,  3.00it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:13<00:03,  3.04it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:13<00:03,  2.98it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:14<00:03,  2.98it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:14<00:03,  2.97it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:14<00:02,  3.01it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:15<00:02,  2.96it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:15<00:02,  2.91it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:15<00:01,  2.86it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:16<00:01,  2.84it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:16<00:00,  3.06it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:16<00:00,  3.30it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:17<00:00,  3.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:17<00:00,  3.61it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:17<00:00,  2.89it/s]

Epoch 5:  40%|████      | 2416/5971 [25:25<37:23,  1.58it/s, loss=0.205, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00181, train/loss_step=0.356, global_step=3100.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  40%|████      | 2416/5971 [25:25<37:23,  1.58it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.06e-5, train/loss_step=0.0195, global_step=3100.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  40%|████      | 2417/5971 [25:27<37:24,  1.58it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.06e-5, train/loss_step=0.0195, global_step=3100.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  40%|████      | 2417/5971 [25:27<37:24,  1.58it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.45e-5, train/loss_step=0.0132, global_step=3101.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  40%|████      | 2418/5971 [25:28<37:25,  1.58it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.45e-5, train/loss_step=0.0132, global_step=3101.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  40%|████      | 2418/5971 [25:28<37:25,  1.58it/s, loss=0.198, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00256, train/loss_step=0.454, global_step=3101.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████      | 2419/5971 [25:29<37:25,  1.58it/s, loss=0.198, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00256, train/loss_step=0.454, global_step=3101.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2419/5971 [25:29<37:25,  1.58it/s, loss=0.202, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.00043, train/loss_step=0.126, global_step=3101.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2420/5971 [25:33<37:29,  1.58it/s, loss=0.202, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.00043, train/loss_step=0.126, global_step=3101.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2420/5971 [25:33<37:29,  1.58it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.69e-5, train/loss_step=0.0131, global_step=3101.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2421/5971 [25:34<37:29,  1.58it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.69e-5, train/loss_step=0.0131, global_step=3101.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2421/5971 [25:34<37:29,  1.58it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000101, train/loss_step=0.0259, global_step=3102.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2422/5971 [25:35<37:29,  1.58it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000101, train/loss_step=0.0259, global_step=3102.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2422/5971 [25:35<37:29,  1.58it/s, loss=0.199, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000974, train/loss_step=0.249, global_step=3102.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████      | 2423/5971 [25:37<37:30,  1.58it/s, loss=0.199, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000974, train/loss_step=0.249, global_step=3102.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2423/5971 [25:37<37:30,  1.58it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.000242, train/loss_step=0.0718, global_step=3102.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2424/5971 [25:41<37:34,  1.57it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.000242, train/loss_step=0.0718, global_step=3102.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2424/5971 [25:41<37:34,  1.57it/s, loss=0.197, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000391, train/loss_step=0.118, global_step=3102.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████      | 2425/5971 [25:42<37:34,  1.57it/s, loss=0.197, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000391, train/loss_step=0.118, global_step=3102.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2425/5971 [25:42<37:34,  1.57it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.02e-5, train/loss_step=0.0177, global_step=3103.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2426/5971 [25:43<37:34,  1.57it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.02e-5, train/loss_step=0.0177, global_step=3103.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2426/5971 [25:43<37:34,  1.57it/s, loss=0.18, v_num=0, train/loss_simple_step=0.076, train/loss_vlb_step=0.000251, train/loss_step=0.076, global_step=3103.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████      | 2427/5971 [25:45<37:35,  1.57it/s, loss=0.18, v_num=0, train/loss_simple_step=0.076, train/loss_vlb_step=0.000251, train/loss_step=0.076, global_step=3103.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2427/5971 [25:45<37:35,  1.57it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.15e-5, train/loss_step=0.0116, global_step=3103.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2428/5971 [25:48<37:37,  1.57it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.15e-5, train/loss_step=0.0116, global_step=3103.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2428/5971 [25:48<37:37,  1.57it/s, loss=0.152, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00148, train/loss_step=0.299, global_step=3103.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████      | 2429/5971 [25:49<37:38,  1.57it/s, loss=0.152, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00148, train/loss_step=0.299, global_step=3103.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2429/5971 [25:49<37:38,  1.57it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0799, train/loss_vlb_step=0.000269, train/loss_step=0.0799, global_step=3104.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2430/5971 [25:50<37:38,  1.57it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0799, train/loss_vlb_step=0.000269, train/loss_step=0.0799, global_step=3104.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2430/5971 [25:50<37:38,  1.57it/s, loss=0.158, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00175, train/loss_step=0.353, global_step=3104.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  41%|████      | 2431/5971 [25:51<37:37,  1.57it/s, loss=0.158, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00175, train/loss_step=0.353, global_step=3104.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2431/5971 [25:51<37:37,  1.57it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0911, train/loss_vlb_step=0.0003, train/loss_step=0.0911, global_step=3104.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2432/5971 [25:54<37:40,  1.57it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0911, train/loss_vlb_step=0.0003, train/loss_step=0.0911, global_step=3104.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2432/5971 [25:54<37:40,  1.57it/s, loss=0.131, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000473, train/loss_step=0.143, global_step=3104.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2433/5971 [25:55<37:40,  1.56it/s, loss=0.131, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000473, train/loss_step=0.143, global_step=3104.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2433/5971 [25:55<37:40,  1.56it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00325, train/loss_vlb_step=1.78e-5, train/loss_step=0.00325, global_step=3105.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2434/5971 [25:56<37:40,  1.56it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00325, train/loss_vlb_step=1.78e-5, train/loss_step=0.00325, global_step=3105.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2434/5971 [25:56<37:40,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.19e-5, train/loss_step=0.002, global_step=3105.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  41%|████      | 2435/5971 [25:57<37:40,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.19e-5, train/loss_step=0.002, global_step=3105.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2435/5971 [25:57<37:40,  1.56it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.2e-5, train/loss_step=0.00429, global_step=3105.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2436/5971 [26:00<37:43,  1.56it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.2e-5, train/loss_step=0.00429, global_step=3105.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2436/5971 [26:00<37:43,  1.56it/s, loss=0.123, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00202, train/loss_step=0.307, global_step=3105.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  41%|████      | 2437/5971 [26:01<37:43,  1.56it/s, loss=0.123, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00202, train/loss_step=0.307, global_step=3105.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2437/5971 [26:01<37:43,  1.56it/s, loss=0.123, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.22e-5, train/loss_step=0.019, global_step=3106.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2438/5971 [26:02<37:43,  1.56it/s, loss=0.123, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.22e-5, train/loss_step=0.019, global_step=3106.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2438/5971 [26:02<37:43,  1.56it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000166, train/loss_step=0.0474, global_step=3106.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2439/5971 [26:03<37:43,  1.56it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000166, train/loss_step=0.0474, global_step=3106.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2439/5971 [26:03<37:43,  1.56it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00569, train/loss_vlb_step=2.91e-5, train/loss_step=0.00569, global_step=3106.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2440/5971 [26:06<37:46,  1.56it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00569, train/loss_vlb_step=2.91e-5, train/loss_step=0.00569, global_step=3106.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2440/5971 [26:06<37:46,  1.56it/s, loss=0.111, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00119, train/loss_step=0.304, global_step=3106.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  41%|████      | 2441/5971 [26:07<37:46,  1.56it/s, loss=0.111, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00119, train/loss_step=0.304, global_step=3106.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2441/5971 [26:07<37:46,  1.56it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.45e-5, train/loss_step=0.0183, global_step=3107.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2442/5971 [26:09<37:46,  1.56it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.45e-5, train/loss_step=0.0183, global_step=3107.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2442/5971 [26:09<37:46,  1.56it/s, loss=0.108, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000677, train/loss_step=0.188, global_step=3107.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  41%|████      | 2443/5971 [26:10<37:46,  1.56it/s, loss=0.108, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000677, train/loss_step=0.188, global_step=3107.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2443/5971 [26:10<37:46,  1.56it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.00011, train/loss_step=0.0281, global_step=3107.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2444/5971 [26:13<37:50,  1.55it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.00011, train/loss_step=0.0281, global_step=3107.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2444/5971 [26:13<37:50,  1.55it/s, loss=0.113, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00151, train/loss_step=0.271, global_step=3107.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████      | 2445/5971 [26:15<37:50,  1.55it/s, loss=0.113, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00151, train/loss_step=0.271, global_step=3107.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2445/5971 [26:15<37:50,  1.55it/s, loss=0.129, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.0015, train/loss_step=0.326, global_step=3108.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  41%|████      | 2446/5971 [26:16<37:50,  1.55it/s, loss=0.129, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.0015, train/loss_step=0.326, global_step=3108.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2446/5971 [26:16<37:50,  1.55it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000131, train/loss_step=0.0364, global_step=3108.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2447/5971 [26:17<37:50,  1.55it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000131, train/loss_step=0.0364, global_step=3108.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2447/5971 [26:17<37:50,  1.55it/s, loss=0.133, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000411, train/loss_step=0.125, global_step=3108.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████      | 2448/5971 [26:20<37:53,  1.55it/s, loss=0.133, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000411, train/loss_step=0.125, global_step=3108.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2448/5971 [26:20<37:53,  1.55it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000296, train/loss_step=0.0895, global_step=3108.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2449/5971 [26:21<37:53,  1.55it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000296, train/loss_step=0.0895, global_step=3108.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2449/5971 [26:21<37:53,  1.55it/s, loss=0.14, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00334, train/loss_step=0.445, global_step=3109.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  41%|████      | 2450/5971 [26:22<37:53,  1.55it/s, loss=0.14, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00334, train/loss_step=0.445, global_step=3109.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2450/5971 [26:22<37:53,  1.55it/s, loss=0.162, v_num=0, train/loss_simple_step=0.795, train/loss_vlb_step=0.0345, train/loss_step=0.795, global_step=3109.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2451/5971 [26:23<37:53,  1.55it/s, loss=0.162, v_num=0, train/loss_simple_step=0.795, train/loss_vlb_step=0.0345, train/loss_step=0.795, global_step=3109.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2451/5971 [26:23<37:53,  1.55it/s, loss=0.18, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00353, train/loss_step=0.448, global_step=3109.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2452/5971 [26:26<37:56,  1.55it/s, loss=0.18, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00353, train/loss_step=0.448, global_step=3109.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2452/5971 [26:26<37:56,  1.55it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.74e-5, train/loss_step=0.00531, global_step=3109.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2453/5971 [26:27<37:56,  1.55it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.74e-5, train/loss_step=0.00531, global_step=3109.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2453/5971 [26:27<37:56,  1.55it/s, loss=0.201, v_num=0, train/loss_simple_step=0.562, train/loss_vlb_step=0.00761, train/loss_step=0.562, global_step=3110.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  41%|████      | 2454/5971 [26:28<37:56,  1.55it/s, loss=0.201, v_num=0, train/loss_simple_step=0.562, train/loss_vlb_step=0.00761, train/loss_step=0.562, global_step=3110.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2454/5971 [26:28<37:56,  1.55it/s, loss=0.208, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000465, train/loss_step=0.140, global_step=3110.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2455/5971 [26:30<37:56,  1.54it/s, loss=0.208, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000465, train/loss_step=0.140, global_step=3110.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2455/5971 [26:30<37:56,  1.54it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=3110.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2456/5971 [26:34<38:00,  1.54it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=3110.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2456/5971 [26:34<38:00,  1.54it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.000192, train/loss_step=0.0546, global_step=3110.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  41%|████      | 2457/5971 [26:35<38:00,  1.54it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.000192, train/loss_step=0.0546, global_step=3110.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2457/5971 [26:35<38:00,  1.54it/s, loss=0.196, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000132, train/loss_step=0.036, global_step=3111.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████      | 2458/5971 [26:36<38:00,  1.54it/s, loss=0.196, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000132, train/loss_step=0.036, global_step=3111.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2458/5971 [26:36<38:00,  1.54it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0051, train/loss_vlb_step=2.49e-5, train/loss_step=0.0051, global_step=3111.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2459/5971 [26:37<38:00,  1.54it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0051, train/loss_vlb_step=2.49e-5, train/loss_step=0.0051, global_step=3111.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2459/5971 [26:37<38:00,  1.54it/s, loss=0.205, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000935, train/loss_step=0.225, global_step=3111.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  41%|████      | 2460/5971 [26:40<38:03,  1.54it/s, loss=0.205, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000935, train/loss_step=0.225, global_step=3111.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2460/5971 [26:40<38:03,  1.54it/s, loss=0.196, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=3111.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2461/5971 [26:41<38:03,  1.54it/s, loss=0.196, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=3111.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2461/5971 [26:41<38:03,  1.54it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000105, train/loss_step=0.0266, global_step=3112.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2462/5971 [26:42<38:03,  1.54it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000105, train/loss_step=0.0266, global_step=3112.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2462/5971 [26:42<38:03,  1.54it/s, loss=0.2, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.00097, train/loss_step=0.248, global_step=3112.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  41%|████      | 2463/5971 [26:43<38:03,  1.54it/s, loss=0.2, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.00097, train/loss_step=0.248, global_step=3112.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████      | 2463/5971 [26:43<38:03,  1.54it/s, loss=0.223, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00395, train/loss_step=0.497, global_step=3112.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2464/5971 [26:47<38:06,  1.53it/s, loss=0.223, v_num=0, train/loss_simple_step=0.497, train/loss_vlb_step=0.00395, train/loss_step=0.497, global_step=3112.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2464/5971 [26:47<38:06,  1.53it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.00012, train/loss_step=0.0342, global_step=3112.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2465/5971 [26:48<38:06,  1.53it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.00012, train/loss_step=0.0342, global_step=3112.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2465/5971 [26:48<38:06,  1.53it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.76e-5, train/loss_step=0.00588, global_step=3113.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2466/5971 [26:49<38:06,  1.53it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.76e-5, train/loss_step=0.00588, global_step=3113.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2466/5971 [26:49<38:06,  1.53it/s, loss=0.205, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000893, train/loss_step=0.224, global_step=3113.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  41%|████▏     | 2467/5971 [26:50<38:06,  1.53it/s, loss=0.205, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000893, train/loss_step=0.224, global_step=3113.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2467/5971 [26:50<38:06,  1.53it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.62e-5, train/loss_step=0.00296, global_step=3113.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2468/5971 [26:54<38:10,  1.53it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.62e-5, train/loss_step=0.00296, global_step=3113.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2468/5971 [26:54<38:10,  1.53it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.62e-5, train/loss_step=0.0248, global_step=3113.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  41%|████▏     | 2469/5971 [26:55<38:10,  1.53it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.62e-5, train/loss_step=0.0248, global_step=3113.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2469/5971 [26:55<38:10,  1.53it/s, loss=0.177, v_num=0, train/loss_simple_step=0.084, train/loss_vlb_step=0.000277, train/loss_step=0.084, global_step=3114.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  41%|████▏     | 2470/5971 [26:57<38:11,  1.53it/s, loss=0.177, v_num=0, train/loss_simple_step=0.084, train/loss_vlb_step=0.000277, train/loss_step=0.084, global_step=3114.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2470/5971 [26:57<38:11,  1.53it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.47e-5, train/loss_step=0.0131, global_step=3114.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2471/5971 [26:58<38:11,  1.53it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.47e-5, train/loss_step=0.0131, global_step=3114.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2471/5971 [26:58<38:11,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0917, train/loss_vlb_step=0.000307, train/loss_step=0.0917, global_step=3114.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2472/5971 [27:00<38:13,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0917, train/loss_vlb_step=0.000307, train/loss_step=0.0917, global_step=3114.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2472/5971 [27:00<38:13,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000915, train/loss_step=0.251, global_step=3114.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  41%|████▏     | 2473/5971 [27:02<38:13,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000915, train/loss_step=0.251, global_step=3114.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2473/5971 [27:02<38:13,  1.53it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000319, train/loss_step=0.0967, global_step=3115.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2474/5971 [27:03<38:13,  1.52it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000319, train/loss_step=0.0967, global_step=3115.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2474/5971 [27:03<38:13,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00287, train/loss_step=0.465, global_step=3115.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  41%|████▏     | 2475/5971 [27:04<38:13,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00287, train/loss_step=0.465, global_step=3115.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2475/5971 [27:04<38:13,  1.52it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000268, train/loss_step=0.0809, global_step=3115.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2476/5971 [27:06<38:15,  1.52it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000268, train/loss_step=0.0809, global_step=3115.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2476/5971 [27:06<38:15,  1.52it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0047, train/loss_vlb_step=2.47e-5, train/loss_step=0.0047, global_step=3115.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2477/5971 [27:07<38:15,  1.52it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0047, train/loss_vlb_step=2.47e-5, train/loss_step=0.0047, global_step=3115.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  41%|████▏     | 2477/5971 [27:07<38:15,  1.52it/s, loss=0.136, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000914, train/loss_step=0.221, global_step=3116.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  42%|████▏     | 2478/5971 [27:08<38:14,  1.52it/s, loss=0.136, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000914, train/loss_step=0.221, global_step=3116.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2478/5971 [27:08<38:14,  1.52it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.4e-5, train/loss_step=0.00237, global_step=3116.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2479/5971 [27:09<38:14,  1.52it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.4e-5, train/loss_step=0.00237, global_step=3116.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2479/5971 [27:09<38:14,  1.52it/s, loss=0.131, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000405, train/loss_step=0.123, global_step=3116.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  42%|████▏     | 2480/5971 [27:14<38:19,  1.52it/s, loss=0.131, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000405, train/loss_step=0.123, global_step=3116.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2480/5971 [27:14<38:19,  1.52it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000296, train/loss_step=0.0895, global_step=3116.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2481/5971 [27:15<38:19,  1.52it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000296, train/loss_step=0.0895, global_step=3116.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2481/5971 [27:15<38:19,  1.52it/s, loss=0.151, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.0027, train/loss_step=0.466, global_step=3117.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  42%|████▏     | 2482/5971 [27:16<38:19,  1.52it/s, loss=0.151, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.0027, train/loss_step=0.466, global_step=3117.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2482/5971 [27:16<38:19,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000344, train/loss_step=0.104, global_step=3117.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2483/5971 [27:17<38:19,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000344, train/loss_step=0.104, global_step=3117.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2483/5971 [27:17<38:19,  1.52it/s, loss=0.154, v_num=0, train/loss_simple_step=0.702, train/loss_vlb_step=0.0363, train/loss_step=0.702, global_step=3117.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  42%|████▏     | 2484/5971 [27:23<38:26,  1.51it/s, loss=0.154, v_num=0, train/loss_simple_step=0.702, train/loss_vlb_step=0.0363, train/loss_step=0.702, global_step=3117.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2484/5971 [27:23<38:26,  1.51it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.3e-5, train/loss_step=0.00239, global_step=3117.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2485/5971 [27:24<38:26,  1.51it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.3e-5, train/loss_step=0.00239, global_step=3117.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2485/5971 [27:24<38:26,  1.51it/s, loss=0.16, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.00057, train/loss_step=0.160, global_step=3118.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  42%|████▏     | 2486/5971 [27:25<38:26,  1.51it/s, loss=0.16, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.00057, train/loss_step=0.160, global_step=3118.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2486/5971 [27:25<38:26,  1.51it/s, loss=0.155, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000384, train/loss_step=0.116, global_step=3118.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2487/5971 [27:26<38:25,  1.51it/s, loss=0.155, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000384, train/loss_step=0.116, global_step=3118.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2487/5971 [27:26<38:25,  1.51it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000113, train/loss_step=0.0303, global_step=3118.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2488/5971 [27:29<38:28,  1.51it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000113, train/loss_step=0.0303, global_step=3118.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2488/5971 [27:29<38:28,  1.51it/s, loss=0.175, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00271, train/loss_step=0.396, global_step=3118.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  42%|████▏     | 2489/5971 [27:30<38:28,  1.51it/s, loss=0.175, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00271, train/loss_step=0.396, global_step=3118.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2489/5971 [27:30<38:28,  1.51it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.69e-5, train/loss_step=0.0102, global_step=3119.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2490/5971 [27:32<38:28,  1.51it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.69e-5, train/loss_step=0.0102, global_step=3119.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2490/5971 [27:32<38:28,  1.51it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.29e-5, train/loss_step=0.0151, global_step=3119.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2491/5971 [27:33<38:28,  1.51it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.29e-5, train/loss_step=0.0151, global_step=3119.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2491/5971 [27:33<38:28,  1.51it/s, loss=0.18, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.0012, train/loss_step=0.255, global_step=3119.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  42%|████▏     | 2492/5971 [27:36<38:31,  1.51it/s, loss=0.18, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.0012, train/loss_step=0.255, global_step=3119.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2492/5971 [27:36<38:31,  1.51it/s, loss=0.213, v_num=0, train/loss_simple_step=0.919, train/loss_vlb_step=0.463, train/loss_step=0.919, global_step=3119.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2493/5971 [27:37<38:31,  1.50it/s, loss=0.213, v_num=0, train/loss_simple_step=0.919, train/loss_vlb_step=0.463, train/loss_step=0.919, global_step=3119.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2493/5971 [27:37<38:31,  1.50it/s, loss=0.223, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00147, train/loss_step=0.305, global_step=3120.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2494/5971 [27:38<38:31,  1.50it/s, loss=0.223, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00147, train/loss_step=0.305, global_step=3120.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2494/5971 [27:38<38:31,  1.50it/s, loss=0.207, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000438, train/loss_step=0.128, global_step=3120.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2495/5971 [27:39<38:31,  1.50it/s, loss=0.207, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000438, train/loss_step=0.128, global_step=3120.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2495/5971 [27:39<38:31,  1.50it/s, loss=0.21, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000506, train/loss_step=0.153, global_step=3120.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  42%|████▏     | 2496/5971 [27:42<38:34,  1.50it/s, loss=0.21, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000506, train/loss_step=0.153, global_step=3120.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2496/5971 [27:42<38:34,  1.50it/s, loss=0.221, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.00087, train/loss_step=0.220, global_step=3120.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2497/5971 [27:44<38:34,  1.50it/s, loss=0.221, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.00087, train/loss_step=0.220, global_step=3120.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2497/5971 [27:44<38:34,  1.50it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.51e-5, train/loss_step=0.0124, global_step=3121.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2498/5971 [27:45<38:34,  1.50it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.51e-5, train/loss_step=0.0124, global_step=3121.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2498/5971 [27:45<38:34,  1.50it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.95e-5, train/loss_step=0.0113, global_step=3121.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2499/5971 [27:46<38:34,  1.50it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.95e-5, train/loss_step=0.0113, global_step=3121.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2499/5971 [27:46<38:34,  1.50it/s, loss=0.214, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000632, train/loss_step=0.187, global_step=3121.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  42%|████▏     | 2500/5971 [27:49<38:37,  1.50it/s, loss=0.214, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000632, train/loss_step=0.187, global_step=3121.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2500/5971 [27:49<38:37,  1.50it/s, loss=0.23, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00234, train/loss_step=0.410, global_step=3121.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  42%|████▏     | 2501/5971 [27:50<38:37,  1.50it/s, loss=0.23, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00234, train/loss_step=0.410, global_step=3121.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2501/5971 [27:50<38:37,  1.50it/s, loss=0.216, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000688, train/loss_step=0.187, global_step=3122.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2502/5971 [27:52<38:37,  1.50it/s, loss=0.216, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000688, train/loss_step=0.187, global_step=3122.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2502/5971 [27:52<38:37,  1.50it/s, loss=0.218, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000435, train/loss_step=0.130, global_step=3122.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2503/5971 [27:53<38:37,  1.50it/s, loss=0.218, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000435, train/loss_step=0.130, global_step=3122.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2503/5971 [27:53<38:37,  1.50it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000118, train/loss_step=0.0324, global_step=3122.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2504/5971 [27:56<38:40,  1.49it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000118, train/loss_step=0.0324, global_step=3122.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2504/5971 [27:56<38:40,  1.49it/s, loss=0.199, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00167, train/loss_step=0.299, global_step=3122.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  42%|████▏     | 2505/5971 [27:57<38:40,  1.49it/s, loss=0.199, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00167, train/loss_step=0.299, global_step=3122.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2505/5971 [27:57<38:40,  1.49it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.56e-5, train/loss_step=0.0155, global_step=3123.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2506/5971 [27:58<38:40,  1.49it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.56e-5, train/loss_step=0.0155, global_step=3123.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2506/5971 [27:58<38:40,  1.49it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.36e-5, train/loss_step=0.0133, global_step=3123.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2507/5971 [27:59<38:40,  1.49it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.36e-5, train/loss_step=0.0133, global_step=3123.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2507/5971 [27:59<38:40,  1.49it/s, loss=0.198, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000901, train/loss_step=0.253, global_step=3123.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  42%|████▏     | 2508/5971 [28:02<38:41,  1.49it/s, loss=0.198, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000901, train/loss_step=0.253, global_step=3123.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2508/5971 [28:02<38:41,  1.49it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000104, train/loss_step=0.0259, global_step=3123.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2509/5971 [28:03<38:41,  1.49it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000104, train/loss_step=0.0259, global_step=3123.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2509/5971 [28:03<38:41,  1.49it/s, loss=0.189, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000832, train/loss_step=0.211, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  42%|████▏     | 2510/5971 [28:04<38:41,  1.49it/s, loss=0.189, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000832, train/loss_step=0.211, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2510/5971 [28:04<38:41,  1.49it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.07e-5, train/loss_step=0.00384, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2511/5971 [28:05<38:41,  1.49it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.07e-5, train/loss_step=0.00384, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2511/5971 [28:05<38:41,  1.49it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0652, train/loss_vlb_step=0.000214, train/loss_step=0.0652, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  42%|████▏     | 2512/5971 [28:07<38:43,  1.49it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0652, train/loss_vlb_step=0.000214, train/loss_step=0.0652, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  42%|████▏     | 2512/5971 [28:07<38:43,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:25,  1.95it/s][A
Epoch 5:  42%|████▏     | 2514/5971 [28:08<38:40,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:00<00:55,  2.98it/s][A
Epoch 5:  42%|████▏     | 2516/5971 [28:08<38:38,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:21,  7.62it/s][A
Epoch 5:  42%|████▏     | 2518/5971 [28:08<38:35,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▍         | 7/167 [00:00<00:15, 10.07it/s][A
Epoch 5:  42%|████▏     | 2520/5971 [28:08<38:32,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:01<00:13, 11.68it/s][A
Epoch 5:  42%|████▏     | 2522/5971 [28:09<38:29,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 12/167 [00:01<00:10, 14.40it/s][A
Epoch 5:  42%|████▏     | 2525/5971 [28:09<38:24,  1.50it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   9%|▉         | 15/167 [00:01<00:08, 17.03it/s][A
Epoch 5:  42%|████▏     | 2528/5971 [28:09<38:19,  1.50it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█         | 18/167 [00:01<00:07, 19.11it/s][A
Epoch 5:  42%|████▏     | 2531/5971 [28:09<38:15,  1.50it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:01<00:07, 19.77it/s][A
Epoch 5:  42%|████▏     | 2534/5971 [28:09<38:10,  1.50it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 20.86it/s][A
Epoch 5:  42%|████▏     | 2537/5971 [28:09<38:06,  1.50it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 27/167 [00:01<00:06, 21.10it/s][A
Epoch 5:  43%|████▎     | 2540/5971 [28:09<38:01,  1.50it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  18%|█▊        | 30/167 [00:02<00:06, 21.13it/s][A
Epoch 5:  43%|████▎     | 2543/5971 [28:10<37:57,  1.51it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|█▉        | 33/167 [00:02<00:06, 21.10it/s][A
Epoch 5:  43%|████▎     | 2546/5971 [28:10<37:52,  1.51it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:02<00:06, 20.60it/s][A
Epoch 5:  43%|████▎     | 2549/5971 [28:10<37:48,  1.51it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:02<00:06, 20.19it/s][A
Epoch 5:  43%|████▎     | 2552/5971 [28:10<37:43,  1.51it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▌       | 42/167 [00:02<00:06, 19.84it/s][A
Epoch 5:  43%|████▎     | 2555/5971 [28:10<37:39,  1.51it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▋       | 44/167 [00:02<00:06, 19.67it/s][A
Epoch 5:  43%|████▎     | 2558/5971 [28:10<37:35,  1.51it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:02<00:06, 19.56it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:06, 19.49it/s][A
Epoch 5:  43%|████▎     | 2561/5971 [28:10<37:30,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  30%|██▉       | 50/167 [00:03<00:06, 18.74it/s][A
Epoch 5:  43%|████▎     | 2564/5971 [28:11<37:26,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 53/167 [00:03<00:05, 19.43it/s][A
Epoch 5:  43%|████▎     | 2567/5971 [28:11<37:21,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  33%|███▎      | 55/167 [00:03<00:05, 19.34it/s][A

Validating:  34%|███▍      | 57/167 [00:03<00:05, 19.26it/s][A
Epoch 5:  43%|████▎     | 2570/5971 [28:11<37:17,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▌      | 59/167 [00:03<00:05, 18.56it/s][A
Epoch 5:  43%|████▎     | 2573/5971 [28:11<37:13,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 62/167 [00:03<00:05, 20.12it/s][A
Epoch 5:  43%|████▎     | 2576/5971 [28:11<37:08,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 21.02it/s][A
Epoch 5:  43%|████▎     | 2579/5971 [28:11<37:04,  1.52it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████      | 68/167 [00:03<00:04, 20.18it/s][A
Epoch 5:  43%|████▎     | 2582/5971 [28:11<36:59,  1.53it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 71/167 [00:04<00:04, 20.56it/s][A
Epoch 5:  43%|████▎     | 2585/5971 [28:12<36:55,  1.53it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▍     | 74/167 [00:04<00:04, 20.78it/s][A
Epoch 5:  43%|████▎     | 2588/5971 [28:12<36:51,  1.53it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 77/167 [00:04<00:04, 21.87it/s][A
Epoch 5:  43%|████▎     | 2591/5971 [28:12<36:46,  1.53it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  48%|████▊     | 80/167 [00:04<00:03, 22.02it/s][A
Epoch 5:  43%|████▎     | 2594/5971 [28:12<36:42,  1.53it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  50%|████▉     | 83/167 [00:04<00:04, 20.87it/s][A
Epoch 5:  43%|████▎     | 2597/5971 [28:12<36:38,  1.53it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 21.47it/s][A
Epoch 5:  44%|████▎     | 2600/5971 [28:12<36:33,  1.54it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 22.58it/s][A
Epoch 5:  44%|████▎     | 2603/5971 [28:12<36:29,  1.54it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  55%|█████▌    | 92/167 [00:05<00:03, 22.29it/s][A
Epoch 5:  44%|████▎     | 2606/5971 [28:13<36:25,  1.54it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 95/167 [00:05<00:03, 21.95it/s][A
Epoch 5:  44%|████▎     | 2609/5971 [28:13<36:21,  1.54it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▊    | 98/167 [00:05<00:03, 21.40it/s][A
Epoch 5:  44%|████▎     | 2612/5971 [28:13<36:16,  1.54it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|██████    | 101/167 [00:05<00:03, 21.62it/s][A
Epoch 5:  44%|████▍     | 2615/5971 [28:13<36:12,  1.54it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 104/167 [00:05<00:02, 21.43it/s][A
Epoch 5:  44%|████▍     | 2618/5971 [28:13<36:08,  1.55it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 21.53it/s][A
Epoch 5:  44%|████▍     | 2621/5971 [28:13<36:04,  1.55it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 20.80it/s][A
Epoch 5:  44%|████▍     | 2624/5971 [28:13<35:59,  1.55it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 113/167 [00:06<00:02, 20.70it/s][A
Epoch 5:  44%|████▍     | 2627/5971 [28:14<35:55,  1.55it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  69%|██████▉   | 116/167 [00:06<00:02, 20.63it/s][A
Epoch 5:  44%|████▍     | 2630/5971 [28:14<35:51,  1.55it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████▏  | 119/167 [00:06<00:02, 21.28it/s][A
Epoch 5:  44%|████▍     | 2633/5971 [28:14<35:47,  1.55it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  73%|███████▎  | 122/167 [00:06<00:02, 21.12it/s][A
Epoch 5:  44%|████▍     | 2636/5971 [28:14<35:43,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▍  | 125/167 [00:06<00:01, 21.19it/s][A
Epoch 5:  44%|████▍     | 2639/5971 [28:14<35:38,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 128/167 [00:06<00:01, 21.57it/s][A
Epoch 5:  44%|████▍     | 2642/5971 [28:14<35:34,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 22.22it/s][A
Epoch 5:  44%|████▍     | 2645/5971 [28:14<35:30,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|████████  | 134/167 [00:07<00:01, 22.69it/s][A
Epoch 5:  44%|████▍     | 2648/5971 [28:15<35:26,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  82%|████████▏ | 137/167 [00:07<00:01, 22.62it/s][A
Epoch 5:  44%|████▍     | 2651/5971 [28:15<35:22,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 140/167 [00:07<00:01, 22.57it/s][A
Epoch 5:  44%|████▍     | 2654/5971 [28:15<35:18,  1.57it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 143/167 [00:07<00:01, 21.88it/s][A
Epoch 5:  44%|████▍     | 2657/5971 [28:15<35:13,  1.57it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:07<00:01, 19.96it/s][A
Epoch 5:  45%|████▍     | 2660/5971 [28:15<35:09,  1.57it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▉ | 149/167 [00:07<00:00, 21.22it/s][A
Epoch 5:  45%|████▍     | 2663/5971 [28:15<35:05,  1.57it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  91%|█████████ | 152/167 [00:07<00:00, 21.50it/s][A
Epoch 5:  45%|████▍     | 2666/5971 [28:15<35:01,  1.57it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  93%|█████████▎| 155/167 [00:07<00:00, 22.26it/s][A
Epoch 5:  45%|████▍     | 2669/5971 [28:16<34:57,  1.57it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▍| 158/167 [00:08<00:00, 21.47it/s][A
Epoch 5:  45%|████▍     | 2672/5971 [28:16<34:53,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▋| 161/167 [00:08<00:00, 22.51it/s][A
Epoch 5:  45%|████▍     | 2675/5971 [28:16<34:49,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 164/167 [00:08<00:00, 22.22it/s][A
Epoch 5:  45%|████▍     | 2678/5971 [28:16<34:45,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 100%|██████████| 167/167 [00:08<00:00, 22.15it/s][A
Epoch 5:  45%|████▍     | 2680/5971 [28:16<34:42,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  45%|████▍     | 2681/5971 [28:18<34:42,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000794, train/loss_step=0.214, global_step=3124.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▍     | 2681/5971 [28:18<34:42,  1.58it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0806, train/loss_vlb_step=0.000268, train/loss_step=0.0806, global_step=3125.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▍     | 2682/5971 [28:19<34:42,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.32e-5, train/loss_step=0.00235, global_step=3125.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▍     | 2683/5971 [28:20<34:43,  1.58it/s, loss=0.13, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000793, train/loss_step=0.225, global_step=3125.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  45%|████▍     | 2684/5971 [28:23<34:45,  1.58it/s, loss=0.13, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000793, train/loss_step=0.225, global_step=3125.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▍     | 2684/5971 [28:23<34:45,  1.58it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.36e-5, train/loss_step=0.00232, global_step=3125.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▍     | 2685/5971 [28:24<34:45,  1.58it/s, loss=0.125, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000446, train/loss_step=0.133, global_step=3126.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  45%|████▍     | 2686/5971 [28:25<34:45,  1.58it/s, loss=0.141, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.0019, train/loss_step=0.336, global_step=3126.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  45%|████▌     | 2687/5971 [28:26<34:45,  1.57it/s, loss=0.141, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.0019, train/loss_step=0.336, global_step=3126.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2687/5971 [28:26<34:45,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00624, train/loss_vlb_step=3.06e-5, train/loss_step=0.00624, global_step=3126.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2688/5971 [28:29<34:47,  1.57it/s, loss=0.119, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.00049, train/loss_step=0.149, global_step=3126.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  45%|████▌     | 2689/5971 [28:30<34:47,  1.57it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.78e-5, train/loss_step=0.00568, global_step=3127.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2690/5971 [28:32<34:47,  1.57it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.78e-5, train/loss_step=0.00568, global_step=3127.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2690/5971 [28:32<34:47,  1.57it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000194, train/loss_step=0.0544, global_step=3127.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2691/5971 [28:33<34:47,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00272, train/loss_vlb_step=1.5e-5, train/loss_step=0.00272, global_step=3127.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2692/5971 [28:35<34:48,  1.57it/s, loss=0.108, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00188, train/loss_step=0.354, global_step=3127.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  45%|████▌     | 2693/5971 [28:36<34:48,  1.57it/s, loss=0.108, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00188, train/loss_step=0.354, global_step=3127.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2693/5971 [28:36<34:48,  1.57it/s, loss=0.117, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.00073, train/loss_step=0.208, global_step=3128.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2694/5971 [28:37<34:48,  1.57it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000125, train/loss_step=0.0328, global_step=3128.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2695/5971 [28:38<34:48,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000616, train/loss_step=0.184, global_step=3128.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  45%|████▌     | 2696/5971 [28:41<34:50,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000616, train/loss_step=0.184, global_step=3128.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2696/5971 [28:41<34:50,  1.57it/s, loss=0.119, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000345, train/loss_step=0.104, global_step=3128.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2697/5971 [28:42<34:50,  1.57it/s, loss=0.122, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.000965, train/loss_step=0.269, global_step=3129.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2698/5971 [28:43<34:50,  1.57it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0353, train/loss_vlb_step=0.000126, train/loss_step=0.0353, global_step=3129.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2699/5971 [28:44<34:50,  1.57it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0353, train/loss_vlb_step=0.000126, train/loss_step=0.0353, global_step=3129.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2699/5971 [28:44<34:50,  1.57it/s, loss=0.13, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000673, train/loss_step=0.193, global_step=3129.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  45%|████▌     | 2700/5971 [28:47<34:52,  1.56it/s, loss=0.139, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00344, train/loss_step=0.410, global_step=3129.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2701/5971 [28:48<34:52,  1.56it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.29e-5, train/loss_step=0.0129, global_step=3130.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2702/5971 [28:49<34:51,  1.56it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.29e-5, train/loss_step=0.0129, global_step=3130.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2702/5971 [28:49<34:51,  1.56it/s, loss=0.154, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00243, train/loss_step=0.367, global_step=3130.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  45%|████▌     | 2703/5971 [28:50<34:51,  1.56it/s, loss=0.17, v_num=0, train/loss_simple_step=0.549, train/loss_vlb_step=0.00674, train/loss_step=0.549, global_step=3130.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  45%|████▌     | 2704/5971 [28:53<34:54,  1.56it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00147, train/loss_vlb_step=8.76e-6, train/loss_step=0.00147, global_step=3130.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2705/5971 [28:54<34:53,  1.56it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00147, train/loss_vlb_step=8.76e-6, train/loss_step=0.00147, global_step=3130.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2705/5971 [28:54<34:53,  1.56it/s, loss=0.184, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00215, train/loss_step=0.410, global_step=3131.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  45%|████▌     | 2706/5971 [28:55<34:53,  1.56it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.17e-5, train/loss_step=0.0229, global_step=3131.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2707/5971 [28:56<34:53,  1.56it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000112, train/loss_step=0.0303, global_step=3131.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2708/5971 [28:59<34:55,  1.56it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000112, train/loss_step=0.0303, global_step=3131.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2708/5971 [28:59<34:55,  1.56it/s, loss=0.181, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.002, train/loss_step=0.370, global_step=3131.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  45%|████▌     | 2709/5971 [29:00<34:55,  1.56it/s, loss=0.202, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00317, train/loss_step=0.430, global_step=3132.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2710/5971 [29:01<34:55,  1.56it/s, loss=0.215, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00188, train/loss_step=0.315, global_step=3132.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2711/5971 [29:02<34:55,  1.56it/s, loss=0.215, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00188, train/loss_step=0.315, global_step=3132.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2711/5971 [29:02<34:55,  1.56it/s, loss=0.233, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00169, train/loss_step=0.354, global_step=3132.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2712/5971 [29:05<34:56,  1.55it/s, loss=0.215, v_num=0, train/loss_simple_step=0.006, train/loss_vlb_step=2.96e-5, train/loss_step=0.006, global_step=3132.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2713/5971 [29:06<34:56,  1.55it/s, loss=0.226, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00253, train/loss_step=0.423, global_step=3133.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2714/5971 [29:07<34:56,  1.55it/s, loss=0.226, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00253, train/loss_step=0.423, global_step=3133.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2714/5971 [29:07<34:56,  1.55it/s, loss=0.234, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000697, train/loss_step=0.197, global_step=3133.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2715/5971 [29:08<34:56,  1.55it/s, loss=0.235, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000708, train/loss_step=0.206, global_step=3133.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  45%|████▌     | 2716/5971 [29:11<34:58,  1.55it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.49e-5, train/loss_step=0.0146, global_step=3133.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2717/5971 [29:12<34:58,  1.55it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.49e-5, train/loss_step=0.0146, global_step=3133.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2717/5971 [29:12<34:58,  1.55it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00917, train/loss_vlb_step=4.2e-5, train/loss_step=0.00917, global_step=3134.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2718/5971 [29:13<34:58,  1.55it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000178, train/loss_step=0.0474, global_step=3134.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2719/5971 [29:14<34:57,  1.55it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000213, train/loss_step=0.0606, global_step=3134.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2720/5971 [29:17<34:59,  1.55it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0606, train/loss_vlb_step=0.000213, train/loss_step=0.0606, global_step=3134.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2720/5971 [29:17<34:59,  1.55it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.26e-5, train/loss_step=0.0237, global_step=3134.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2721/5971 [29:18<34:59,  1.55it/s, loss=0.21, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00156, train/loss_step=0.367, global_step=3135.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  46%|████▌     | 2722/5971 [29:19<34:59,  1.55it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=9.26e-5, train/loss_step=0.0224, global_step=3135.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2723/5971 [29:20<34:59,  1.55it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=9.26e-5, train/loss_step=0.0224, global_step=3135.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2723/5971 [29:20<34:59,  1.55it/s, loss=0.166, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.22e-5, train/loss_step=0.018, global_step=3135.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  46%|████▌     | 2724/5971 [29:23<35:00,  1.55it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.42e-5, train/loss_step=0.0117, global_step=3135.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2725/5971 [29:24<35:00,  1.55it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000122, train/loss_step=0.0339, global_step=3136.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2726/5971 [29:25<35:00,  1.54it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000122, train/loss_step=0.0339, global_step=3136.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2726/5971 [29:25<35:00,  1.54it/s, loss=0.152, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000369, train/loss_step=0.110, global_step=3136.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  46%|████▌     | 2727/5971 [29:26<35:00,  1.54it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0506, train/loss_vlb_step=0.000177, train/loss_step=0.0506, global_step=3136.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2728/5971 [29:29<35:03,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000145, train/loss_step=0.0396, global_step=3136.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2729/5971 [29:31<35:03,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000145, train/loss_step=0.0396, global_step=3136.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2729/5971 [29:31<35:03,  1.54it/s, loss=0.117, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000123, train/loss_step=0.034, global_step=3137.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  46%|████▌     | 2730/5971 [29:32<35:03,  1.54it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000135, train/loss_step=0.0374, global_step=3137.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2731/5971 [29:33<35:03,  1.54it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.09e-5, train/loss_step=0.0238, global_step=3137.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2732/5971 [29:36<35:05,  1.54it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.09e-5, train/loss_step=0.0238, global_step=3137.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2732/5971 [29:36<35:05,  1.54it/s, loss=0.12, v_num=0, train/loss_simple_step=0.662, train/loss_vlb_step=0.0114, train/loss_step=0.662, global_step=3137.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  46%|████▌     | 2733/5971 [29:37<35:04,  1.54it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.36e-5, train/loss_step=0.0044, global_step=3138.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2734/5971 [29:38<35:04,  1.54it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.0804, train/loss_vlb_step=0.000271, train/loss_step=0.0804, global_step=3138.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2735/5971 [29:39<35:04,  1.54it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.0804, train/loss_vlb_step=0.000271, train/loss_step=0.0804, global_step=3138.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2735/5971 [29:39<35:04,  1.54it/s, loss=0.0835, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.47e-5, train/loss_step=0.0184, global_step=3138.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2736/5971 [29:41<35:05,  1.54it/s, loss=0.0845, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000134, train/loss_step=0.0357, global_step=3138.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2737/5971 [29:42<35:05,  1.54it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.43e-5, train/loss_step=0.0243, global_step=3139.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2738/5971 [29:43<35:05,  1.54it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.43e-5, train/loss_step=0.0243, global_step=3139.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2738/5971 [29:43<35:05,  1.54it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000474, train/loss_step=0.138, global_step=3139.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2739/5971 [29:44<35:05,  1.54it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.16e-5, train/loss_step=0.00194, global_step=3139.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2740/5971 [29:48<35:07,  1.53it/s, loss=0.0947, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000656, train/loss_step=0.179, global_step=3139.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  46%|████▌     | 2741/5971 [29:49<35:07,  1.53it/s, loss=0.0947, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000656, train/loss_step=0.179, global_step=3139.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2741/5971 [29:49<35:07,  1.53it/s, loss=0.0765, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.28e-5, train/loss_step=0.00441, global_step=3140.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2742/5971 [29:50<35:07,  1.53it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00133, train/loss_step=0.343, global_step=3140.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  46%|████▌     | 2743/5971 [29:51<35:07,  1.53it/s, loss=0.114, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00409, train/loss_step=0.439, global_step=3140.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2744/5971 [29:53<35:08,  1.53it/s, loss=0.114, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00409, train/loss_step=0.439, global_step=3140.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2744/5971 [29:53<35:08,  1.53it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.000147, train/loss_step=0.0404, global_step=3140.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2745/5971 [29:54<35:08,  1.53it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.000138, train/loss_step=0.0398, global_step=3141.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2746/5971 [29:55<35:08,  1.53it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00288, train/loss_vlb_step=1.57e-5, train/loss_step=0.00288, global_step=3141.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2747/5971 [29:56<35:07,  1.53it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00288, train/loss_vlb_step=1.57e-5, train/loss_step=0.00288, global_step=3141.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2747/5971 [29:56<35:07,  1.53it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.45e-5, train/loss_step=0.0131, global_step=3141.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2748/5971 [29:59<35:09,  1.53it/s, loss=0.116, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000667, train/loss_step=0.200, global_step=3141.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2749/5971 [30:00<35:08,  1.53it/s, loss=0.131, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00208, train/loss_step=0.342, global_step=3142.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2750/5971 [30:00<35:08,  1.53it/s, loss=0.131, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00208, train/loss_step=0.342, global_step=3142.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2750/5971 [30:00<35:08,  1.53it/s, loss=0.135, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000352, train/loss_step=0.107, global_step=3142.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2751/5971 [30:01<35:08,  1.53it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.000166, train/loss_step=0.0472, global_step=3142.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2752/5971 [30:04<35:10,  1.53it/s, loss=0.113, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000728, train/loss_step=0.199, global_step=3142.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  46%|████▌     | 2753/5971 [30:05<35:10,  1.53it/s, loss=0.113, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000728, train/loss_step=0.199, global_step=3142.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2753/5971 [30:05<35:10,  1.53it/s, loss=0.116, v_num=0, train/loss_simple_step=0.074, train/loss_vlb_step=0.000243, train/loss_step=0.074, global_step=3143.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2754/5971 [30:06<35:09,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000428, train/loss_step=0.129, global_step=3143.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2755/5971 [30:07<35:09,  1.52it/s, loss=0.132, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00133, train/loss_step=0.274, global_step=3143.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▌     | 2756/5971 [30:10<35:11,  1.52it/s, loss=0.132, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00133, train/loss_step=0.274, global_step=3143.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2756/5971 [30:10<35:11,  1.52it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.4e-5, train/loss_step=0.00739, global_step=3143.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2757/5971 [30:11<35:10,  1.52it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000206, train/loss_step=0.0596, global_step=3144.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2758/5971 [30:12<35:10,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00908, train/loss_vlb_step=3.98e-5, train/loss_step=0.00908, global_step=3144.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2759/5971 [30:13<35:10,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00908, train/loss_vlb_step=3.98e-5, train/loss_step=0.00908, global_step=3144.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2759/5971 [30:13<35:10,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.19e-5, train/loss_step=0.0151, global_step=3144.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  46%|████▌     | 2760/5971 [30:15<35:11,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0429, train/loss_vlb_step=0.000161, train/loss_step=0.0429, global_step=3144.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▌     | 2761/5971 [30:16<35:11,  1.52it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000206, train/loss_step=0.0624, global_step=3145.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2762/5971 [30:17<35:11,  1.52it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000206, train/loss_step=0.0624, global_step=3145.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2762/5971 [30:17<35:11,  1.52it/s, loss=0.119, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00117, train/loss_step=0.271, global_step=3145.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  46%|████▋     | 2763/5971 [30:18<35:10,  1.52it/s, loss=0.105, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000626, train/loss_step=0.172, global_step=3145.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2764/5971 [30:21<35:12,  1.52it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.87e-5, train/loss_step=0.00332, global_step=3145.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2765/5971 [30:22<35:11,  1.52it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.87e-5, train/loss_step=0.00332, global_step=3145.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2765/5971 [30:22<35:11,  1.52it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000104, train/loss_step=0.0266, global_step=3146.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▋     | 2766/5971 [30:23<35:11,  1.52it/s, loss=0.11, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.00048, train/loss_step=0.144, global_step=3146.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  46%|████▋     | 2767/5971 [30:24<35:11,  1.52it/s, loss=0.109, v_num=0, train/loss_simple_step=0.000947, train/loss_vlb_step=5.72e-6, train/loss_step=0.000947, global_step=3146.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2768/5971 [30:26<35:12,  1.52it/s, loss=0.109, v_num=0, train/loss_simple_step=0.000947, train/loss_vlb_step=5.72e-6, train/loss_step=0.000947, global_step=3146.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2768/5971 [30:26<35:12,  1.52it/s, loss=0.101, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000102, train/loss_step=0.026, global_step=3146.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  46%|████▋     | 2769/5971 [30:27<35:12,  1.52it/s, loss=0.0845, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.23e-5, train/loss_step=0.0198, global_step=3147.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2770/5971 [30:28<35:12,  1.52it/s, loss=0.0887, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000689, train/loss_step=0.190, global_step=3147.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  46%|████▋     | 2771/5971 [30:29<35:11,  1.52it/s, loss=0.0887, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000689, train/loss_step=0.190, global_step=3147.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2771/5971 [30:29<35:11,  1.52it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000751, train/loss_step=0.205, global_step=3147.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2772/5971 [30:32<35:13,  1.51it/s, loss=0.1, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00127, train/loss_step=0.273, global_step=3147.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  46%|████▋     | 2773/5971 [30:32<35:13,  1.51it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.52e-5, train/loss_step=0.00268, global_step=3148.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2774/5971 [30:33<35:12,  1.51it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.52e-5, train/loss_step=0.00268, global_step=3148.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2774/5971 [30:33<35:12,  1.51it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.19e-5, train/loss_step=0.0114, global_step=3148.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  46%|████▋     | 2775/5971 [30:34<35:12,  1.51it/s, loss=0.079, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000128, train/loss_step=0.0381, global_step=3148.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  46%|████▋     | 2776/5971 [30:37<35:14,  1.51it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000525, train/loss_step=0.153, global_step=3148.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  47%|████▋     | 2777/5971 [30:38<35:13,  1.51it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000525, train/loss_step=0.153, global_step=3148.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  47%|████▋     | 2777/5971 [30:38<35:13,  1.51it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000213, train/loss_step=0.064, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  47%|████▋     | 2778/5971 [30:39<35:13,  1.51it/s, loss=0.0862, v_num=0, train/loss_simple_step=0.0031, train/loss_vlb_step=1.67e-5, train/loss_step=0.0031, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  47%|████▋     | 2779/5971 [30:40<35:12,  1.51it/s, loss=0.108, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00361, train/loss_step=0.454, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  47%|████▋     | 2780/5971 [30:42<35:14,  1.51it/s, loss=0.108, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00361, train/loss_step=0.454, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  47%|████▋     | 2780/5971 [30:42<35:14,  1.51it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:31,  1.82it/s][A

Validating:   1%|          | 2/167 [00:00<00:47,  3.46it/s][A
Epoch 5:  47%|████▋     | 2783/5971 [30:43<35:11,  1.51it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.73it/s][A
Epoch 5:  47%|████▋     | 2786/5971 [30:43<35:07,  1.51it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▍         | 7/167 [00:00<00:14, 11.30it/s][A
Epoch 5:  47%|████▋     | 2789/5971 [30:43<35:02,  1.51it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   6%|▌         | 10/167 [00:01<00:10, 15.18it/s][A
Epoch 5:  47%|████▋     | 2792/5971 [30:44<34:58,  1.51it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 13/167 [00:01<00:08, 17.73it/s][A
Epoch 5:  47%|████▋     | 2795/5971 [30:44<34:54,  1.52it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.16it/s][A
Epoch 5:  47%|████▋     | 2798/5971 [30:44<34:50,  1.52it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.21it/s][A
Epoch 5:  47%|████▋     | 2801/5971 [30:44<34:46,  1.52it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.66it/s][A
Epoch 5:  47%|████▋     | 2804/5971 [30:44<34:42,  1.52it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.40it/s][A
Epoch 5:  47%|████▋     | 2807/5971 [30:44<34:38,  1.52it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.99it/s][A
Epoch 5:  47%|████▋     | 2811/5971 [30:44<34:33,  1.52it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.39it/s][A
Epoch 5:  47%|████▋     | 2815/5971 [30:44<34:27,  1.53it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.69it/s][A
Epoch 5:  47%|████▋     | 2819/5971 [30:45<34:22,  1.53it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 27.14it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.86it/s][A
Epoch 5:  47%|████▋     | 2823/5971 [30:45<34:16,  1.53it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 24.95it/s][A
Epoch 5:  47%|████▋     | 2827/5971 [30:45<34:11,  1.53it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.20it/s][A
Epoch 5:  47%|████▋     | 2831/5971 [30:45<34:06,  1.53it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 51/167 [00:02<00:04, 24.91it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.90it/s][A
Epoch 5:  47%|████▋     | 2835/5971 [30:45<34:00,  1.54it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.18it/s][A
Epoch 5:  48%|████▊     | 2839/5971 [30:45<33:55,  1.54it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.40it/s][A
Epoch 5:  48%|████▊     | 2843/5971 [30:45<33:50,  1.54it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 63/167 [00:03<00:03, 26.22it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.79it/s][A
Epoch 5:  48%|████▊     | 2847/5971 [30:46<33:45,  1.54it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.23it/s][A
Epoch 5:  48%|████▊     | 2851/5971 [30:46<33:39,  1.54it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.38it/s][A
Epoch 5:  48%|████▊     | 2855/5971 [30:46<33:34,  1.55it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.38it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.72it/s][A
Epoch 5:  48%|████▊     | 2859/5971 [30:46<33:29,  1.55it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.89it/s][A
Epoch 5:  48%|████▊     | 2863/5971 [30:46<33:24,  1.55it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  50%|█████     | 84/167 [00:03<00:03, 24.65it/s][A
Epoch 5:  48%|████▊     | 2867/5971 [30:46<33:18,  1.55it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 24.69it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 25.54it/s][A
Epoch 5:  48%|████▊     | 2871/5971 [30:47<33:13,  1.55it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.17it/s][A
Epoch 5:  48%|████▊     | 2875/5971 [30:47<33:08,  1.56it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.07it/s][A
Epoch 5:  48%|████▊     | 2879/5971 [30:47<33:03,  1.56it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.53it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.55it/s][A
Epoch 5:  48%|████▊     | 2883/5971 [30:47<32:58,  1.56it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.34it/s][A
Epoch 5:  48%|████▊     | 2887/5971 [30:47<32:53,  1.56it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.08it/s][A
Epoch 5:  48%|████▊     | 2891/5971 [30:47<32:47,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.51it/s][A

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.02it/s][A
Epoch 5:  48%|████▊     | 2895/5971 [30:47<32:42,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.21it/s][A
Epoch 5:  49%|████▊     | 2899/5971 [30:48<32:37,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.65it/s][A
Epoch 5:  49%|████▊     | 2903/5971 [30:48<32:32,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 24.98it/s][A

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.18it/s][A
Epoch 5:  49%|████▊     | 2907/5971 [30:48<32:27,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 24.35it/s][A
Epoch 5:  49%|████▉     | 2911/5971 [30:48<32:22,  1.58it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 24.87it/s][A
Epoch 5:  49%|████▉     | 2915/5971 [30:48<32:17,  1.58it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████  | 135/167 [00:05<00:01, 24.63it/s][A

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 24.21it/s][A
Epoch 5:  49%|████▉     | 2919/5971 [30:48<32:12,  1.58it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 24.54it/s][A
Epoch 5:  49%|████▉     | 2923/5971 [30:49<32:07,  1.58it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 25.22it/s][A
Epoch 5:  49%|████▉     | 2927/5971 [30:49<32:02,  1.58it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 25.52it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.69it/s][A
Epoch 5:  49%|████▉     | 2931/5971 [30:49<31:57,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.47it/s][A
Epoch 5:  49%|████▉     | 2935/5971 [30:49<31:52,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.49it/s][A
Epoch 5:  49%|████▉     | 2939/5971 [30:49<31:47,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 24.17it/s][A

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.62it/s][A
Epoch 5:  49%|████▉     | 2943/5971 [30:49<31:42,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 26.09it/s][A
Epoch 5:  49%|████▉     | 2947/5971 [30:50<31:37,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  49%|████▉     | 2948/5971 [30:50<31:36,  1.59it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.22e-5, train/loss_step=0.0138, global_step=3149.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  49%|████▉     | 2949/5971 [30:51<31:36,  1.59it/s, loss=0.11, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000411, train/loss_step=0.122, global_step=3150.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  49%|████▉     | 2950/5971 [30:52<31:36,  1.59it/s, loss=0.0978, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000114, train/loss_step=0.0338, global_step=3150.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  49%|████▉     | 2951/5971 [30:53<31:35,  1.59it/s, loss=0.0978, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000114, train/loss_step=0.0338, global_step=3150.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  49%|████▉     | 2951/5971 [30:53<31:35,  1.59it/s, loss=0.0892, v_num=0, train/loss_simple_step=0.0011, train/loss_vlb_step=6.52e-6, train/loss_step=0.0011, global_step=3150.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  49%|████▉     | 2952/5971 [30:55<31:37,  1.59it/s, loss=0.103, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00117, train/loss_step=0.273, global_step=3150.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  49%|████▉     | 2953/5971 [30:56<31:37,  1.59it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00764, train/loss_vlb_step=3.52e-5, train/loss_step=0.00764, global_step=3151.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  49%|████▉     | 2954/5971 [30:57<31:36,  1.59it/s, loss=0.117, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00401, train/loss_step=0.457, global_step=3151.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  49%|████▉     | 2955/5971 [30:58<31:36,  1.59it/s, loss=0.117, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00401, train/loss_step=0.457, global_step=3151.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  49%|████▉     | 2955/5971 [30:58<31:36,  1.59it/s, loss=0.163, v_num=0, train/loss_simple_step=0.904, train/loss_vlb_step=0.115, train/loss_step=0.904, global_step=3151.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  50%|████▉     | 2956/5971 [31:01<31:37,  1.59it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00311, train/loss_vlb_step=1.59e-5, train/loss_step=0.00311, global_step=3151.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2957/5971 [31:02<31:37,  1.59it/s, loss=0.172, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000902, train/loss_step=0.238, global_step=3152.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  50%|████▉     | 2958/5971 [31:02<31:36,  1.59it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.65e-5, train/loss_step=0.00295, global_step=3152.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2959/5971 [31:03<31:36,  1.59it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.65e-5, train/loss_step=0.00295, global_step=3152.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2959/5971 [31:03<31:36,  1.59it/s, loss=0.164, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000995, train/loss_step=0.232, global_step=3152.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  50%|████▉     | 2960/5971 [31:06<31:37,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000497, train/loss_step=0.143, global_step=3152.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2961/5971 [31:07<31:37,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.26e-6, train/loss_step=0.0014, global_step=3153.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2962/5971 [31:07<31:36,  1.59it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0717, train/loss_vlb_step=0.000253, train/loss_step=0.0717, global_step=3153.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2963/5971 [31:08<31:36,  1.59it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0717, train/loss_vlb_step=0.000253, train/loss_step=0.0717, global_step=3153.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2963/5971 [31:08<31:36,  1.59it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00855, train/loss_vlb_step=4.01e-5, train/loss_step=0.00855, global_step=3153.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2964/5971 [31:11<31:37,  1.58it/s, loss=0.186, v_num=0, train/loss_simple_step=0.679, train/loss_vlb_step=0.0254, train/loss_step=0.679, global_step=3153.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  50%|████▉     | 2965/5971 [31:12<31:37,  1.58it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.58e-5, train/loss_step=0.00529, global_step=3154.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2966/5971 [31:12<31:36,  1.58it/s, loss=0.183, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.37e-5, train/loss_step=0.016, global_step=3154.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  50%|████▉     | 2967/5971 [31:13<31:36,  1.58it/s, loss=0.183, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.37e-5, train/loss_step=0.016, global_step=3154.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2967/5971 [31:13<31:36,  1.58it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00938, train/loss_vlb_step=4.14e-5, train/loss_step=0.00938, global_step=3154.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2968/5971 [31:16<31:38,  1.58it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.86e-5, train/loss_step=0.0132, global_step=3154.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  50%|████▉     | 2969/5971 [31:17<31:38,  1.58it/s, loss=0.165, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000731, train/loss_step=0.199, global_step=3155.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  50%|████▉     | 2970/5971 [31:18<31:37,  1.58it/s, loss=0.177, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00101, train/loss_step=0.270, global_step=3155.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  50%|████▉     | 2971/5971 [31:19<31:37,  1.58it/s, loss=0.177, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00101, train/loss_step=0.270, global_step=3155.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2971/5971 [31:19<31:37,  1.58it/s, loss=0.205, v_num=0, train/loss_simple_step=0.569, train/loss_vlb_step=0.00726, train/loss_step=0.569, global_step=3155.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2972/5971 [31:22<31:38,  1.58it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.42e-5, train/loss_step=0.00698, global_step=3155.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2973/5971 [31:22<31:38,  1.58it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00756, train/loss_vlb_step=3.75e-5, train/loss_step=0.00756, global_step=3156.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2974/5971 [31:23<31:37,  1.58it/s, loss=0.208, v_num=0, train/loss_simple_step=0.774, train/loss_vlb_step=0.0498, train/loss_step=0.774, global_step=3156.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  50%|████▉     | 2975/5971 [31:24<31:37,  1.58it/s, loss=0.208, v_num=0, train/loss_simple_step=0.774, train/loss_vlb_step=0.0498, train/loss_step=0.774, global_step=3156.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2975/5971 [31:24<31:37,  1.58it/s, loss=0.183, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00273, train/loss_step=0.402, global_step=3156.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2976/5971 [31:28<31:39,  1.58it/s, loss=0.196, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000903, train/loss_step=0.261, global_step=3156.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2977/5971 [31:28<31:39,  1.58it/s, loss=0.192, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000615, train/loss_step=0.170, global_step=3157.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2978/5971 [31:29<31:38,  1.58it/s, loss=0.197, v_num=0, train/loss_simple_step=0.092, train/loss_vlb_step=0.000302, train/loss_step=0.092, global_step=3157.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2979/5971 [31:30<31:38,  1.58it/s, loss=0.197, v_num=0, train/loss_simple_step=0.092, train/loss_vlb_step=0.000302, train/loss_step=0.092, global_step=3157.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2979/5971 [31:30<31:38,  1.58it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.11e-5, train/loss_step=0.00396, global_step=3157.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2980/5971 [31:33<31:40,  1.57it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.000207, train/loss_step=0.0601, global_step=3157.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  50%|████▉     | 2981/5971 [31:34<31:39,  1.57it/s, loss=0.21, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.00598, train/loss_step=0.575, global_step=3158.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  50%|████▉     | 2982/5971 [31:35<31:39,  1.57it/s, loss=0.212, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000385, train/loss_step=0.116, global_step=3158.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2983/5971 [31:36<31:39,  1.57it/s, loss=0.212, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000385, train/loss_step=0.116, global_step=3158.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2983/5971 [31:36<31:39,  1.57it/s, loss=0.252, v_num=0, train/loss_simple_step=0.801, train/loss_vlb_step=0.102, train/loss_step=0.801, global_step=3158.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  50%|████▉     | 2984/5971 [31:39<31:40,  1.57it/s, loss=0.224, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=3158.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|████▉     | 2985/5971 [31:40<31:40,  1.57it/s, loss=0.241, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00183, train/loss_step=0.344, global_step=3159.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  50%|█████     | 2986/5971 [31:41<31:40,  1.57it/s, loss=0.241, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=3159.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2987/5971 [31:42<31:39,  1.57it/s, loss=0.241, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=3159.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2987/5971 [31:42<31:39,  1.57it/s, loss=0.243, v_num=0, train/loss_simple_step=0.0506, train/loss_vlb_step=0.000178, train/loss_step=0.0506, global_step=3159.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  50%|█████     | 2988/5971 [31:45<31:41,  1.57it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0724, train/loss_vlb_step=0.000238, train/loss_step=0.0724, global_step=3159.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2989/5971 [31:46<31:41,  1.57it/s, loss=0.236, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.8e-5, train/loss_step=0.00337, global_step=3160.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2990/5971 [31:47<31:40,  1.57it/s, loss=0.226, v_num=0, train/loss_simple_step=0.069, train/loss_vlb_step=0.000232, train/loss_step=0.069, global_step=3160.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  50%|█████     | 2991/5971 [31:48<31:40,  1.57it/s, loss=0.226, v_num=0, train/loss_simple_step=0.069, train/loss_vlb_step=0.000232, train/loss_step=0.069, global_step=3160.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2991/5971 [31:48<31:40,  1.57it/s, loss=0.231, v_num=0, train/loss_simple_step=0.664, train/loss_vlb_step=0.00723, train/loss_step=0.664, global_step=3160.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  50%|█████     | 2992/5971 [31:51<31:42,  1.57it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.75e-5, train/loss_step=0.0198, global_step=3160.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2993/5971 [31:52<31:41,  1.57it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=7.06e-5, train/loss_step=0.0164, global_step=3161.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2994/5971 [31:53<31:41,  1.57it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=5.01e-5, train/loss_step=0.0107, global_step=3161.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2995/5971 [31:53<31:41,  1.57it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=5.01e-5, train/loss_step=0.0107, global_step=3161.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2995/5971 [31:53<31:41,  1.57it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00867, train/loss_vlb_step=4.1e-5, train/loss_step=0.00867, global_step=3161.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2996/5971 [31:56<31:42,  1.56it/s, loss=0.167, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000429, train/loss_step=0.131, global_step=3161.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  50%|█████     | 2997/5971 [31:57<31:42,  1.56it/s, loss=0.165, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=3162.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2998/5971 [31:58<31:41,  1.56it/s, loss=0.168, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000571, train/loss_step=0.168, global_step=3162.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2999/5971 [31:59<31:41,  1.56it/s, loss=0.168, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000571, train/loss_step=0.168, global_step=3162.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 2999/5971 [31:59<31:41,  1.56it/s, loss=0.209, v_num=0, train/loss_simple_step=0.820, train/loss_vlb_step=0.0836, train/loss_step=0.820, global_step=3162.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  50%|█████     | 3000/5971 [32:01<31:42,  1.56it/s, loss=0.213, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000461, train/loss_step=0.137, global_step=3162.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3001/5971 [32:02<31:42,  1.56it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0476, train/loss_vlb_step=0.000172, train/loss_step=0.0476, global_step=3163.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3002/5971 [32:03<31:41,  1.56it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00126, train/loss_vlb_step=7.54e-6, train/loss_step=0.00126, global_step=3163.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3003/5971 [32:04<31:41,  1.56it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00126, train/loss_vlb_step=7.54e-6, train/loss_step=0.00126, global_step=3163.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3003/5971 [32:04<31:41,  1.56it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000191, train/loss_step=0.0569, global_step=3163.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  50%|█████     | 3004/5971 [32:07<31:42,  1.56it/s, loss=0.167, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00534, train/loss_step=0.594, global_step=3163.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  50%|█████     | 3005/5971 [32:08<31:42,  1.56it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.000107, train/loss_step=0.0276, global_step=3164.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3006/5971 [32:09<31:42,  1.56it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.95e-5, train/loss_step=0.00784, global_step=3164.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3007/5971 [32:10<31:41,  1.56it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00784, train/loss_vlb_step=3.95e-5, train/loss_step=0.00784, global_step=3164.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3007/5971 [32:10<31:41,  1.56it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0233, train/loss_vlb_step=9.62e-5, train/loss_step=0.0233, global_step=3164.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  50%|█████     | 3008/5971 [32:12<31:43,  1.56it/s, loss=0.156, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000697, train/loss_step=0.189, global_step=3164.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3009/5971 [32:13<31:42,  1.56it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000119, train/loss_step=0.0342, global_step=3165.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3010/5971 [32:14<31:42,  1.56it/s, loss=0.17, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00137, train/loss_step=0.330, global_step=3165.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  50%|█████     | 3011/5971 [32:15<31:42,  1.56it/s, loss=0.17, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00137, train/loss_step=0.330, global_step=3165.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3011/5971 [32:15<31:42,  1.56it/s, loss=0.164, v_num=0, train/loss_simple_step=0.538, train/loss_vlb_step=0.00761, train/loss_step=0.538, global_step=3165.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3012/5971 [32:18<31:43,  1.55it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=0.000104, train/loss_step=0.0269, global_step=3165.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3013/5971 [32:19<31:43,  1.55it/s, loss=0.193, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.0254, train/loss_step=0.586, global_step=3166.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  50%|█████     | 3014/5971 [32:20<31:43,  1.55it/s, loss=0.196, v_num=0, train/loss_simple_step=0.070, train/loss_vlb_step=0.000231, train/loss_step=0.070, global_step=3166.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3015/5971 [32:21<31:42,  1.55it/s, loss=0.196, v_num=0, train/loss_simple_step=0.070, train/loss_vlb_step=0.000231, train/loss_step=0.070, global_step=3166.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  50%|█████     | 3015/5971 [32:21<31:42,  1.55it/s, loss=0.199, v_num=0, train/loss_simple_step=0.068, train/loss_vlb_step=0.000231, train/loss_step=0.068, global_step=3166.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3016/5971 [32:24<31:44,  1.55it/s, loss=0.201, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000659, train/loss_step=0.184, global_step=3166.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3017/5971 [32:25<31:44,  1.55it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.24e-5, train/loss_step=0.0157, global_step=3167.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3018/5971 [32:26<31:44,  1.55it/s, loss=0.195, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000434, train/loss_step=0.132, global_step=3167.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  51%|█████     | 3019/5971 [32:27<31:43,  1.55it/s, loss=0.195, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000434, train/loss_step=0.132, global_step=3167.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3019/5971 [32:27<31:43,  1.55it/s, loss=0.172, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.002, train/loss_step=0.379, global_step=3167.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  51%|█████     | 3020/5971 [32:30<31:44,  1.55it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000241, train/loss_step=0.0656, global_step=3167.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3021/5971 [32:31<31:44,  1.55it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.29e-5, train/loss_step=0.00698, global_step=3168.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3022/5971 [32:31<31:44,  1.55it/s, loss=0.169, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000187, train/loss_step=0.052, global_step=3168.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  51%|█████     | 3023/5971 [32:32<31:43,  1.55it/s, loss=0.169, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000187, train/loss_step=0.052, global_step=3168.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3023/5971 [32:32<31:43,  1.55it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0011, train/loss_vlb_step=6.59e-6, train/loss_step=0.0011, global_step=3168.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3024/5971 [32:35<31:44,  1.55it/s, loss=0.147, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000778, train/loss_step=0.206, global_step=3168.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  51%|█████     | 3025/5971 [32:36<31:44,  1.55it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=0.000105, train/loss_step=0.0269, global_step=3169.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3026/5971 [32:36<31:43,  1.55it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000122, train/loss_step=0.0319, global_step=3169.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3027/5971 [32:37<31:43,  1.55it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000122, train/loss_step=0.0319, global_step=3169.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3027/5971 [32:37<31:43,  1.55it/s, loss=0.151, v_num=0, train/loss_simple_step=0.084, train/loss_vlb_step=0.000284, train/loss_step=0.084, global_step=3169.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  51%|█████     | 3028/5971 [32:40<31:44,  1.55it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.15e-5, train/loss_step=0.0227, global_step=3169.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3029/5971 [32:41<31:44,  1.54it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.82e-5, train/loss_step=0.0191, global_step=3170.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3030/5971 [32:42<31:44,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.27e-5, train/loss_step=0.00221, global_step=3170.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3031/5971 [32:43<31:43,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.27e-5, train/loss_step=0.00221, global_step=3170.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3031/5971 [32:43<31:43,  1.54it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.00312, train/loss_vlb_step=1.79e-5, train/loss_step=0.00312, global_step=3170.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3032/5971 [32:45<31:45,  1.54it/s, loss=0.103, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000358, train/loss_step=0.109, global_step=3170.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  51%|█████     | 3033/5971 [32:46<31:44,  1.54it/s, loss=0.0784, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.000291, train/loss_step=0.0883, global_step=3171.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3034/5971 [32:47<31:44,  1.54it/s, loss=0.0754, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.81e-5, train/loss_step=0.0102, global_step=3171.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  51%|█████     | 3035/5971 [32:48<31:43,  1.54it/s, loss=0.0754, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.81e-5, train/loss_step=0.0102, global_step=3171.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3035/5971 [32:48<31:43,  1.54it/s, loss=0.0784, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000429, train/loss_step=0.128, global_step=3171.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  51%|█████     | 3036/5971 [32:52<31:45,  1.54it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.71e-5, train/loss_step=0.0137, global_step=3171.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3037/5971 [32:53<31:45,  1.54it/s, loss=0.0742, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=3172.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  51%|█████     | 3038/5971 [32:54<31:45,  1.54it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000181, train/loss_step=0.0529, global_step=3172.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3039/5971 [32:55<31:44,  1.54it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000181, train/loss_step=0.0529, global_step=3172.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3039/5971 [32:55<31:44,  1.54it/s, loss=0.0535, v_num=0, train/loss_simple_step=0.0441, train/loss_vlb_step=0.000154, train/loss_step=0.0441, global_step=3172.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3040/5971 [32:57<31:45,  1.54it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.659, train/loss_vlb_step=0.0108, train/loss_step=0.659, global_step=3172.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  51%|█████     | 3041/5971 [32:58<31:45,  1.54it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000675, train/loss_step=0.187, global_step=3173.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3042/5971 [32:59<31:45,  1.54it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000125, train/loss_step=0.0343, global_step=3173.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3043/5971 [33:00<31:44,  1.54it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000125, train/loss_step=0.0343, global_step=3173.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3043/5971 [33:00<31:44,  1.54it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.61e-5, train/loss_step=0.00296, global_step=3173.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3044/5971 [33:03<31:46,  1.54it/s, loss=0.0816, v_num=0, train/loss_simple_step=0.00942, train/loss_vlb_step=4.34e-5, train/loss_step=0.00942, global_step=3173.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3045/5971 [33:04<31:46,  1.54it/s, loss=0.121, v_num=0, train/loss_simple_step=0.822, train/loss_vlb_step=0.0256, train/loss_step=0.822, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]      
Epoch 5:  51%|█████     | 3046/5971 [33:05<31:45,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.22e-5, train/loss_step=0.00844, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3047/5971 [33:05<31:45,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00844, train/loss_vlb_step=4.22e-5, train/loss_step=0.00844, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3047/5971 [33:05<31:45,  1.53it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.41e-6, train/loss_step=0.00162, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  51%|█████     | 3048/5971 [33:08<31:46,  1.53it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.24it/s][A

Validating:   1%|          | 2/167 [00:00<00:43,  3.84it/s][A
Epoch 5:  51%|█████     | 3051/5971 [33:09<31:43,  1.53it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.53it/s][A
Epoch 5:  51%|█████     | 3055/5971 [33:09<31:38,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.93it/s][A
Epoch 5:  51%|█████     | 3059/5971 [33:09<31:33,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.04it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.30it/s][A
Epoch 5:  51%|█████▏    | 3063/5971 [33:09<31:28,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.43it/s][A
Epoch 5:  51%|█████▏    | 3067/5971 [33:09<31:23,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.92it/s][A
Epoch 5:  51%|█████▏    | 3071/5971 [33:10<31:18,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.43it/s][A
Epoch 5:  51%|█████▏    | 3075/5971 [33:10<31:13,  1.55it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.01it/s][A

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.62it/s][A
Epoch 5:  52%|█████▏    | 3079/5971 [33:10<31:08,  1.55it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.09it/s][A
Epoch 5:  52%|█████▏    | 3083/5971 [33:10<31:04,  1.55it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.62it/s][A
Epoch 5:  52%|█████▏    | 3087/5971 [33:10<30:59,  1.55it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.47it/s][A

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.68it/s][A
Epoch 5:  52%|█████▏    | 3091/5971 [33:10<30:54,  1.55it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.40it/s][A
Epoch 5:  52%|█████▏    | 3095/5971 [33:11<30:49,  1.55it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.43it/s][A
Epoch 5:  52%|█████▏    | 3099/5971 [33:11<30:44,  1.56it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.28it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.85it/s][A
Epoch 5:  52%|█████▏    | 3103/5971 [33:11<30:39,  1.56it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.55it/s][A
Epoch 5:  52%|█████▏    | 3107/5971 [33:11<30:35,  1.56it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.02it/s][A
Epoch 5:  52%|█████▏    | 3111/5971 [33:11<30:30,  1.56it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.85it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.36it/s][A
Epoch 5:  52%|█████▏    | 3115/5971 [33:11<30:25,  1.56it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.41it/s][A
Epoch 5:  52%|█████▏    | 3119/5971 [33:11<30:20,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.18it/s][A
Epoch 5:  52%|█████▏    | 3123/5971 [33:12<30:16,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.88it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.14it/s][A
Epoch 5:  52%|█████▏    | 3127/5971 [33:12<30:11,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.58it/s][A
Epoch 5:  52%|█████▏    | 3131/5971 [33:12<30:06,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.51it/s][A
Epoch 5:  53%|█████▎    | 3135/5971 [33:12<30:01,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.37it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 25.61it/s][A
Epoch 5:  53%|█████▎    | 3139/5971 [33:12<29:57,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.89it/s][A
Epoch 5:  53%|█████▎    | 3143/5971 [33:12<29:52,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 24.51it/s][A
Epoch 5:  53%|█████▎    | 3147/5971 [33:13<29:47,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 23.91it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 24.99it/s][A
Epoch 5:  53%|█████▎    | 3151/5971 [33:13<29:43,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 23.68it/s][A
Epoch 5:  53%|█████▎    | 3155/5971 [33:13<29:38,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 23.30it/s][A
Epoch 5:  53%|█████▎    | 3159/5971 [33:13<29:34,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 24.12it/s][A

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 25.49it/s][A
Epoch 5:  53%|█████▎    | 3163/5971 [33:13<29:29,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.20it/s][A
Epoch 5:  53%|█████▎    | 3167/5971 [33:13<29:24,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.98it/s][A
Epoch 5:  53%|█████▎    | 3171/5971 [33:13<29:20,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.93it/s][A

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.42it/s][A
Epoch 5:  53%|█████▎    | 3175/5971 [33:14<29:15,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 24.57it/s][A
Epoch 5:  53%|█████▎    | 3179/5971 [33:14<29:11,  1.59it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 23.65it/s][A
Epoch 5:  53%|█████▎    | 3183/5971 [33:14<29:06,  1.60it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████  | 135/167 [00:05<00:01, 22.64it/s][A

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 22.47it/s][A
Epoch 5:  53%|█████▎    | 3187/5971 [33:14<29:01,  1.60it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 23.35it/s][A
Epoch 5:  53%|█████▎    | 3191/5971 [33:14<28:57,  1.60it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 23.19it/s][A
Epoch 5:  54%|█████▎    | 3195/5971 [33:15<28:52,  1.60it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 24.38it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.71it/s][A
Epoch 5:  54%|█████▎    | 3199/5971 [33:15<28:48,  1.60it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.30it/s][A
Epoch 5:  54%|█████▎    | 3203/5971 [33:15<28:43,  1.61it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.62it/s][A
Epoch 5:  54%|█████▎    | 3207/5971 [33:15<28:39,  1.61it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.50it/s][A

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.49it/s][A
Epoch 5:  54%|█████▍    | 3211/5971 [33:15<28:34,  1.61it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 27.04it/s][A
Epoch 5:  54%|█████▍    | 3215/5971 [33:15<28:30,  1.61it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3216/5971 [33:16<28:29,  1.61it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.18e-5, train/loss_step=0.00199, global_step=3174.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  54%|█████▍    | 3217/5971 [33:17<28:29,  1.61it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0856, train/loss_vlb_step=0.000282, train/loss_step=0.0856, global_step=3175.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  54%|█████▍    | 3218/5971 [33:18<28:28,  1.61it/s, loss=0.149, v_num=0, train/loss_simple_step=0.623, train/loss_vlb_step=0.00901, train/loss_step=0.623, global_step=3175.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  54%|█████▍    | 3219/5971 [33:18<28:28,  1.61it/s, loss=0.149, v_num=0, train/loss_simple_step=0.623, train/loss_vlb_step=0.00901, train/loss_step=0.623, global_step=3175.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3219/5971 [33:18<28:28,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.8e-5, train/loss_step=0.0107, global_step=3175.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3220/5971 [33:21<28:29,  1.61it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.39e-5, train/loss_step=0.0208, global_step=3175.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3221/5971 [33:22<28:29,  1.61it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000188, train/loss_step=0.0539, global_step=3176.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3222/5971 [33:23<28:28,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000569, train/loss_step=0.168, global_step=3176.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  54%|█████▍    | 3223/5971 [33:24<28:28,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000569, train/loss_step=0.168, global_step=3176.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3223/5971 [33:24<28:28,  1.61it/s, loss=0.182, v_num=0, train/loss_simple_step=0.736, train/loss_vlb_step=0.0258, train/loss_step=0.736, global_step=3176.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  54%|█████▍    | 3224/5971 [33:27<28:29,  1.61it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0094, train/loss_vlb_step=4.28e-5, train/loss_step=0.0094, global_step=3176.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3225/5971 [33:28<28:29,  1.61it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000296, train/loss_step=0.0895, global_step=3177.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3226/5971 [33:29<28:29,  1.61it/s, loss=0.185, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=3177.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  54%|█████▍    | 3227/5971 [33:30<28:28,  1.61it/s, loss=0.185, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=3177.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3227/5971 [33:30<28:28,  1.61it/s, loss=0.186, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000264, train/loss_step=0.080, global_step=3177.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3228/5971 [33:32<28:29,  1.60it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000169, train/loss_step=0.0483, global_step=3177.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3229/5971 [33:33<28:29,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000797, train/loss_step=0.223, global_step=3178.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  54%|█████▍    | 3230/5971 [33:34<28:28,  1.60it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0939, train/loss_vlb_step=0.000309, train/loss_step=0.0939, global_step=3178.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3231/5971 [33:35<28:28,  1.60it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0939, train/loss_vlb_step=0.000309, train/loss_step=0.0939, global_step=3178.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3231/5971 [33:35<28:28,  1.60it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.6e-5, train/loss_step=0.0157, global_step=3178.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  54%|█████▍    | 3232/5971 [33:37<28:29,  1.60it/s, loss=0.167, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000383, train/loss_step=0.117, global_step=3178.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3233/5971 [33:38<28:28,  1.60it/s, loss=0.143, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00192, train/loss_step=0.351, global_step=3179.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  54%|█████▍    | 3234/5971 [33:39<28:28,  1.60it/s, loss=0.148, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=3179.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3235/5971 [33:40<28:28,  1.60it/s, loss=0.148, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=3179.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3235/5971 [33:40<28:28,  1.60it/s, loss=0.177, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.00876, train/loss_step=0.578, global_step=3179.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3236/5971 [33:43<28:29,  1.60it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0731, train/loss_vlb_step=0.000242, train/loss_step=0.0731, global_step=3179.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3237/5971 [33:44<28:29,  1.60it/s, loss=0.181, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000351, train/loss_step=0.106, global_step=3180.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  54%|█████▍    | 3238/5971 [33:45<28:29,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.07e-5, train/loss_step=0.0018, global_step=3180.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3239/5971 [33:46<28:28,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.07e-5, train/loss_step=0.0018, global_step=3180.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3239/5971 [33:46<28:28,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00304, train/loss_vlb_step=1.63e-5, train/loss_step=0.00304, global_step=3180.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3240/5971 [33:49<28:30,  1.60it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000126, train/loss_step=0.0357, global_step=3180.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3241/5971 [33:50<28:30,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00248, train/loss_step=0.446, global_step=3181.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  54%|█████▍    | 3242/5971 [33:51<28:29,  1.60it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.85e-5, train/loss_step=0.0108, global_step=3181.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3243/5971 [33:52<28:29,  1.60it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.85e-5, train/loss_step=0.0108, global_step=3181.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3243/5971 [33:52<28:29,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.87e-5, train/loss_step=0.0106, global_step=3181.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3244/5971 [33:54<28:30,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00241, train/loss_step=0.415, global_step=3181.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  54%|█████▍    | 3245/5971 [33:55<28:29,  1.59it/s, loss=0.157, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00132, train/loss_step=0.303, global_step=3182.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3246/5971 [33:56<28:29,  1.59it/s, loss=0.163, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00103, train/loss_step=0.244, global_step=3182.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3247/5971 [33:57<28:28,  1.59it/s, loss=0.163, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00103, train/loss_step=0.244, global_step=3182.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3247/5971 [33:57<28:28,  1.59it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.000158, train/loss_step=0.0472, global_step=3182.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3248/5971 [33:59<28:29,  1.59it/s, loss=0.178, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.0019, train/loss_step=0.379, global_step=3182.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  54%|█████▍    | 3249/5971 [34:00<28:29,  1.59it/s, loss=0.169, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000148, train/loss_step=0.042, global_step=3183.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3250/5971 [34:01<28:28,  1.59it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000149, train/loss_step=0.0409, global_step=3183.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3251/5971 [34:02<28:28,  1.59it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000149, train/loss_step=0.0409, global_step=3183.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3251/5971 [34:02<28:28,  1.59it/s, loss=0.176, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000721, train/loss_step=0.203, global_step=3183.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  54%|█████▍    | 3252/5971 [34:05<28:29,  1.59it/s, loss=0.173, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000187, train/loss_step=0.053, global_step=3183.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3253/5971 [34:06<28:29,  1.59it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00622, train/loss_vlb_step=3.14e-5, train/loss_step=0.00622, global_step=3184.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  54%|█████▍    | 3254/5971 [34:07<28:28,  1.59it/s, loss=0.152, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000152, train/loss_step=0.042, global_step=3184.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  55%|█████▍    | 3255/5971 [34:07<28:28,  1.59it/s, loss=0.152, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000152, train/loss_step=0.042, global_step=3184.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3255/5971 [34:07<28:28,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000621, train/loss_step=0.181, global_step=3184.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3256/5971 [34:10<28:29,  1.59it/s, loss=0.139, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000681, train/loss_step=0.204, global_step=3184.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3257/5971 [34:11<28:29,  1.59it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.00024, train/loss_step=0.0718, global_step=3185.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3258/5971 [34:12<28:28,  1.59it/s, loss=0.137, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.18e-5, train/loss_step=0.004, global_step=3185.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  55%|█████▍    | 3259/5971 [34:13<28:28,  1.59it/s, loss=0.137, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.18e-5, train/loss_step=0.004, global_step=3185.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3259/5971 [34:13<28:28,  1.59it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0489, train/loss_vlb_step=0.000172, train/loss_step=0.0489, global_step=3185.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3260/5971 [34:15<28:28,  1.59it/s, loss=0.152, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00108, train/loss_step=0.281, global_step=3185.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  55%|█████▍    | 3261/5971 [34:16<28:28,  1.59it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0235, train/loss_vlb_step=9.35e-5, train/loss_step=0.0235, global_step=3186.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3262/5971 [34:17<28:28,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000122, train/loss_step=0.0333, global_step=3186.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3263/5971 [34:18<28:27,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000122, train/loss_step=0.0333, global_step=3186.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3263/5971 [34:18<28:27,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=9.01e-5, train/loss_step=0.0226, global_step=3186.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  55%|█████▍    | 3264/5971 [34:20<28:28,  1.58it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000144, train/loss_step=0.0395, global_step=3186.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3265/5971 [34:21<28:28,  1.58it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.00815, train/loss_vlb_step=3.92e-5, train/loss_step=0.00815, global_step=3187.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3266/5971 [34:22<28:27,  1.58it/s, loss=0.135, v_num=0, train/loss_simple_step=0.977, train/loss_vlb_step=0.492, train/loss_step=0.977, global_step=3187.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]       
Epoch 5:  55%|█████▍    | 3267/5971 [34:23<28:27,  1.58it/s, loss=0.135, v_num=0, train/loss_simple_step=0.977, train/loss_vlb_step=0.492, train/loss_step=0.977, global_step=3187.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3267/5971 [34:23<28:27,  1.58it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=9.99e-6, train/loss_step=0.00176, global_step=3187.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3268/5971 [34:26<28:28,  1.58it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0977, train/loss_vlb_step=0.000321, train/loss_step=0.0977, global_step=3187.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  55%|█████▍    | 3269/5971 [34:27<28:28,  1.58it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.39e-6, train/loss_step=0.00156, global_step=3188.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3270/5971 [34:28<28:27,  1.58it/s, loss=0.13, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00168, train/loss_step=0.306, global_step=3188.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  55%|█████▍    | 3271/5971 [34:29<28:27,  1.58it/s, loss=0.13, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00168, train/loss_step=0.306, global_step=3188.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3271/5971 [34:29<28:27,  1.58it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.22e-5, train/loss_step=0.00213, global_step=3188.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3272/5971 [34:31<28:28,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.0059, train/loss_step=0.609, global_step=3188.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  55%|█████▍    | 3273/5971 [34:32<28:27,  1.58it/s, loss=0.152, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000271, train/loss_step=0.077, global_step=3189.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3274/5971 [34:33<28:27,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00125, train/loss_vlb_step=7.47e-6, train/loss_step=0.00125, global_step=3189.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3275/5971 [34:34<28:27,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00125, train/loss_vlb_step=7.47e-6, train/loss_step=0.00125, global_step=3189.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3275/5971 [34:34<28:27,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.00045, train/loss_step=0.121, global_step=3189.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  55%|█████▍    | 3276/5971 [34:36<28:27,  1.58it/s, loss=0.167, v_num=0, train/loss_simple_step=0.608, train/loss_vlb_step=0.0102, train/loss_step=0.608, global_step=3189.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  55%|█████▍    | 3277/5971 [34:37<28:27,  1.58it/s, loss=0.174, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000779, train/loss_step=0.212, global_step=3190.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3278/5971 [34:38<28:26,  1.58it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000116, train/loss_step=0.0299, global_step=3190.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3279/5971 [34:39<28:26,  1.58it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000116, train/loss_step=0.0299, global_step=3190.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3279/5971 [34:39<28:26,  1.58it/s, loss=0.18, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000499, train/loss_step=0.149, global_step=3190.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  55%|█████▍    | 3280/5971 [34:41<28:27,  1.58it/s, loss=0.194, v_num=0, train/loss_simple_step=0.556, train/loss_vlb_step=0.00804, train/loss_step=0.556, global_step=3190.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3281/5971 [34:42<28:26,  1.58it/s, loss=0.212, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00246, train/loss_step=0.382, global_step=3191.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3282/5971 [34:43<28:26,  1.58it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.81e-5, train/loss_step=0.0256, global_step=3191.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3283/5971 [34:44<28:26,  1.58it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.81e-5, train/loss_step=0.0256, global_step=3191.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▍    | 3283/5971 [34:44<28:26,  1.58it/s, loss=0.215, v_num=0, train/loss_simple_step=0.092, train/loss_vlb_step=0.000302, train/loss_step=0.092, global_step=3191.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  55%|█████▍    | 3284/5971 [34:47<28:27,  1.57it/s, loss=0.228, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00142, train/loss_step=0.303, global_step=3191.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  55%|█████▌    | 3285/5971 [34:48<28:27,  1.57it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000227, train/loss_step=0.0673, global_step=3192.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3286/5971 [34:49<28:26,  1.57it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.29e-5, train/loss_step=0.0237, global_step=3192.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  55%|█████▌    | 3287/5971 [34:50<28:26,  1.57it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.29e-5, train/loss_step=0.0237, global_step=3192.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3287/5971 [34:50<28:26,  1.57it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.42e-6, train/loss_step=0.00139, global_step=3192.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3288/5971 [34:52<28:26,  1.57it/s, loss=0.179, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.39e-5, train/loss_step=0.015, global_step=3192.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  55%|█████▌    | 3289/5971 [34:53<28:26,  1.57it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000267, train/loss_step=0.0809, global_step=3193.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3290/5971 [34:54<28:26,  1.57it/s, loss=0.177, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000631, train/loss_step=0.187, global_step=3193.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  55%|█████▌    | 3291/5971 [34:55<28:25,  1.57it/s, loss=0.177, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000631, train/loss_step=0.187, global_step=3193.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3291/5971 [34:55<28:25,  1.57it/s, loss=0.191, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00108, train/loss_step=0.283, global_step=3193.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  55%|█████▌    | 3292/5971 [34:57<28:26,  1.57it/s, loss=0.168, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000492, train/loss_step=0.149, global_step=3193.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3293/5971 [34:58<28:25,  1.57it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.00016, train/loss_step=0.0433, global_step=3194.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3294/5971 [34:59<28:25,  1.57it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000303, train/loss_step=0.0916, global_step=3194.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3295/5971 [35:00<28:25,  1.57it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000303, train/loss_step=0.0916, global_step=3194.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3295/5971 [35:00<28:25,  1.57it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00338, train/loss_vlb_step=1.65e-5, train/loss_step=0.00338, global_step=3194.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3296/5971 [35:02<28:25,  1.57it/s, loss=0.145, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000748, train/loss_step=0.203, global_step=3194.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  55%|█████▌    | 3297/5971 [35:03<28:25,  1.57it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00711, train/loss_vlb_step=3.42e-5, train/loss_step=0.00711, global_step=3195.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3298/5971 [35:04<28:24,  1.57it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=9.23e-6, train/loss_step=0.00153, global_step=3195.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3299/5971 [35:04<28:24,  1.57it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=9.23e-6, train/loss_step=0.00153, global_step=3195.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3299/5971 [35:04<28:24,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.66e-5, train/loss_step=0.00294, global_step=3195.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3300/5971 [35:07<28:25,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=3195.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  55%|█████▌    | 3301/5971 [35:08<28:24,  1.57it/s, loss=0.0876, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000115, train/loss_step=0.0304, global_step=3196.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3302/5971 [35:09<28:24,  1.57it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000297, train/loss_step=0.0894, global_step=3196.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3303/5971 [35:09<28:23,  1.57it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000297, train/loss_step=0.0894, global_step=3196.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3303/5971 [35:09<28:23,  1.57it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=1.97e-5, train/loss_step=0.00376, global_step=3196.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3304/5971 [35:12<28:24,  1.56it/s, loss=0.0724, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.37e-5, train/loss_step=0.0242, global_step=3196.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  55%|█████▌    | 3305/5971 [35:13<28:24,  1.56it/s, loss=0.0945, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00624, train/loss_step=0.508, global_step=3197.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  55%|█████▌    | 3306/5971 [35:14<28:23,  1.56it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.27e-5, train/loss_step=0.0114, global_step=3197.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3307/5971 [35:15<28:23,  1.56it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.27e-5, train/loss_step=0.0114, global_step=3197.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3307/5971 [35:15<28:23,  1.56it/s, loss=0.12, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00484, train/loss_step=0.519, global_step=3197.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  55%|█████▌    | 3308/5971 [35:17<28:23,  1.56it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0923, train/loss_vlb_step=0.000304, train/loss_step=0.0923, global_step=3197.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3309/5971 [35:18<28:23,  1.56it/s, loss=0.13, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000831, train/loss_step=0.216, global_step=3198.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  55%|█████▌    | 3310/5971 [35:19<28:23,  1.56it/s, loss=0.131, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000739, train/loss_step=0.191, global_step=3198.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3311/5971 [35:19<28:22,  1.56it/s, loss=0.131, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000739, train/loss_step=0.191, global_step=3198.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3311/5971 [35:19<28:22,  1.56it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.00032, train/loss_step=0.0974, global_step=3198.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3312/5971 [35:22<28:23,  1.56it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.58e-5, train/loss_step=0.00285, global_step=3198.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  55%|█████▌    | 3313/5971 [35:23<28:23,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000468, train/loss_step=0.142, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  56%|█████▌    | 3314/5971 [35:24<28:22,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0863, train/loss_vlb_step=0.000285, train/loss_step=0.0863, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  56%|█████▌    | 3315/5971 [35:25<28:22,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0863, train/loss_vlb_step=0.000285, train/loss_step=0.0863, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  56%|█████▌    | 3315/5971 [35:25<28:22,  1.56it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=7.94e-5, train/loss_step=0.0205, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  56%|█████▌    | 3316/5971 [35:27<28:23,  1.56it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:19,  2.08it/s][A
Epoch 5:  56%|█████▌    | 3319/5971 [35:28<28:20,  1.56it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   2%|▏         | 3/167 [00:00<00:28,  5.78it/s][A

Validating:   4%|▎         | 6/167 [00:00<00:14, 11.15it/s][A
Epoch 5:  56%|█████▌    | 3323/5971 [35:28<28:15,  1.56it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:00<00:10, 15.60it/s][A
Epoch 5:  56%|█████▌    | 3327/5971 [35:28<28:11,  1.56it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 12/167 [00:00<00:08, 18.42it/s][A
Epoch 5:  56%|█████▌    | 3331/5971 [35:28<28:06,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.27it/s][A

Validating:  11%|█         | 18/167 [00:01<00:06, 22.76it/s][A
Epoch 5:  56%|█████▌    | 3335/5971 [35:29<28:02,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 24.01it/s][A
Epoch 5:  56%|█████▌    | 3339/5971 [35:29<27:57,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 21.50it/s][A
Epoch 5:  56%|█████▌    | 3343/5971 [35:29<27:53,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 27/167 [00:01<00:06, 22.69it/s][A

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.89it/s][A
Epoch 5:  56%|█████▌    | 3347/5971 [35:29<27:49,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|█▉        | 33/167 [00:02<00:11, 12.05it/s][A
Epoch 5:  56%|█████▌    | 3351/5971 [35:30<27:44,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  21%|██        | 35/167 [00:02<00:10, 13.19it/s][A

Validating:  22%|██▏       | 37/167 [00:02<00:10, 12.22it/s][A
Epoch 5:  56%|█████▌    | 3355/5971 [35:30<27:40,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:03<00:16,  7.66it/s][A

Validating:  25%|██▍       | 41/167 [00:03<00:17,  7.39it/s][A
Epoch 5:  56%|█████▋    | 3359/5971 [35:31<27:36,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▌       | 43/167 [00:03<00:14,  8.79it/s][A

Validating:  27%|██▋       | 45/167 [00:03<00:12,  9.92it/s][A
Epoch 5:  56%|█████▋    | 3363/5971 [35:31<27:32,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 47/167 [00:03<00:13,  9.04it/s][A

Validating:  29%|██▉       | 49/167 [00:04<00:13,  8.46it/s][A
Epoch 5:  56%|█████▋    | 3367/5971 [35:32<27:28,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 52/167 [00:04<00:09, 11.64it/s][A
Epoch 5:  56%|█████▋    | 3371/5971 [35:32<27:24,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▎      | 56/167 [00:04<00:07, 15.65it/s][A
Epoch 5:  57%|█████▋    | 3375/5971 [35:32<27:19,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  36%|███▌      | 60/167 [00:04<00:05, 19.29it/s][A
Epoch 5:  57%|█████▋    | 3379/5971 [35:32<27:15,  1.59it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:04<00:04, 22.15it/s][A
Epoch 5:  57%|█████▋    | 3383/5971 [35:32<27:10,  1.59it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████      | 68/167 [00:04<00:04, 24.52it/s][A
Epoch 5:  57%|█████▋    | 3387/5971 [35:32<27:06,  1.59it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  43%|████▎     | 71/167 [00:04<00:03, 24.49it/s][A

Validating:  44%|████▍     | 74/167 [00:05<00:03, 25.01it/s][A
Epoch 5:  57%|█████▋    | 3391/5971 [35:32<27:02,  1.59it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 77/167 [00:05<00:03, 24.60it/s][A
Epoch 5:  57%|█████▋    | 3395/5971 [35:33<26:57,  1.59it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  48%|████▊     | 80/167 [00:05<00:03, 24.37it/s][A
Epoch 5:  57%|█████▋    | 3399/5971 [35:33<26:53,  1.59it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  50%|████▉     | 83/167 [00:05<00:03, 25.21it/s][A

Validating:  51%|█████▏    | 86/167 [00:05<00:03, 25.30it/s][A
Epoch 5:  57%|█████▋    | 3403/5971 [35:33<26:49,  1.60it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 90/167 [00:05<00:02, 27.06it/s][A
Epoch 5:  57%|█████▋    | 3407/5971 [35:33<26:45,  1.60it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▌    | 93/167 [00:05<00:02, 26.08it/s][A
Epoch 5:  57%|█████▋    | 3411/5971 [35:33<26:40,  1.60it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 96/167 [00:05<00:02, 26.84it/s][A
Epoch 5:  57%|█████▋    | 3415/5971 [35:33<26:36,  1.60it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▉    | 99/167 [00:05<00:02, 27.35it/s][A

Validating:  61%|██████    | 102/167 [00:06<00:02, 27.14it/s][A
Epoch 5:  57%|█████▋    | 3419/5971 [35:33<26:32,  1.60it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 105/167 [00:06<00:02, 26.96it/s][A
Epoch 5:  57%|█████▋    | 3423/5971 [35:34<26:28,  1.60it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▍   | 108/167 [00:06<00:02, 27.26it/s][A
Epoch 5:  57%|█████▋    | 3427/5971 [35:34<26:23,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▋   | 111/167 [00:06<00:02, 25.88it/s][A

Validating:  68%|██████▊   | 114/167 [00:06<00:02, 26.31it/s][A
Epoch 5:  57%|█████▋    | 3431/5971 [35:34<26:19,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:06<00:01, 26.71it/s][A
Epoch 5:  58%|█████▊    | 3435/5971 [35:34<26:15,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 120/167 [00:06<00:01, 25.75it/s][A
Epoch 5:  58%|█████▊    | 3439/5971 [35:34<26:11,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▍  | 124/167 [00:06<00:01, 27.40it/s][A
Epoch 5:  58%|█████▊    | 3443/5971 [35:34<26:07,  1.61it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  76%|███████▌  | 127/167 [00:07<00:01, 26.52it/s][A

Validating:  78%|███████▊  | 130/167 [00:07<00:01, 27.26it/s][A
Epoch 5:  58%|█████▊    | 3447/5971 [35:34<26:02,  1.62it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|███████▉  | 133/167 [00:07<00:01, 26.91it/s][A
Epoch 5:  58%|█████▊    | 3451/5971 [35:35<25:58,  1.62it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████▏ | 136/167 [00:07<00:01, 27.15it/s][A
Epoch 5:  58%|█████▊    | 3455/5971 [35:35<25:54,  1.62it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 139/167 [00:07<00:01, 26.26it/s][A

Validating:  85%|████████▌ | 142/167 [00:07<00:00, 26.17it/s][A
Epoch 5:  58%|█████▊    | 3459/5971 [35:35<25:50,  1.62it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 145/167 [00:07<00:00, 22.26it/s][A
Epoch 5:  58%|█████▊    | 3463/5971 [35:35<25:46,  1.62it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▊ | 148/167 [00:07<00:00, 22.34it/s][A
Epoch 5:  58%|█████▊    | 3467/5971 [35:35<25:42,  1.62it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|█████████ | 151/167 [00:08<00:00, 22.63it/s][A

Validating:  92%|█████████▏| 154/167 [00:08<00:00, 23.67it/s][A
Epoch 5:  58%|█████▊    | 3471/5971 [35:35<25:38,  1.63it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▍| 158/167 [00:08<00:00, 25.51it/s][A
Epoch 5:  58%|█████▊    | 3475/5971 [35:36<25:33,  1.63it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▋| 161/167 [00:08<00:00, 25.66it/s][A
Epoch 5:  58%|█████▊    | 3479/5971 [35:36<25:29,  1.63it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 164/167 [00:08<00:00, 24.99it/s][A
Epoch 5:  58%|█████▊    | 3483/5971 [35:36<25:25,  1.63it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 100%|██████████| 167/167 [00:08<00:00, 25.74it/s][A
Epoch 5:  58%|█████▊    | 3484/5971 [35:36<25:24,  1.63it/s, loss=0.128, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3199.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.14it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.72it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.12it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.42it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.71it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.93it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.05it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.17it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.53it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.58it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.42it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.31it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.42it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.56it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.37it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.37it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.50it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.11it/s]

Epoch 5:  58%|█████▊    | 3485/5971 [35:48<25:32,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00211, train/loss_step=0.397, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A
Epoch 5:  58%|█████▊    | 3485/5971 [35:51<25:34,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00211, train/loss_step=0.397, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.39it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.74it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.00it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.18it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.45it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.41it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.29it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.11it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.08it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.10it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:05,  5.07it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.14it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.24it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:05,  4.46it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  4.64it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  4.81it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:04,  4.94it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.03it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.10it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.16it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  5.20it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.16it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.17it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.18it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.22it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.21it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.17it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.09it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.15it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.27it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.35it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  5.49it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.96it/s]

Epoch 5:  58%|█████▊    | 3486/5971 [36:01<25:40,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00211, train/loss_step=0.397, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3486/5971 [36:01<25:40,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000507, train/loss_step=0.145, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.39it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.98it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.16it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.48it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.52it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.58it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.62it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.64it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.63it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.53it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.29it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.25it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.31it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.59it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.61it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.23it/s]

Epoch 5:  58%|█████▊    | 3487/5971 [36:13<25:47,  1.60it/s, loss=0.154, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000507, train/loss_step=0.145, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3487/5971 [36:13<25:47,  1.60it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.000111, train/loss_step=0.0281, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.39it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.74it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.46it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.53it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.53it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.54it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.55it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.53it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.51it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.50it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.48it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.50it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.54it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.38it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.26it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.32it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.35it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.29it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.28it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.37it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.53it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 5:  58%|█████▊    | 3488/5971 [36:27<25:56,  1.60it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.000111, train/loss_step=0.0281, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3488/5971 [36:27<25:56,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=0.000104, train/loss_step=0.0269, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  58%|█████▊    | 3489/5971 [36:28<25:56,  1.59it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=0.000104, train/loss_step=0.0269, global_step=3200.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3489/5971 [36:28<25:56,  1.59it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00305, train/loss_vlb_step=1.66e-5, train/loss_step=0.00305, global_step=3201.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3490/5971 [36:29<25:55,  1.59it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00305, train/loss_vlb_step=1.66e-5, train/loss_step=0.00305, global_step=3201.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3490/5971 [36:29<25:55,  1.59it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=3201.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  58%|█████▊    | 3491/5971 [36:30<25:55,  1.59it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.3e-5, train/loss_step=0.00216, global_step=3201.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3491/5971 [36:30<25:55,  1.59it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.17e-5, train/loss_step=0.00442, global_step=3201.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3492/5971 [36:32<25:55,  1.59it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.17e-5, train/loss_step=0.00442, global_step=3201.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3492/5971 [36:32<25:55,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00933, train/loss_vlb_step=4.17e-5, train/loss_step=0.00933, global_step=3201.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3493/5971 [36:33<25:55,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00933, train/loss_vlb_step=4.17e-5, train/loss_step=0.00933, global_step=3201.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  58%|█████▊    | 3493/5971 [36:33<25:55,  1.59it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.62e-5, train/loss_step=0.00529, global_step=3202.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3494/5971 [36:34<25:54,  1.59it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.62e-5, train/loss_step=0.00529, global_step=3202.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3494/5971 [36:34<25:54,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000378, train/loss_step=0.115, global_step=3202.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▊    | 3495/5971 [36:34<25:54,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000378, train/loss_step=0.115, global_step=3202.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3495/5971 [36:34<25:54,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.553, train/loss_vlb_step=0.00449, train/loss_step=0.553, global_step=3202.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  59%|█████▊    | 3496/5971 [36:37<25:55,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.553, train/loss_vlb_step=0.00449, train/loss_step=0.553, global_step=3202.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3496/5971 [36:37<25:55,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.00017, train/loss_step=0.0455, global_step=3202.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3497/5971 [36:38<25:54,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.00017, train/loss_step=0.0455, global_step=3202.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3497/5971 [36:38<25:54,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00116, train/loss_step=0.285, global_step=3203.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  59%|█████▊    | 3498/5971 [36:39<25:54,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00116, train/loss_step=0.285, global_step=3203.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3498/5971 [36:39<25:54,  1.59it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00349, train/loss_vlb_step=1.89e-5, train/loss_step=0.00349, global_step=3203.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3499/5971 [36:40<25:53,  1.59it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00349, train/loss_vlb_step=1.89e-5, train/loss_step=0.00349, global_step=3203.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3499/5971 [36:40<25:53,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00738, train/loss_vlb_step=3.65e-5, train/loss_step=0.00738, global_step=3203.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3500/5971 [36:42<25:54,  1.59it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00738, train/loss_vlb_step=3.65e-5, train/loss_step=0.00738, global_step=3203.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3500/5971 [36:42<25:54,  1.59it/s, loss=0.13, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00166, train/loss_step=0.357, global_step=3203.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  59%|█████▊    | 3501/5971 [36:43<25:53,  1.59it/s, loss=0.13, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00166, train/loss_step=0.357, global_step=3203.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3501/5971 [36:43<25:53,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000642, train/loss_step=0.182, global_step=3204.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3502/5971 [36:43<25:53,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000642, train/loss_step=0.182, global_step=3204.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3502/5971 [36:43<25:53,  1.59it/s, loss=0.138, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000693, train/loss_step=0.201, global_step=3204.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3503/5971 [36:45<25:53,  1.59it/s, loss=0.138, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000693, train/loss_step=0.201, global_step=3204.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3503/5971 [36:45<25:53,  1.59it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000293, train/loss_step=0.0887, global_step=3204.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3504/5971 [36:47<25:53,  1.59it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000293, train/loss_step=0.0887, global_step=3204.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3504/5971 [36:47<25:53,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00283, train/loss_step=0.410, global_step=3204.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▊    | 3505/5971 [36:48<25:53,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00283, train/loss_step=0.410, global_step=3204.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3505/5971 [36:48<25:53,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.01e-5, train/loss_step=0.0197, global_step=3205.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3506/5971 [36:48<25:52,  1.59it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.01e-5, train/loss_step=0.0197, global_step=3205.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3506/5971 [36:48<25:52,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0939, train/loss_vlb_step=0.000314, train/loss_step=0.0939, global_step=3205.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3507/5971 [36:49<25:52,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0939, train/loss_vlb_step=0.000314, train/loss_step=0.0939, global_step=3205.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▊    | 3507/5971 [36:49<25:52,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00133, train/loss_step=0.304, global_step=3205.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▉    | 3508/5971 [36:51<25:52,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00133, train/loss_step=0.304, global_step=3205.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3508/5971 [36:51<25:52,  1.59it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.41e-5, train/loss_step=0.00256, global_step=3205.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3509/5971 [36:52<25:52,  1.59it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.41e-5, train/loss_step=0.00256, global_step=3205.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3509/5971 [36:52<25:52,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.38e-5, train/loss_step=0.0229, global_step=3206.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  59%|█████▉    | 3510/5971 [36:53<25:51,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.38e-5, train/loss_step=0.0229, global_step=3206.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3510/5971 [36:53<25:51,  1.59it/s, loss=0.142, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000421, train/loss_step=0.126, global_step=3206.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  59%|█████▉    | 3511/5971 [36:54<25:51,  1.59it/s, loss=0.142, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000421, train/loss_step=0.126, global_step=3206.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3511/5971 [36:54<25:51,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.00014, train/loss_step=0.037, global_step=3206.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  59%|█████▉    | 3512/5971 [36:56<25:51,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.00014, train/loss_step=0.037, global_step=3206.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3512/5971 [36:56<25:51,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.59e-5, train/loss_step=0.0148, global_step=3206.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3513/5971 [36:57<25:51,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.59e-5, train/loss_step=0.0148, global_step=3206.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3513/5971 [36:57<25:51,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.89e-5, train/loss_step=0.0197, global_step=3207.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3514/5971 [36:58<25:50,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.89e-5, train/loss_step=0.0197, global_step=3207.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3514/5971 [36:58<25:50,  1.58it/s, loss=0.141, v_num=0, train/loss_simple_step=0.048, train/loss_vlb_step=0.000165, train/loss_step=0.048, global_step=3207.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  59%|█████▉    | 3515/5971 [36:59<25:50,  1.58it/s, loss=0.141, v_num=0, train/loss_simple_step=0.048, train/loss_vlb_step=0.000165, train/loss_step=0.048, global_step=3207.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3515/5971 [36:59<25:50,  1.58it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.000294, train/loss_step=0.0881, global_step=3207.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3516/5971 [37:02<25:51,  1.58it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.000294, train/loss_step=0.0881, global_step=3207.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3516/5971 [37:02<25:51,  1.58it/s, loss=0.117, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000126, train/loss_step=0.037, global_step=3207.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  59%|█████▉    | 3517/5971 [37:03<25:50,  1.58it/s, loss=0.117, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000126, train/loss_step=0.037, global_step=3207.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3517/5971 [37:03<25:50,  1.58it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.64e-6, train/loss_step=0.00161, global_step=3208.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3518/5971 [37:03<25:50,  1.58it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.64e-6, train/loss_step=0.00161, global_step=3208.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3518/5971 [37:03<25:50,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00194, train/loss_step=0.385, global_step=3208.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  59%|█████▉    | 3519/5971 [37:04<25:49,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00194, train/loss_step=0.385, global_step=3208.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3519/5971 [37:04<25:49,  1.58it/s, loss=0.127, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=3208.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3520/5971 [37:06<25:50,  1.58it/s, loss=0.127, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=3208.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3520/5971 [37:06<25:50,  1.58it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=8.63e-6, train/loss_step=0.00167, global_step=3208.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3521/5971 [37:07<25:49,  1.58it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=8.63e-6, train/loss_step=0.00167, global_step=3208.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3521/5971 [37:07<25:49,  1.58it/s, loss=0.129, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00452, train/loss_step=0.568, global_step=3209.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  59%|█████▉    | 3522/5971 [37:08<25:49,  1.58it/s, loss=0.129, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00452, train/loss_step=0.568, global_step=3209.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3522/5971 [37:08<25:49,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000218, train/loss_step=0.0645, global_step=3209.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3523/5971 [37:09<25:48,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000218, train/loss_step=0.0645, global_step=3209.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3523/5971 [37:09<25:48,  1.58it/s, loss=0.141, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00252, train/loss_step=0.467, global_step=3209.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▉    | 3524/5971 [37:11<25:49,  1.58it/s, loss=0.141, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00252, train/loss_step=0.467, global_step=3209.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3524/5971 [37:11<25:49,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000114, train/loss_step=0.0309, global_step=3209.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3525/5971 [37:12<25:48,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000114, train/loss_step=0.0309, global_step=3209.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3525/5971 [37:12<25:48,  1.58it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.74e-5, train/loss_step=0.00314, global_step=3210.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3526/5971 [37:13<25:48,  1.58it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.74e-5, train/loss_step=0.00314, global_step=3210.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3526/5971 [37:13<25:48,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=3210.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▉    | 3527/5971 [37:14<25:47,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=3210.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3527/5971 [37:14<25:47,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000555, train/loss_step=0.165, global_step=3210.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3528/5971 [37:16<25:48,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000555, train/loss_step=0.165, global_step=3210.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3528/5971 [37:16<25:48,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00104, train/loss_vlb_step=6.24e-6, train/loss_step=0.00104, global_step=3210.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3529/5971 [37:17<25:47,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00104, train/loss_vlb_step=6.24e-6, train/loss_step=0.00104, global_step=3210.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3529/5971 [37:17<25:47,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000106, train/loss_step=0.0287, global_step=3211.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  59%|█████▉    | 3530/5971 [37:18<25:47,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000106, train/loss_step=0.0287, global_step=3211.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3530/5971 [37:18<25:47,  1.58it/s, loss=0.121, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00106, train/loss_step=0.250, global_step=3211.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▉    | 3531/5971 [37:19<25:47,  1.58it/s, loss=0.121, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00106, train/loss_step=0.250, global_step=3211.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3531/5971 [37:19<25:47,  1.58it/s, loss=0.127, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000525, train/loss_step=0.157, global_step=3211.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3532/5971 [37:21<25:47,  1.58it/s, loss=0.127, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000525, train/loss_step=0.157, global_step=3211.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3532/5971 [37:21<25:47,  1.58it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000185, train/loss_step=0.0536, global_step=3211.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3533/5971 [37:22<25:47,  1.58it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000185, train/loss_step=0.0536, global_step=3211.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3533/5971 [37:22<25:47,  1.58it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00402, train/loss_vlb_step=2.21e-5, train/loss_step=0.00402, global_step=3212.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3534/5971 [37:23<25:46,  1.58it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00402, train/loss_vlb_step=2.21e-5, train/loss_step=0.00402, global_step=3212.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3534/5971 [37:23<25:46,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00513, train/loss_vlb_step=2.65e-5, train/loss_step=0.00513, global_step=3212.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3535/5971 [37:24<25:46,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00513, train/loss_vlb_step=2.65e-5, train/loss_step=0.00513, global_step=3212.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3535/5971 [37:24<25:46,  1.58it/s, loss=0.124, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000167, train/loss_step=0.049, global_step=3212.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▉    | 3536/5971 [37:26<25:46,  1.57it/s, loss=0.124, v_num=0, train/loss_simple_step=0.049, train/loss_vlb_step=0.000167, train/loss_step=0.049, global_step=3212.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3536/5971 [37:26<25:46,  1.57it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000105, train/loss_step=0.0285, global_step=3212.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3537/5971 [37:27<25:46,  1.57it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000105, train/loss_step=0.0285, global_step=3212.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3537/5971 [37:27<25:46,  1.57it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00514, train/loss_vlb_step=2.63e-5, train/loss_step=0.00514, global_step=3213.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3538/5971 [37:28<25:45,  1.57it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00514, train/loss_vlb_step=2.63e-5, train/loss_step=0.00514, global_step=3213.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3538/5971 [37:28<25:45,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000134, train/loss_step=0.0369, global_step=3213.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  59%|█████▉    | 3539/5971 [37:29<25:45,  1.57it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000134, train/loss_step=0.0369, global_step=3213.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3539/5971 [37:29<25:45,  1.57it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.06e-5, train/loss_step=0.00398, global_step=3213.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3540/5971 [37:32<25:46,  1.57it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.06e-5, train/loss_step=0.00398, global_step=3213.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3540/5971 [37:32<25:46,  1.57it/s, loss=0.113, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000898, train/loss_step=0.216, global_step=3213.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▉    | 3541/5971 [37:33<25:45,  1.57it/s, loss=0.113, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000898, train/loss_step=0.216, global_step=3213.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3541/5971 [37:33<25:45,  1.57it/s, loss=0.087, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000193, train/loss_step=0.0544, global_step=3214.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3542/5971 [37:33<25:45,  1.57it/s, loss=0.087, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000193, train/loss_step=0.0544, global_step=3214.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3542/5971 [37:33<25:45,  1.57it/s, loss=0.084, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.53e-5, train/loss_step=0.00478, global_step=3214.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3543/5971 [37:34<25:44,  1.57it/s, loss=0.084, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.53e-5, train/loss_step=0.00478, global_step=3214.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3543/5971 [37:34<25:44,  1.57it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.652, train/loss_vlb_step=0.00882, train/loss_step=0.652, global_step=3214.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  59%|█████▉    | 3544/5971 [37:36<25:45,  1.57it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.652, train/loss_vlb_step=0.00882, train/loss_step=0.652, global_step=3214.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3544/5971 [37:36<25:45,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00104, train/loss_step=0.264, global_step=3214.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  59%|█████▉    | 3545/5971 [37:37<25:44,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00104, train/loss_step=0.264, global_step=3214.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3545/5971 [37:37<25:44,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00796, train/loss_vlb_step=3.86e-5, train/loss_step=0.00796, global_step=3215.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3546/5971 [37:38<25:44,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00796, train/loss_vlb_step=3.86e-5, train/loss_step=0.00796, global_step=3215.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3546/5971 [37:38<25:44,  1.57it/s, loss=0.122, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00382, train/loss_step=0.461, global_step=3215.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  59%|█████▉    | 3547/5971 [37:39<25:43,  1.57it/s, loss=0.122, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00382, train/loss_step=0.461, global_step=3215.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3547/5971 [37:39<25:43,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=0.0001, train/loss_step=0.0242, global_step=3215.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3548/5971 [37:41<25:44,  1.57it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=0.0001, train/loss_step=0.0242, global_step=3215.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3548/5971 [37:41<25:44,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00113, train/loss_step=0.250, global_step=3215.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  59%|█████▉    | 3549/5971 [37:42<25:43,  1.57it/s, loss=0.128, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00113, train/loss_step=0.250, global_step=3215.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3549/5971 [37:42<25:43,  1.57it/s, loss=0.131, v_num=0, train/loss_simple_step=0.089, train/loss_vlb_step=0.000295, train/loss_step=0.089, global_step=3216.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3550/5971 [37:43<25:43,  1.57it/s, loss=0.131, v_num=0, train/loss_simple_step=0.089, train/loss_vlb_step=0.000295, train/loss_step=0.089, global_step=3216.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3550/5971 [37:43<25:43,  1.57it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000191, train/loss_step=0.0567, global_step=3216.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3551/5971 [37:44<25:42,  1.57it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000191, train/loss_step=0.0567, global_step=3216.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3551/5971 [37:44<25:42,  1.57it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.57e-5, train/loss_step=0.00287, global_step=3216.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3552/5971 [37:46<25:43,  1.57it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.57e-5, train/loss_step=0.00287, global_step=3216.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  59%|█████▉    | 3552/5971 [37:46<25:43,  1.57it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000228, train/loss_step=0.0655, global_step=3216.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  60%|█████▉    | 3553/5971 [37:47<25:42,  1.57it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000228, train/loss_step=0.0655, global_step=3216.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3553/5971 [37:47<25:42,  1.57it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.6e-5, train/loss_step=0.00277, global_step=3217.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3554/5971 [37:48<25:42,  1.57it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.6e-5, train/loss_step=0.00277, global_step=3217.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3554/5971 [37:48<25:42,  1.57it/s, loss=0.121, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000461, train/loss_step=0.138, global_step=3217.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  60%|█████▉    | 3555/5971 [37:49<25:41,  1.57it/s, loss=0.121, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000461, train/loss_step=0.138, global_step=3217.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3555/5971 [37:49<25:41,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00131, train/loss_step=0.273, global_step=3217.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  60%|█████▉    | 3556/5971 [37:51<25:42,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00131, train/loss_step=0.273, global_step=3217.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3556/5971 [37:51<25:42,  1.57it/s, loss=0.145, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00118, train/loss_step=0.296, global_step=3217.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3557/5971 [37:52<25:42,  1.57it/s, loss=0.145, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00118, train/loss_step=0.296, global_step=3217.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3557/5971 [37:52<25:42,  1.57it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.97e-5, train/loss_step=0.0168, global_step=3218.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3558/5971 [37:53<25:41,  1.57it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.97e-5, train/loss_step=0.0168, global_step=3218.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3558/5971 [37:53<25:41,  1.57it/s, loss=0.173, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.0165, train/loss_step=0.590, global_step=3218.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  60%|█████▉    | 3559/5971 [37:54<25:41,  1.57it/s, loss=0.173, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.0165, train/loss_step=0.590, global_step=3218.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3559/5971 [37:54<25:41,  1.57it/s, loss=0.178, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=3218.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3560/5971 [37:56<25:41,  1.56it/s, loss=0.178, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=3218.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3560/5971 [37:56<25:41,  1.56it/s, loss=0.186, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00219, train/loss_step=0.375, global_step=3218.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  60%|█████▉    | 3561/5971 [37:57<25:41,  1.56it/s, loss=0.186, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00219, train/loss_step=0.375, global_step=3218.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3561/5971 [37:57<25:41,  1.56it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000314, train/loss_step=0.0956, global_step=3219.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3562/5971 [37:58<25:40,  1.56it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000314, train/loss_step=0.0956, global_step=3219.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3562/5971 [37:58<25:40,  1.56it/s, loss=0.196, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000567, train/loss_step=0.165, global_step=3219.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  60%|█████▉    | 3563/5971 [37:59<25:40,  1.56it/s, loss=0.196, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000567, train/loss_step=0.165, global_step=3219.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3563/5971 [37:59<25:40,  1.56it/s, loss=0.172, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000576, train/loss_step=0.171, global_step=3219.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3564/5971 [38:01<25:40,  1.56it/s, loss=0.172, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000576, train/loss_step=0.171, global_step=3219.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3564/5971 [38:01<25:40,  1.56it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.23e-5, train/loss_step=0.00212, global_step=3219.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3565/5971 [38:02<25:40,  1.56it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.23e-5, train/loss_step=0.00212, global_step=3219.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3565/5971 [38:02<25:40,  1.56it/s, loss=0.16, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=8.22e-5, train/loss_step=0.019, global_step=3220.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  60%|█████▉    | 3566/5971 [38:03<25:39,  1.56it/s, loss=0.16, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=8.22e-5, train/loss_step=0.019, global_step=3220.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3566/5971 [38:03<25:39,  1.56it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00475, train/loss_vlb_step=2.34e-5, train/loss_step=0.00475, global_step=3220.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3567/5971 [38:04<25:39,  1.56it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00475, train/loss_vlb_step=2.34e-5, train/loss_step=0.00475, global_step=3220.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3567/5971 [38:04<25:39,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000156, train/loss_step=0.0437, global_step=3220.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  60%|█████▉    | 3568/5971 [38:07<25:40,  1.56it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000156, train/loss_step=0.0437, global_step=3220.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3568/5971 [38:07<25:40,  1.56it/s, loss=0.133, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000536, train/loss_step=0.159, global_step=3220.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  60%|█████▉    | 3569/5971 [38:09<25:40,  1.56it/s, loss=0.133, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000536, train/loss_step=0.159, global_step=3220.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3569/5971 [38:09<25:40,  1.56it/s, loss=0.164, v_num=0, train/loss_simple_step=0.702, train/loss_vlb_step=0.0142, train/loss_step=0.702, global_step=3221.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  60%|█████▉    | 3570/5971 [38:10<25:39,  1.56it/s, loss=0.164, v_num=0, train/loss_simple_step=0.702, train/loss_vlb_step=0.0142, train/loss_step=0.702, global_step=3221.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3570/5971 [38:10<25:39,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.8e-5, train/loss_step=0.00337, global_step=3221.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3571/5971 [38:11<25:39,  1.56it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.8e-5, train/loss_step=0.00337, global_step=3221.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3571/5971 [38:11<25:39,  1.56it/s, loss=0.166, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000339, train/loss_step=0.102, global_step=3221.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  60%|█████▉    | 3572/5971 [38:13<25:39,  1.56it/s, loss=0.166, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000339, train/loss_step=0.102, global_step=3221.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3572/5971 [38:13<25:39,  1.56it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.12e-5, train/loss_step=0.00186, global_step=3221.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3573/5971 [38:13<25:39,  1.56it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.12e-5, train/loss_step=0.00186, global_step=3221.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3573/5971 [38:13<25:39,  1.56it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000162, train/loss_step=0.0454, global_step=3222.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  60%|█████▉    | 3574/5971 [38:14<25:38,  1.56it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000162, train/loss_step=0.0454, global_step=3222.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3574/5971 [38:14<25:38,  1.56it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0689, train/loss_vlb_step=0.000238, train/loss_step=0.0689, global_step=3222.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3575/5971 [38:15<25:38,  1.56it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0689, train/loss_vlb_step=0.000238, train/loss_step=0.0689, global_step=3222.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3575/5971 [38:15<25:38,  1.56it/s, loss=0.149, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.81e-5, train/loss_step=0.011, global_step=3222.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  60%|█████▉    | 3576/5971 [38:19<25:39,  1.56it/s, loss=0.149, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.81e-5, train/loss_step=0.011, global_step=3222.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3576/5971 [38:19<25:39,  1.56it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.47e-5, train/loss_step=0.0171, global_step=3222.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3577/5971 [38:20<25:38,  1.56it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.47e-5, train/loss_step=0.0171, global_step=3222.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3577/5971 [38:20<25:38,  1.56it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.06e-5, train/loss_step=0.0018, global_step=3223.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3578/5971 [38:20<25:38,  1.56it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.06e-5, train/loss_step=0.0018, global_step=3223.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3578/5971 [38:20<25:38,  1.56it/s, loss=0.121, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00151, train/loss_step=0.338, global_step=3223.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  60%|█████▉    | 3579/5971 [38:21<25:37,  1.56it/s, loss=0.121, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00151, train/loss_step=0.338, global_step=3223.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3579/5971 [38:21<25:37,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0639, train/loss_vlb_step=0.000216, train/loss_step=0.0639, global_step=3223.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3580/5971 [38:24<25:38,  1.55it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0639, train/loss_vlb_step=0.000216, train/loss_step=0.0639, global_step=3223.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3580/5971 [38:24<25:38,  1.55it/s, loss=0.134, v_num=0, train/loss_simple_step=0.669, train/loss_vlb_step=0.0187, train/loss_step=0.669, global_step=3223.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  60%|█████▉    | 3581/5971 [38:25<25:38,  1.55it/s, loss=0.134, v_num=0, train/loss_simple_step=0.669, train/loss_vlb_step=0.0187, train/loss_step=0.669, global_step=3223.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3581/5971 [38:25<25:38,  1.55it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000114, train/loss_step=0.0291, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3582/5971 [38:26<25:37,  1.55it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000114, train/loss_step=0.0291, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|█████▉    | 3582/5971 [38:26<25:37,  1.55it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000175, train/loss_step=0.0481, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|██████    | 3583/5971 [38:27<25:37,  1.55it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000175, train/loss_step=0.0481, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|██████    | 3583/5971 [38:27<25:37,  1.55it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000128, train/loss_step=0.0352, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|██████    | 3584/5971 [38:29<25:37,  1.55it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000128, train/loss_step=0.0352, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  60%|██████    | 3584/5971 [38:29<25:37,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.34it/s][A
Epoch 5:  60%|██████    | 3586/5971 [38:30<25:36,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:00<00:54,  3.00it/s][A
Epoch 5:  60%|██████    | 3588/5971 [38:30<25:34,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:20,  7.95it/s][A
Epoch 5:  60%|██████    | 3591/5971 [38:30<25:30,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.42it/s][A
Epoch 5:  60%|██████    | 3594/5971 [38:30<25:27,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.12it/s][A
Epoch 5:  60%|██████    | 3597/5971 [38:30<25:24,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.14it/s][A
Epoch 5:  60%|██████    | 3600/5971 [38:30<25:21,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.43it/s][A
Epoch 5:  60%|██████    | 3603/5971 [38:31<25:18,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.50it/s][A
Epoch 5:  60%|██████    | 3606/5971 [38:31<25:15,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.75it/s][A
Epoch 5:  60%|██████    | 3609/5971 [38:31<25:12,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.70it/s][A
Epoch 5:  60%|██████    | 3612/5971 [38:31<25:09,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.70it/s][A
Epoch 5:  61%|██████    | 3615/5971 [38:31<25:06,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.87it/s][A
Epoch 5:  61%|██████    | 3618/5971 [38:31<25:02,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  21%|██        | 35/167 [00:01<00:04, 26.51it/s][A
Epoch 5:  61%|██████    | 3621/5971 [38:31<24:59,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 27.29it/s][A
Epoch 5:  61%|██████    | 3625/5971 [38:31<24:55,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 27.92it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.76it/s][A
Epoch 5:  61%|██████    | 3629/5971 [38:32<24:51,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.85it/s][A
Epoch 5:  61%|██████    | 3633/5971 [38:32<24:47,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.06it/s][A
Epoch 5:  61%|██████    | 3637/5971 [38:32<24:43,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.03it/s][A
Epoch 5:  61%|██████    | 3641/5971 [38:32<24:39,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.90it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.17it/s][A
Epoch 5:  61%|██████    | 3645/5971 [38:32<24:35,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 28.20it/s][A
Epoch 5:  61%|██████    | 3649/5971 [38:32<24:31,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.35it/s][A
Epoch 5:  61%|██████    | 3653/5971 [38:32<24:27,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.90it/s][A
Epoch 5:  61%|██████    | 3657/5971 [38:33<24:23,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.01it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 24.89it/s][A
Epoch 5:  61%|██████▏   | 3661/5971 [38:33<24:19,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 22.82it/s][A
Epoch 5:  61%|██████▏   | 3665/5971 [38:33<24:15,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 23.47it/s][A
Epoch 5:  61%|██████▏   | 3669/5971 [38:33<24:11,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████     | 85/167 [00:03<00:03, 24.18it/s][A

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 24.81it/s][A
Epoch 5:  62%|██████▏   | 3673/5971 [38:33<24:07,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 24.02it/s][A
Epoch 5:  62%|██████▏   | 3677/5971 [38:33<24:03,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 24.60it/s][A
Epoch 5:  62%|██████▏   | 3681/5971 [38:34<23:59,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 24.62it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.23it/s][A
Epoch 5:  62%|██████▏   | 3685/5971 [38:34<23:55,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 24.98it/s][A
Epoch 5:  62%|██████▏   | 3689/5971 [38:34<23:51,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.31it/s][A
Epoch 5:  62%|██████▏   | 3693/5971 [38:34<23:47,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 24.83it/s][A

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 25.64it/s][A
Epoch 5:  62%|██████▏   | 3697/5971 [38:34<23:43,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 25.13it/s][A
Epoch 5:  62%|██████▏   | 3701/5971 [38:34<23:39,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████   | 118/167 [00:05<00:01, 24.58it/s][A
Epoch 5:  62%|██████▏   | 3705/5971 [38:35<23:35,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 24.38it/s][A

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 23.64it/s][A
Epoch 5:  62%|██████▏   | 3709/5971 [38:35<23:31,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 23.91it/s][A
Epoch 5:  62%|██████▏   | 3713/5971 [38:35<23:27,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 24.64it/s][A
Epoch 5:  62%|██████▏   | 3717/5971 [38:35<23:23,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 24.25it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 24.66it/s][A
Epoch 5:  62%|██████▏   | 3721/5971 [38:35<23:19,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 24.59it/s][A
Epoch 5:  62%|██████▏   | 3725/5971 [38:35<23:15,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  85%|████████▌ | 142/167 [00:06<00:01, 23.85it/s][A
Epoch 5:  62%|██████▏   | 3729/5971 [38:36<23:12,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 24.36it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 24.81it/s][A
Epoch 5:  63%|██████▎   | 3733/5971 [38:36<23:08,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 25.93it/s][A
Epoch 5:  63%|██████▎   | 3737/5971 [38:36<23:04,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 25.24it/s][A
Epoch 5:  63%|██████▎   | 3741/5971 [38:36<23:00,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.15it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.66it/s][A
Epoch 5:  63%|██████▎   | 3745/5971 [38:36<22:56,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 27.02it/s][A
Epoch 5:  63%|██████▎   | 3749/5971 [38:36<22:52,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 26.65it/s][A
Epoch 5:  63%|██████▎   | 3752/5971 [38:37<22:50,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  63%|██████▎   | 3753/5971 [38:38<22:49,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3224.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3753/5971 [38:38<22:49,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000629, train/loss_step=0.180, global_step=3225.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3754/5971 [38:39<22:49,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.2e-5, train/loss_step=0.002, global_step=3225.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  63%|██████▎   | 3755/5971 [38:39<22:48,  1.62it/s, loss=0.169, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0263, train/loss_step=0.750, global_step=3225.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3756/5971 [38:43<22:49,  1.62it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00549, train/loss_vlb_step=2.84e-5, train/loss_step=0.00549, global_step=3225.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3757/5971 [38:44<22:49,  1.62it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00549, train/loss_vlb_step=2.84e-5, train/loss_step=0.00549, global_step=3225.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3757/5971 [38:44<22:49,  1.62it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000113, train/loss_step=0.0292, global_step=3226.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3758/5971 [38:44<22:48,  1.62it/s, loss=0.167, v_num=0, train/loss_simple_step=0.788, train/loss_vlb_step=0.397, train/loss_step=0.788, global_step=3226.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  63%|██████▎   | 3759/5971 [38:45<22:48,  1.62it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=4.84e-5, train/loss_step=0.0116, global_step=3226.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3760/5971 [38:48<22:49,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.24e-5, train/loss_step=0.0124, global_step=3226.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3761/5971 [38:49<22:48,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.24e-5, train/loss_step=0.0124, global_step=3226.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3761/5971 [38:49<22:48,  1.61it/s, loss=0.166, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=3227.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3762/5971 [38:50<22:48,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.704, train/loss_vlb_step=0.0264, train/loss_step=0.704, global_step=3227.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  63%|██████▎   | 3763/5971 [38:51<22:47,  1.61it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000165, train/loss_step=0.0479, global_step=3227.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3764/5971 [38:53<22:47,  1.61it/s, loss=0.218, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00234, train/loss_step=0.377, global_step=3227.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3765/5971 [38:54<22:47,  1.61it/s, loss=0.218, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00234, train/loss_step=0.377, global_step=3227.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3765/5971 [38:54<22:47,  1.61it/s, loss=0.229, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000852, train/loss_step=0.234, global_step=3228.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3766/5971 [38:55<22:47,  1.61it/s, loss=0.22, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000489, train/loss_step=0.148, global_step=3228.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3767/5971 [38:56<22:46,  1.61it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.29e-5, train/loss_step=0.00222, global_step=3228.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3768/5971 [38:58<22:46,  1.61it/s, loss=0.201, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00186, train/loss_step=0.368, global_step=3228.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  63%|██████▎   | 3769/5971 [38:59<22:46,  1.61it/s, loss=0.201, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00186, train/loss_step=0.368, global_step=3228.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3769/5971 [38:59<22:46,  1.61it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.67e-5, train/loss_step=0.00531, global_step=3229.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3770/5971 [39:00<22:45,  1.61it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.67e-5, train/loss_step=0.00749, global_step=3229.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3771/5971 [39:01<22:45,  1.61it/s, loss=0.203, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000423, train/loss_step=0.128, global_step=3229.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  63%|██████▎   | 3772/5971 [39:03<22:45,  1.61it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.6e-5, train/loss_step=0.0246, global_step=3229.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3773/5971 [39:04<22:45,  1.61it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.6e-5, train/loss_step=0.0246, global_step=3229.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3773/5971 [39:04<22:45,  1.61it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.04e-5, train/loss_step=0.0234, global_step=3230.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3774/5971 [39:05<22:44,  1.61it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.82e-5, train/loss_step=0.0108, global_step=3230.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3775/5971 [39:05<22:44,  1.61it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0813, train/loss_vlb_step=0.00027, train/loss_step=0.0813, global_step=3230.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3776/5971 [39:08<22:44,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000191, train/loss_step=0.055, global_step=3230.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3777/5971 [39:09<22:44,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000191, train/loss_step=0.055, global_step=3230.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3777/5971 [39:09<22:44,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.77e-5, train/loss_step=0.0218, global_step=3231.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3778/5971 [39:10<22:43,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000522, train/loss_step=0.151, global_step=3231.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3779/5971 [39:11<22:43,  1.61it/s, loss=0.142, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00139, train/loss_step=0.331, global_step=3231.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3780/5971 [39:13<22:43,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00114, train/loss_step=0.265, global_step=3231.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3781/5971 [39:14<22:43,  1.61it/s, loss=0.155, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00114, train/loss_step=0.265, global_step=3231.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3781/5971 [39:14<22:43,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.00051, train/loss_step=0.153, global_step=3232.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3782/5971 [39:14<22:42,  1.61it/s, loss=0.127, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000374, train/loss_step=0.113, global_step=3232.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3783/5971 [39:15<22:42,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00321, train/loss_step=0.439, global_step=3232.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3784/5971 [39:18<22:42,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000545, train/loss_step=0.163, global_step=3232.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3785/5971 [39:18<22:42,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000545, train/loss_step=0.163, global_step=3232.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3785/5971 [39:18<22:42,  1.60it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00771, train/loss_vlb_step=3.4e-5, train/loss_step=0.00771, global_step=3233.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3786/5971 [39:19<22:41,  1.60it/s, loss=0.128, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000757, train/loss_step=0.219, global_step=3233.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  63%|██████▎   | 3787/5971 [39:20<22:41,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000356, train/loss_step=0.108, global_step=3233.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3788/5971 [39:22<22:41,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00188, train/loss_step=0.364, global_step=3233.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3789/5971 [39:23<22:40,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00188, train/loss_step=0.364, global_step=3233.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3789/5971 [39:23<22:40,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0553, train/loss_vlb_step=0.000201, train/loss_step=0.0553, global_step=3234.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  63%|██████▎   | 3790/5971 [39:24<22:40,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=5.64e-5, train/loss_step=0.0145, global_step=3234.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  63%|██████▎   | 3791/5971 [39:25<22:39,  1.60it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.73e-5, train/loss_step=0.0217, global_step=3234.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3792/5971 [39:27<22:40,  1.60it/s, loss=0.131, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.56e-5, train/loss_step=0.018, global_step=3234.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  64%|██████▎   | 3793/5971 [39:28<22:39,  1.60it/s, loss=0.131, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.56e-5, train/loss_step=0.018, global_step=3234.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3793/5971 [39:28<22:39,  1.60it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.05e-5, train/loss_step=0.00182, global_step=3235.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3794/5971 [39:29<22:39,  1.60it/s, loss=0.131, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000136, train/loss_step=0.036, global_step=3235.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  64%|██████▎   | 3795/5971 [39:30<22:38,  1.60it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00121, train/loss_vlb_step=7.32e-6, train/loss_step=0.00121, global_step=3235.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3796/5971 [39:33<22:39,  1.60it/s, loss=0.144, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00211, train/loss_step=0.401, global_step=3235.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  64%|██████▎   | 3797/5971 [39:34<22:39,  1.60it/s, loss=0.144, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00211, train/loss_step=0.401, global_step=3235.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3797/5971 [39:34<22:39,  1.60it/s, loss=0.151, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000532, train/loss_step=0.156, global_step=3236.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3798/5971 [39:35<22:38,  1.60it/s, loss=0.152, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000592, train/loss_step=0.174, global_step=3236.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3799/5971 [39:36<22:38,  1.60it/s, loss=0.174, v_num=0, train/loss_simple_step=0.775, train/loss_vlb_step=0.0366, train/loss_step=0.775, global_step=3236.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  64%|██████▎   | 3800/5971 [39:38<22:38,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000574, train/loss_step=0.170, global_step=3236.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3801/5971 [39:39<22:38,  1.60it/s, loss=0.17, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000574, train/loss_step=0.170, global_step=3236.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3801/5971 [39:39<22:38,  1.60it/s, loss=0.176, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00125, train/loss_step=0.273, global_step=3237.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3802/5971 [39:40<22:37,  1.60it/s, loss=0.182, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.00105, train/loss_step=0.244, global_step=3237.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3803/5971 [39:41<22:37,  1.60it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.000155, train/loss_step=0.0455, global_step=3237.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3804/5971 [39:43<22:37,  1.60it/s, loss=0.169, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00152, train/loss_step=0.306, global_step=3237.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  64%|██████▎   | 3805/5971 [39:44<22:37,  1.60it/s, loss=0.169, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00152, train/loss_step=0.306, global_step=3237.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3805/5971 [39:44<22:37,  1.60it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.000161, train/loss_step=0.0433, global_step=3238.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▎   | 3806/5971 [39:45<22:36,  1.60it/s, loss=0.166, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000345, train/loss_step=0.104, global_step=3238.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  64%|██████▍   | 3807/5971 [39:46<22:36,  1.60it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.94e-5, train/loss_step=0.0259, global_step=3238.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3808/5971 [39:48<22:36,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.57e-5, train/loss_step=0.00478, global_step=3238.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3809/5971 [39:49<22:35,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.57e-5, train/loss_step=0.00478, global_step=3238.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3809/5971 [39:49<22:35,  1.59it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.65e-5, train/loss_step=0.00291, global_step=3239.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3810/5971 [39:50<22:35,  1.59it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.75e-5, train/loss_step=0.00347, global_step=3239.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3811/5971 [39:51<22:34,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000559, train/loss_step=0.164, global_step=3239.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  64%|██████▍   | 3812/5971 [39:53<22:35,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.13e-5, train/loss_step=0.00188, global_step=3239.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3813/5971 [39:54<22:34,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00188, train/loss_vlb_step=1.13e-5, train/loss_step=0.00188, global_step=3239.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3813/5971 [39:54<22:34,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00985, train/loss_vlb_step=4.6e-5, train/loss_step=0.00985, global_step=3240.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3814/5971 [39:55<22:34,  1.59it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.6e-5, train/loss_step=0.00484, global_step=3240.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3815/5971 [39:56<22:33,  1.59it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.12e-5, train/loss_step=0.00193, global_step=3240.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3816/5971 [39:58<22:34,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.42e-5, train/loss_step=0.00479, global_step=3240.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3817/5971 [39:59<22:33,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.42e-5, train/loss_step=0.00479, global_step=3240.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3817/5971 [39:59<22:33,  1.59it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000326, train/loss_step=0.0992, global_step=3241.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3818/5971 [40:00<22:33,  1.59it/s, loss=0.135, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00239, train/loss_step=0.424, global_step=3241.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  64%|██████▍   | 3819/5971 [40:01<22:32,  1.59it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0923, train/loss_vlb_step=0.000304, train/loss_step=0.0923, global_step=3241.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3820/5971 [40:03<22:33,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00309, train/loss_step=0.467, global_step=3241.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  64%|██████▍   | 3821/5971 [40:04<22:32,  1.59it/s, loss=0.116, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00309, train/loss_step=0.467, global_step=3241.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3821/5971 [40:04<22:32,  1.59it/s, loss=0.121, v_num=0, train/loss_simple_step=0.362, train/loss_vlb_step=0.00169, train/loss_step=0.362, global_step=3242.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3822/5971 [40:05<22:32,  1.59it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0826, train/loss_vlb_step=0.000283, train/loss_step=0.0826, global_step=3242.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3823/5971 [40:06<22:31,  1.59it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.2e-5, train/loss_step=0.0125, global_step=3242.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  64%|██████▍   | 3824/5971 [40:08<22:31,  1.59it/s, loss=0.101, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=3242.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3825/5971 [40:09<22:31,  1.59it/s, loss=0.101, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=3242.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3825/5971 [40:09<22:31,  1.59it/s, loss=0.11, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000777, train/loss_step=0.211, global_step=3243.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3826/5971 [40:10<22:30,  1.59it/s, loss=0.135, v_num=0, train/loss_simple_step=0.615, train/loss_vlb_step=0.00869, train/loss_step=0.615, global_step=3243.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3827/5971 [40:10<22:30,  1.59it/s, loss=0.142, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000595, train/loss_step=0.162, global_step=3243.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3828/5971 [40:13<22:30,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.27e-5, train/loss_step=0.0166, global_step=3243.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3829/5971 [40:14<22:30,  1.59it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.27e-5, train/loss_step=0.0166, global_step=3243.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3829/5971 [40:14<22:30,  1.59it/s, loss=0.154, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000978, train/loss_step=0.230, global_step=3244.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3830/5971 [40:15<22:29,  1.59it/s, loss=0.168, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00131, train/loss_step=0.281, global_step=3244.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3831/5971 [40:16<22:29,  1.59it/s, loss=0.168, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000571, train/loss_step=0.171, global_step=3244.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3832/5971 [40:18<22:29,  1.59it/s, loss=0.178, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.00064, train/loss_step=0.187, global_step=3244.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3833/5971 [40:19<22:29,  1.58it/s, loss=0.178, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.00064, train/loss_step=0.187, global_step=3244.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3833/5971 [40:19<22:29,  1.58it/s, loss=0.192, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00114, train/loss_step=0.292, global_step=3245.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3834/5971 [40:20<22:28,  1.58it/s, loss=0.198, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.00041, train/loss_step=0.124, global_step=3245.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3835/5971 [40:20<22:28,  1.58it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000208, train/loss_step=0.0581, global_step=3245.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3836/5971 [40:23<22:28,  1.58it/s, loss=0.208, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000524, train/loss_step=0.153, global_step=3245.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  64%|██████▍   | 3837/5971 [40:24<22:27,  1.58it/s, loss=0.208, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000524, train/loss_step=0.153, global_step=3245.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3837/5971 [40:24<22:27,  1.58it/s, loss=0.211, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000627, train/loss_step=0.168, global_step=3246.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3838/5971 [40:25<22:27,  1.58it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=6.32e-5, train/loss_step=0.0137, global_step=3246.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3839/5971 [40:26<22:26,  1.58it/s, loss=0.192, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000374, train/loss_step=0.112, global_step=3246.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3840/5971 [40:28<22:27,  1.58it/s, loss=0.183, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00106, train/loss_step=0.284, global_step=3246.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3841/5971 [40:29<22:26,  1.58it/s, loss=0.183, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00106, train/loss_step=0.284, global_step=3246.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3841/5971 [40:29<22:26,  1.58it/s, loss=0.177, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000929, train/loss_step=0.244, global_step=3247.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3842/5971 [40:30<22:26,  1.58it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.32e-5, train/loss_step=0.0215, global_step=3247.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3843/5971 [40:30<22:25,  1.58it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=7.19e-5, train/loss_step=0.0161, global_step=3247.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3844/5971 [40:33<22:26,  1.58it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.67e-5, train/loss_step=0.0104, global_step=3247.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3845/5971 [40:34<22:25,  1.58it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.67e-5, train/loss_step=0.0104, global_step=3247.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3845/5971 [40:34<22:25,  1.58it/s, loss=0.168, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000724, train/loss_step=0.207, global_step=3248.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3846/5971 [40:35<22:25,  1.58it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00571, train/loss_vlb_step=2.87e-5, train/loss_step=0.00571, global_step=3248.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3847/5971 [40:35<22:24,  1.58it/s, loss=0.149, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00198, train/loss_step=0.379, global_step=3248.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  64%|██████▍   | 3848/5971 [40:38<22:24,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=4.74e-5, train/loss_step=0.012, global_step=3248.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3849/5971 [40:39<22:24,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=4.74e-5, train/loss_step=0.012, global_step=3248.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3849/5971 [40:39<22:24,  1.58it/s, loss=0.144, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000447, train/loss_step=0.135, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  64%|██████▍   | 3850/5971 [40:40<22:23,  1.58it/s, loss=0.147, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00247, train/loss_step=0.355, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  64%|██████▍   | 3851/5971 [40:40<22:23,  1.58it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.01e-5, train/loss_step=0.0166, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  65%|██████▍   | 3852/5971 [40:43<22:23,  1.58it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  65%|██████▍   | 3853/5971 [40:43<22:22,  1.58it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:01<03:03,  1.11s/it][A

Validating:   2%|▏         | 4/167 [00:01<00:39,  4.16it/s][A
Epoch 5:  65%|██████▍   | 3857/5971 [40:44<22:19,  1.58it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▍         | 7/167 [00:01<00:21,  7.56it/s][A
Epoch 5:  65%|██████▍   | 3861/5971 [40:44<22:15,  1.58it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   6%|▌         | 10/167 [00:01<00:14, 10.78it/s][A
Epoch 5:  65%|██████▍   | 3865/5971 [40:44<22:11,  1.58it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 13/167 [00:01<00:11, 13.46it/s][A

Validating:  10%|▉         | 16/167 [00:01<00:09, 15.89it/s][A
Epoch 5:  65%|██████▍   | 3869/5971 [40:44<22:07,  1.58it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█▏        | 19/167 [00:01<00:08, 18.23it/s][A
Epoch 5:  65%|██████▍   | 3873/5971 [40:44<22:04,  1.58it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 22/167 [00:01<00:07, 20.10it/s][A
Epoch 5:  65%|██████▍   | 3877/5971 [40:45<22:00,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  15%|█▍        | 25/167 [00:02<00:06, 20.94it/s][A

Validating:  17%|█▋        | 28/167 [00:02<00:06, 22.36it/s][A
Epoch 5:  65%|██████▍   | 3881/5971 [40:45<21:56,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▊        | 31/167 [00:02<00:06, 22.32it/s][A
Epoch 5:  65%|██████▌   | 3885/5971 [40:45<21:52,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|██        | 34/167 [00:02<00:05, 23.25it/s][A
Epoch 5:  65%|██████▌   | 3889/5971 [40:45<21:48,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 24.59it/s][A
Epoch 5:  65%|██████▌   | 3893/5971 [40:45<21:45,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.86it/s][A
Epoch 5:  65%|██████▌   | 3897/5971 [40:45<21:41,  1.59it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.09it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.44it/s][A
Epoch 5:  65%|██████▌   | 3901/5971 [40:46<21:37,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 51/167 [00:03<00:04, 27.39it/s][A
Epoch 5:  65%|██████▌   | 3905/5971 [40:46<21:33,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 54/167 [00:03<00:04, 27.09it/s][A
Epoch 5:  65%|██████▌   | 3909/5971 [40:46<21:30,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▍      | 57/167 [00:03<00:04, 27.31it/s][A

Validating:  36%|███▌      | 60/167 [00:03<00:04, 26.61it/s][A
Epoch 5:  66%|██████▌   | 3913/5971 [40:46<21:26,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 63/167 [00:03<00:03, 27.30it/s][A
Epoch 5:  66%|██████▌   | 3917/5971 [40:46<21:22,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.03it/s][A
Epoch 5:  66%|██████▌   | 3921/5971 [40:46<21:18,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.53it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.02it/s][A
Epoch 5:  66%|██████▌   | 3925/5971 [40:46<21:15,  1.60it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 24.72it/s][A
Epoch 5:  66%|██████▌   | 3929/5971 [40:47<21:11,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 78/167 [00:04<00:03, 25.46it/s][A
Epoch 5:  66%|██████▌   | 3933/5971 [40:47<21:07,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▊     | 81/167 [00:04<00:03, 25.09it/s][A

Validating:  50%|█████     | 84/167 [00:04<00:03, 25.50it/s][A
Epoch 5:  66%|██████▌   | 3937/5971 [40:47<21:04,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 26.29it/s][A
Epoch 5:  66%|██████▌   | 3941/5971 [40:47<21:00,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 27.13it/s][A
Epoch 5:  66%|██████▌   | 3945/5971 [40:47<20:56,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.58it/s][A

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 28.01it/s][A
Epoch 5:  66%|██████▌   | 3949/5971 [40:47<20:53,  1.61it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.24it/s][A
Epoch 5:  66%|██████▌   | 3953/5971 [40:47<20:49,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.04it/s][A
Epoch 5:  66%|██████▋   | 3957/5971 [40:48<20:45,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 105/167 [00:05<00:02, 27.37it/s][A

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 26.75it/s][A
Epoch 5:  66%|██████▋   | 3961/5971 [40:48<20:42,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 25.67it/s][A
Epoch 5:  66%|██████▋   | 3965/5971 [40:48<20:38,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 25.09it/s][A
Epoch 5:  66%|██████▋   | 3969/5971 [40:48<20:34,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.03it/s][A

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.98it/s][A
Epoch 5:  67%|██████▋   | 3973/5971 [40:48<20:31,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.51it/s][A
Epoch 5:  67%|██████▋   | 3977/5971 [40:48<20:27,  1.62it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.65it/s][A
Epoch 5:  67%|██████▋   | 3981/5971 [40:49<20:23,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 129/167 [00:06<00:01, 27.34it/s][A

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 27.93it/s][A
Epoch 5:  67%|██████▋   | 3985/5971 [40:49<20:20,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████  | 135/167 [00:06<00:01, 28.10it/s][A
Epoch 5:  67%|██████▋   | 3989/5971 [40:49<20:16,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 27.61it/s][A
Epoch 5:  67%|██████▋   | 3993/5971 [40:49<20:13,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 26.25it/s][A

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 25.58it/s][A
Epoch 5:  67%|██████▋   | 3997/5971 [40:49<20:09,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.08it/s][A
Epoch 5:  67%|██████▋   | 4001/5971 [40:49<20:05,  1.63it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.51it/s][A
Epoch 5:  67%|██████▋   | 4005/5971 [40:49<20:02,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.77it/s][A

Validating:  93%|█████████▎| 156/167 [00:07<00:00, 26.66it/s][A
Epoch 5:  67%|██████▋   | 4009/5971 [40:50<19:58,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▌| 159/167 [00:07<00:00, 26.95it/s][A
Epoch 5:  67%|██████▋   | 4013/5971 [40:50<19:55,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 27.14it/s][A
Epoch 5:  67%|██████▋   | 4017/5971 [40:50<19:51,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 25.74it/s][A
Epoch 5:  67%|██████▋   | 4020/5971 [40:50<19:49,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  67%|██████▋   | 4021/5971 [40:51<19:48,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=3249.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4021/5971 [40:51<19:48,  1.64it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.44e-5, train/loss_step=0.0232, global_step=3250.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4022/5971 [40:52<19:48,  1.64it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=7.2e-5, train/loss_step=0.0161, global_step=3250.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  67%|██████▋   | 4023/5971 [40:53<19:47,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000191, train/loss_step=0.0529, global_step=3250.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4024/5971 [40:56<19:48,  1.64it/s, loss=0.129, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00189, train/loss_step=0.380, global_step=3250.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  67%|██████▋   | 4025/5971 [40:57<19:47,  1.64it/s, loss=0.129, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00189, train/loss_step=0.380, global_step=3250.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4025/5971 [40:57<19:47,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00229, train/loss_step=0.363, global_step=3251.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4026/5971 [40:57<19:47,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=6.02e-5, train/loss_step=0.0137, global_step=3251.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4027/5971 [40:58<19:46,  1.64it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.27e-5, train/loss_step=0.00224, global_step=3251.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4028/5971 [41:00<19:46,  1.64it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.47e-5, train/loss_step=0.0122, global_step=3251.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  67%|██████▋   | 4029/5971 [41:01<19:46,  1.64it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.47e-5, train/loss_step=0.0122, global_step=3251.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4029/5971 [41:01<19:46,  1.64it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0798, train/loss_vlb_step=0.000267, train/loss_step=0.0798, global_step=3252.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  67%|██████▋   | 4030/5971 [41:02<19:45,  1.64it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=6.85e-5, train/loss_step=0.0169, global_step=3252.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4031/5971 [41:03<19:45,  1.64it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=0.000101, train/loss_step=0.0258, global_step=3252.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4032/5971 [41:05<19:45,  1.64it/s, loss=0.12, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000684, train/loss_step=0.192, global_step=3252.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  68%|██████▊   | 4033/5971 [41:06<19:45,  1.64it/s, loss=0.12, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000684, train/loss_step=0.192, global_step=3252.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4033/5971 [41:06<19:45,  1.64it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.84e-5, train/loss_step=0.0257, global_step=3253.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4034/5971 [41:07<19:44,  1.64it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0782, train/loss_vlb_step=0.000257, train/loss_step=0.0782, global_step=3253.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4035/5971 [41:08<19:44,  1.63it/s, loss=0.0961, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1e-5, train/loss_step=0.00172, global_step=3253.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4036/5971 [41:10<19:44,  1.63it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000124, train/loss_step=0.0309, global_step=3253.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4037/5971 [41:11<19:43,  1.63it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000124, train/loss_step=0.0309, global_step=3253.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4037/5971 [41:11<19:43,  1.63it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.34e-5, train/loss_step=0.0123, global_step=3254.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4038/5971 [41:12<19:43,  1.63it/s, loss=0.0763, v_num=0, train/loss_simple_step=0.0627, train/loss_vlb_step=0.000216, train/loss_step=0.0627, global_step=3254.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4039/5971 [41:13<19:42,  1.63it/s, loss=0.0776, v_num=0, train/loss_simple_step=0.0429, train/loss_vlb_step=0.000162, train/loss_step=0.0429, global_step=3254.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4040/5971 [41:15<19:43,  1.63it/s, loss=0.0721, v_num=0, train/loss_simple_step=0.00882, train/loss_vlb_step=4.11e-5, train/loss_step=0.00882, global_step=3254.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4041/5971 [41:16<19:42,  1.63it/s, loss=0.0721, v_num=0, train/loss_simple_step=0.00882, train/loss_vlb_step=4.11e-5, train/loss_step=0.00882, global_step=3254.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4041/5971 [41:16<19:42,  1.63it/s, loss=0.0711, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.71e-5, train/loss_step=0.00309, global_step=3255.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4042/5971 [41:17<19:42,  1.63it/s, loss=0.0738, v_num=0, train/loss_simple_step=0.0708, train/loss_vlb_step=0.000239, train/loss_step=0.0708, global_step=3255.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4043/5971 [41:18<19:41,  1.63it/s, loss=0.108, v_num=0, train/loss_simple_step=0.731, train/loss_vlb_step=0.0195, train/loss_step=0.731, global_step=3255.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  68%|██████▊   | 4044/5971 [41:21<19:42,  1.63it/s, loss=0.0957, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000457, train/loss_step=0.139, global_step=3255.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4045/5971 [41:22<19:41,  1.63it/s, loss=0.0957, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000457, train/loss_step=0.139, global_step=3255.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4045/5971 [41:22<19:41,  1.63it/s, loss=0.109, v_num=0, train/loss_simple_step=0.637, train/loss_vlb_step=0.011, train/loss_step=0.637, global_step=3256.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  68%|██████▊   | 4046/5971 [41:23<19:41,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.0043, train/loss_step=0.456, global_step=3256.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4047/5971 [41:24<19:40,  1.63it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.32e-5, train/loss_step=0.0101, global_step=3256.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4048/5971 [41:26<19:40,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00773, train/loss_step=0.594, global_step=3256.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  68%|██████▊   | 4049/5971 [41:27<19:40,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00773, train/loss_step=0.594, global_step=3256.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4049/5971 [41:27<19:40,  1.63it/s, loss=0.176, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00183, train/loss_step=0.385, global_step=3257.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4050/5971 [41:28<19:40,  1.63it/s, loss=0.182, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000459, train/loss_step=0.133, global_step=3257.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4051/5971 [41:29<19:39,  1.63it/s, loss=0.187, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=3257.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4052/5971 [41:31<19:39,  1.63it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00577, train/loss_vlb_step=2.99e-5, train/loss_step=0.00577, global_step=3257.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4053/5971 [41:32<19:39,  1.63it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00577, train/loss_vlb_step=2.99e-5, train/loss_step=0.00577, global_step=3257.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4053/5971 [41:32<19:39,  1.63it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0574, train/loss_vlb_step=0.000195, train/loss_step=0.0574, global_step=3258.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4054/5971 [41:33<19:38,  1.63it/s, loss=0.187, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000961, train/loss_step=0.236, global_step=3258.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  68%|██████▊   | 4055/5971 [41:34<19:38,  1.63it/s, loss=0.192, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00036, train/loss_step=0.109, global_step=3258.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4056/5971 [41:37<19:38,  1.62it/s, loss=0.204, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00106, train/loss_step=0.264, global_step=3258.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4057/5971 [41:38<19:38,  1.62it/s, loss=0.204, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00106, train/loss_step=0.264, global_step=3258.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4057/5971 [41:38<19:38,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000122, train/loss_step=0.0343, global_step=3259.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4058/5971 [41:39<19:37,  1.62it/s, loss=0.223, v_num=0, train/loss_simple_step=0.416, train/loss_vlb_step=0.0027, train/loss_step=0.416, global_step=3259.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  68%|██████▊   | 4059/5971 [41:39<19:37,  1.62it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0818, train/loss_vlb_step=0.000269, train/loss_step=0.0818, global_step=3259.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4060/5971 [41:42<19:37,  1.62it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000154, train/loss_step=0.0436, global_step=3259.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4061/5971 [41:43<19:37,  1.62it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000154, train/loss_step=0.0436, global_step=3259.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4061/5971 [41:43<19:37,  1.62it/s, loss=0.251, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00322, train/loss_step=0.507, global_step=3260.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  68%|██████▊   | 4062/5971 [41:44<19:36,  1.62it/s, loss=0.249, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000117, train/loss_step=0.0316, global_step=3260.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4063/5971 [41:44<19:36,  1.62it/s, loss=0.22, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000514, train/loss_step=0.149, global_step=3260.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  68%|██████▊   | 4064/5971 [41:47<19:36,  1.62it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.00019, train/loss_step=0.0562, global_step=3260.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4065/5971 [41:48<19:35,  1.62it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.00019, train/loss_step=0.0562, global_step=3260.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4065/5971 [41:48<19:35,  1.62it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.00011, train/loss_step=0.0277, global_step=3261.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4066/5971 [41:49<19:35,  1.62it/s, loss=0.169, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=3261.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4067/5971 [41:50<19:34,  1.62it/s, loss=0.187, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00205, train/loss_step=0.358, global_step=3261.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4068/5971 [41:52<19:35,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000655, train/loss_step=0.185, global_step=3261.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4069/5971 [41:53<19:34,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000655, train/loss_step=0.185, global_step=3261.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4069/5971 [41:53<19:34,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000918, train/loss_step=0.233, global_step=3262.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4070/5971 [41:54<19:34,  1.62it/s, loss=0.158, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3262.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4071/5971 [41:55<19:33,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.000106, train/loss_step=0.0267, global_step=3262.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4072/5971 [41:58<19:34,  1.62it/s, loss=0.162, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000694, train/loss_step=0.179, global_step=3262.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  68%|██████▊   | 4073/5971 [41:59<19:33,  1.62it/s, loss=0.162, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000694, train/loss_step=0.179, global_step=3262.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4073/5971 [41:59<19:33,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.000278, train/loss_step=0.0845, global_step=3263.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4074/5971 [42:00<19:33,  1.62it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000102, train/loss_step=0.0259, global_step=3263.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4075/5971 [42:01<19:32,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.0044, train/loss_step=0.485, global_step=3263.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  68%|██████▊   | 4076/5971 [42:04<19:33,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000291, train/loss_step=0.0887, global_step=3263.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4077/5971 [42:05<19:33,  1.61it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000291, train/loss_step=0.0887, global_step=3263.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4077/5971 [42:05<19:33,  1.61it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00817, train/loss_vlb_step=3.97e-5, train/loss_step=0.00817, global_step=3264.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4078/5971 [42:06<19:32,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000881, train/loss_step=0.249, global_step=3264.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  68%|██████▊   | 4079/5971 [42:07<19:32,  1.61it/s, loss=0.16, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000747, train/loss_step=0.213, global_step=3264.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4080/5971 [42:09<19:32,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.36e-5, train/loss_step=0.00497, global_step=3264.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4081/5971 [42:10<19:31,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.36e-5, train/loss_step=0.00497, global_step=3264.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4081/5971 [42:10<19:31,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000446, train/loss_step=0.135, global_step=3265.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  68%|██████▊   | 4082/5971 [42:11<19:31,  1.61it/s, loss=0.145, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000467, train/loss_step=0.141, global_step=3265.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4083/5971 [42:12<19:30,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00167, train/loss_step=0.327, global_step=3265.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  68%|██████▊   | 4084/5971 [42:14<19:30,  1.61it/s, loss=0.174, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00469, train/loss_step=0.464, global_step=3265.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4085/5971 [42:15<19:30,  1.61it/s, loss=0.174, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00469, train/loss_step=0.464, global_step=3265.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4085/5971 [42:15<19:30,  1.61it/s, loss=0.179, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000435, train/loss_step=0.130, global_step=3266.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4086/5971 [42:16<19:29,  1.61it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.98e-5, train/loss_step=0.0192, global_step=3266.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4087/5971 [42:17<19:29,  1.61it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0823, train/loss_vlb_step=0.000277, train/loss_step=0.0823, global_step=3266.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4088/5971 [42:20<19:29,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0737, train/loss_vlb_step=0.000246, train/loss_step=0.0737, global_step=3266.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4089/5971 [42:21<19:29,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0737, train/loss_vlb_step=0.000246, train/loss_step=0.0737, global_step=3266.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4089/5971 [42:21<19:29,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.42e-5, train/loss_step=0.00253, global_step=3267.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  68%|██████▊   | 4090/5971 [42:22<19:28,  1.61it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.92e-5, train/loss_step=0.00364, global_step=3267.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4091/5971 [42:23<19:28,  1.61it/s, loss=0.158, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00367, train/loss_step=0.438, global_step=3267.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  69%|██████▊   | 4092/5971 [42:25<19:28,  1.61it/s, loss=0.164, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00144, train/loss_step=0.315, global_step=3267.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4093/5971 [42:26<19:28,  1.61it/s, loss=0.164, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00144, train/loss_step=0.315, global_step=3267.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4093/5971 [42:26<19:28,  1.61it/s, loss=0.169, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000549, train/loss_step=0.165, global_step=3268.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4094/5971 [42:27<19:27,  1.61it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00744, train/loss_vlb_step=3.58e-5, train/loss_step=0.00744, global_step=3268.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4095/5971 [42:28<19:27,  1.61it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00333, train/loss_vlb_step=1.81e-5, train/loss_step=0.00333, global_step=3268.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4096/5971 [42:30<19:27,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00365, train/loss_vlb_step=1.95e-5, train/loss_step=0.00365, global_step=3268.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4097/5971 [42:31<19:26,  1.61it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00365, train/loss_vlb_step=1.95e-5, train/loss_step=0.00365, global_step=3268.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4097/5971 [42:31<19:26,  1.61it/s, loss=0.153, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00132, train/loss_step=0.278, global_step=3269.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  69%|██████▊   | 4098/5971 [42:32<19:26,  1.61it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00918, train/loss_vlb_step=4.38e-5, train/loss_step=0.00918, global_step=3269.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4099/5971 [42:33<19:25,  1.61it/s, loss=0.14, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000685, train/loss_step=0.190, global_step=3269.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  69%|██████▊   | 4100/5971 [42:35<19:25,  1.60it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0889, train/loss_vlb_step=0.000293, train/loss_step=0.0889, global_step=3269.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4101/5971 [42:36<19:25,  1.60it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0889, train/loss_vlb_step=0.000293, train/loss_step=0.0889, global_step=3269.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4101/5971 [42:36<19:25,  1.60it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0518, train/loss_vlb_step=0.000175, train/loss_step=0.0518, global_step=3270.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  69%|██████▊   | 4102/5971 [42:37<19:24,  1.60it/s, loss=0.139, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000426, train/loss_step=0.129, global_step=3270.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  69%|██████▊   | 4103/5971 [42:38<19:24,  1.60it/s, loss=0.142, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00241, train/loss_step=0.395, global_step=3270.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  69%|██████▊   | 4104/5971 [42:40<19:24,  1.60it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000108, train/loss_step=0.0296, global_step=3270.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4105/5971 [42:41<19:24,  1.60it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000108, train/loss_step=0.0296, global_step=3270.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▊   | 4105/5971 [42:41<19:24,  1.60it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0064, train/loss_vlb_step=3.18e-5, train/loss_step=0.0064, global_step=3271.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  69%|██████▉   | 4106/5971 [42:42<19:23,  1.60it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.83e-5, train/loss_step=0.0136, global_step=3271.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4107/5971 [42:43<19:23,  1.60it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00834, train/loss_vlb_step=3.86e-5, train/loss_step=0.00834, global_step=3271.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4108/5971 [42:46<19:23,  1.60it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.02e-5, train/loss_step=0.0165, global_step=3271.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  69%|██████▉   | 4109/5971 [42:47<19:23,  1.60it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.02e-5, train/loss_step=0.0165, global_step=3271.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4109/5971 [42:47<19:23,  1.60it/s, loss=0.123, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00127, train/loss_step=0.303, global_step=3272.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  69%|██████▉   | 4110/5971 [42:48<19:22,  1.60it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.51e-5, train/loss_step=0.00265, global_step=3272.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4111/5971 [42:49<19:22,  1.60it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.67e-5, train/loss_step=0.0102, global_step=3272.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  69%|██████▉   | 4112/5971 [42:51<19:22,  1.60it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.24e-5, train/loss_step=0.00409, global_step=3272.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4113/5971 [42:52<19:21,  1.60it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.24e-5, train/loss_step=0.00409, global_step=3272.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4113/5971 [42:52<19:21,  1.60it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00097, train/loss_step=0.265, global_step=3273.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  69%|██████▉   | 4114/5971 [42:53<19:21,  1.60it/s, loss=0.0906, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.7e-5, train/loss_step=0.003, global_step=3273.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  69%|██████▉   | 4115/5971 [42:54<19:20,  1.60it/s, loss=0.0926, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.00016, train/loss_step=0.0436, global_step=3273.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4116/5971 [42:56<19:20,  1.60it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.23e-5, train/loss_step=0.023, global_step=3273.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  69%|██████▉   | 4117/5971 [42:57<19:20,  1.60it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.23e-5, train/loss_step=0.023, global_step=3273.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4117/5971 [42:57<19:20,  1.60it/s, loss=0.113, v_num=0, train/loss_simple_step=0.657, train/loss_vlb_step=0.0133, train/loss_step=0.657, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  69%|██████▉   | 4118/5971 [42:58<19:19,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00129, train/loss_step=0.284, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4119/5971 [42:59<19:19,  1.60it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.00033, train/loss_step=0.0992, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  69%|██████▉   | 4120/5971 [43:01<19:19,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  69%|██████▉   | 4121/5971 [43:01<19:18,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:21,  2.04it/s][A

Validating:   1%|          | 2/167 [00:00<00:55,  2.97it/s][A
Epoch 5:  69%|██████▉   | 4125/5971 [43:02<19:15,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:20,  7.82it/s][A
Epoch 5:  69%|██████▉   | 4129/5971 [43:02<19:11,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:00<00:11, 13.62it/s][A

Validating:   7%|▋         | 12/167 [00:01<00:09, 16.94it/s][A
Epoch 5:  69%|██████▉   | 4133/5971 [43:02<19:08,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.01it/s][A
Epoch 5:  69%|██████▉   | 4137/5971 [43:02<19:04,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█         | 18/167 [00:01<00:07, 20.92it/s][A
Epoch 5:  69%|██████▉   | 4141/5971 [43:02<19:01,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.94it/s][A
Epoch 5:  69%|██████▉   | 4145/5971 [43:02<18:57,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.66it/s][A

Validating:  17%|█▋        | 28/167 [00:01<00:05, 26.65it/s][A
Epoch 5:  69%|██████▉   | 4149/5971 [43:03<18:54,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 26.44it/s][A
Epoch 5:  70%|██████▉   | 4153/5971 [43:03<18:50,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|██        | 34/167 [00:01<00:05, 25.71it/s][A
Epoch 5:  70%|██████▉   | 4157/5971 [43:03<18:47,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 25.84it/s][A

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.53it/s][A
Epoch 5:  70%|██████▉   | 4161/5971 [43:03<18:43,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.57it/s][A
Epoch 5:  70%|██████▉   | 4165/5971 [43:03<18:40,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.52it/s][A
Epoch 5:  70%|██████▉   | 4169/5971 [43:03<18:36,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 25.39it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 24.59it/s][A
Epoch 5:  70%|██████▉   | 4173/5971 [43:04<18:33,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 25.65it/s][A
Epoch 5:  70%|██████▉   | 4177/5971 [43:04<18:29,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 24.88it/s][A
Epoch 5:  70%|███████   | 4181/5971 [43:04<18:26,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 25.70it/s][A

Validating:  38%|███▊      | 64/167 [00:03<00:04, 24.33it/s][A
Epoch 5:  70%|███████   | 4185/5971 [43:04<18:22,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  40%|████      | 67/167 [00:03<00:03, 25.38it/s][A
Epoch 5:  70%|███████   | 4189/5971 [43:04<18:19,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.05it/s][A
Epoch 5:  70%|███████   | 4193/5971 [43:04<18:15,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.32it/s][A
Epoch 5:  70%|███████   | 4197/5971 [43:04<18:12,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.87it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.39it/s][A
Epoch 5:  70%|███████   | 4201/5971 [43:05<18:08,  1.63it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.78it/s][A
Epoch 5:  70%|███████   | 4205/5971 [43:05<18:05,  1.63it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.29it/s][A
Epoch 5:  70%|███████   | 4209/5971 [43:05<18:02,  1.63it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 25.79it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.17it/s][A
Epoch 5:  71%|███████   | 4213/5971 [43:05<17:58,  1.63it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.06it/s][A
Epoch 5:  71%|███████   | 4217/5971 [43:05<17:55,  1.63it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.24it/s][A
Epoch 5:  71%|███████   | 4221/5971 [43:05<17:51,  1.63it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.70it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.74it/s][A
Epoch 5:  71%|███████   | 4225/5971 [43:06<17:48,  1.63it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.34it/s][A
Epoch 5:  71%|███████   | 4229/5971 [43:06<17:45,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 24.76it/s][A
Epoch 5:  71%|███████   | 4233/5971 [43:06<17:41,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 24.64it/s][A

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.44it/s][A
Epoch 5:  71%|███████   | 4237/5971 [43:06<17:38,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.21it/s][A
Epoch 5:  71%|███████   | 4241/5971 [43:06<17:34,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.43it/s][A
Epoch 5:  71%|███████   | 4245/5971 [43:06<17:31,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.25it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.70it/s][A
Epoch 5:  71%|███████   | 4249/5971 [43:07<17:28,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.70it/s][A
Epoch 5:  71%|███████   | 4253/5971 [43:07<17:24,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.73it/s][A
Epoch 5:  71%|███████▏  | 4257/5971 [43:07<17:21,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.35it/s][A

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.98it/s][A
Epoch 5:  71%|███████▏  | 4261/5971 [43:07<17:18,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 27.53it/s][A
Epoch 5:  71%|███████▏  | 4265/5971 [43:07<17:14,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 28.08it/s][A
Epoch 5:  71%|███████▏  | 4269/5971 [43:07<17:11,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.45it/s][A
Epoch 5:  72%|███████▏  | 4273/5971 [43:07<17:08,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.62it/s][A

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 25.76it/s][A
Epoch 5:  72%|███████▏  | 4277/5971 [43:08<17:04,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.58it/s][A
Epoch 5:  72%|███████▏  | 4281/5971 [43:08<17:01,  1.65it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.63it/s][A
Epoch 5:  72%|███████▏  | 4285/5971 [43:08<16:58,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.38it/s][A
Epoch 5:  72%|███████▏  | 4288/5971 [43:08<16:55,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  72%|███████▏  | 4289/5971 [43:09<16:55,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.839, train/loss_vlb_step=0.142, train/loss_step=0.839, global_step=3274.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4289/5971 [43:09<16:55,  1.66it/s, loss=0.163, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=3275.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4290/5971 [43:10<16:54,  1.66it/s, loss=0.163, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.00047, train/loss_step=0.141, global_step=3275.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  72%|███████▏  | 4291/5971 [43:11<16:54,  1.66it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00693, train/loss_vlb_step=3.24e-5, train/loss_step=0.00693, global_step=3275.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4292/5971 [43:13<16:54,  1.66it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000118, train/loss_step=0.0314, global_step=3275.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  72%|███████▏  | 4293/5971 [43:14<16:53,  1.66it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000118, train/loss_step=0.0314, global_step=3275.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4293/5971 [43:14<16:53,  1.66it/s, loss=0.158, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.00113, train/loss_step=0.280, global_step=3276.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  72%|███████▏  | 4294/5971 [43:15<16:53,  1.65it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0775, train/loss_vlb_step=0.00026, train/loss_step=0.0775, global_step=3276.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4295/5971 [43:16<16:52,  1.65it/s, loss=0.173, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00104, train/loss_step=0.257, global_step=3276.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  72%|███████▏  | 4296/5971 [43:18<16:53,  1.65it/s, loss=0.185, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00103, train/loss_step=0.255, global_step=3276.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4297/5971 [43:19<16:52,  1.65it/s, loss=0.185, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00103, train/loss_step=0.255, global_step=3276.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4297/5971 [43:19<16:52,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000101, train/loss_step=0.0275, global_step=3277.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4298/5971 [43:20<16:52,  1.65it/s, loss=0.199, v_num=0, train/loss_simple_step=0.558, train/loss_vlb_step=0.00825, train/loss_step=0.558, global_step=3277.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  72%|███████▏  | 4299/5971 [43:21<16:51,  1.65it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0719, train/loss_vlb_step=0.000247, train/loss_step=0.0719, global_step=3277.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4300/5971 [43:23<16:51,  1.65it/s, loss=0.219, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00179, train/loss_step=0.336, global_step=3277.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  72%|███████▏  | 4301/5971 [43:24<16:51,  1.65it/s, loss=0.219, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00179, train/loss_step=0.336, global_step=3277.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4301/5971 [43:24<16:51,  1.65it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0802, train/loss_vlb_step=0.000267, train/loss_step=0.0802, global_step=3278.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4302/5971 [43:25<16:50,  1.65it/s, loss=0.225, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.0015, train/loss_step=0.311, global_step=3278.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  72%|███████▏  | 4303/5971 [43:26<16:50,  1.65it/s, loss=0.263, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0465, train/loss_step=0.812, global_step=3278.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4304/5971 [43:28<16:50,  1.65it/s, loss=0.262, v_num=0, train/loss_simple_step=0.00449, train/loss_vlb_step=2.34e-5, train/loss_step=0.00449, global_step=3278.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4305/5971 [43:29<16:49,  1.65it/s, loss=0.262, v_num=0, train/loss_simple_step=0.00449, train/loss_vlb_step=2.34e-5, train/loss_step=0.00449, global_step=3278.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4305/5971 [43:29<16:49,  1.65it/s, loss=0.23, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.62e-5, train/loss_step=0.00517, global_step=3279.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  72%|███████▏  | 4306/5971 [43:30<16:49,  1.65it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.000284, train/loss_step=0.0845, global_step=3279.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  72%|███████▏  | 4307/5971 [43:31<16:48,  1.65it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000109, train/loss_step=0.0279, global_step=3279.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4308/5971 [43:33<16:48,  1.65it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000165, train/loss_step=0.0464, global_step=3279.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4309/5971 [43:34<16:48,  1.65it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000165, train/loss_step=0.0464, global_step=3279.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4309/5971 [43:34<16:48,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00402, train/loss_vlb_step=2.18e-5, train/loss_step=0.00402, global_step=3280.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4310/5971 [43:35<16:47,  1.65it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.00018, train/loss_step=0.0511, global_step=3280.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  72%|███████▏  | 4311/5971 [43:36<16:47,  1.65it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.03e-5, train/loss_step=0.0148, global_step=3280.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4312/5971 [43:38<16:47,  1.65it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000188, train/loss_step=0.0551, global_step=3280.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4313/5971 [43:39<16:46,  1.65it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000188, train/loss_step=0.0551, global_step=3280.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4313/5971 [43:39<16:46,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.09e-5, train/loss_step=0.00182, global_step=3281.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4314/5971 [43:40<16:46,  1.65it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0576, train/loss_vlb_step=0.0002, train/loss_step=0.0576, global_step=3281.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  72%|███████▏  | 4315/5971 [43:41<16:45,  1.65it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00655, train/loss_vlb_step=3.14e-5, train/loss_step=0.00655, global_step=3281.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4316/5971 [43:43<16:45,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00159, train/loss_step=0.328, global_step=3281.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  72%|███████▏  | 4317/5971 [43:44<16:45,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00159, train/loss_step=0.328, global_step=3281.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4317/5971 [43:44<16:45,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0595, train/loss_vlb_step=0.000209, train/loss_step=0.0595, global_step=3282.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4318/5971 [43:45<16:44,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.00014, train/loss_step=0.0371, global_step=3282.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  72%|███████▏  | 4319/5971 [43:46<16:44,  1.64it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000123, train/loss_step=0.0333, global_step=3282.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4320/5971 [43:48<16:44,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000111, train/loss_step=0.0289, global_step=3282.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4321/5971 [43:49<16:43,  1.64it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000111, train/loss_step=0.0289, global_step=3282.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4321/5971 [43:49<16:43,  1.64it/s, loss=0.108, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000699, train/loss_step=0.195, global_step=3283.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  72%|███████▏  | 4322/5971 [43:50<16:43,  1.64it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.71e-5, train/loss_step=0.0127, global_step=3283.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4323/5971 [43:51<16:42,  1.64it/s, loss=0.0537, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.63e-5, train/loss_step=0.0204, global_step=3283.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4324/5971 [43:53<16:42,  1.64it/s, loss=0.0536, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.42e-5, train/loss_step=0.00246, global_step=3283.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4325/5971 [43:54<16:42,  1.64it/s, loss=0.0536, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.42e-5, train/loss_step=0.00246, global_step=3283.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4325/5971 [43:54<16:42,  1.64it/s, loss=0.0534, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.17e-5, train/loss_step=0.00217, global_step=3284.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  72%|███████▏  | 4326/5971 [43:55<16:41,  1.64it/s, loss=0.0499, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.19e-5, train/loss_step=0.0144, global_step=3284.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  72%|███████▏  | 4327/5971 [43:56<16:41,  1.64it/s, loss=0.0617, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00101, train/loss_step=0.263, global_step=3284.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  72%|███████▏  | 4328/5971 [43:58<16:41,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=3284.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4329/5971 [43:59<16:40,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=3284.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4329/5971 [43:59<16:40,  1.64it/s, loss=0.0667, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000146, train/loss_step=0.0395, global_step=3285.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4330/5971 [44:00<16:40,  1.64it/s, loss=0.0643, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=1.96e-5, train/loss_step=0.00384, global_step=3285.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4331/5971 [44:00<16:39,  1.64it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.633, train/loss_vlb_step=0.00689, train/loss_step=0.633, global_step=3285.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  73%|███████▎  | 4332/5971 [44:03<16:39,  1.64it/s, loss=0.126, v_num=0, train/loss_simple_step=0.674, train/loss_vlb_step=0.0164, train/loss_step=0.674, global_step=3285.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4333/5971 [44:04<16:39,  1.64it/s, loss=0.126, v_num=0, train/loss_simple_step=0.674, train/loss_vlb_step=0.0164, train/loss_step=0.674, global_step=3285.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4333/5971 [44:04<16:39,  1.64it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=3286.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4334/5971 [44:04<16:38,  1.64it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.88e-5, train/loss_step=0.0109, global_step=3286.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4335/5971 [44:05<16:38,  1.64it/s, loss=0.143, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00274, train/loss_step=0.398, global_step=3286.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4336/5971 [44:07<16:38,  1.64it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0459, train/loss_vlb_step=0.000162, train/loss_step=0.0459, global_step=3286.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4337/5971 [44:08<16:37,  1.64it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0459, train/loss_vlb_step=0.000162, train/loss_step=0.0459, global_step=3286.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4337/5971 [44:08<16:37,  1.64it/s, loss=0.142, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00121, train/loss_step=0.313, global_step=3287.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  73%|███████▎  | 4338/5971 [44:09<16:37,  1.64it/s, loss=0.16, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00248, train/loss_step=0.390, global_step=3287.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4339/5971 [44:10<16:36,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.83e-6, train/loss_step=0.00166, global_step=3287.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4340/5971 [44:12<16:36,  1.64it/s, loss=0.177, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00266, train/loss_step=0.413, global_step=3287.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  73%|███████▎  | 4341/5971 [44:13<16:36,  1.64it/s, loss=0.177, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00266, train/loss_step=0.413, global_step=3287.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4341/5971 [44:13<16:36,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000164, train/loss_step=0.0496, global_step=3288.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4342/5971 [44:14<16:35,  1.64it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000194, train/loss_step=0.0551, global_step=3288.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4343/5971 [44:15<16:35,  1.64it/s, loss=0.183, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000962, train/loss_step=0.246, global_step=3288.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4344/5971 [44:17<16:35,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.000146, train/loss_step=0.0404, global_step=3288.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4345/5971 [44:18<16:34,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.000146, train/loss_step=0.0404, global_step=3288.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4345/5971 [44:18<16:34,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0046, train/loss_vlb_step=2.4e-5, train/loss_step=0.0046, global_step=3289.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4346/5971 [44:19<16:34,  1.63it/s, loss=0.194, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000627, train/loss_step=0.187, global_step=3289.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4347/5971 [44:20<16:33,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00138, train/loss_step=0.334, global_step=3289.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4348/5971 [44:22<16:33,  1.63it/s, loss=0.2, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00058, train/loss_step=0.168, global_step=3289.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4349/5971 [44:23<16:33,  1.63it/s, loss=0.2, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00058, train/loss_step=0.168, global_step=3289.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4349/5971 [44:23<16:33,  1.63it/s, loss=0.239, v_num=0, train/loss_simple_step=0.818, train/loss_vlb_step=0.0469, train/loss_step=0.818, global_step=3290.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4350/5971 [44:24<16:32,  1.63it/s, loss=0.239, v_num=0, train/loss_simple_step=0.00357, train/loss_vlb_step=1.92e-5, train/loss_step=0.00357, global_step=3290.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4351/5971 [44:25<16:32,  1.63it/s, loss=0.213, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=3290.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  73%|███████▎  | 4352/5971 [44:27<16:32,  1.63it/s, loss=0.193, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00133, train/loss_step=0.282, global_step=3290.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4353/5971 [44:28<16:31,  1.63it/s, loss=0.193, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00133, train/loss_step=0.282, global_step=3290.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4353/5971 [44:28<16:31,  1.63it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00659, train/loss_vlb_step=2.98e-5, train/loss_step=0.00659, global_step=3291.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4354/5971 [44:29<16:31,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=3291.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  73%|███████▎  | 4355/5971 [44:30<16:30,  1.63it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00959, train/loss_vlb_step=4.53e-5, train/loss_step=0.00959, global_step=3291.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4356/5971 [44:32<16:30,  1.63it/s, loss=0.2, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00395, train/loss_step=0.472, global_step=3291.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]      
Epoch 5:  73%|███████▎  | 4357/5971 [44:33<16:29,  1.63it/s, loss=0.2, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00395, train/loss_step=0.472, global_step=3291.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4357/5971 [44:33<16:29,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.8e-5, train/loss_step=0.0077, global_step=3292.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4358/5971 [44:33<16:29,  1.63it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00198, train/loss_vlb_step=1.18e-5, train/loss_step=0.00198, global_step=3292.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4359/5971 [44:34<16:28,  1.63it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.97e-5, train/loss_step=0.0137, global_step=3292.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4360/5971 [44:36<16:28,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000572, train/loss_step=0.165, global_step=3292.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4361/5971 [44:37<16:28,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000572, train/loss_step=0.165, global_step=3292.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4361/5971 [44:37<16:28,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000128, train/loss_step=0.0357, global_step=3293.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4362/5971 [44:38<16:27,  1.63it/s, loss=0.178, v_num=0, train/loss_simple_step=0.547, train/loss_vlb_step=0.0085, train/loss_step=0.547, global_step=3293.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  73%|███████▎  | 4363/5971 [44:39<16:27,  1.63it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00817, train/loss_vlb_step=3.78e-5, train/loss_step=0.00817, global_step=3293.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4364/5971 [44:41<16:27,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000437, train/loss_step=0.132, global_step=3293.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  73%|███████▎  | 4365/5971 [44:42<16:26,  1.63it/s, loss=0.17, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000437, train/loss_step=0.132, global_step=3293.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4365/5971 [44:42<16:26,  1.63it/s, loss=0.177, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000416, train/loss_step=0.127, global_step=3294.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4366/5971 [44:43<16:26,  1.63it/s, loss=0.168, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.63e-5, train/loss_step=0.022, global_step=3294.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4367/5971 [44:44<16:25,  1.63it/s, loss=0.164, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000824, train/loss_step=0.239, global_step=3294.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4368/5971 [44:46<16:25,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.41e-5, train/loss_step=0.00456, global_step=3294.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4369/5971 [44:47<16:25,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.41e-5, train/loss_step=0.00456, global_step=3294.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4369/5971 [44:47<16:25,  1.63it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.03e-5, train/loss_step=0.0146, global_step=3295.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4370/5971 [44:48<16:24,  1.63it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0253, train/loss_vlb_step=0.000102, train/loss_step=0.0253, global_step=3295.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4371/5971 [44:49<16:24,  1.63it/s, loss=0.124, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00124, train/loss_step=0.265, global_step=3295.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  73%|███████▎  | 4372/5971 [44:51<16:24,  1.62it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000214, train/loss_step=0.0632, global_step=3295.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4373/5971 [44:52<16:23,  1.62it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000214, train/loss_step=0.0632, global_step=3295.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4373/5971 [44:52<16:23,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000554, train/loss_step=0.158, global_step=3296.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4374/5971 [44:53<16:23,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=3296.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4375/5971 [44:54<16:22,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00121, train/loss_step=0.281, global_step=3296.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4376/5971 [44:56<16:22,  1.62it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.28e-6, train/loss_step=0.0014, global_step=3296.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4377/5971 [44:57<16:22,  1.62it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.28e-6, train/loss_step=0.0014, global_step=3296.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4377/5971 [44:57<16:22,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000968, train/loss_step=0.242, global_step=3297.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4378/5971 [44:58<16:21,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00682, train/loss_vlb_step=3.32e-5, train/loss_step=0.00682, global_step=3297.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4379/5971 [44:58<16:20,  1.62it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.92e-5, train/loss_step=0.0138, global_step=3297.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4380/5971 [45:01<16:20,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.6e-5, train/loss_step=0.0123, global_step=3297.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4381/5971 [45:02<16:20,  1.62it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.6e-5, train/loss_step=0.0123, global_step=3297.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4381/5971 [45:02<16:20,  1.62it/s, loss=0.114, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.39e-5, train/loss_step=0.015, global_step=3298.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4382/5971 [45:02<16:19,  1.62it/s, loss=0.0884, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000116, train/loss_step=0.031, global_step=3298.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4383/5971 [45:03<16:19,  1.62it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.00383, train/loss_vlb_step=1.98e-5, train/loss_step=0.00383, global_step=3298.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4384/5971 [45:05<16:19,  1.62it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.00596, train/loss_vlb_step=2.7e-5, train/loss_step=0.00596, global_step=3298.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4385/5971 [45:06<16:18,  1.62it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.00596, train/loss_vlb_step=2.7e-5, train/loss_step=0.00596, global_step=3298.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4385/5971 [45:06<16:18,  1.62it/s, loss=0.0761, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.74e-5, train/loss_step=0.0109, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  73%|███████▎  | 4386/5971 [45:07<16:18,  1.62it/s, loss=0.0752, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.87e-5, train/loss_step=0.00329, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  73%|███████▎  | 4387/5971 [45:08<16:17,  1.62it/s, loss=0.065, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000131, train/loss_step=0.0351, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  73%|███████▎  | 4388/5971 [45:10<16:17,  1.62it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  74%|███████▎  | 4389/5971 [45:10<16:16,  1.62it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:15,  2.20it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:26,  6.17it/s][A
Epoch 5:  74%|███████▎  | 4393/5971 [45:11<16:13,  1.62it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▎         | 6/167 [00:00<00:13, 11.50it/s][A
Epoch 5:  74%|███████▎  | 4397/5971 [45:11<16:10,  1.62it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:00<00:09, 15.98it/s][A

Validating:   7%|▋         | 12/167 [00:00<00:08, 19.33it/s][A
Epoch 5:  74%|███████▎  | 4401/5971 [45:11<16:07,  1.62it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.97it/s][A
Epoch 5:  74%|███████▍  | 4405/5971 [45:11<16:03,  1.62it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█         | 18/167 [00:01<00:06, 22.88it/s][A
Epoch 5:  74%|███████▍  | 4409/5971 [45:11<16:00,  1.63it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 22/167 [00:01<00:05, 25.52it/s][A
Epoch 5:  74%|███████▍  | 4413/5971 [45:12<15:57,  1.63it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 27.30it/s][A
Epoch 5:  74%|███████▍  | 4417/5971 [45:12<15:53,  1.63it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  18%|█▊        | 30/167 [00:01<00:04, 28.22it/s][A
Epoch 5:  74%|███████▍  | 4421/5971 [45:12<15:50,  1.63it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|██        | 34/167 [00:01<00:04, 28.29it/s][A
Epoch 5:  74%|███████▍  | 4425/5971 [45:12<15:47,  1.63it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.80it/s][A
Epoch 5:  74%|███████▍  | 4429/5971 [45:12<15:44,  1.63it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▍       | 41/167 [00:01<00:04, 28.09it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 28.53it/s][A
Epoch 5:  74%|███████▍  | 4433/5971 [45:12<15:40,  1.63it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 28.06it/s][A
Epoch 5:  74%|███████▍  | 4437/5971 [45:12<15:37,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 27.44it/s][A
Epoch 5:  74%|███████▍  | 4441/5971 [45:13<15:34,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.75it/s][A

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.70it/s][A
Epoch 5:  74%|███████▍  | 4445/5971 [45:13<15:31,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.07it/s][A
Epoch 5:  75%|███████▍  | 4449/5971 [45:13<15:28,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.41it/s][A
Epoch 5:  75%|███████▍  | 4453/5971 [45:13<15:24,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 25.97it/s][A
Epoch 5:  75%|███████▍  | 4457/5971 [45:13<15:21,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████▏     | 69/167 [00:02<00:03, 27.24it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.28it/s][A
Epoch 5:  75%|███████▍  | 4461/5971 [45:13<15:18,  1.64it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.11it/s][A
Epoch 5:  75%|███████▍  | 4465/5971 [45:13<15:15,  1.65it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.20it/s][A
Epoch 5:  75%|███████▍  | 4469/5971 [45:14<15:11,  1.65it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.08it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.94it/s][A
Epoch 5:  75%|███████▍  | 4473/5971 [45:14<15:08,  1.65it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.84it/s][A
Epoch 5:  75%|███████▍  | 4477/5971 [45:14<15:05,  1.65it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  54%|█████▍    | 90/167 [00:03<00:03, 25.26it/s][A
Epoch 5:  75%|███████▌  | 4481/5971 [45:14<15:02,  1.65it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:03<00:02, 27.08it/s][A
Epoch 5:  75%|███████▌  | 4485/5971 [45:14<14:59,  1.65it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.07it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.52it/s][A
Epoch 5:  75%|███████▌  | 4489/5971 [45:14<14:56,  1.65it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.78it/s][A
Epoch 5:  75%|███████▌  | 4493/5971 [45:15<14:52,  1.66it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.84it/s][A
Epoch 5:  75%|███████▌  | 4497/5971 [45:15<14:49,  1.66it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 25.26it/s][A
Epoch 5:  75%|███████▌  | 4501/5971 [45:15<14:46,  1.66it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.90it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.58it/s][A
Epoch 5:  75%|███████▌  | 4505/5971 [45:15<14:43,  1.66it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 25.94it/s][A
Epoch 5:  76%|███████▌  | 4509/5971 [45:15<14:40,  1.66it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.91it/s][A
Epoch 5:  76%|███████▌  | 4513/5971 [45:15<14:37,  1.66it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.26it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.11it/s][A
Epoch 5:  76%|███████▌  | 4517/5971 [45:15<14:34,  1.66it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.08it/s][A
Epoch 5:  76%|███████▌  | 4521/5971 [45:16<14:30,  1.66it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.03it/s][A
Epoch 5:  76%|███████▌  | 4525/5971 [45:16<14:27,  1.67it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 24.33it/s][A

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 25.73it/s][A
Epoch 5:  76%|███████▌  | 4529/5971 [45:16<14:24,  1.67it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.17it/s][A
Epoch 5:  76%|███████▌  | 4533/5971 [45:16<14:21,  1.67it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:05<00:00, 26.36it/s][A
Epoch 5:  76%|███████▌  | 4537/5971 [45:16<14:18,  1.67it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 24.83it/s][A
Epoch 5:  76%|███████▌  | 4541/5971 [45:16<14:15,  1.67it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.32it/s][A

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 25.91it/s][A
Epoch 5:  76%|███████▌  | 4545/5971 [45:17<14:12,  1.67it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 27.80it/s][A
Epoch 5:  76%|███████▌  | 4549/5971 [45:17<14:09,  1.67it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 28.17it/s][A
Epoch 5:  76%|███████▋  | 4553/5971 [45:17<14:06,  1.68it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.14it/s][A
Epoch 5:  76%|███████▋  | 4556/5971 [45:17<14:03,  1.68it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:30,  1.63it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:17,  2.67it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:13,  3.37it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.82it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.23it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.56it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.88it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.09it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.49it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.44it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.33it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.33it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.42it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.50it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.53it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.39it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.42it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.61it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.46it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.41it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.20it/s]

Epoch 5:  76%|███████▋  | 4557/5971 [45:29<14:06,  1.67it/s, loss=0.0649, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.71e-5, train/loss_step=0.00303, global_step=3299.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4557/5971 [45:29<14:06,  1.67it/s, loss=0.0647, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.72e-5, train/loss_step=0.0105, global_step=3300.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.33it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.11it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.70it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  4.03it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.33it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.63it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.85it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.85it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.87it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  4.96it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.10it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.18it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.29it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.28it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:05,  5.33it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.30it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.38it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.49it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.40it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.03it/s]

Epoch 5:  76%|███████▋  | 4558/5971 [45:42<14:09,  1.66it/s, loss=0.0647, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.72e-5, train/loss_step=0.0105, global_step=3300.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4558/5971 [45:42<14:09,  1.66it/s, loss=0.0656, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000158, train/loss_step=0.0421, global_step=3300.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.21it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.99it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.20it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.45it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.49it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.45it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.45it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.30it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.22it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.20it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.34it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.44it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.57it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.61it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.52it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.33it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.30it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.14it/s]

Epoch 5:  76%|███████▋  | 4559/5971 [45:54<14:12,  1.66it/s, loss=0.0656, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000158, train/loss_step=0.0421, global_step=3300.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4559/5971 [45:54<14:12,  1.66it/s, loss=0.0726, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00243, train/loss_step=0.407, global_step=3300.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.88it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.33it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.64it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.86it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.98it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.05it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.11it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.21it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.33it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.44it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.50it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.50it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.45it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.48it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.46it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.41it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.45it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.61it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.60it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.50it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.57it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.60it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.61it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.65it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 5:  76%|███████▋  | 4560/5971 [46:08<14:16,  1.65it/s, loss=0.0726, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00243, train/loss_step=0.407, global_step=3300.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4560/5971 [46:08<14:16,  1.65it/s, loss=0.0782, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000583, train/loss_step=0.175, global_step=3300.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4561/5971 [46:08<14:15,  1.65it/s, loss=0.0782, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000583, train/loss_step=0.175, global_step=3300.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4561/5971 [46:08<14:15,  1.65it/s, loss=0.0712, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.33e-5, train/loss_step=0.0183, global_step=3301.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4562/5971 [46:09<14:15,  1.65it/s, loss=0.0712, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.33e-5, train/loss_step=0.0183, global_step=3301.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4562/5971 [46:09<14:15,  1.65it/s, loss=0.0796, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.000961, train/loss_step=0.274, global_step=3301.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  76%|███████▋  | 4563/5971 [46:10<14:14,  1.65it/s, loss=0.0796, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.000961, train/loss_step=0.274, global_step=3301.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4563/5971 [46:10<14:14,  1.65it/s, loss=0.0661, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.1e-5, train/loss_step=0.0112, global_step=3301.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4564/5971 [46:12<14:14,  1.65it/s, loss=0.0661, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.1e-5, train/loss_step=0.0112, global_step=3301.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4564/5971 [46:12<14:14,  1.65it/s, loss=0.0661, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.3e-5, train/loss_step=0.0022, global_step=3301.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4565/5971 [46:13<14:14,  1.65it/s, loss=0.0661, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.3e-5, train/loss_step=0.0022, global_step=3301.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4565/5971 [46:13<14:14,  1.65it/s, loss=0.0589, v_num=0, train/loss_simple_step=0.0984, train/loss_vlb_step=0.000326, train/loss_step=0.0984, global_step=3302.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4566/5971 [46:14<14:13,  1.65it/s, loss=0.0589, v_num=0, train/loss_simple_step=0.0984, train/loss_vlb_step=0.000326, train/loss_step=0.0984, global_step=3302.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4566/5971 [46:14<14:13,  1.65it/s, loss=0.0589, v_num=0, train/loss_simple_step=0.00504, train/loss_vlb_step=2.63e-5, train/loss_step=0.00504, global_step=3302.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4567/5971 [46:15<14:13,  1.65it/s, loss=0.0589, v_num=0, train/loss_simple_step=0.00504, train/loss_vlb_step=2.63e-5, train/loss_step=0.00504, global_step=3302.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  76%|███████▋  | 4567/5971 [46:15<14:13,  1.65it/s, loss=0.0697, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000796, train/loss_step=0.230, global_step=3302.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4568/5971 [46:17<14:12,  1.64it/s, loss=0.0697, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000796, train/loss_step=0.230, global_step=3302.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4568/5971 [46:17<14:12,  1.64it/s, loss=0.076, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000465, train/loss_step=0.140, global_step=3302.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  77%|███████▋  | 4569/5971 [46:18<14:12,  1.64it/s, loss=0.076, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000465, train/loss_step=0.140, global_step=3302.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4569/5971 [46:18<14:12,  1.64it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.48e-5, train/loss_step=0.0103, global_step=3303.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4570/5971 [46:19<14:11,  1.64it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.48e-5, train/loss_step=0.0103, global_step=3303.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4570/5971 [46:19<14:11,  1.64it/s, loss=0.0748, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.84e-5, train/loss_step=0.0108, global_step=3303.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4571/5971 [46:20<14:11,  1.64it/s, loss=0.0748, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.84e-5, train/loss_step=0.0108, global_step=3303.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4571/5971 [46:20<14:11,  1.64it/s, loss=0.0755, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.71e-5, train/loss_step=0.0189, global_step=3303.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4572/5971 [46:22<14:11,  1.64it/s, loss=0.0755, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.71e-5, train/loss_step=0.0189, global_step=3303.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4572/5971 [46:22<14:11,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.605, train/loss_vlb_step=0.0108, train/loss_step=0.605, global_step=3303.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  77%|███████▋  | 4573/5971 [46:23<14:10,  1.64it/s, loss=0.105, v_num=0, train/loss_simple_step=0.605, train/loss_vlb_step=0.0108, train/loss_step=0.605, global_step=3303.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4573/5971 [46:23<14:10,  1.64it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.000232, train/loss_step=0.0693, global_step=3304.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4574/5971 [46:24<14:10,  1.64it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0693, train/loss_vlb_step=0.000232, train/loss_step=0.0693, global_step=3304.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4574/5971 [46:24<14:10,  1.64it/s, loss=0.126, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00202, train/loss_step=0.359, global_step=3304.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4575/5971 [46:25<14:09,  1.64it/s, loss=0.126, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00202, train/loss_step=0.359, global_step=3304.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4575/5971 [46:25<14:09,  1.64it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00977, train/loss_vlb_step=4.43e-5, train/loss_step=0.00977, global_step=3304.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4576/5971 [46:27<14:09,  1.64it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00977, train/loss_vlb_step=4.43e-5, train/loss_step=0.00977, global_step=3304.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4576/5971 [46:27<14:09,  1.64it/s, loss=0.143, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00189, train/loss_step=0.370, global_step=3304.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  77%|███████▋  | 4577/5971 [46:28<14:09,  1.64it/s, loss=0.143, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00189, train/loss_step=0.370, global_step=3304.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4577/5971 [46:28<14:09,  1.64it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.13e-5, train/loss_step=0.0143, global_step=3305.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4578/5971 [46:29<14:08,  1.64it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.13e-5, train/loss_step=0.0143, global_step=3305.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4578/5971 [46:29<14:08,  1.64it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0431, train/loss_vlb_step=0.000166, train/loss_step=0.0431, global_step=3305.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4579/5971 [46:30<14:07,  1.64it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0431, train/loss_vlb_step=0.000166, train/loss_step=0.0431, global_step=3305.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4579/5971 [46:30<14:07,  1.64it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.74e-5, train/loss_step=0.0136, global_step=3305.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  77%|███████▋  | 4580/5971 [46:32<14:07,  1.64it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.74e-5, train/loss_step=0.0136, global_step=3305.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4580/5971 [46:32<14:07,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00236, train/loss_step=0.449, global_step=3305.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  77%|███████▋  | 4581/5971 [46:33<14:07,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00236, train/loss_step=0.449, global_step=3305.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4581/5971 [46:33<14:07,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000217, train/loss_step=0.062, global_step=3306.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4582/5971 [46:34<14:06,  1.64it/s, loss=0.14, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000217, train/loss_step=0.062, global_step=3306.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4582/5971 [46:34<14:06,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.00095, train/loss_step=0.237, global_step=3306.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4583/5971 [46:34<14:06,  1.64it/s, loss=0.138, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.00095, train/loss_step=0.237, global_step=3306.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4583/5971 [46:34<14:06,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00132, train/loss_step=0.267, global_step=3306.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4584/5971 [46:37<14:06,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00132, train/loss_step=0.267, global_step=3306.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4584/5971 [46:37<14:06,  1.64it/s, loss=0.165, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00135, train/loss_step=0.284, global_step=3306.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4585/5971 [46:38<14:05,  1.64it/s, loss=0.165, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00135, train/loss_step=0.284, global_step=3306.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4585/5971 [46:38<14:05,  1.64it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.27e-5, train/loss_step=0.0146, global_step=3307.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4586/5971 [46:39<14:05,  1.64it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0146, train/loss_vlb_step=6.27e-5, train/loss_step=0.0146, global_step=3307.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4586/5971 [46:39<14:05,  1.64it/s, loss=0.193, v_num=0, train/loss_simple_step=0.656, train/loss_vlb_step=0.0128, train/loss_step=0.656, global_step=3307.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4587/5971 [46:39<14:04,  1.64it/s, loss=0.193, v_num=0, train/loss_simple_step=0.656, train/loss_vlb_step=0.0128, train/loss_step=0.656, global_step=3307.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4587/5971 [46:39<14:04,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.705, train/loss_vlb_step=0.0137, train/loss_step=0.705, global_step=3307.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4588/5971 [46:41<14:04,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.705, train/loss_vlb_step=0.0137, train/loss_step=0.705, global_step=3307.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4588/5971 [46:41<14:04,  1.64it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00654, train/loss_vlb_step=3.01e-5, train/loss_step=0.00654, global_step=3307.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4589/5971 [46:42<14:03,  1.64it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00654, train/loss_vlb_step=3.01e-5, train/loss_step=0.00654, global_step=3307.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4589/5971 [46:42<14:03,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000127, train/loss_step=0.0335, global_step=3308.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4590/5971 [46:43<14:03,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000127, train/loss_step=0.0335, global_step=3308.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4590/5971 [46:43<14:03,  1.64it/s, loss=0.22, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000614, train/loss_step=0.180, global_step=3308.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4591/5971 [46:44<14:02,  1.64it/s, loss=0.22, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000614, train/loss_step=0.180, global_step=3308.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4591/5971 [46:44<14:02,  1.64it/s, loss=0.219, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=3.87e-5, train/loss_step=0.00843, global_step=3308.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4592/5971 [46:46<14:02,  1.64it/s, loss=0.219, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=3.87e-5, train/loss_step=0.00843, global_step=3308.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4592/5971 [46:46<14:02,  1.64it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.35e-5, train/loss_step=0.0101, global_step=3308.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4593/5971 [46:47<14:02,  1.64it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.35e-5, train/loss_step=0.0101, global_step=3308.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4593/5971 [46:47<14:02,  1.64it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.89e-5, train/loss_step=0.00351, global_step=3309.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4594/5971 [46:48<14:01,  1.64it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.89e-5, train/loss_step=0.00351, global_step=3309.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4594/5971 [46:48<14:01,  1.64it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.46e-5, train/loss_step=0.00721, global_step=3309.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4595/5971 [46:49<14:01,  1.64it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.46e-5, train/loss_step=0.00721, global_step=3309.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4595/5971 [46:49<14:01,  1.64it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.56e-5, train/loss_step=0.00503, global_step=3309.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4596/5971 [46:51<14:01,  1.63it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00503, train/loss_vlb_step=2.56e-5, train/loss_step=0.00503, global_step=3309.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4596/5971 [46:51<14:01,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000836, train/loss_step=0.216, global_step=3309.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4597/5971 [46:52<14:00,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000836, train/loss_step=0.216, global_step=3309.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4597/5971 [46:52<14:00,  1.63it/s, loss=0.191, v_num=0, train/loss_simple_step=0.619, train/loss_vlb_step=0.00875, train/loss_step=0.619, global_step=3310.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  77%|███████▋  | 4598/5971 [46:53<13:59,  1.63it/s, loss=0.191, v_num=0, train/loss_simple_step=0.619, train/loss_vlb_step=0.00875, train/loss_step=0.619, global_step=3310.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4598/5971 [46:53<13:59,  1.63it/s, loss=0.2, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000789, train/loss_step=0.215, global_step=3310.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  77%|███████▋  | 4599/5971 [46:54<13:59,  1.63it/s, loss=0.2, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000789, train/loss_step=0.215, global_step=3310.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4599/5971 [46:54<13:59,  1.63it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000145, train/loss_step=0.0407, global_step=3310.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4600/5971 [46:56<13:59,  1.63it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000145, train/loss_step=0.0407, global_step=3310.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4600/5971 [46:56<13:59,  1.63it/s, loss=0.19, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000992, train/loss_step=0.229, global_step=3310.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4601/5971 [46:57<13:58,  1.63it/s, loss=0.19, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000992, train/loss_step=0.229, global_step=3310.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4601/5971 [46:57<13:58,  1.63it/s, loss=0.212, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00581, train/loss_step=0.508, global_step=3311.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4602/5971 [46:58<13:58,  1.63it/s, loss=0.212, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00581, train/loss_step=0.508, global_step=3311.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4602/5971 [46:58<13:58,  1.63it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0721, train/loss_vlb_step=0.000237, train/loss_step=0.0721, global_step=3311.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4603/5971 [46:59<13:57,  1.63it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0721, train/loss_vlb_step=0.000237, train/loss_step=0.0721, global_step=3311.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4603/5971 [46:59<13:57,  1.63it/s, loss=0.204, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.000956, train/loss_step=0.259, global_step=3311.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  77%|███████▋  | 4604/5971 [47:01<13:57,  1.63it/s, loss=0.204, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.000956, train/loss_step=0.259, global_step=3311.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4604/5971 [47:01<13:57,  1.63it/s, loss=0.227, v_num=0, train/loss_simple_step=0.751, train/loss_vlb_step=0.0176, train/loss_step=0.751, global_step=3311.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  77%|███████▋  | 4605/5971 [47:02<13:57,  1.63it/s, loss=0.227, v_num=0, train/loss_simple_step=0.751, train/loss_vlb_step=0.0176, train/loss_step=0.751, global_step=3311.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4605/5971 [47:02<13:57,  1.63it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0742, train/loss_vlb_step=0.000246, train/loss_step=0.0742, global_step=3312.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4606/5971 [47:03<13:56,  1.63it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0742, train/loss_vlb_step=0.000246, train/loss_step=0.0742, global_step=3312.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4606/5971 [47:03<13:56,  1.63it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000135, train/loss_step=0.0371, global_step=3312.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4607/5971 [47:04<13:55,  1.63it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000135, train/loss_step=0.0371, global_step=3312.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4607/5971 [47:04<13:55,  1.63it/s, loss=0.171, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000486, train/loss_step=0.147, global_step=3312.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  77%|███████▋  | 4608/5971 [47:06<13:55,  1.63it/s, loss=0.171, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000486, train/loss_step=0.147, global_step=3312.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4608/5971 [47:06<13:55,  1.63it/s, loss=0.199, v_num=0, train/loss_simple_step=0.559, train/loss_vlb_step=0.00619, train/loss_step=0.559, global_step=3312.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  77%|███████▋  | 4609/5971 [47:07<13:55,  1.63it/s, loss=0.199, v_num=0, train/loss_simple_step=0.559, train/loss_vlb_step=0.00619, train/loss_step=0.559, global_step=3312.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4609/5971 [47:07<13:55,  1.63it/s, loss=0.219, v_num=0, train/loss_simple_step=0.442, train/loss_vlb_step=0.00252, train/loss_step=0.442, global_step=3313.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4610/5971 [47:08<13:54,  1.63it/s, loss=0.219, v_num=0, train/loss_simple_step=0.442, train/loss_vlb_step=0.00252, train/loss_step=0.442, global_step=3313.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4610/5971 [47:08<13:54,  1.63it/s, loss=0.218, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000521, train/loss_step=0.150, global_step=3313.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4611/5971 [47:09<13:54,  1.63it/s, loss=0.218, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000521, train/loss_step=0.150, global_step=3313.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4611/5971 [47:09<13:54,  1.63it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00885, train/loss_vlb_step=3.95e-5, train/loss_step=0.00885, global_step=3313.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4612/5971 [47:11<13:54,  1.63it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00885, train/loss_vlb_step=3.95e-5, train/loss_step=0.00885, global_step=3313.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4612/5971 [47:11<13:54,  1.63it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.52e-5, train/loss_step=0.0187, global_step=3313.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  77%|███████▋  | 4613/5971 [47:12<13:53,  1.63it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.52e-5, train/loss_step=0.0187, global_step=3313.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4613/5971 [47:12<13:53,  1.63it/s, loss=0.238, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00282, train/loss_step=0.406, global_step=3314.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  77%|███████▋  | 4614/5971 [47:12<13:53,  1.63it/s, loss=0.238, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00282, train/loss_step=0.406, global_step=3314.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4614/5971 [47:12<13:53,  1.63it/s, loss=0.238, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.01e-5, train/loss_step=0.00174, global_step=3314.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4615/5971 [47:13<13:52,  1.63it/s, loss=0.238, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.01e-5, train/loss_step=0.00174, global_step=3314.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4615/5971 [47:13<13:52,  1.63it/s, loss=0.248, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000685, train/loss_step=0.196, global_step=3314.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4616/5971 [47:15<13:52,  1.63it/s, loss=0.248, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000685, train/loss_step=0.196, global_step=3314.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4616/5971 [47:15<13:52,  1.63it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.16e-5, train/loss_step=0.0122, global_step=3314.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4617/5971 [47:16<13:51,  1.63it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.16e-5, train/loss_step=0.0122, global_step=3314.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4617/5971 [47:16<13:51,  1.63it/s, loss=0.214, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000514, train/loss_step=0.153, global_step=3315.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  77%|███████▋  | 4618/5971 [47:17<13:51,  1.63it/s, loss=0.214, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000514, train/loss_step=0.153, global_step=3315.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4618/5971 [47:17<13:51,  1.63it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000316, train/loss_step=0.0955, global_step=3315.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4619/5971 [47:18<13:50,  1.63it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0955, train/loss_vlb_step=0.000316, train/loss_step=0.0955, global_step=3315.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4619/5971 [47:18<13:50,  1.63it/s, loss=0.22, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00125, train/loss_step=0.282, global_step=3315.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  77%|███████▋  | 4620/5971 [47:20<13:50,  1.63it/s, loss=0.22, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00125, train/loss_step=0.282, global_step=3315.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4620/5971 [47:20<13:50,  1.63it/s, loss=0.223, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00142, train/loss_step=0.292, global_step=3315.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4621/5971 [47:21<13:50,  1.63it/s, loss=0.223, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00142, train/loss_step=0.292, global_step=3315.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4621/5971 [47:21<13:50,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00751, train/loss_vlb_step=3.62e-5, train/loss_step=0.00751, global_step=3316.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4622/5971 [47:22<13:49,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00751, train/loss_vlb_step=3.62e-5, train/loss_step=0.00751, global_step=3316.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4622/5971 [47:22<13:49,  1.63it/s, loss=0.203, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000559, train/loss_step=0.165, global_step=3316.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  77%|███████▋  | 4623/5971 [47:23<13:48,  1.63it/s, loss=0.203, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000559, train/loss_step=0.165, global_step=3316.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4623/5971 [47:23<13:48,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000528, train/loss_step=0.152, global_step=3316.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4624/5971 [47:25<13:48,  1.63it/s, loss=0.198, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000528, train/loss_step=0.152, global_step=3316.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4624/5971 [47:25<13:48,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00883, train/loss_vlb_step=4.25e-5, train/loss_step=0.00883, global_step=3316.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4625/5971 [47:26<13:48,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00883, train/loss_vlb_step=4.25e-5, train/loss_step=0.00883, global_step=3316.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4625/5971 [47:26<13:48,  1.63it/s, loss=0.166, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.00063, train/loss_step=0.183, global_step=3317.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  77%|███████▋  | 4626/5971 [47:27<13:47,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.00063, train/loss_step=0.183, global_step=3317.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4626/5971 [47:27<13:47,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.65e-5, train/loss_step=0.00294, global_step=3317.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4627/5971 [47:28<13:47,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.65e-5, train/loss_step=0.00294, global_step=3317.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  77%|███████▋  | 4627/5971 [47:28<13:47,  1.62it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000181, train/loss_step=0.0544, global_step=3317.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  78%|███████▊  | 4628/5971 [47:30<13:47,  1.62it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000181, train/loss_step=0.0544, global_step=3317.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4628/5971 [47:30<13:47,  1.62it/s, loss=0.14, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000562, train/loss_step=0.165, global_step=3317.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  78%|███████▊  | 4629/5971 [47:31<13:46,  1.62it/s, loss=0.14, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000562, train/loss_step=0.165, global_step=3317.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4629/5971 [47:31<13:46,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000546, train/loss_step=0.160, global_step=3318.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4630/5971 [47:32<13:45,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000546, train/loss_step=0.160, global_step=3318.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4630/5971 [47:32<13:45,  1.62it/s, loss=0.145, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00543, train/loss_step=0.530, global_step=3318.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  78%|███████▊  | 4631/5971 [47:33<13:45,  1.62it/s, loss=0.145, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00543, train/loss_step=0.530, global_step=3318.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4631/5971 [47:33<13:45,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00477, train/loss_step=0.525, global_step=3318.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4632/5971 [47:35<13:45,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00477, train/loss_step=0.525, global_step=3318.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4632/5971 [47:35<13:45,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.53e-5, train/loss_step=0.00517, global_step=3318.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4633/5971 [47:36<13:44,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.53e-5, train/loss_step=0.00517, global_step=3318.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4633/5971 [47:36<13:44,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00131, train/loss_step=0.292, global_step=3319.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  78%|███████▊  | 4634/5971 [47:37<13:44,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00131, train/loss_step=0.292, global_step=3319.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4634/5971 [47:37<13:44,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00311, train/loss_vlb_step=1.72e-5, train/loss_step=0.00311, global_step=3319.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4635/5971 [47:38<13:43,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00311, train/loss_vlb_step=1.72e-5, train/loss_step=0.00311, global_step=3319.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4635/5971 [47:38<13:43,  1.62it/s, loss=0.2, v_num=0, train/loss_simple_step=0.906, train/loss_vlb_step=0.456, train/loss_step=0.906, global_step=3319.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]        
Epoch 5:  78%|███████▊  | 4636/5971 [47:40<13:43,  1.62it/s, loss=0.2, v_num=0, train/loss_simple_step=0.906, train/loss_vlb_step=0.456, train/loss_step=0.906, global_step=3319.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4636/5971 [47:40<13:43,  1.62it/s, loss=0.207, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000537, train/loss_step=0.157, global_step=3319.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4637/5971 [47:41<13:42,  1.62it/s, loss=0.207, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000537, train/loss_step=0.157, global_step=3319.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4637/5971 [47:41<13:42,  1.62it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000121, train/loss_step=0.0324, global_step=3320.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4638/5971 [47:42<13:42,  1.62it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000121, train/loss_step=0.0324, global_step=3320.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4638/5971 [47:42<13:42,  1.62it/s, loss=0.206, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000712, train/loss_step=0.204, global_step=3320.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  78%|███████▊  | 4639/5971 [47:42<13:41,  1.62it/s, loss=0.206, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000712, train/loss_step=0.204, global_step=3320.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4639/5971 [47:42<13:41,  1.62it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.14e-5, train/loss_step=0.00201, global_step=3320.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4640/5971 [47:45<13:41,  1.62it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.14e-5, train/loss_step=0.00201, global_step=3320.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4640/5971 [47:45<13:41,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00121, train/loss_step=0.270, global_step=3320.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  78%|███████▊  | 4641/5971 [47:46<13:41,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00121, train/loss_step=0.270, global_step=3320.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4641/5971 [47:46<13:41,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.09e-5, train/loss_step=0.00181, global_step=3321.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4642/5971 [47:47<13:40,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.09e-5, train/loss_step=0.00181, global_step=3321.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4642/5971 [47:47<13:40,  1.62it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0457, train/loss_vlb_step=0.000165, train/loss_step=0.0457, global_step=3321.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  78%|███████▊  | 4643/5971 [47:47<13:40,  1.62it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0457, train/loss_vlb_step=0.000165, train/loss_step=0.0457, global_step=3321.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4643/5971 [47:47<13:40,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=3321.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  78%|███████▊  | 4644/5971 [47:50<13:40,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=3321.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4644/5971 [47:50<13:40,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.72e-5, train/loss_step=0.00303, global_step=3321.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4645/5971 [47:51<13:39,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.72e-5, train/loss_step=0.00303, global_step=3321.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4645/5971 [47:51<13:39,  1.62it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.15e-5, train/loss_step=0.00191, global_step=3322.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4646/5971 [47:52<13:38,  1.62it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.15e-5, train/loss_step=0.00191, global_step=3322.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4646/5971 [47:52<13:38,  1.62it/s, loss=0.185, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000937, train/loss_step=0.228, global_step=3322.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  78%|███████▊  | 4647/5971 [47:53<13:38,  1.62it/s, loss=0.185, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000937, train/loss_step=0.228, global_step=3322.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4647/5971 [47:53<13:38,  1.62it/s, loss=0.193, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000817, train/loss_step=0.224, global_step=3322.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4648/5971 [47:55<13:38,  1.62it/s, loss=0.193, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000817, train/loss_step=0.224, global_step=3322.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4648/5971 [47:55<13:38,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00185, train/loss_step=0.394, global_step=3322.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  78%|███████▊  | 4649/5971 [47:56<13:37,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00185, train/loss_step=0.394, global_step=3322.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4649/5971 [47:56<13:37,  1.62it/s, loss=0.229, v_num=0, train/loss_simple_step=0.639, train/loss_vlb_step=0.00619, train/loss_step=0.639, global_step=3323.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4650/5971 [47:57<13:37,  1.62it/s, loss=0.229, v_num=0, train/loss_simple_step=0.639, train/loss_vlb_step=0.00619, train/loss_step=0.639, global_step=3323.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4650/5971 [47:57<13:37,  1.62it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000163, train/loss_step=0.0443, global_step=3323.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4651/5971 [47:58<13:36,  1.62it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000163, train/loss_step=0.0443, global_step=3323.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4651/5971 [47:58<13:36,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.000104, train/loss_step=0.0256, global_step=3323.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4652/5971 [48:00<13:36,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.000104, train/loss_step=0.0256, global_step=3323.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4652/5971 [48:00<13:36,  1.62it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00744, train/loss_vlb_step=3.3e-5, train/loss_step=0.00744, global_step=3323.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  78%|███████▊  | 4653/5971 [48:01<13:35,  1.62it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00744, train/loss_vlb_step=3.3e-5, train/loss_step=0.00744, global_step=3323.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4653/5971 [48:01<13:35,  1.62it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.75e-5, train/loss_step=0.00314, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4654/5971 [48:01<13:35,  1.62it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.75e-5, train/loss_step=0.00314, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4654/5971 [48:01<13:35,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  78%|███████▊  | 4655/5971 [48:02<13:34,  1.62it/s, loss=0.171, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4655/5971 [48:02<13:34,  1.62it/s, loss=0.138, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000946, train/loss_step=0.241, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4656/5971 [48:04<13:34,  1.61it/s, loss=0.138, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000946, train/loss_step=0.241, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  78%|███████▊  | 4656/5971 [48:04<13:34,  1.61it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.25it/s][A
Epoch 5:  78%|███████▊  | 4658/5971 [48:05<13:33,  1.61it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:00<00:43,  3.76it/s][A
Epoch 5:  78%|███████▊  | 4660/5971 [48:05<13:31,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.56it/s][A
Epoch 5:  78%|███████▊  | 4663/5971 [48:05<13:29,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.41it/s][A
Epoch 5:  78%|███████▊  | 4666/5971 [48:05<13:26,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.53it/s][A
Epoch 5:  78%|███████▊  | 4669/5971 [48:05<13:24,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.53it/s][A
Epoch 5:  78%|███████▊  | 4672/5971 [48:06<13:22,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.36it/s][A
Epoch 5:  78%|███████▊  | 4675/5971 [48:06<13:19,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.93it/s][A
Epoch 5:  78%|███████▊  | 4678/5971 [48:06<13:17,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.71it/s][A
Epoch 5:  78%|███████▊  | 4681/5971 [48:06<13:15,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.40it/s][A
Epoch 5:  78%|███████▊  | 4684/5971 [48:06<13:12,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.27it/s][A
Epoch 5:  78%|███████▊  | 4687/5971 [48:06<13:10,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 22.94it/s][A
Epoch 5:  79%|███████▊  | 4690/5971 [48:06<13:08,  1.62it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.28it/s][A
Epoch 5:  79%|███████▊  | 4693/5971 [48:06<13:06,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 23.88it/s][A
Epoch 5:  79%|███████▊  | 4696/5971 [48:07<13:03,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.46it/s][A
Epoch 5:  79%|███████▊  | 4699/5971 [48:07<13:01,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.69it/s][A
Epoch 5:  79%|███████▊  | 4702/5971 [48:07<12:59,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.26it/s][A
Epoch 5:  79%|███████▉  | 4705/5971 [48:07<12:56,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.80it/s][A
Epoch 5:  79%|███████▉  | 4708/5971 [48:07<12:54,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 25.80it/s][A
Epoch 5:  79%|███████▉  | 4711/5971 [48:07<12:52,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.93it/s][A
Epoch 5:  79%|███████▉  | 4714/5971 [48:07<12:49,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.65it/s][A
Epoch 5:  79%|███████▉  | 4717/5971 [48:07<12:47,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 26.03it/s][A
Epoch 5:  79%|███████▉  | 4720/5971 [48:07<12:45,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.98it/s][A
Epoch 5:  79%|███████▉  | 4724/5971 [48:08<12:42,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.12it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.62it/s][A
Epoch 5:  79%|███████▉  | 4728/5971 [48:08<12:39,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.61it/s][A
Epoch 5:  79%|███████▉  | 4732/5971 [48:08<12:36,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.97it/s][A
Epoch 5:  79%|███████▉  | 4736/5971 [48:08<12:33,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.66it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.26it/s][A
Epoch 5:  79%|███████▉  | 4740/5971 [48:08<12:30,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.70it/s][A
Epoch 5:  79%|███████▉  | 4744/5971 [48:08<12:27,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.05it/s][A
Epoch 5:  80%|███████▉  | 4748/5971 [48:09<12:24,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 28.04it/s][A
Epoch 5:  80%|███████▉  | 4752/5971 [48:09<12:20,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 28.26it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.36it/s][A
Epoch 5:  80%|███████▉  | 4756/5971 [48:09<12:17,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.94it/s][A
Epoch 5:  80%|███████▉  | 4760/5971 [48:09<12:14,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.20it/s][A
Epoch 5:  80%|███████▉  | 4764/5971 [48:09<12:11,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.71it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.80it/s][A
Epoch 5:  80%|███████▉  | 4768/5971 [48:09<12:08,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 26.45it/s][A
Epoch 5:  80%|███████▉  | 4772/5971 [48:09<12:05,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.42it/s][A
Epoch 5:  80%|███████▉  | 4776/5971 [48:10<12:02,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 28.09it/s][A
Epoch 5:  80%|████████  | 4780/5971 [48:10<11:59,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 28.03it/s][A
Epoch 5:  80%|████████  | 4784/5971 [48:10<11:57,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.46it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 27.44it/s][A
Epoch 5:  80%|████████  | 4788/5971 [48:10<11:54,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|████████  | 134/167 [00:05<00:01, 27.93it/s][A
Epoch 5:  80%|████████  | 4792/5971 [48:10<11:51,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.38it/s][A
Epoch 5:  80%|████████  | 4796/5971 [48:10<11:48,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.90it/s][A

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.38it/s][A
Epoch 5:  80%|████████  | 4800/5971 [48:10<11:45,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.41it/s][A
Epoch 5:  80%|████████  | 4804/5971 [48:11<11:42,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.32it/s][A
Epoch 5:  81%|████████  | 4808/5971 [48:11<11:39,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.16it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.26it/s][A
Epoch 5:  81%|████████  | 4812/5971 [48:11<11:36,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.46it/s][A
Epoch 5:  81%|████████  | 4816/5971 [48:11<11:33,  1.67it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 27.71it/s][A
Epoch 5:  81%|████████  | 4820/5971 [48:11<11:30,  1.67it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.94it/s][A
Epoch 5:  81%|████████  | 4824/5971 [48:11<11:27,  1.67it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4824/5971 [48:12<11:27,  1.67it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00141, train/loss_step=0.308, global_step=3324.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  81%|████████  | 4825/5971 [48:13<11:27,  1.67it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00524, train/loss_vlb_step=2.76e-5, train/loss_step=0.00524, global_step=3325.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4826/5971 [48:14<11:26,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.734, train/loss_vlb_step=0.0159, train/loss_step=0.734, global_step=3325.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  81%|████████  | 4827/5971 [48:15<11:26,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0175, train/loss_step=0.687, global_step=3325.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4828/5971 [48:18<11:25,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.687, train/loss_vlb_step=0.0175, train/loss_step=0.687, global_step=3325.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4828/5971 [48:18<11:25,  1.67it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00473, train/loss_vlb_step=2.47e-5, train/loss_step=0.00473, global_step=3325.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4829/5971 [48:19<11:25,  1.67it/s, loss=0.203, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000935, train/loss_step=0.229, global_step=3326.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  81%|████████  | 4830/5971 [48:20<11:24,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000295, train/loss_step=0.0894, global_step=3326.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4831/5971 [48:21<11:24,  1.67it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=4.83e-5, train/loss_step=0.0115, global_step=3326.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  81%|████████  | 4832/5971 [48:23<11:24,  1.66it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=4.83e-5, train/loss_step=0.0115, global_step=3326.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4832/5971 [48:23<11:24,  1.66it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.000203, train/loss_step=0.0613, global_step=3326.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4833/5971 [48:24<11:23,  1.66it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.05e-5, train/loss_step=0.0165, global_step=3327.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  81%|████████  | 4834/5971 [48:24<11:23,  1.66it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.37e-5, train/loss_step=0.00247, global_step=3327.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4835/5971 [48:25<11:22,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00942, train/loss_vlb_step=4.42e-5, train/loss_step=0.00942, global_step=3327.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4836/5971 [48:28<11:22,  1.66it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00942, train/loss_vlb_step=4.42e-5, train/loss_step=0.00942, global_step=3327.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4836/5971 [48:28<11:22,  1.66it/s, loss=0.181, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00259, train/loss_step=0.382, global_step=3327.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  81%|████████  | 4837/5971 [48:29<11:21,  1.66it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00958, train/loss_vlb_step=4.34e-5, train/loss_step=0.00958, global_step=3328.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4838/5971 [48:29<11:21,  1.66it/s, loss=0.155, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000521, train/loss_step=0.155, global_step=3328.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  81%|████████  | 4839/5971 [48:30<11:20,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0977, train/loss_vlb_step=0.000329, train/loss_step=0.0977, global_step=3328.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4840/5971 [48:32<11:20,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0977, train/loss_vlb_step=0.000329, train/loss_step=0.0977, global_step=3328.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4840/5971 [48:32<11:20,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.005, train/loss_vlb_step=2.59e-5, train/loss_step=0.005, global_step=3328.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  81%|████████  | 4841/5971 [48:33<11:20,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0041, train/loss_vlb_step=2.12e-5, train/loss_step=0.0041, global_step=3329.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4842/5971 [48:34<11:19,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.69e-5, train/loss_step=0.0197, global_step=3329.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4843/5971 [48:35<11:18,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=3.73e-5, train/loss_step=0.00851, global_step=3329.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4844/5971 [48:37<11:18,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=3.73e-5, train/loss_step=0.00851, global_step=3329.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4844/5971 [48:37<11:18,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000573, train/loss_step=0.165, global_step=3329.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  81%|████████  | 4845/5971 [48:38<11:18,  1.66it/s, loss=0.138, v_num=0, train/loss_simple_step=0.069, train/loss_vlb_step=0.000233, train/loss_step=0.069, global_step=3330.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4846/5971 [48:39<11:17,  1.66it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00667, train/loss_vlb_step=3.2e-5, train/loss_step=0.00667, global_step=3330.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4847/5971 [48:40<11:17,  1.66it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000128, train/loss_step=0.0346, global_step=3330.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4848/5971 [48:42<11:16,  1.66it/s, loss=0.0691, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000128, train/loss_step=0.0346, global_step=3330.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4848/5971 [48:42<11:16,  1.66it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00173, train/loss_step=0.345, global_step=3330.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  81%|████████  | 4849/5971 [48:43<11:16,  1.66it/s, loss=0.0807, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000406, train/loss_step=0.121, global_step=3331.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4850/5971 [48:44<11:15,  1.66it/s, loss=0.0802, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000273, train/loss_step=0.0807, global_step=3331.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████  | 4851/5971 [48:45<11:15,  1.66it/s, loss=0.104, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.00286, train/loss_step=0.489, global_step=3331.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  81%|████████▏ | 4852/5971 [48:47<11:15,  1.66it/s, loss=0.104, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.00286, train/loss_step=0.489, global_step=3331.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4852/5971 [48:47<11:15,  1.66it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.91e-5, train/loss_step=0.00341, global_step=3331.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4853/5971 [48:48<11:14,  1.66it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.76e-5, train/loss_step=0.00346, global_step=3332.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4854/5971 [48:49<11:13,  1.66it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.97e-5, train/loss_step=0.0249, global_step=3332.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  81%|████████▏ | 4855/5971 [48:50<11:13,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000138, train/loss_step=0.0369, global_step=3332.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4856/5971 [48:52<11:13,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0369, train/loss_vlb_step=0.000138, train/loss_step=0.0369, global_step=3332.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4856/5971 [48:52<11:13,  1.66it/s, loss=0.0842, v_num=0, train/loss_simple_step=0.00426, train/loss_vlb_step=2.17e-5, train/loss_step=0.00426, global_step=3332.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4857/5971 [48:53<11:12,  1.66it/s, loss=0.0843, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.53e-5, train/loss_step=0.0125, global_step=3333.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  81%|████████▏ | 4858/5971 [48:54<11:12,  1.66it/s, loss=0.0838, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000498, train/loss_step=0.145, global_step=3333.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  81%|████████▏ | 4859/5971 [48:55<11:11,  1.66it/s, loss=0.079, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.29e-5, train/loss_step=0.00229, global_step=3333.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4860/5971 [48:57<11:11,  1.66it/s, loss=0.079, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.29e-5, train/loss_step=0.00229, global_step=3333.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4860/5971 [48:57<11:11,  1.66it/s, loss=0.0793, v_num=0, train/loss_simple_step=0.00975, train/loss_vlb_step=4.3e-5, train/loss_step=0.00975, global_step=3333.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4861/5971 [48:58<11:10,  1.65it/s, loss=0.0793, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.93e-5, train/loss_step=0.00364, global_step=3334.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4862/5971 [48:58<11:10,  1.65it/s, loss=0.0811, v_num=0, train/loss_simple_step=0.0567, train/loss_vlb_step=0.000199, train/loss_step=0.0567, global_step=3334.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  81%|████████▏ | 4863/5971 [48:59<11:09,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.736, train/loss_vlb_step=0.0381, train/loss_step=0.736, global_step=3334.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  81%|████████▏ | 4864/5971 [49:02<11:09,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.736, train/loss_vlb_step=0.0381, train/loss_step=0.736, global_step=3334.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4864/5971 [49:02<11:09,  1.65it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0827, train/loss_vlb_step=0.000272, train/loss_step=0.0827, global_step=3334.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  81%|████████▏ | 4865/5971 [49:03<11:08,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.00109, train/loss_step=0.237, global_step=3335.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  81%|████████▏ | 4866/5971 [49:03<11:08,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.82e-5, train/loss_step=0.0208, global_step=3335.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4867/5971 [49:04<11:07,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.12e-5, train/loss_step=0.023, global_step=3335.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  82%|████████▏ | 4868/5971 [49:06<11:07,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.12e-5, train/loss_step=0.023, global_step=3335.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4868/5971 [49:06<11:07,  1.65it/s, loss=0.106, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.86e-5, train/loss_step=0.022, global_step=3335.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4869/5971 [49:07<11:07,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.451, train/loss_vlb_step=0.0061, train/loss_step=0.451, global_step=3336.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4870/5971 [49:08<11:06,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000194, train/loss_step=0.0548, global_step=3336.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4871/5971 [49:09<11:05,  1.65it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00791, train/loss_vlb_step=3.79e-5, train/loss_step=0.00791, global_step=3336.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4872/5971 [49:11<11:05,  1.65it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00791, train/loss_vlb_step=3.79e-5, train/loss_step=0.00791, global_step=3336.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4872/5971 [49:11<11:05,  1.65it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.00789, train/loss_vlb_step=3.88e-5, train/loss_step=0.00789, global_step=3336.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4873/5971 [49:12<11:05,  1.65it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.00446, train/loss_vlb_step=2.24e-5, train/loss_step=0.00446, global_step=3337.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4874/5971 [49:13<11:04,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00225, train/loss_step=0.452, global_step=3337.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  82%|████████▏ | 4875/5971 [49:14<11:04,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.45e-5, train/loss_step=0.010, global_step=3337.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4876/5971 [49:16<11:03,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.45e-5, train/loss_step=0.010, global_step=3337.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4876/5971 [49:16<11:03,  1.65it/s, loss=0.125, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000546, train/loss_step=0.162, global_step=3337.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4877/5971 [49:17<11:03,  1.65it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000255, train/loss_step=0.0743, global_step=3338.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4878/5971 [49:18<11:02,  1.65it/s, loss=0.128, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000458, train/loss_step=0.139, global_step=3338.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  82%|████████▏ | 4879/5971 [49:19<11:02,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00159, train/loss_step=0.323, global_step=3338.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4880/5971 [49:21<11:01,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00159, train/loss_step=0.323, global_step=3338.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4880/5971 [49:21<11:01,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.00107, train/loss_step=0.233, global_step=3338.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4881/5971 [49:22<11:01,  1.65it/s, loss=0.163, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000517, train/loss_step=0.157, global_step=3339.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4882/5971 [49:23<11:00,  1.65it/s, loss=0.173, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00105, train/loss_step=0.259, global_step=3339.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4883/5971 [49:24<11:00,  1.65it/s, loss=0.137, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.35e-5, train/loss_step=0.012, global_step=3339.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4884/5971 [49:26<11:00,  1.65it/s, loss=0.137, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.35e-5, train/loss_step=0.012, global_step=3339.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4884/5971 [49:26<11:00,  1.65it/s, loss=0.15, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00206, train/loss_step=0.359, global_step=3339.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4885/5971 [49:27<10:59,  1.65it/s, loss=0.139, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=7.07e-5, train/loss_step=0.016, global_step=3340.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4886/5971 [49:28<10:59,  1.65it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0652, train/loss_vlb_step=0.000223, train/loss_step=0.0652, global_step=3340.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4887/5971 [49:29<10:58,  1.65it/s, loss=0.15, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000652, train/loss_step=0.183, global_step=3340.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  82%|████████▏ | 4888/5971 [49:31<10:58,  1.65it/s, loss=0.15, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000652, train/loss_step=0.183, global_step=3340.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4888/5971 [49:31<10:58,  1.65it/s, loss=0.158, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000691, train/loss_step=0.189, global_step=3340.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4889/5971 [49:32<10:57,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000559, train/loss_step=0.165, global_step=3341.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4890/5971 [49:33<10:57,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000181, train/loss_step=0.0491, global_step=3341.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4891/5971 [49:34<10:56,  1.64it/s, loss=0.163, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00251, train/loss_step=0.393, global_step=3341.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  82%|████████▏ | 4892/5971 [49:36<10:56,  1.64it/s, loss=0.163, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00251, train/loss_step=0.393, global_step=3341.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4892/5971 [49:36<10:56,  1.64it/s, loss=0.171, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000622, train/loss_step=0.179, global_step=3341.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4893/5971 [49:37<10:55,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0989, train/loss_vlb_step=0.000325, train/loss_step=0.0989, global_step=3342.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4894/5971 [49:38<10:55,  1.64it/s, loss=0.179, v_num=0, train/loss_simple_step=0.505, train/loss_vlb_step=0.00406, train/loss_step=0.505, global_step=3342.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  82%|████████▏ | 4895/5971 [49:39<10:54,  1.64it/s, loss=0.189, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000786, train/loss_step=0.216, global_step=3342.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4896/5971 [49:41<10:54,  1.64it/s, loss=0.189, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000786, train/loss_step=0.216, global_step=3342.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4896/5971 [49:41<10:54,  1.64it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.89e-5, train/loss_step=0.0133, global_step=3342.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4897/5971 [49:42<10:53,  1.64it/s, loss=0.192, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00112, train/loss_step=0.282, global_step=3343.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  82%|████████▏ | 4898/5971 [49:43<10:53,  1.64it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.21e-5, train/loss_step=0.0201, global_step=3343.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4899/5971 [49:44<10:52,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.26e-5, train/loss_step=0.00219, global_step=3343.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4900/5971 [49:46<10:52,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.26e-5, train/loss_step=0.00219, global_step=3343.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4900/5971 [49:46<10:52,  1.64it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0731, train/loss_vlb_step=0.000242, train/loss_step=0.0731, global_step=3343.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4901/5971 [49:47<10:52,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000352, train/loss_step=0.107, global_step=3344.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  82%|████████▏ | 4902/5971 [49:47<10:51,  1.64it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.45e-5, train/loss_step=0.00244, global_step=3344.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4903/5971 [49:48<10:50,  1.64it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000304, train/loss_step=0.0926, global_step=3344.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  82%|████████▏ | 4904/5971 [49:50<10:50,  1.64it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000304, train/loss_step=0.0926, global_step=3344.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4904/5971 [49:50<10:50,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.529, train/loss_vlb_step=0.00444, train/loss_step=0.529, global_step=3344.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  82%|████████▏ | 4905/5971 [49:51<10:50,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.96e-5, train/loss_step=0.016, global_step=3345.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4906/5971 [49:52<10:49,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.048, train/loss_vlb_step=0.000166, train/loss_step=0.048, global_step=3345.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4907/5971 [49:53<10:48,  1.64it/s, loss=0.16, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000775, train/loss_step=0.214, global_step=3345.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4908/5971 [49:55<10:48,  1.64it/s, loss=0.16, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000775, train/loss_step=0.214, global_step=3345.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4908/5971 [49:55<10:48,  1.64it/s, loss=0.175, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.00417, train/loss_step=0.492, global_step=3345.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4909/5971 [49:56<10:48,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.51e-5, train/loss_step=0.0236, global_step=3346.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4910/5971 [49:57<10:47,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000752, train/loss_step=0.204, global_step=3346.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4911/5971 [49:58<10:47,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.0019, train/loss_step=0.394, global_step=3346.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  82%|████████▏ | 4912/5971 [50:00<10:46,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.0019, train/loss_step=0.394, global_step=3346.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4912/5971 [50:00<10:46,  1.64it/s, loss=0.207, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0465, train/loss_step=0.812, global_step=3346.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4913/5971 [50:01<10:46,  1.64it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000111, train/loss_step=0.0327, global_step=3347.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4914/5971 [50:02<10:45,  1.64it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.45e-5, train/loss_step=0.0149, global_step=3347.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4915/5971 [50:03<10:45,  1.64it/s, loss=0.206, v_num=0, train/loss_simple_step=0.757, train/loss_vlb_step=0.0235, train/loss_step=0.757, global_step=3347.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  82%|████████▏ | 4916/5971 [50:05<10:44,  1.64it/s, loss=0.206, v_num=0, train/loss_simple_step=0.757, train/loss_vlb_step=0.0235, train/loss_step=0.757, global_step=3347.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4916/5971 [50:05<10:44,  1.64it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=0.000101, train/loss_step=0.0246, global_step=3347.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4917/5971 [50:06<10:44,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00423, train/loss_step=0.488, global_step=3348.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  82%|████████▏ | 4918/5971 [50:07<10:43,  1.64it/s, loss=0.233, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00154, train/loss_step=0.338, global_step=3348.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4919/5971 [50:08<10:43,  1.64it/s, loss=0.234, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=6.61e-5, train/loss_step=0.0174, global_step=3348.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4920/5971 [50:10<10:42,  1.63it/s, loss=0.234, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=6.61e-5, train/loss_step=0.0174, global_step=3348.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4920/5971 [50:10<10:42,  1.63it/s, loss=0.237, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000444, train/loss_step=0.133, global_step=3348.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4921/5971 [50:11<10:42,  1.63it/s, loss=0.239, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000482, train/loss_step=0.146, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4922/5971 [50:12<10:41,  1.63it/s, loss=0.24, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000111, train/loss_step=0.028, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  82%|████████▏ | 4923/5971 [50:13<10:41,  1.63it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.000278, train/loss_step=0.0845, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4924/5971 [50:15<10:41,  1.63it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.000278, train/loss_step=0.0845, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  82%|████████▏ | 4924/5971 [50:15<10:41,  1.63it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:01,  2.69it/s][A

Validating:   1%|          | 2/167 [00:00<00:47,  3.49it/s][A
Epoch 5:  83%|████████▎ | 4928/5971 [50:16<10:38,  1.63it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.89it/s][A
Epoch 5:  83%|████████▎ | 4932/5971 [50:16<10:35,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.88it/s][A
Epoch 5:  83%|████████▎ | 4936/5971 [50:16<10:32,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 12/167 [00:00<00:08, 18.02it/s][A

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.32it/s][A
Epoch 5:  83%|████████▎ | 4940/5971 [50:16<10:29,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█         | 18/167 [00:01<00:06, 21.49it/s][A
Epoch 5:  83%|████████▎ | 4944/5971 [50:16<10:26,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.08it/s][A
Epoch 5:  83%|████████▎ | 4948/5971 [50:16<10:23,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.17it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.58it/s][A
Epoch 5:  83%|████████▎ | 4952/5971 [50:17<10:20,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.62it/s][A
Epoch 5:  83%|████████▎ | 4956/5971 [50:17<10:17,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|██        | 34/167 [00:01<00:05, 25.34it/s][A
Epoch 5:  83%|████████▎ | 4960/5971 [50:17<10:14,  1.64it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 37/167 [00:01<00:04, 26.35it/s][A
Epoch 5:  83%|████████▎ | 4964/5971 [50:17<10:11,  1.65it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 26.48it/s][A

Validating:  26%|██▌       | 43/167 [00:02<00:04, 27.27it/s][A
Epoch 5:  83%|████████▎ | 4968/5971 [50:17<10:09,  1.65it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 28.95it/s][A
Epoch 5:  83%|████████▎ | 4972/5971 [50:17<10:06,  1.65it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 51/167 [00:02<00:03, 29.87it/s][A
Epoch 5:  83%|████████▎ | 4976/5971 [50:17<10:03,  1.65it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 54/167 [00:02<00:03, 29.86it/s][A
Epoch 5:  83%|████████▎ | 4980/5971 [50:17<10:00,  1.65it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 29.84it/s][A
Epoch 5:  83%|████████▎ | 4984/5971 [50:18<09:57,  1.65it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 29.97it/s][A
Epoch 5:  84%|████████▎ | 4988/5971 [50:18<09:54,  1.65it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 29.01it/s][A

Validating:  40%|████      | 67/167 [00:02<00:03, 28.72it/s][A
Epoch 5:  84%|████████▎ | 4992/5971 [50:18<09:51,  1.65it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.99it/s][A
Epoch 5:  84%|████████▎ | 4996/5971 [50:18<09:48,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.24it/s][A
Epoch 5:  84%|████████▎ | 5000/5971 [50:18<09:46,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 27.84it/s][A

Validating:  47%|████▋     | 79/167 [00:03<00:03, 27.14it/s][A
Epoch 5:  84%|████████▍ | 5004/5971 [50:18<09:43,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.56it/s][A
Epoch 5:  84%|████████▍ | 5008/5971 [50:18<09:40,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████     | 85/167 [00:03<00:02, 27.93it/s][A
Epoch 5:  84%|████████▍ | 5012/5971 [50:19<09:37,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.30it/s][A

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.17it/s][A
Epoch 5:  84%|████████▍ | 5016/5971 [50:19<09:34,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:03<00:02, 27.12it/s][A
Epoch 5:  84%|████████▍ | 5020/5971 [50:19<09:31,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.91it/s][A
Epoch 5:  84%|████████▍ | 5024/5971 [50:19<09:29,  1.66it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.88it/s][A
Epoch 5:  84%|████████▍ | 5028/5971 [50:19<09:26,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.79it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.31it/s][A
Epoch 5:  84%|████████▍ | 5032/5971 [50:19<09:23,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▋   | 111/167 [00:04<00:01, 28.04it/s][A
Epoch 5:  84%|████████▍ | 5036/5971 [50:20<09:20,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 26.70it/s][A
Epoch 5:  84%|████████▍ | 5040/5971 [50:20<09:17,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.37it/s][A
Epoch 5:  84%|████████▍ | 5044/5971 [50:20<09:14,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 26.83it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.33it/s][A
Epoch 5:  85%|████████▍ | 5048/5971 [50:20<09:12,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.36it/s][A
Epoch 5:  85%|████████▍ | 5052/5971 [50:20<09:09,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.70it/s][A
Epoch 5:  85%|████████▍ | 5056/5971 [50:20<09:06,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.38it/s][A
Epoch 5:  85%|████████▍ | 5060/5971 [50:20<09:03,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.56it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.47it/s][A
Epoch 5:  85%|████████▍ | 5064/5971 [50:21<09:00,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.70it/s][A
Epoch 5:  85%|████████▍ | 5068/5971 [50:21<08:58,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:05<00:00, 27.92it/s][A
Epoch 5:  85%|████████▍ | 5072/5971 [50:21<08:55,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▉ | 149/167 [00:05<00:00, 27.49it/s][A
Epoch 5:  85%|████████▌ | 5076/5971 [50:21<08:52,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.26it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 24.83it/s][A
Epoch 5:  85%|████████▌ | 5080/5971 [50:21<08:49,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.33it/s][A
Epoch 5:  85%|████████▌ | 5084/5971 [50:21<08:47,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.03it/s][A
Epoch 5:  85%|████████▌ | 5088/5971 [50:21<08:44,  1.68it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.31it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.18it/s][A
Epoch 5:  85%|████████▌ | 5092/5971 [50:22<08:41,  1.69it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5092/5971 [50:22<08:41,  1.69it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0655, train/loss_vlb_step=0.000223, train/loss_step=0.0655, global_step=3349.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  85%|████████▌ | 5093/5971 [50:23<08:41,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000391, train/loss_step=0.118, global_step=3350.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  85%|████████▌ | 5094/5971 [50:24<08:40,  1.68it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.55e-5, train/loss_step=0.0121, global_step=3350.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5095/5971 [50:25<08:40,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00115, train/loss_step=0.260, global_step=3350.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  85%|████████▌ | 5096/5971 [50:27<08:39,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00115, train/loss_step=0.260, global_step=3350.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5096/5971 [50:27<08:39,  1.68it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00258, train/loss_vlb_step=1.5e-5, train/loss_step=0.00258, global_step=3350.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5097/5971 [50:28<08:39,  1.68it/s, loss=0.209, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000928, train/loss_step=0.242, global_step=3351.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  85%|████████▌ | 5098/5971 [50:29<08:38,  1.68it/s, loss=0.205, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000411, train/loss_step=0.125, global_step=3351.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5099/5971 [50:30<08:38,  1.68it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000214, train/loss_step=0.0644, global_step=3351.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5100/5971 [50:32<08:37,  1.68it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000214, train/loss_step=0.0644, global_step=3351.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5100/5971 [50:32<08:37,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00164, train/loss_step=0.328, global_step=3351.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  85%|████████▌ | 5101/5971 [50:33<08:37,  1.68it/s, loss=0.168, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=3352.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5102/5971 [50:34<08:36,  1.68it/s, loss=0.182, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00162, train/loss_step=0.294, global_step=3352.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  85%|████████▌ | 5103/5971 [50:35<08:36,  1.68it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.17e-5, train/loss_step=0.0194, global_step=3352.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5104/5971 [50:37<08:35,  1.68it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=8.17e-5, train/loss_step=0.0194, global_step=3352.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  85%|████████▌ | 5104/5971 [50:37<08:35,  1.68it/s, loss=0.173, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00868, train/loss_step=0.586, global_step=3352.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  85%|████████▌ | 5105/5971 [50:38<08:35,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000414, train/loss_step=0.125, global_step=3353.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5106/5971 [50:39<08:34,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.560, train/loss_vlb_step=0.00775, train/loss_step=0.560, global_step=3353.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▌ | 5107/5971 [50:39<08:34,  1.68it/s, loss=0.187, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00252, train/loss_step=0.431, global_step=3353.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5108/5971 [50:42<08:33,  1.68it/s, loss=0.187, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00252, train/loss_step=0.431, global_step=3353.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5108/5971 [50:42<08:33,  1.68it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00672, train/loss_vlb_step=3.46e-5, train/loss_step=0.00672, global_step=3353.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5109/5971 [50:43<08:33,  1.68it/s, loss=0.186, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.000959, train/loss_step=0.258, global_step=3354.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5110/5971 [50:44<08:32,  1.68it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0377, train/loss_vlb_step=0.00014, train/loss_step=0.0377, global_step=3354.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5111/5971 [50:44<08:32,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00124, train/loss_step=0.294, global_step=3354.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  86%|████████▌ | 5112/5971 [50:47<08:31,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00124, train/loss_step=0.294, global_step=3354.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5112/5971 [50:47<08:31,  1.68it/s, loss=0.207, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00119, train/loss_step=0.268, global_step=3354.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5113/5971 [50:47<08:31,  1.68it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0705, train/loss_vlb_step=0.000235, train/loss_step=0.0705, global_step=3355.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5114/5971 [50:48<08:30,  1.68it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00644, train/loss_vlb_step=3.21e-5, train/loss_step=0.00644, global_step=3355.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5115/5971 [50:49<08:30,  1.68it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000167, train/loss_step=0.0484, global_step=3355.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▌ | 5116/5971 [50:52<08:29,  1.68it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000167, train/loss_step=0.0484, global_step=3355.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5116/5971 [50:52<08:29,  1.68it/s, loss=0.208, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00139, train/loss_step=0.285, global_step=3355.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5117/5971 [50:53<08:29,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.19e-5, train/loss_step=0.0176, global_step=3356.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5118/5971 [50:53<08:28,  1.68it/s, loss=0.2, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000641, train/loss_step=0.182, global_step=3356.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5119/5971 [50:54<08:28,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.23e-5, train/loss_step=0.00442, global_step=3356.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5120/5971 [50:56<08:28,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.23e-5, train/loss_step=0.00442, global_step=3356.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5120/5971 [50:56<08:28,  1.68it/s, loss=0.193, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000941, train/loss_step=0.247, global_step=3356.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5121/5971 [50:57<08:27,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.00401, train/loss_step=0.479, global_step=3357.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▌ | 5122/5971 [50:58<08:26,  1.67it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.25e-5, train/loss_step=0.00403, global_step=3357.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5123/5971 [50:59<08:26,  1.67it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000172, train/loss_step=0.0512, global_step=3357.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▌ | 5124/5971 [51:01<08:26,  1.67it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000172, train/loss_step=0.0512, global_step=3357.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5124/5971 [51:01<08:26,  1.67it/s, loss=0.187, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00176, train/loss_step=0.359, global_step=3357.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5125/5971 [51:02<08:25,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0512, train/loss_vlb_step=0.000183, train/loss_step=0.0512, global_step=3358.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5126/5971 [51:03<08:24,  1.67it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00304, train/loss_vlb_step=1.67e-5, train/loss_step=0.00304, global_step=3358.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5127/5971 [51:04<08:24,  1.67it/s, loss=0.163, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00534, train/loss_step=0.594, global_step=3358.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  86%|████████▌ | 5128/5971 [51:06<08:24,  1.67it/s, loss=0.163, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.00534, train/loss_step=0.594, global_step=3358.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5128/5971 [51:06<08:24,  1.67it/s, loss=0.177, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.00103, train/loss_step=0.272, global_step=3358.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5129/5971 [51:07<08:23,  1.67it/s, loss=0.198, v_num=0, train/loss_simple_step=0.685, train/loss_vlb_step=0.0297, train/loss_step=0.685, global_step=3359.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▌ | 5130/5971 [51:08<08:22,  1.67it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00839, train/loss_vlb_step=4.04e-5, train/loss_step=0.00839, global_step=3359.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5131/5971 [51:09<08:22,  1.67it/s, loss=0.194, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000886, train/loss_step=0.251, global_step=3359.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5132/5971 [51:11<08:22,  1.67it/s, loss=0.194, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000886, train/loss_step=0.251, global_step=3359.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5132/5971 [51:11<08:22,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.45e-5, train/loss_step=0.0174, global_step=3359.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5133/5971 [51:12<08:21,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=1.97e-5, train/loss_step=0.00374, global_step=3360.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5134/5971 [51:13<08:20,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000386, train/loss_step=0.117, global_step=3360.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5135/5971 [51:14<08:20,  1.67it/s, loss=0.214, v_num=0, train/loss_simple_step=0.641, train/loss_vlb_step=0.015, train/loss_step=0.641, global_step=3360.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5136/5971 [51:16<08:20,  1.67it/s, loss=0.214, v_num=0, train/loss_simple_step=0.641, train/loss_vlb_step=0.015, train/loss_step=0.641, global_step=3360.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5136/5971 [51:16<08:20,  1.67it/s, loss=0.233, v_num=0, train/loss_simple_step=0.673, train/loss_vlb_step=0.0136, train/loss_step=0.673, global_step=3360.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5137/5971 [51:17<08:19,  1.67it/s, loss=0.237, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000316, train/loss_step=0.096, global_step=3361.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5138/5971 [51:18<08:18,  1.67it/s, loss=0.271, v_num=0, train/loss_simple_step=0.853, train/loss_vlb_step=0.215, train/loss_step=0.853, global_step=3361.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  86%|████████▌ | 5139/5971 [51:19<08:18,  1.67it/s, loss=0.275, v_num=0, train/loss_simple_step=0.0981, train/loss_vlb_step=0.000325, train/loss_step=0.0981, global_step=3361.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5140/5971 [51:21<08:18,  1.67it/s, loss=0.275, v_num=0, train/loss_simple_step=0.0981, train/loss_vlb_step=0.000325, train/loss_step=0.0981, global_step=3361.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5140/5971 [51:21<08:18,  1.67it/s, loss=0.264, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.75e-5, train/loss_step=0.0216, global_step=3361.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▌ | 5141/5971 [51:22<08:17,  1.67it/s, loss=0.247, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000434, train/loss_step=0.130, global_step=3362.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▌ | 5142/5971 [51:23<08:17,  1.67it/s, loss=0.247, v_num=0, train/loss_simple_step=0.00698, train/loss_vlb_step=3.25e-5, train/loss_step=0.00698, global_step=3362.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5143/5971 [51:24<08:16,  1.67it/s, loss=0.278, v_num=0, train/loss_simple_step=0.674, train/loss_vlb_step=0.0199, train/loss_step=0.674, global_step=3362.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  86%|████████▌ | 5144/5971 [51:26<08:16,  1.67it/s, loss=0.278, v_num=0, train/loss_simple_step=0.674, train/loss_vlb_step=0.0199, train/loss_step=0.674, global_step=3362.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5144/5971 [51:26<08:16,  1.67it/s, loss=0.26, v_num=0, train/loss_simple_step=0.00591, train/loss_vlb_step=2.93e-5, train/loss_step=0.00591, global_step=3362.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5145/5971 [51:27<08:15,  1.67it/s, loss=0.262, v_num=0, train/loss_simple_step=0.0838, train/loss_vlb_step=0.000276, train/loss_step=0.0838, global_step=3363.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5146/5971 [51:28<08:14,  1.67it/s, loss=0.268, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000442, train/loss_step=0.131, global_step=3363.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  86%|████████▌ | 5147/5971 [51:28<08:14,  1.67it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.2e-5, train/loss_step=0.0124, global_step=3363.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5148/5971 [51:31<08:14,  1.67it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.2e-5, train/loss_step=0.0124, global_step=3363.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5148/5971 [51:31<08:14,  1.67it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.43e-5, train/loss_step=0.0135, global_step=3363.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▌ | 5149/5971 [51:32<08:13,  1.67it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000147, train/loss_step=0.0399, global_step=3364.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5150/5971 [51:33<08:13,  1.67it/s, loss=0.199, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=3364.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  86%|████████▋ | 5151/5971 [51:34<08:12,  1.67it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.13e-5, train/loss_step=0.0238, global_step=3364.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5152/5971 [51:36<08:12,  1.66it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.13e-5, train/loss_step=0.0238, global_step=3364.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5152/5971 [51:36<08:12,  1.66it/s, loss=0.196, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000611, train/loss_step=0.184, global_step=3364.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▋ | 5153/5971 [51:37<08:11,  1.66it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0708, train/loss_vlb_step=0.000238, train/loss_step=0.0708, global_step=3365.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5154/5971 [51:37<08:10,  1.66it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.21e-5, train/loss_step=0.00209, global_step=3365.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5155/5971 [51:38<08:10,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0776, train/loss_vlb_step=0.000257, train/loss_step=0.0776, global_step=3365.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  86%|████████▋ | 5156/5971 [51:41<08:10,  1.66it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0776, train/loss_vlb_step=0.000257, train/loss_step=0.0776, global_step=3365.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5156/5971 [51:41<08:10,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.000218, train/loss_step=0.0653, global_step=3365.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5157/5971 [51:42<08:09,  1.66it/s, loss=0.137, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000433, train/loss_step=0.132, global_step=3366.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  86%|████████▋ | 5158/5971 [51:43<08:09,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000551, train/loss_step=0.163, global_step=3366.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5159/5971 [51:44<08:08,  1.66it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000101, train/loss_step=0.0272, global_step=3366.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5160/5971 [51:46<08:08,  1.66it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=0.000101, train/loss_step=0.0272, global_step=3366.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5160/5971 [51:46<08:08,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.671, train/loss_vlb_step=0.0157, train/loss_step=0.671, global_step=3366.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  86%|████████▋ | 5161/5971 [51:47<08:07,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000176, train/loss_step=0.0484, global_step=3367.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5162/5971 [51:47<08:06,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.17e-5, train/loss_step=0.00422, global_step=3367.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5163/5971 [51:48<08:06,  1.66it/s, loss=0.094, v_num=0, train/loss_simple_step=0.00746, train/loss_vlb_step=3.31e-5, train/loss_step=0.00746, global_step=3367.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5164/5971 [51:50<08:06,  1.66it/s, loss=0.094, v_num=0, train/loss_simple_step=0.00746, train/loss_vlb_step=3.31e-5, train/loss_step=0.00746, global_step=3367.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  86%|████████▋ | 5164/5971 [51:50<08:06,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.001, train/loss_step=0.248, global_step=3367.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]      
Epoch 5:  87%|████████▋ | 5165/5971 [51:51<08:05,  1.66it/s, loss=0.114, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00097, train/loss_step=0.234, global_step=3368.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5166/5971 [51:52<08:04,  1.66it/s, loss=0.118, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000778, train/loss_step=0.209, global_step=3368.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5167/5971 [51:53<08:04,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000496, train/loss_step=0.143, global_step=3368.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5168/5971 [51:55<08:04,  1.66it/s, loss=0.124, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000496, train/loss_step=0.143, global_step=3368.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5168/5971 [51:55<08:04,  1.66it/s, loss=0.143, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00206, train/loss_step=0.398, global_step=3368.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  87%|████████▋ | 5169/5971 [51:56<08:03,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.01e-5, train/loss_step=0.00665, global_step=3369.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5170/5971 [51:57<08:02,  1.66it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.08e-5, train/loss_step=0.0155, global_step=3369.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  87%|████████▋ | 5171/5971 [51:58<08:02,  1.66it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00906, train/loss_vlb_step=4.19e-5, train/loss_step=0.00906, global_step=3369.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5172/5971 [52:00<08:01,  1.66it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00906, train/loss_vlb_step=4.19e-5, train/loss_step=0.00906, global_step=3369.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5172/5971 [52:00<08:01,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0913, train/loss_vlb_step=0.0003, train/loss_step=0.0913, global_step=3369.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  87%|████████▋ | 5173/5971 [52:01<08:01,  1.66it/s, loss=0.143, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.0014, train/loss_step=0.315, global_step=3370.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  87%|████████▋ | 5174/5971 [52:02<08:00,  1.66it/s, loss=0.161, v_num=0, train/loss_simple_step=0.361, train/loss_vlb_step=0.00208, train/loss_step=0.361, global_step=3370.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5175/5971 [52:03<08:00,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000148, train/loss_step=0.0386, global_step=3370.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5176/5971 [52:05<07:59,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000148, train/loss_step=0.0386, global_step=3370.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5176/5971 [52:05<07:59,  1.66it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.62e-5, train/loss_step=0.0098, global_step=3370.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  87%|████████▋ | 5177/5971 [52:06<07:59,  1.66it/s, loss=0.173, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00269, train/loss_step=0.456, global_step=3371.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  87%|████████▋ | 5178/5971 [52:07<07:58,  1.66it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0521, train/loss_vlb_step=0.000175, train/loss_step=0.0521, global_step=3371.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5179/5971 [52:08<07:58,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000122, train/loss_step=0.0341, global_step=3371.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5180/5971 [52:10<07:57,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000122, train/loss_step=0.0341, global_step=3371.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5180/5971 [52:10<07:57,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.27e-5, train/loss_step=0.0211, global_step=3371.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  87%|████████▋ | 5181/5971 [52:11<07:57,  1.65it/s, loss=0.151, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00182, train/loss_step=0.365, global_step=3372.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  87%|████████▋ | 5182/5971 [52:12<07:56,  1.65it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00444, train/loss_vlb_step=2.39e-5, train/loss_step=0.00444, global_step=3372.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5183/5971 [52:13<07:56,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0862, train/loss_vlb_step=0.000283, train/loss_step=0.0862, global_step=3372.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  87%|████████▋ | 5184/5971 [52:15<07:55,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0862, train/loss_vlb_step=0.000283, train/loss_step=0.0862, global_step=3372.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5184/5971 [52:15<07:55,  1.65it/s, loss=0.166, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00346, train/loss_step=0.478, global_step=3372.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  87%|████████▋ | 5185/5971 [52:16<07:55,  1.65it/s, loss=0.157, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000167, train/loss_step=0.047, global_step=3373.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5186/5971 [52:17<07:54,  1.65it/s, loss=0.162, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00143, train/loss_step=0.303, global_step=3373.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  87%|████████▋ | 5187/5971 [52:17<07:54,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00318, train/loss_vlb_step=1.68e-5, train/loss_step=0.00318, global_step=3373.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5188/5971 [52:20<07:53,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00318, train/loss_vlb_step=1.68e-5, train/loss_step=0.00318, global_step=3373.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5188/5971 [52:20<07:53,  1.65it/s, loss=0.14, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000358, train/loss_step=0.108, global_step=3373.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  87%|████████▋ | 5189/5971 [52:20<07:53,  1.65it/s, loss=0.166, v_num=0, train/loss_simple_step=0.523, train/loss_vlb_step=0.0063, train/loss_step=0.523, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  87%|████████▋ | 5190/5971 [52:21<07:52,  1.65it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0798, train/loss_vlb_step=0.000274, train/loss_step=0.0798, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5191/5971 [52:22<07:52,  1.65it/s, loss=0.175, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  87%|████████▋ | 5192/5971 [52:24<07:51,  1.65it/s, loss=0.175, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  87%|████████▋ | 5192/5971 [52:24<07:51,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.41it/s][A

Validating:   1%|          | 2/167 [00:00<00:44,  3.68it/s][A
Epoch 5:  87%|████████▋ | 5196/5971 [52:25<07:49,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.59it/s][A
Epoch 5:  87%|████████▋ | 5200/5971 [52:25<07:46,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.53it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.46it/s][A
Epoch 5:  87%|████████▋ | 5204/5971 [52:25<07:43,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.45it/s][A
Epoch 5:  87%|████████▋ | 5208/5971 [52:25<07:40,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.07it/s][A
Epoch 5:  87%|████████▋ | 5212/5971 [52:26<07:38,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:01<00:05, 24.78it/s][A
Epoch 5:  87%|████████▋ | 5216/5971 [52:26<07:35,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.40it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.67it/s][A
Epoch 5:  87%|████████▋ | 5220/5971 [52:26<07:32,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.53it/s][A
Epoch 5:  87%|████████▋ | 5224/5971 [52:26<07:29,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.48it/s][A
Epoch 5:  88%|████████▊ | 5228/5971 [52:26<07:27,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.92it/s][A

Validating:  23%|██▎       | 39/167 [00:01<00:04, 27.37it/s][A
Epoch 5:  88%|████████▊ | 5232/5971 [52:26<07:24,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.38it/s][A
Epoch 5:  88%|████████▊ | 5236/5971 [52:26<07:21,  1.66it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.41it/s][A
Epoch 5:  88%|████████▊ | 5240/5971 [52:27<07:18,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.35it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 26.77it/s][A
Epoch 5:  88%|████████▊ | 5244/5971 [52:27<07:16,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.20it/s][A
Epoch 5:  88%|████████▊ | 5248/5971 [52:27<07:13,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.29it/s][A
Epoch 5:  88%|████████▊ | 5252/5971 [52:27<07:10,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.01it/s][A
Epoch 5:  88%|████████▊ | 5256/5971 [52:27<07:08,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.80it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 27.44it/s][A
Epoch 5:  88%|████████▊ | 5260/5971 [52:27<07:05,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.80it/s][A
Epoch 5:  88%|████████▊ | 5264/5971 [52:28<07:02,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 24.86it/s][A
Epoch 5:  88%|████████▊ | 5268/5971 [52:28<07:00,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 24.51it/s][A

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.59it/s][A
Epoch 5:  88%|████████▊ | 5272/5971 [52:28<06:57,  1.67it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.28it/s][A
Epoch 5:  88%|████████▊ | 5276/5971 [52:28<06:54,  1.68it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.06it/s][A
Epoch 5:  88%|████████▊ | 5280/5971 [52:28<06:51,  1.68it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 25.27it/s][A

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 25.93it/s][A
Epoch 5:  88%|████████▊ | 5284/5971 [52:28<06:49,  1.68it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.14it/s][A
Epoch 5:  89%|████████▊ | 5288/5971 [52:29<06:46,  1.68it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.85it/s][A
Epoch 5:  89%|████████▊ | 5292/5971 [52:29<06:43,  1.68it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.38it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.10it/s][A
Epoch 5:  89%|████████▊ | 5296/5971 [52:29<06:41,  1.68it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.10it/s][A
Epoch 5:  89%|████████▉ | 5300/5971 [52:29<06:38,  1.68it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 24.65it/s][A
Epoch 5:  89%|████████▉ | 5304/5971 [52:29<06:36,  1.68it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 24.57it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:02, 25.63it/s][A
Epoch 5:  89%|████████▉ | 5308/5971 [52:29<06:33,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  71%|███████   | 118/167 [00:04<00:01, 26.33it/s][A
Epoch 5:  89%|████████▉ | 5312/5971 [52:29<06:30,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.83it/s][A
Epoch 5:  89%|████████▉ | 5316/5971 [52:30<06:28,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 25.22it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.40it/s][A
Epoch 5:  89%|████████▉ | 5320/5971 [52:30<06:25,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.22it/s][A
Epoch 5:  89%|████████▉ | 5324/5971 [52:30<06:22,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.69it/s][A
Epoch 5:  89%|████████▉ | 5328/5971 [52:30<06:20,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 25.13it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.78it/s][A
Epoch 5:  89%|████████▉ | 5332/5971 [52:30<06:17,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  85%|████████▌ | 142/167 [00:05<00:01, 24.97it/s][A
Epoch 5:  89%|████████▉ | 5336/5971 [52:30<06:14,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 25.67it/s][A
Epoch 5:  89%|████████▉ | 5340/5971 [52:30<06:12,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.24it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.50it/s][A
Epoch 5:  89%|████████▉ | 5344/5971 [52:31<06:09,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.52it/s][A
Epoch 5:  90%|████████▉ | 5348/5971 [52:31<06:07,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.98it/s][A
Epoch 5:  90%|████████▉ | 5352/5971 [52:31<06:04,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.25it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.13it/s][A
Epoch 5:  90%|████████▉ | 5356/5971 [52:31<06:01,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 25.20it/s][A
Epoch 5:  90%|████████▉ | 5360/5971 [52:31<05:59,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5360/5971 [52:32<05:59,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.75e-5, train/loss_step=0.0135, global_step=3374.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  90%|████████▉ | 5361/5971 [52:33<05:58,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.88e-5, train/loss_step=0.0257, global_step=3375.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5362/5971 [52:34<05:58,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000112, train/loss_step=0.0314, global_step=3375.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5363/5971 [52:35<05:57,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000116, train/loss_step=0.0308, global_step=3375.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5364/5971 [52:37<05:57,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000116, train/loss_step=0.0308, global_step=3375.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5364/5971 [52:37<05:57,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000613, train/loss_step=0.178, global_step=3375.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|████████▉ | 5365/5971 [52:38<05:56,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000125, train/loss_step=0.0327, global_step=3376.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5366/5971 [52:39<05:56,  1.70it/s, loss=0.13, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=3376.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  90%|████████▉ | 5367/5971 [52:40<05:55,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.01e-5, train/loss_step=0.0117, global_step=3376.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5368/5971 [52:42<05:55,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.01e-5, train/loss_step=0.0117, global_step=3376.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5368/5971 [52:42<05:55,  1.70it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000135, train/loss_step=0.0359, global_step=3376.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5369/5971 [52:43<05:54,  1.70it/s, loss=0.12, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000574, train/loss_step=0.166, global_step=3377.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  90%|████████▉ | 5370/5971 [52:44<05:54,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00149, train/loss_step=0.294, global_step=3377.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5371/5971 [52:45<05:53,  1.70it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000212, train/loss_step=0.0624, global_step=3377.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5372/5971 [52:47<05:53,  1.70it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0624, train/loss_vlb_step=0.000212, train/loss_step=0.0624, global_step=3377.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5372/5971 [52:47<05:53,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00886, train/loss_vlb_step=3.81e-5, train/loss_step=0.00886, global_step=3377.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|████████▉ | 5373/5971 [52:48<05:52,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.442, train/loss_vlb_step=0.00352, train/loss_step=0.442, global_step=3378.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  90%|█████████ | 5374/5971 [52:49<05:52,  1.70it/s, loss=0.121, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000493, train/loss_step=0.146, global_step=3378.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5375/5971 [52:50<05:51,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00238, train/loss_step=0.429, global_step=3378.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5376/5971 [52:52<05:51,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00238, train/loss_step=0.429, global_step=3378.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5376/5971 [52:52<05:51,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.08e-5, train/loss_step=0.00395, global_step=3378.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5377/5971 [52:53<05:50,  1.69it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.000106, train/loss_step=0.0265, global_step=3379.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5378/5971 [52:54<05:49,  1.69it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000298, train/loss_step=0.0899, global_step=3379.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5379/5971 [52:55<05:49,  1.69it/s, loss=0.115, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000497, train/loss_step=0.150, global_step=3379.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  90%|█████████ | 5380/5971 [52:57<05:49,  1.69it/s, loss=0.115, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000497, train/loss_step=0.150, global_step=3379.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5380/5971 [52:57<05:49,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000258, train/loss_step=0.0759, global_step=3379.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5381/5971 [52:58<05:48,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.77e-5, train/loss_step=0.0129, global_step=3380.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5382/5971 [52:59<05:47,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0802, train/loss_vlb_step=0.000264, train/loss_step=0.0802, global_step=3380.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5383/5971 [53:00<05:47,  1.69it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.00018, train/loss_step=0.0537, global_step=3380.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5384/5971 [53:02<05:46,  1.69it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.00018, train/loss_step=0.0537, global_step=3380.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5384/5971 [53:02<05:46,  1.69it/s, loss=0.119, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000518, train/loss_step=0.156, global_step=3380.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5385/5971 [53:03<05:46,  1.69it/s, loss=0.142, v_num=0, train/loss_simple_step=0.491, train/loss_vlb_step=0.00314, train/loss_step=0.491, global_step=3381.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5386/5971 [53:04<05:45,  1.69it/s, loss=0.161, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00586, train/loss_step=0.478, global_step=3381.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5387/5971 [53:05<05:45,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00131, train/loss_step=0.304, global_step=3381.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5388/5971 [53:07<05:44,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00131, train/loss_step=0.304, global_step=3381.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5388/5971 [53:07<05:44,  1.69it/s, loss=0.213, v_num=0, train/loss_simple_step=0.789, train/loss_vlb_step=0.0578, train/loss_step=0.789, global_step=3381.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5389/5971 [53:08<05:44,  1.69it/s, loss=0.21, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000365, train/loss_step=0.108, global_step=3382.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5390/5971 [53:09<05:43,  1.69it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0253, train/loss_vlb_step=0.0001, train/loss_step=0.0253, global_step=3382.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5391/5971 [53:10<05:43,  1.69it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.0003, train/loss_step=0.0901, global_step=3382.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5392/5971 [53:12<05:42,  1.69it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0901, train/loss_vlb_step=0.0003, train/loss_step=0.0901, global_step=3382.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5392/5971 [53:12<05:42,  1.69it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00896, train/loss_vlb_step=4.14e-5, train/loss_step=0.00896, global_step=3382.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5393/5971 [53:13<05:42,  1.69it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0661, train/loss_vlb_step=0.000222, train/loss_step=0.0661, global_step=3383.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5394/5971 [53:14<05:41,  1.69it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0909, train/loss_vlb_step=0.000299, train/loss_step=0.0909, global_step=3383.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5395/5971 [53:15<05:41,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0697, train/loss_vlb_step=0.000236, train/loss_step=0.0697, global_step=3383.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5396/5971 [53:17<05:40,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0697, train/loss_vlb_step=0.000236, train/loss_step=0.0697, global_step=3383.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5396/5971 [53:17<05:40,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=3383.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  90%|█████████ | 5397/5971 [53:18<05:40,  1.69it/s, loss=0.165, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000196, train/loss_step=0.055, global_step=3384.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5398/5971 [53:19<05:39,  1.69it/s, loss=0.167, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000389, train/loss_step=0.118, global_step=3384.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5399/5971 [53:20<05:39,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.082, train/loss_vlb_step=0.000277, train/loss_step=0.082, global_step=3384.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5400/5971 [53:22<05:38,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.082, train/loss_vlb_step=0.000277, train/loss_step=0.082, global_step=3384.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5400/5971 [53:22<05:38,  1.69it/s, loss=0.177, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00161, train/loss_step=0.351, global_step=3384.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  90%|█████████ | 5401/5971 [53:23<05:38,  1.69it/s, loss=0.204, v_num=0, train/loss_simple_step=0.544, train/loss_vlb_step=0.00867, train/loss_step=0.544, global_step=3385.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5402/5971 [53:24<05:37,  1.69it/s, loss=0.205, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=3385.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  90%|█████████ | 5403/5971 [53:25<05:36,  1.69it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.95e-5, train/loss_step=0.0247, global_step=3385.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5404/5971 [53:27<05:36,  1.69it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.95e-5, train/loss_step=0.0247, global_step=3385.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5404/5971 [53:27<05:36,  1.69it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000133, train/loss_step=0.0364, global_step=3385.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5405/5971 [53:28<05:35,  1.69it/s, loss=0.173, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.6e-5, train/loss_step=0.003, global_step=3386.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  91%|█████████ | 5406/5971 [53:29<05:35,  1.68it/s, loss=0.17, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00271, train/loss_step=0.423, global_step=3386.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5407/5971 [53:29<05:34,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00267, train/loss_step=0.403, global_step=3386.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5408/5971 [53:32<05:34,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00267, train/loss_step=0.403, global_step=3386.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5408/5971 [53:32<05:34,  1.68it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.46e-5, train/loss_step=0.00252, global_step=3386.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5409/5971 [53:33<05:33,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000563, train/loss_step=0.168, global_step=3387.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  91%|█████████ | 5410/5971 [53:33<05:33,  1.68it/s, loss=0.149, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000851, train/loss_step=0.225, global_step=3387.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5411/5971 [53:34<05:32,  1.68it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000148, train/loss_step=0.0405, global_step=3387.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5412/5971 [53:37<05:32,  1.68it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000148, train/loss_step=0.0405, global_step=3387.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5412/5971 [53:37<05:32,  1.68it/s, loss=0.161, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00112, train/loss_step=0.296, global_step=3387.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  91%|█████████ | 5413/5971 [53:38<05:31,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00451, train/loss_vlb_step=2.38e-5, train/loss_step=0.00451, global_step=3388.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5414/5971 [53:39<05:31,  1.68it/s, loss=0.162, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000557, train/loss_step=0.165, global_step=3388.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  91%|█████████ | 5415/5971 [53:39<05:30,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.21e-5, train/loss_step=0.0182, global_step=3388.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5416/5971 [53:42<05:30,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0182, train/loss_vlb_step=7.21e-5, train/loss_step=0.0182, global_step=3388.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5416/5971 [53:42<05:30,  1.68it/s, loss=0.184, v_num=0, train/loss_simple_step=0.620, train/loss_vlb_step=0.0057, train/loss_step=0.620, global_step=3388.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  91%|█████████ | 5417/5971 [53:42<05:29,  1.68it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.09e-5, train/loss_step=0.0117, global_step=3389.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5418/5971 [53:43<05:28,  1.68it/s, loss=0.224, v_num=0, train/loss_simple_step=0.959, train/loss_vlb_step=0.483, train/loss_step=0.959, global_step=3389.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  91%|█████████ | 5419/5971 [53:44<05:28,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000156, train/loss_step=0.0413, global_step=3389.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5420/5971 [53:46<05:27,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000156, train/loss_step=0.0413, global_step=3389.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5420/5971 [53:46<05:27,  1.68it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000101, train/loss_step=0.0268, global_step=3389.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5421/5971 [53:47<05:27,  1.68it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.35e-5, train/loss_step=0.0184, global_step=3390.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  91%|█████████ | 5422/5971 [53:48<05:26,  1.68it/s, loss=0.185, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000772, train/loss_step=0.211, global_step=3390.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5423/5971 [53:49<05:26,  1.68it/s, loss=0.196, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00105, train/loss_step=0.251, global_step=3390.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  91%|█████████ | 5424/5971 [53:51<05:25,  1.68it/s, loss=0.196, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00105, train/loss_step=0.251, global_step=3390.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5424/5971 [53:51<05:25,  1.68it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.91e-5, train/loss_step=0.0218, global_step=3390.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5425/5971 [53:52<05:25,  1.68it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.4e-5, train/loss_step=0.00458, global_step=3391.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5426/5971 [53:53<05:24,  1.68it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00462, train/loss_vlb_step=2.41e-5, train/loss_step=0.00462, global_step=3391.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5427/5971 [53:54<05:24,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.71e-5, train/loss_step=0.0102, global_step=3391.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  91%|█████████ | 5428/5971 [53:56<05:23,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.71e-5, train/loss_step=0.0102, global_step=3391.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5428/5971 [53:56<05:23,  1.68it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000129, train/loss_step=0.0331, global_step=3391.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5429/5971 [53:57<05:23,  1.68it/s, loss=0.161, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.00121, train/loss_step=0.248, global_step=3392.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  91%|█████████ | 5430/5971 [53:58<05:22,  1.68it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0312, train/loss_vlb_step=0.000118, train/loss_step=0.0312, global_step=3392.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5431/5971 [53:59<05:22,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.00075, train/loss_step=0.214, global_step=3392.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  91%|█████████ | 5432/5971 [54:01<05:21,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.00075, train/loss_step=0.214, global_step=3392.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5432/5971 [54:01<05:21,  1.68it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00275, train/loss_vlb_step=1.56e-5, train/loss_step=0.00275, global_step=3392.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5433/5971 [54:02<05:21,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00119, train/loss_step=0.292, global_step=3393.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  91%|█████████ | 5434/5971 [54:03<05:20,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000131, train/loss_step=0.0364, global_step=3393.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5435/5971 [54:04<05:19,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00366, train/loss_step=0.483, global_step=3393.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  91%|█████████ | 5436/5971 [54:06<05:19,  1.67it/s, loss=0.176, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00366, train/loss_step=0.483, global_step=3393.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5436/5971 [54:06<05:19,  1.67it/s, loss=0.149, v_num=0, train/loss_simple_step=0.079, train/loss_vlb_step=0.000262, train/loss_step=0.079, global_step=3393.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5437/5971 [54:07<05:18,  1.67it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000222, train/loss_step=0.0656, global_step=3394.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5438/5971 [54:08<05:18,  1.67it/s, loss=0.133, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.00799, train/loss_step=0.578, global_step=3394.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  91%|█████████ | 5439/5971 [54:09<05:17,  1.67it/s, loss=0.14, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000631, train/loss_step=0.186, global_step=3394.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5440/5971 [54:11<05:17,  1.67it/s, loss=0.14, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000631, train/loss_step=0.186, global_step=3394.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5440/5971 [54:11<05:17,  1.67it/s, loss=0.156, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00165, train/loss_step=0.346, global_step=3394.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5441/5971 [54:12<05:16,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0415, train/loss_vlb_step=0.000147, train/loss_step=0.0415, global_step=3395.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5442/5971 [54:13<05:16,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000509, train/loss_step=0.153, global_step=3395.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  91%|█████████ | 5443/5971 [54:14<05:15,  1.67it/s, loss=0.15, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000548, train/loss_step=0.162, global_step=3395.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  91%|█████████ | 5444/5971 [54:16<05:15,  1.67it/s, loss=0.15, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000548, train/loss_step=0.162, global_step=3395.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5444/5971 [54:16<05:15,  1.67it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0487, train/loss_vlb_step=0.000167, train/loss_step=0.0487, global_step=3395.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5445/5971 [54:17<05:14,  1.67it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000104, train/loss_step=0.0262, global_step=3396.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5446/5971 [54:18<05:14,  1.67it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0717, train/loss_vlb_step=0.000238, train/loss_step=0.0717, global_step=3396.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5447/5971 [54:19<05:13,  1.67it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00836, train/loss_vlb_step=4.06e-5, train/loss_step=0.00836, global_step=3396.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5448/5971 [54:21<05:13,  1.67it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00836, train/loss_vlb_step=4.06e-5, train/loss_step=0.00836, global_step=3396.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████ | 5448/5971 [54:21<05:13,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.00025, train/loss_step=0.0755, global_step=3396.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  91%|█████████▏| 5449/5971 [54:22<05:12,  1.67it/s, loss=0.172, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00737, train/loss_step=0.533, global_step=3397.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  91%|█████████▏| 5450/5971 [54:22<05:11,  1.67it/s, loss=0.187, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00144, train/loss_step=0.330, global_step=3397.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5451/5971 [54:23<05:11,  1.67it/s, loss=0.187, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000836, train/loss_step=0.230, global_step=3397.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5452/5971 [54:26<05:10,  1.67it/s, loss=0.187, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000836, train/loss_step=0.230, global_step=3397.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5452/5971 [54:26<05:10,  1.67it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=4.8e-5, train/loss_step=0.0115, global_step=3397.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5453/5971 [54:27<05:10,  1.67it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00349, train/loss_vlb_step=1.82e-5, train/loss_step=0.00349, global_step=3398.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5454/5971 [54:27<05:09,  1.67it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.36e-5, train/loss_step=0.00238, global_step=3398.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5455/5971 [54:28<05:09,  1.67it/s, loss=0.153, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000332, train/loss_step=0.100, global_step=3398.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  91%|█████████▏| 5456/5971 [54:30<05:08,  1.67it/s, loss=0.153, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000332, train/loss_step=0.100, global_step=3398.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5456/5971 [54:30<05:08,  1.67it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0608, train/loss_vlb_step=0.000213, train/loss_step=0.0608, global_step=3398.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5457/5971 [54:31<05:08,  1.67it/s, loss=0.159, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000709, train/loss_step=0.208, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  91%|█████████▏| 5458/5971 [54:32<05:07,  1.67it/s, loss=0.142, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.00101, train/loss_step=0.246, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  91%|█████████▏| 5459/5971 [54:33<05:06,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0589, train/loss_vlb_step=0.000216, train/loss_step=0.0589, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5460/5971 [54:35<05:06,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0589, train/loss_vlb_step=0.000216, train/loss_step=0.0589, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  91%|█████████▏| 5460/5971 [54:35<05:06,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:01<03:41,  1.33s/it][A

Validating:   2%|▏         | 3/167 [00:01<01:04,  2.56it/s][A
Epoch 5:  92%|█████████▏| 5464/5971 [54:37<05:04,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   4%|▎         | 6/167 [00:01<00:27,  5.81it/s][A
Epoch 5:  92%|█████████▏| 5468/5971 [54:37<05:01,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▌         | 9/167 [00:01<00:17,  8.95it/s][A
Epoch 5:  92%|█████████▏| 5472/5971 [54:37<04:58,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 12/167 [00:01<00:13, 11.92it/s][A

Validating:   9%|▉         | 15/167 [00:01<00:10, 14.80it/s][A
Epoch 5:  92%|█████████▏| 5476/5971 [54:37<04:56,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  11%|█         | 18/167 [00:02<00:08, 16.98it/s][A
Epoch 5:  92%|█████████▏| 5480/5971 [54:38<04:53,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  13%|█▎        | 21/167 [00:02<00:07, 19.50it/s][A
Epoch 5:  92%|█████████▏| 5484/5971 [54:38<04:51,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 24/167 [00:02<00:06, 21.34it/s][A

Validating:  16%|█▌        | 27/167 [00:02<00:06, 22.29it/s][A
Epoch 5:  92%|█████████▏| 5488/5971 [54:38<04:48,  1.67it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  18%|█▊        | 30/167 [00:02<00:06, 22.19it/s][A
Epoch 5:  92%|█████████▏| 5492/5971 [54:38<04:45,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 23.34it/s][A
Epoch 5:  92%|█████████▏| 5496/5971 [54:38<04:43,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 23.71it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 24.67it/s][A
Epoch 5:  92%|█████████▏| 5500/5971 [54:38<04:40,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.09it/s][A
Epoch 5:  92%|█████████▏| 5504/5971 [54:39<04:38,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  27%|██▋       | 45/167 [00:03<00:04, 25.92it/s][A
Epoch 5:  92%|█████████▏| 5508/5971 [54:39<04:35,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  29%|██▊       | 48/167 [00:03<00:04, 25.17it/s][A

Validating:  31%|███       | 51/167 [00:03<00:04, 26.35it/s][A
Epoch 5:  92%|█████████▏| 5512/5971 [54:39<04:33,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 54/167 [00:03<00:04, 26.48it/s][A
Epoch 5:  92%|█████████▏| 5516/5971 [54:39<04:30,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▍      | 57/167 [00:03<00:04, 26.82it/s][A
Epoch 5:  92%|█████████▏| 5520/5971 [54:39<04:27,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  37%|███▋      | 61/167 [00:03<00:03, 27.98it/s][A
Epoch 5:  93%|█████████▎| 5524/5971 [54:39<04:25,  1.68it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 28.23it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 27.73it/s][A
Epoch 5:  93%|█████████▎| 5528/5971 [54:39<04:22,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:04<00:03, 27.14it/s][A
Epoch 5:  93%|█████████▎| 5532/5971 [54:40<04:20,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▎     | 73/167 [00:04<00:03, 27.40it/s][A
Epoch 5:  93%|█████████▎| 5536/5971 [54:40<04:17,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  46%|████▌     | 76/167 [00:04<00:03, 27.26it/s][A

Validating:  47%|████▋     | 79/167 [00:04<00:03, 26.56it/s][A
Epoch 5:  93%|█████████▎| 5540/5971 [54:40<04:15,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 26.13it/s][A
Epoch 5:  93%|█████████▎| 5544/5971 [54:40<04:12,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████     | 85/167 [00:04<00:03, 26.93it/s][A
Epoch 5:  93%|█████████▎| 5548/5971 [54:40<04:10,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 25.00it/s][A

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.61it/s][A
Epoch 5:  93%|█████████▎| 5552/5971 [54:40<04:07,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 24.90it/s][A
Epoch 5:  93%|█████████▎| 5556/5971 [54:40<04:05,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  58%|█████▊    | 97/167 [00:05<00:02, 24.84it/s][A
Epoch 5:  93%|█████████▎| 5560/5971 [54:41<04:02,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|██████    | 101/167 [00:05<00:02, 26.68it/s][A
Epoch 5:  93%|█████████▎| 5564/5971 [54:41<03:59,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  62%|██████▏   | 104/167 [00:05<00:02, 25.75it/s][A

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 25.50it/s][A
Epoch 5:  93%|█████████▎| 5568/5971 [54:41<03:57,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 26.14it/s][A
Epoch 5:  93%|█████████▎| 5572/5971 [54:41<03:54,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 25.49it/s][A
Epoch 5:  93%|█████████▎| 5576/5971 [54:41<03:52,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 25.58it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.57it/s][A
Epoch 5:  93%|█████████▎| 5580/5971 [54:41<03:49,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  73%|███████▎  | 122/167 [00:06<00:01, 26.53it/s][A
Epoch 5:  94%|█████████▎| 5584/5971 [54:42<03:47,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▍  | 125/167 [00:06<00:01, 26.67it/s][A
Epoch 5:  94%|█████████▎| 5588/5971 [54:42<03:44,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  77%|███████▋  | 128/167 [00:06<00:01, 27.16it/s][A

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 27.08it/s][A
Epoch 5:  94%|█████████▎| 5592/5971 [54:42<03:42,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  81%|████████  | 135/167 [00:06<00:01, 28.65it/s][A
Epoch 5:  94%|█████████▎| 5596/5971 [54:42<03:39,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 139/167 [00:06<00:00, 29.36it/s][A
Epoch 5:  94%|█████████▍| 5600/5971 [54:42<03:37,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 29.28it/s][A
Epoch 5:  94%|█████████▍| 5604/5971 [54:42<03:34,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 28.23it/s][A
Epoch 5:  94%|█████████▍| 5608/5971 [54:42<03:32,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.70it/s][A
Epoch 5:  94%|█████████▍| 5612/5971 [54:43<03:29,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  91%|█████████ | 152/167 [00:07<00:00, 28.29it/s][A

Validating:  93%|█████████▎| 155/167 [00:07<00:00, 28.03it/s][A
Epoch 5:  94%|█████████▍| 5616/5971 [54:43<03:27,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 27.20it/s][A
Epoch 5:  94%|█████████▍| 5620/5971 [54:43<03:25,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 27.86it/s][A
Epoch 5:  94%|█████████▍| 5624/5971 [54:43<03:22,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 27.26it/s][A
Epoch 5:  94%|█████████▍| 5628/5971 [54:43<03:20,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5628/5971 [54:43<03:20,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.38it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.50it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.35it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.98it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.25it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.46it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.53it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.61it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.62it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.66it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.69it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.71it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.66it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.71it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.60it/s][A
Epoch 5:  94%|█████████▍| 5628/5971 [54:53<03:20,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.58it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.58it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.59it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.59it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.59it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.32it/s]

Epoch 5:  94%|█████████▍| 5629/5971 [54:55<03:20,  1.71it/s, loss=0.129, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000758, train/loss_step=0.213, global_step=3399.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5629/5971 [54:55<03:20,  1.71it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000109, train/loss_step=0.0274, global_step=3400.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.08it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.65it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.18it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.57it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.84it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.04it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.43it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.30it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.42it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.43it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.38it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.39it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.28it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.35it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.46it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.56it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.38it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.10it/s]

Epoch 5:  94%|█████████▍| 5630/5971 [55:07<03:20,  1.70it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000109, train/loss_step=0.0274, global_step=3400.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5630/5971 [55:07<03:20,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00212, train/loss_step=0.378, global_step=3400.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.78it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.25it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.61it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.87it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.04it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.23it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.27it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.43it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.44it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.34it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.25it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.19it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.25it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.07it/s]

Epoch 5:  94%|█████████▍| 5631/5971 [55:20<03:20,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00212, train/loss_step=0.378, global_step=3400.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5631/5971 [55:20<03:20,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00173, train/loss_step=0.344, global_step=3400.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:34,  1.40it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.48it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.84it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.62it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.88it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.05it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.35it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.41it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.36it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.38it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.40it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.23it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.35it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.26it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.24it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.35it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.34it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.34it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.15it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.05it/s]

Epoch 5:  94%|█████████▍| 5632/5971 [55:33<03:20,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00173, train/loss_step=0.344, global_step=3400.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5632/5971 [55:33<03:20,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00164, train/loss_step=0.355, global_step=3400.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5633/5971 [55:34<03:20,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00164, train/loss_step=0.355, global_step=3400.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5633/5971 [55:34<03:20,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0091, train/loss_vlb_step=4.27e-5, train/loss_step=0.0091, global_step=3401.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5634/5971 [55:35<03:19,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0091, train/loss_vlb_step=4.27e-5, train/loss_step=0.0091, global_step=3401.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5634/5971 [55:35<03:19,  1.69it/s, loss=0.173, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00107, train/loss_step=0.257, global_step=3401.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  94%|█████████▍| 5635/5971 [55:36<03:18,  1.69it/s, loss=0.173, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00107, train/loss_step=0.257, global_step=3401.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5635/5971 [55:36<03:18,  1.69it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.55e-5, train/loss_step=0.0132, global_step=3401.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5636/5971 [55:38<03:18,  1.69it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.55e-5, train/loss_step=0.0132, global_step=3401.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5636/5971 [55:38<03:18,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.83e-5, train/loss_step=0.00536, global_step=3401.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5637/5971 [55:39<03:17,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.83e-5, train/loss_step=0.00536, global_step=3401.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5637/5971 [55:39<03:17,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.41e-5, train/loss_step=0.00248, global_step=3402.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5638/5971 [55:40<03:17,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.41e-5, train/loss_step=0.00248, global_step=3402.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5638/5971 [55:40<03:17,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00196, train/loss_step=0.404, global_step=3402.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  94%|█████████▍| 5639/5971 [55:41<03:16,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00196, train/loss_step=0.404, global_step=3402.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5639/5971 [55:41<03:16,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00111, train/loss_step=0.288, global_step=3402.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5640/5971 [55:43<03:16,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00111, train/loss_step=0.288, global_step=3402.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5640/5971 [55:43<03:16,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0893, train/loss_vlb_step=0.000294, train/loss_step=0.0893, global_step=3402.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5641/5971 [55:44<03:15,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0893, train/loss_vlb_step=0.000294, train/loss_step=0.0893, global_step=3402.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5641/5971 [55:44<03:15,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000825, train/loss_step=0.225, global_step=3403.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  94%|█████████▍| 5642/5971 [55:45<03:15,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000825, train/loss_step=0.225, global_step=3403.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  94%|█████████▍| 5642/5971 [55:45<03:15,  1.69it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.98e-5, train/loss_step=0.0189, global_step=3403.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5643/5971 [55:46<03:14,  1.69it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.98e-5, train/loss_step=0.0189, global_step=3403.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5643/5971 [55:46<03:14,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.52e-5, train/loss_step=0.00256, global_step=3403.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5644/5971 [55:48<03:13,  1.69it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.52e-5, train/loss_step=0.00256, global_step=3403.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5644/5971 [55:48<03:13,  1.69it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000233, train/loss_step=0.0674, global_step=3403.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5645/5971 [55:49<03:13,  1.69it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000233, train/loss_step=0.0674, global_step=3403.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5645/5971 [55:49<03:13,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.01e-5, train/loss_step=0.0113, global_step=3404.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  95%|█████████▍| 5646/5971 [55:50<03:12,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.01e-5, train/loss_step=0.0113, global_step=3404.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5646/5971 [55:50<03:12,  1.69it/s, loss=0.145, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.00046, train/loss_step=0.139, global_step=3404.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  95%|█████████▍| 5647/5971 [55:51<03:12,  1.69it/s, loss=0.145, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.00046, train/loss_step=0.139, global_step=3404.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5647/5971 [55:51<03:12,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0965, train/loss_vlb_step=0.000318, train/loss_step=0.0965, global_step=3404.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5648/5971 [55:54<03:11,  1.68it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0965, train/loss_vlb_step=0.000318, train/loss_step=0.0965, global_step=3404.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5648/5971 [55:54<03:11,  1.68it/s, loss=0.143, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000398, train/loss_step=0.121, global_step=3404.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  95%|█████████▍| 5649/5971 [55:55<03:11,  1.68it/s, loss=0.143, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000398, train/loss_step=0.121, global_step=3404.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5649/5971 [55:55<03:11,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00141, train/loss_step=0.339, global_step=3405.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  95%|█████████▍| 5650/5971 [55:55<03:10,  1.68it/s, loss=0.158, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00141, train/loss_step=0.339, global_step=3405.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5650/5971 [55:55<03:10,  1.68it/s, loss=0.146, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000409, train/loss_step=0.125, global_step=3405.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5651/5971 [55:56<03:10,  1.68it/s, loss=0.146, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000409, train/loss_step=0.125, global_step=3405.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5651/5971 [55:56<03:10,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000476, train/loss_step=0.139, global_step=3405.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5652/5971 [55:59<03:09,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000476, train/loss_step=0.139, global_step=3405.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5652/5971 [55:59<03:09,  1.68it/s, loss=0.128, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000714, train/loss_step=0.214, global_step=3405.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5653/5971 [56:00<03:08,  1.68it/s, loss=0.128, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000714, train/loss_step=0.214, global_step=3405.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5653/5971 [56:00<03:08,  1.68it/s, loss=0.151, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.004, train/loss_step=0.473, global_step=3406.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  95%|█████████▍| 5654/5971 [56:00<03:08,  1.68it/s, loss=0.151, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.004, train/loss_step=0.473, global_step=3406.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5654/5971 [56:00<03:08,  1.68it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.00031, train/loss_step=0.0918, global_step=3406.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5655/5971 [56:01<03:07,  1.68it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.00031, train/loss_step=0.0918, global_step=3406.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5655/5971 [56:01<03:07,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=7.59e-5, train/loss_step=0.0203, global_step=3406.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5656/5971 [56:04<03:07,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=7.59e-5, train/loss_step=0.0203, global_step=3406.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5656/5971 [56:04<03:07,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00284, train/loss_step=0.422, global_step=3406.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  95%|█████████▍| 5657/5971 [56:05<03:06,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00284, train/loss_step=0.422, global_step=3406.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5657/5971 [56:05<03:06,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00753, train/loss_vlb_step=3.66e-5, train/loss_step=0.00753, global_step=3407.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5658/5971 [56:05<03:06,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00753, train/loss_vlb_step=3.66e-5, train/loss_step=0.00753, global_step=3407.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5658/5971 [56:05<03:06,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000565, train/loss_step=0.167, global_step=3407.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  95%|█████████▍| 5659/5971 [56:06<03:05,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000565, train/loss_step=0.167, global_step=3407.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5659/5971 [56:06<03:05,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.5e-6, train/loss_step=0.0014, global_step=3407.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5660/5971 [56:08<03:05,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.5e-6, train/loss_step=0.0014, global_step=3407.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5660/5971 [56:08<03:05,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.62e-5, train/loss_step=0.00285, global_step=3407.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5661/5971 [56:09<03:04,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.62e-5, train/loss_step=0.00285, global_step=3407.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5661/5971 [56:09<03:04,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000827, train/loss_step=0.217, global_step=3408.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  95%|█████████▍| 5662/5971 [56:10<03:03,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000827, train/loss_step=0.217, global_step=3408.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5662/5971 [56:10<03:03,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0792, train/loss_vlb_step=0.000268, train/loss_step=0.0792, global_step=3408.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5663/5971 [56:11<03:03,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0792, train/loss_vlb_step=0.000268, train/loss_step=0.0792, global_step=3408.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5663/5971 [56:11<03:03,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.75e-5, train/loss_step=0.0128, global_step=3408.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  95%|█████████▍| 5664/5971 [56:13<03:02,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.75e-5, train/loss_step=0.0128, global_step=3408.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5664/5971 [56:13<03:02,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.21e-6, train/loss_step=0.00152, global_step=3408.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5665/5971 [56:14<03:02,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.21e-6, train/loss_step=0.00152, global_step=3408.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5665/5971 [56:14<03:02,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00228, train/loss_step=0.421, global_step=3409.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  95%|█████████▍| 5666/5971 [56:15<03:01,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00228, train/loss_step=0.421, global_step=3409.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5666/5971 [56:15<03:01,  1.68it/s, loss=0.172, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00286, train/loss_step=0.493, global_step=3409.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5667/5971 [56:16<03:01,  1.68it/s, loss=0.172, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00286, train/loss_step=0.493, global_step=3409.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5667/5971 [56:16<03:01,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.00696, train/loss_step=0.546, global_step=3409.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5668/5971 [56:18<03:00,  1.68it/s, loss=0.195, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.00696, train/loss_step=0.546, global_step=3409.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5668/5971 [56:18<03:00,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00477, train/loss_vlb_step=2.33e-5, train/loss_step=0.00477, global_step=3409.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5669/5971 [56:19<03:00,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00477, train/loss_vlb_step=2.33e-5, train/loss_step=0.00477, global_step=3409.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5669/5971 [56:19<03:00,  1.68it/s, loss=0.184, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00104, train/loss_step=0.249, global_step=3410.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  95%|█████████▍| 5670/5971 [56:20<02:59,  1.68it/s, loss=0.184, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00104, train/loss_step=0.249, global_step=3410.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5670/5971 [56:20<02:59,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00313, train/loss_step=0.381, global_step=3410.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5671/5971 [56:21<02:58,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00313, train/loss_step=0.381, global_step=3410.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5671/5971 [56:21<02:58,  1.68it/s, loss=0.238, v_num=0, train/loss_simple_step=0.957, train/loss_vlb_step=0.482, train/loss_step=0.957, global_step=3410.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  95%|█████████▍| 5672/5971 [56:24<02:58,  1.68it/s, loss=0.238, v_num=0, train/loss_simple_step=0.957, train/loss_vlb_step=0.482, train/loss_step=0.957, global_step=3410.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▍| 5672/5971 [56:24<02:58,  1.68it/s, loss=0.227, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=7.71e-6, train/loss_step=0.00132, global_step=3410.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5673/5971 [56:25<02:57,  1.68it/s, loss=0.227, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=7.71e-6, train/loss_step=0.00132, global_step=3410.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5673/5971 [56:25<02:57,  1.68it/s, loss=0.226, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.00255, train/loss_step=0.435, global_step=3411.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  95%|█████████▌| 5674/5971 [56:25<02:57,  1.68it/s, loss=0.226, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.00255, train/loss_step=0.435, global_step=3411.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5674/5971 [56:25<02:57,  1.68it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000123, train/loss_step=0.0341, global_step=3411.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5675/5971 [56:26<02:56,  1.68it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.000123, train/loss_step=0.0341, global_step=3411.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5675/5971 [56:26<02:56,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.08e-5, train/loss_step=0.011, global_step=3411.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  95%|█████████▌| 5676/5971 [56:28<02:56,  1.68it/s, loss=0.222, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=5.08e-5, train/loss_step=0.011, global_step=3411.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5676/5971 [56:28<02:56,  1.68it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.12e-5, train/loss_step=0.00415, global_step=3411.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5677/5971 [56:29<02:55,  1.68it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.12e-5, train/loss_step=0.00415, global_step=3411.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5677/5971 [56:29<02:55,  1.68it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000285, train/loss_step=0.0841, global_step=3412.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  95%|█████████▌| 5678/5971 [56:30<02:54,  1.67it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000285, train/loss_step=0.0841, global_step=3412.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5678/5971 [56:30<02:54,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000467, train/loss_step=0.138, global_step=3412.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  95%|█████████▌| 5679/5971 [56:31<02:54,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000467, train/loss_step=0.138, global_step=3412.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5679/5971 [56:31<02:54,  1.67it/s, loss=0.233, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.0112, train/loss_step=0.590, global_step=3412.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  95%|█████████▌| 5680/5971 [56:33<02:53,  1.67it/s, loss=0.233, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.0112, train/loss_step=0.590, global_step=3412.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5680/5971 [56:33<02:53,  1.67it/s, loss=0.235, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000126, train/loss_step=0.0318, global_step=3412.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5681/5971 [56:34<02:53,  1.67it/s, loss=0.235, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000126, train/loss_step=0.0318, global_step=3412.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5681/5971 [56:34<02:53,  1.67it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000152, train/loss_step=0.0436, global_step=3413.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5682/5971 [56:35<02:52,  1.67it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000152, train/loss_step=0.0436, global_step=3413.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5682/5971 [56:35<02:52,  1.67it/s, loss=0.236, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00104, train/loss_step=0.283, global_step=3413.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  95%|█████████▌| 5683/5971 [56:36<02:52,  1.67it/s, loss=0.236, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00104, train/loss_step=0.283, global_step=3413.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5683/5971 [56:36<02:52,  1.67it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0931, train/loss_vlb_step=0.000313, train/loss_step=0.0931, global_step=3413.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5684/5971 [56:38<02:51,  1.67it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0931, train/loss_vlb_step=0.000313, train/loss_step=0.0931, global_step=3413.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5684/5971 [56:38<02:51,  1.67it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0965, train/loss_vlb_step=0.000317, train/loss_step=0.0965, global_step=3413.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5685/5971 [56:39<02:50,  1.67it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0965, train/loss_vlb_step=0.000317, train/loss_step=0.0965, global_step=3413.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5685/5971 [56:39<02:50,  1.67it/s, loss=0.227, v_num=0, train/loss_simple_step=0.0578, train/loss_vlb_step=0.000202, train/loss_step=0.0578, global_step=3414.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5686/5971 [56:40<02:50,  1.67it/s, loss=0.227, v_num=0, train/loss_simple_step=0.0578, train/loss_vlb_step=0.000202, train/loss_step=0.0578, global_step=3414.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5686/5971 [56:40<02:50,  1.67it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00945, train/loss_vlb_step=4.34e-5, train/loss_step=0.00945, global_step=3414.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5687/5971 [56:41<02:49,  1.67it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00945, train/loss_vlb_step=4.34e-5, train/loss_step=0.00945, global_step=3414.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5687/5971 [56:41<02:49,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.00046, train/loss_step=0.139, global_step=3414.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  95%|█████████▌| 5688/5971 [56:43<02:49,  1.67it/s, loss=0.182, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.00046, train/loss_step=0.139, global_step=3414.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5688/5971 [56:43<02:49,  1.67it/s, loss=0.188, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000414, train/loss_step=0.126, global_step=3414.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5689/5971 [56:44<02:48,  1.67it/s, loss=0.188, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000414, train/loss_step=0.126, global_step=3414.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5689/5971 [56:44<02:48,  1.67it/s, loss=0.196, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00259, train/loss_step=0.402, global_step=3415.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  95%|█████████▌| 5690/5971 [56:45<02:48,  1.67it/s, loss=0.196, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00259, train/loss_step=0.402, global_step=3415.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5690/5971 [56:45<02:48,  1.67it/s, loss=0.177, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.41e-5, train/loss_step=0.012, global_step=3415.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5691/5971 [56:46<02:47,  1.67it/s, loss=0.177, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.41e-5, train/loss_step=0.012, global_step=3415.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5691/5971 [56:46<02:47,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=3415.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5692/5971 [56:48<02:47,  1.67it/s, loss=0.136, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=3415.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5692/5971 [56:48<02:47,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.646, train/loss_vlb_step=0.0115, train/loss_step=0.646, global_step=3415.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  95%|█████████▌| 5693/5971 [56:49<02:46,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.646, train/loss_vlb_step=0.0115, train/loss_step=0.646, global_step=3415.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5693/5971 [56:49<02:46,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.623, train/loss_vlb_step=0.00468, train/loss_step=0.623, global_step=3416.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5694/5971 [56:50<02:45,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.623, train/loss_vlb_step=0.00468, train/loss_step=0.623, global_step=3416.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5694/5971 [56:50<02:45,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.93e-5, train/loss_step=0.0243, global_step=3416.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5695/5971 [56:51<02:45,  1.67it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.93e-5, train/loss_step=0.0243, global_step=3416.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5695/5971 [56:51<02:45,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.704, train/loss_vlb_step=0.0219, train/loss_step=0.704, global_step=3416.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  95%|█████████▌| 5696/5971 [56:53<02:44,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.704, train/loss_vlb_step=0.0219, train/loss_step=0.704, global_step=3416.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5696/5971 [56:53<02:44,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=3416.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5697/5971 [56:54<02:44,  1.67it/s, loss=0.217, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=3416.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5697/5971 [56:54<02:44,  1.67it/s, loss=0.22, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.00047, train/loss_step=0.143, global_step=3417.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  95%|█████████▌| 5698/5971 [56:55<02:43,  1.67it/s, loss=0.22, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.00047, train/loss_step=0.143, global_step=3417.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5698/5971 [56:55<02:43,  1.67it/s, loss=0.248, v_num=0, train/loss_simple_step=0.690, train/loss_vlb_step=0.0168, train/loss_step=0.690, global_step=3417.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5699/5971 [56:56<02:43,  1.67it/s, loss=0.248, v_num=0, train/loss_simple_step=0.690, train/loss_vlb_step=0.0168, train/loss_step=0.690, global_step=3417.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5699/5971 [56:56<02:43,  1.67it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000266, train/loss_step=0.0807, global_step=3417.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5700/5971 [56:58<02:42,  1.67it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000266, train/loss_step=0.0807, global_step=3417.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5700/5971 [56:58<02:42,  1.67it/s, loss=0.221, v_num=0, train/loss_simple_step=0.00659, train/loss_vlb_step=3.17e-5, train/loss_step=0.00659, global_step=3417.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5701/5971 [56:59<02:41,  1.67it/s, loss=0.221, v_num=0, train/loss_simple_step=0.00659, train/loss_vlb_step=3.17e-5, train/loss_step=0.00659, global_step=3417.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5701/5971 [56:59<02:41,  1.67it/s, loss=0.225, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000427, train/loss_step=0.129, global_step=3418.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  95%|█████████▌| 5702/5971 [57:00<02:41,  1.67it/s, loss=0.225, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000427, train/loss_step=0.129, global_step=3418.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  95%|█████████▌| 5702/5971 [57:00<02:41,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000102, train/loss_step=0.026, global_step=3418.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5703/5971 [57:01<02:40,  1.67it/s, loss=0.212, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000102, train/loss_step=0.026, global_step=3418.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5703/5971 [57:01<02:40,  1.67it/s, loss=0.231, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00348, train/loss_step=0.459, global_step=3418.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  96%|█████████▌| 5704/5971 [57:03<02:40,  1.67it/s, loss=0.231, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00348, train/loss_step=0.459, global_step=3418.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5704/5971 [57:03<02:40,  1.67it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.9e-5, train/loss_step=0.0108, global_step=3418.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5705/5971 [57:04<02:39,  1.67it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.9e-5, train/loss_step=0.0108, global_step=3418.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5705/5971 [57:04<02:39,  1.67it/s, loss=0.231, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000499, train/loss_step=0.152, global_step=3419.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5706/5971 [57:05<02:39,  1.67it/s, loss=0.231, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000499, train/loss_step=0.152, global_step=3419.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5706/5971 [57:05<02:39,  1.67it/s, loss=0.246, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00149, train/loss_step=0.305, global_step=3419.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  96%|█████████▌| 5707/5971 [57:06<02:38,  1.67it/s, loss=0.246, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00149, train/loss_step=0.305, global_step=3419.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5707/5971 [57:06<02:38,  1.67it/s, loss=0.239, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.91e-5, train/loss_step=0.00579, global_step=3419.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5708/5971 [57:08<02:37,  1.67it/s, loss=0.239, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.91e-5, train/loss_step=0.00579, global_step=3419.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5708/5971 [57:08<02:37,  1.67it/s, loss=0.251, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00249, train/loss_step=0.364, global_step=3419.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  96%|█████████▌| 5709/5971 [57:09<02:37,  1.67it/s, loss=0.251, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00249, train/loss_step=0.364, global_step=3419.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5709/5971 [57:09<02:37,  1.67it/s, loss=0.244, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00107, train/loss_step=0.267, global_step=3420.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5710/5971 [57:10<02:36,  1.66it/s, loss=0.244, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00107, train/loss_step=0.267, global_step=3420.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5710/5971 [57:10<02:36,  1.66it/s, loss=0.245, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000124, train/loss_step=0.033, global_step=3420.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5711/5971 [57:10<02:36,  1.66it/s, loss=0.245, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000124, train/loss_step=0.033, global_step=3420.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5711/5971 [57:10<02:36,  1.66it/s, loss=0.241, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000159, train/loss_step=0.0437, global_step=3420.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5712/5971 [57:13<02:35,  1.66it/s, loss=0.241, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.000159, train/loss_step=0.0437, global_step=3420.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5712/5971 [57:13<02:35,  1.66it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=6e-5, train/loss_step=0.0135, global_step=3420.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  96%|█████████▌| 5713/5971 [57:14<02:35,  1.66it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=6e-5, train/loss_step=0.0135, global_step=3420.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5713/5971 [57:14<02:35,  1.66it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.95e-5, train/loss_step=0.0114, global_step=3421.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5714/5971 [57:15<02:34,  1.66it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.95e-5, train/loss_step=0.0114, global_step=3421.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5714/5971 [57:15<02:34,  1.66it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.44e-6, train/loss_step=0.00157, global_step=3421.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5715/5971 [57:16<02:33,  1.66it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.44e-6, train/loss_step=0.00157, global_step=3421.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5715/5971 [57:16<02:33,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.44e-5, train/loss_step=0.00256, global_step=3421.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5716/5971 [57:18<02:33,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.44e-5, train/loss_step=0.00256, global_step=3421.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5716/5971 [57:18<02:33,  1.66it/s, loss=0.138, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=7.93e-5, train/loss_step=0.021, global_step=3421.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  96%|█████████▌| 5717/5971 [57:19<02:32,  1.66it/s, loss=0.138, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=7.93e-5, train/loss_step=0.021, global_step=3421.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5717/5971 [57:19<02:32,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0051, train/loss_vlb_step=2.59e-5, train/loss_step=0.0051, global_step=3422.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5718/5971 [57:20<02:32,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0051, train/loss_vlb_step=2.59e-5, train/loss_step=0.0051, global_step=3422.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5718/5971 [57:20<02:32,  1.66it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.00787, train/loss_vlb_step=3.51e-5, train/loss_step=0.00787, global_step=3422.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5719/5971 [57:20<02:31,  1.66it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.00787, train/loss_vlb_step=3.51e-5, train/loss_step=0.00787, global_step=3422.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5719/5971 [57:20<02:31,  1.66it/s, loss=0.094, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.16e-5, train/loss_step=0.014, global_step=3422.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  96%|█████████▌| 5720/5971 [57:23<02:31,  1.66it/s, loss=0.094, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.16e-5, train/loss_step=0.014, global_step=3422.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5720/5971 [57:23<02:31,  1.66it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0312, train/loss_vlb_step=0.000119, train/loss_step=0.0312, global_step=3422.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5721/5971 [57:24<02:30,  1.66it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0312, train/loss_vlb_step=0.000119, train/loss_step=0.0312, global_step=3422.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5721/5971 [57:24<02:30,  1.66it/s, loss=0.094, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=3423.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  96%|█████████▌| 5722/5971 [57:24<02:29,  1.66it/s, loss=0.094, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=3423.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5722/5971 [57:24<02:29,  1.66it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.08e-5, train/loss_step=0.0204, global_step=3423.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5723/5971 [57:25<02:29,  1.66it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.08e-5, train/loss_step=0.0204, global_step=3423.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5723/5971 [57:25<02:29,  1.66it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.00334, train/loss_step=0.496, global_step=3423.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  96%|█████████▌| 5724/5971 [57:28<02:28,  1.66it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.00334, train/loss_step=0.496, global_step=3423.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5724/5971 [57:28<02:28,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000549, train/loss_step=0.164, global_step=3423.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5725/5971 [57:28<02:28,  1.66it/s, loss=0.103, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000549, train/loss_step=0.164, global_step=3423.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5725/5971 [57:28<02:28,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00103, train/loss_step=0.268, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  96%|█████████▌| 5726/5971 [57:29<02:27,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00103, train/loss_step=0.268, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5726/5971 [57:29<02:27,  1.66it/s, loss=0.094, v_num=0, train/loss_simple_step=0.00535, train/loss_vlb_step=2.8e-5, train/loss_step=0.00535, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5727/5971 [57:30<02:26,  1.66it/s, loss=0.094, v_num=0, train/loss_simple_step=0.00535, train/loss_vlb_step=2.8e-5, train/loss_step=0.00535, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5727/5971 [57:30<02:26,  1.66it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.00905, train/loss_vlb_step=4.31e-5, train/loss_step=0.00905, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5728/5971 [57:32<02:26,  1.66it/s, loss=0.0942, v_num=0, train/loss_simple_step=0.00905, train/loss_vlb_step=4.31e-5, train/loss_step=0.00905, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  96%|█████████▌| 5728/5971 [57:33<02:26,  1.66it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.22it/s][A
Epoch 5:  96%|█████████▌| 5730/5971 [57:33<02:25,  1.66it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   1%|          | 2/167 [00:00<00:42,  3.85it/s][A
Epoch 5:  96%|█████████▌| 5732/5971 [57:33<02:23,  1.66it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.98it/s][A
Epoch 5:  96%|█████████▌| 5735/5971 [57:33<02:22,  1.66it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.84it/s][A
Epoch 5:  96%|█████████▌| 5738/5971 [57:33<02:20,  1.66it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.71it/s][A
Epoch 5:  96%|█████████▌| 5741/5971 [57:34<02:18,  1.66it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.63it/s][A
Epoch 5:  96%|█████████▌| 5745/5971 [57:34<02:15,  1.66it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.87it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.24it/s][A
Epoch 5:  96%|█████████▋| 5749/5971 [57:34<02:13,  1.66it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.67it/s][A
Epoch 5:  96%|█████████▋| 5753/5971 [57:34<02:10,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.61it/s][A
Epoch 5:  96%|█████████▋| 5757/5971 [57:34<02:08,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.11it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.44it/s][A
Epoch 5:  96%|█████████▋| 5761/5971 [57:34<02:05,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.86it/s][A
Epoch 5:  97%|█████████▋| 5765/5971 [57:34<02:03,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 26.84it/s][A
Epoch 5:  97%|█████████▋| 5769/5971 [57:35<02:00,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.06it/s][A
Epoch 5:  97%|█████████▋| 5773/5971 [57:35<01:58,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.74it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.43it/s][A
Epoch 5:  97%|█████████▋| 5777/5971 [57:35<01:56,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.64it/s][A
Epoch 5:  97%|█████████▋| 5781/5971 [57:35<01:53,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.77it/s][A
Epoch 5:  97%|█████████▋| 5785/5971 [57:35<01:51,  1.67it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.38it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:03, 26.93it/s][A
Epoch 5:  97%|█████████▋| 5789/5971 [57:35<01:48,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 28.19it/s][A
Epoch 5:  97%|█████████▋| 5793/5971 [57:35<01:46,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.97it/s][A
Epoch 5:  97%|█████████▋| 5797/5971 [57:36<01:43,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.34it/s][A
Epoch 5:  97%|█████████▋| 5801/5971 [57:36<01:41,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.86it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.63it/s][A
Epoch 5:  97%|█████████▋| 5805/5971 [57:36<01:38,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.10it/s][A
Epoch 5:  97%|█████████▋| 5809/5971 [57:36<01:36,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 25.03it/s][A
Epoch 5:  97%|█████████▋| 5813/5971 [57:36<01:33,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  51%|█████     | 85/167 [00:03<00:03, 24.95it/s][A
Epoch 5:  97%|█████████▋| 5817/5971 [57:36<01:31,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.28it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.56it/s][A
Epoch 5:  97%|█████████▋| 5821/5971 [57:37<01:29,  1.68it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.64it/s][A
Epoch 5:  98%|█████████▊| 5825/5971 [57:37<01:26,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.37it/s][A
Epoch 5:  98%|█████████▊| 5829/5971 [57:37<01:24,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.79it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.28it/s][A
Epoch 5:  98%|█████████▊| 5833/5971 [57:37<01:21,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.09it/s][A
Epoch 5:  98%|█████████▊| 5837/5971 [57:37<01:19,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.58it/s][A
Epoch 5:  98%|█████████▊| 5841/5971 [57:37<01:16,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 25.71it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 25.82it/s][A
Epoch 5:  98%|█████████▊| 5845/5971 [57:37<01:14,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.98it/s][A
Epoch 5:  98%|█████████▊| 5849/5971 [57:38<01:12,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.20it/s][A
Epoch 5:  98%|█████████▊| 5853/5971 [57:38<01:09,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.19it/s][A
Epoch 5:  98%|█████████▊| 5857/5971 [57:38<01:07,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.76it/s][A
Epoch 5:  98%|█████████▊| 5861/5971 [57:38<01:04,  1.69it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.99it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.53it/s][A
Epoch 5:  98%|█████████▊| 5865/5971 [57:38<01:02,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.64it/s][A
Epoch 5:  98%|█████████▊| 5869/5971 [57:38<01:00,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.54it/s][A
Epoch 5:  98%|█████████▊| 5873/5971 [57:39<00:57,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.85it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.26it/s][A
Epoch 5:  98%|█████████▊| 5877/5971 [57:39<00:55,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.68it/s][A
Epoch 5:  98%|█████████▊| 5881/5971 [57:39<00:52,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.90it/s][A
Epoch 5:  99%|█████████▊| 5885/5971 [57:39<00:50,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.16it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.97it/s][A
Epoch 5:  99%|█████████▊| 5889/5971 [57:39<00:48,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.39it/s][A
Epoch 5:  99%|█████████▊| 5893/5971 [57:39<00:45,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.58it/s][A
Epoch 5:  99%|█████████▊| 5896/5971 [57:40<00:44,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]

                                                             [A
Epoch 5:  99%|█████████▉| 5897/5971 [57:41<00:43,  1.70it/s, loss=0.0808, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3424.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5897/5971 [57:41<00:43,  1.70it/s, loss=0.0676, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.33e-5, train/loss_step=0.00224, global_step=3425.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5898/5971 [57:42<00:42,  1.70it/s, loss=0.0668, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=6.47e-5, train/loss_step=0.0175, global_step=3425.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  99%|█████████▉| 5899/5971 [57:42<00:42,  1.70it/s, loss=0.0647, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.98e-6, train/loss_step=0.0017, global_step=3425.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5900/5971 [57:45<00:41,  1.70it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.625, train/loss_vlb_step=0.0111, train/loss_step=0.625, global_step=3425.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  99%|█████████▉| 5901/5971 [57:46<00:41,  1.70it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.625, train/loss_vlb_step=0.0111, train/loss_step=0.625, global_step=3425.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5901/5971 [57:46<00:41,  1.70it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000232, train/loss_step=0.0706, global_step=3426.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5902/5971 [57:47<00:40,  1.70it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.92e-5, train/loss_step=0.0105, global_step=3426.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  99%|█████████▉| 5903/5971 [57:48<00:39,  1.70it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.45e-5, train/loss_step=0.00262, global_step=3426.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5904/5971 [57:51<00:39,  1.70it/s, loss=0.112, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00119, train/loss_step=0.298, global_step=3426.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  99%|█████████▉| 5905/5971 [57:52<00:38,  1.70it/s, loss=0.112, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00119, train/loss_step=0.298, global_step=3426.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5905/5971 [57:52<00:38,  1.70it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.75e-5, train/loss_step=0.0032, global_step=3427.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5906/5971 [57:53<00:38,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.0059, train/loss_step=0.524, global_step=3427.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  99%|█████████▉| 5907/5971 [57:54<00:37,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=7.06e-5, train/loss_step=0.016, global_step=3427.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5908/5971 [57:56<00:37,  1.70it/s, loss=0.172, v_num=0, train/loss_simple_step=0.699, train/loss_vlb_step=0.0178, train/loss_step=0.699, global_step=3427.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  99%|█████████▉| 5909/5971 [57:57<00:36,  1.70it/s, loss=0.172, v_num=0, train/loss_simple_step=0.699, train/loss_vlb_step=0.0178, train/loss_step=0.699, global_step=3427.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5909/5971 [57:57<00:36,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00321, train/loss_vlb_step=1.7e-5, train/loss_step=0.00321, global_step=3428.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5910/5971 [57:58<00:35,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00781, train/loss_vlb_step=3.72e-5, train/loss_step=0.00781, global_step=3428.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5911/5971 [57:59<00:35,  1.70it/s, loss=0.17, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.00712, train/loss_step=0.570, global_step=3428.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]     
Epoch 5:  99%|█████████▉| 5912/5971 [58:01<00:34,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2e-5, train/loss_step=0.00391, global_step=3428.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5913/5971 [58:02<00:34,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00391, train/loss_vlb_step=2e-5, train/loss_step=0.00391, global_step=3428.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5913/5971 [58:02<00:34,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00602, train/loss_vlb_step=3.03e-5, train/loss_step=0.00602, global_step=3429.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5914/5971 [58:03<00:33,  1.70it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.82e-5, train/loss_step=0.0055, global_step=3429.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  99%|█████████▉| 5915/5971 [58:04<00:32,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000701, train/loss_step=0.196, global_step=3429.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  99%|█████████▉| 5916/5971 [58:06<00:32,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00808, train/loss_vlb_step=3.99e-5, train/loss_step=0.00808, global_step=3429.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5917/5971 [58:07<00:31,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00808, train/loss_vlb_step=3.99e-5, train/loss_step=0.00808, global_step=3429.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5917/5971 [58:07<00:31,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000714, train/loss_step=0.206, global_step=3430.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  99%|█████████▉| 5918/5971 [58:08<00:31,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00187, train/loss_step=0.412, global_step=3430.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  99%|█████████▉| 5919/5971 [58:09<00:30,  1.70it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000314, train/loss_step=0.0956, global_step=3430.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5920/5971 [58:11<00:30,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00916, train/loss_vlb_step=4.49e-5, train/loss_step=0.00916, global_step=3430.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5921/5971 [58:12<00:29,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00916, train/loss_vlb_step=4.49e-5, train/loss_step=0.00916, global_step=3430.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5921/5971 [58:12<00:29,  1.70it/s, loss=0.16, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000431, train/loss_step=0.124, global_step=3431.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  99%|█████████▉| 5922/5971 [58:13<00:28,  1.70it/s, loss=0.174, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00146, train/loss_step=0.298, global_step=3431.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5923/5971 [58:14<00:28,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000118, train/loss_step=0.0317, global_step=3431.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5924/5971 [58:16<00:27,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000139, train/loss_step=0.0381, global_step=3431.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5925/5971 [58:17<00:27,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000139, train/loss_step=0.0381, global_step=3431.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5925/5971 [58:17<00:27,  1.69it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000143, train/loss_step=0.0407, global_step=3432.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5926/5971 [58:17<00:26,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000865, train/loss_step=0.212, global_step=3432.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5:  99%|█████████▉| 5927/5971 [58:18<00:25,  1.69it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000151, train/loss_step=0.0438, global_step=3432.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5928/5971 [58:21<00:25,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000885, train/loss_step=0.234, global_step=3432.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  99%|█████████▉| 5929/5971 [58:22<00:24,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000885, train/loss_step=0.234, global_step=3432.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5929/5971 [58:22<00:24,  1.69it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0457, train/loss_vlb_step=0.000165, train/loss_step=0.0457, global_step=3433.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5930/5971 [58:22<00:24,  1.69it/s, loss=0.165, v_num=0, train/loss_simple_step=0.716, train/loss_vlb_step=0.0104, train/loss_step=0.716, global_step=3433.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5:  99%|█████████▉| 5931/5971 [58:23<00:23,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00122, train/loss_step=0.309, global_step=3433.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5932/5971 [58:26<00:23,  1.69it/s, loss=0.191, v_num=0, train/loss_simple_step=0.796, train/loss_vlb_step=0.0456, train/loss_step=0.796, global_step=3433.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5:  99%|█████████▉| 5933/5971 [58:26<00:22,  1.69it/s, loss=0.191, v_num=0, train/loss_simple_step=0.796, train/loss_vlb_step=0.0456, train/loss_step=0.796, global_step=3433.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5933/5971 [58:26<00:22,  1.69it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.77e-5, train/loss_step=0.0108, global_step=3434.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5934/5971 [58:27<00:21,  1.69it/s, loss=0.2, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000578, train/loss_step=0.167, global_step=3434.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5:  99%|█████████▉| 5935/5971 [58:28<00:21,  1.69it/s, loss=0.215, v_num=0, train/loss_simple_step=0.512, train/loss_vlb_step=0.00503, train/loss_step=0.512, global_step=3434.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5936/5971 [58:30<00:20,  1.69it/s, loss=0.233, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00195, train/loss_step=0.358, global_step=3434.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5937/5971 [58:31<00:20,  1.69it/s, loss=0.233, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00195, train/loss_step=0.358, global_step=3434.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5937/5971 [58:31<00:20,  1.69it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.06e-5, train/loss_step=0.0167, global_step=3435.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5938/5971 [58:32<00:19,  1.69it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0629, train/loss_vlb_step=0.000217, train/loss_step=0.0629, global_step=3435.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5939/5971 [58:33<00:18,  1.69it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000151, train/loss_step=0.0399, global_step=3435.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5940/5971 [58:35<00:18,  1.69it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000183, train/loss_step=0.0496, global_step=3435.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5941/5971 [58:36<00:17,  1.69it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000183, train/loss_step=0.0496, global_step=3435.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:  99%|█████████▉| 5941/5971 [58:36<00:17,  1.69it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.17e-6, train/loss_step=0.00154, global_step=3436.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5942/5971 [58:37<00:17,  1.69it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=6.57e-5, train/loss_step=0.0175, global_step=3436.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5: 100%|█████████▉| 5943/5971 [58:38<00:16,  1.69it/s, loss=0.19, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=3436.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5: 100%|█████████▉| 5944/5971 [58:40<00:15,  1.69it/s, loss=0.217, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00419, train/loss_step=0.568, global_step=3436.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5945/5971 [58:41<00:15,  1.69it/s, loss=0.217, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00419, train/loss_step=0.568, global_step=3436.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5945/5971 [58:41<00:15,  1.69it/s, loss=0.215, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.12e-5, train/loss_step=0.00189, global_step=3437.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5946/5971 [58:42<00:14,  1.69it/s, loss=0.219, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00164, train/loss_step=0.288, global_step=3437.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5: 100%|█████████▉| 5947/5971 [58:43<00:14,  1.69it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.1e-5, train/loss_step=0.00406, global_step=3437.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5948/5971 [58:46<00:13,  1.69it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00103, train/loss_vlb_step=6.15e-6, train/loss_step=0.00103, global_step=3437.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5949/5971 [58:46<00:13,  1.69it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00103, train/loss_vlb_step=6.15e-6, train/loss_step=0.00103, global_step=3437.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5949/5971 [58:46<00:13,  1.69it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000116, train/loss_step=0.0306, global_step=3438.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|█████████▉| 5950/5971 [58:47<00:12,  1.69it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=8.57e-5, train/loss_step=0.0236, global_step=3438.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5: 100%|█████████▉| 5951/5971 [58:48<00:11,  1.69it/s, loss=0.179, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00354, train/loss_step=0.501, global_step=3438.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|█████████▉| 5952/5971 [58:50<00:11,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000568, train/loss_step=0.166, global_step=3438.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5953/5971 [58:51<00:10,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000568, train/loss_step=0.166, global_step=3438.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5953/5971 [58:51<00:10,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.17e-5, train/loss_step=0.00206, global_step=3439.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5954/5971 [58:52<00:10,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00121, train/loss_step=0.288, global_step=3439.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5: 100%|█████████▉| 5955/5971 [58:53<00:09,  1.69it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.07e-5, train/loss_step=0.00392, global_step=3439.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5956/5971 [58:55<00:08,  1.68it/s, loss=0.118, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000558, train/loss_step=0.160, global_step=3439.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5: 100%|█████████▉| 5957/5971 [58:56<00:08,  1.68it/s, loss=0.118, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000558, train/loss_step=0.160, global_step=3439.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5957/5971 [58:56<00:08,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.27e-5, train/loss_step=0.00222, global_step=3440.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5958/5971 [58:57<00:07,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000681, train/loss_step=0.203, global_step=3440.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5: 100%|█████████▉| 5959/5971 [58:58<00:07,  1.68it/s, loss=0.124, v_num=0, train/loss_simple_step=0.032, train/loss_vlb_step=0.000115, train/loss_step=0.032, global_step=3440.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5960/5971 [59:00<00:06,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.001, train/loss_step=0.257, global_step=3440.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5: 100%|█████████▉| 5961/5971 [59:01<00:05,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.001, train/loss_step=0.257, global_step=3440.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5961/5971 [59:01<00:05,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.86e-5, train/loss_step=0.0107, global_step=3441.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5962/5971 [59:02<00:05,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.03e-5, train/loss_step=0.0139, global_step=3441.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5963/5971 [59:03<00:04,  1.68it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000267, train/loss_step=0.0809, global_step=3441.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5964/5971 [59:05<00:04,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.662, train/loss_vlb_step=0.0125, train/loss_step=0.662, global_step=3441.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]    
Epoch 5: 100%|█████████▉| 5965/5971 [59:06<00:03,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.662, train/loss_vlb_step=0.0125, train/loss_step=0.662, global_step=3441.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5965/5971 [59:06<00:03,  1.68it/s, loss=0.149, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000903, train/loss_step=0.242, global_step=3442.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5966/5971 [59:07<00:02,  1.68it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=0.0001, train/loss_step=0.0263, global_step=3442.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5967/5971 [59:08<00:02,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000125, train/loss_step=0.0325, global_step=3442.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5968/5971 [59:10<00:01,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000109, train/loss_step=0.0297, global_step=3442.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5969/5971 [59:11<00:01,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000109, train/loss_step=0.0297, global_step=3442.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|█████████▉| 5969/5971 [59:11<00:01,  1.68it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0651, train/loss_vlb_step=0.000225, train/loss_step=0.0651, global_step=3443.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|█████████▉| 5970/5971 [59:12<00:00,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00508, train/loss_vlb_step=2.73e-5, train/loss_step=0.00508, global_step=3443.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:13<00:00,  1.68it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00373, train/loss_vlb_step=1.98e-5, train/loss_step=0.00373, global_step=3443.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:16<00:00,  1.68it/s, loss=0.118, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000914, train/loss_step=0.241, global_step=3443.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5: 100%|██████████| 5971/5971 [59:17<00:00,  1.68it/s, loss=0.136, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00194, train/loss_step=0.366, global_step=3444.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|██████████| 5971/5971 [59:18<00:00,  1.68it/s, loss=0.136, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.0013, train/loss_step=0.282, global_step=3444.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|██████████| 5971/5971 [59:19<00:00,  1.68it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.25e-5, train/loss_step=0.00208, global_step=3444.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:21<00:00,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000376, train/loss_step=0.113, global_step=3444.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5: 100%|██████████| 5971/5971 [59:21<00:00,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000376, train/loss_step=0.113, global_step=3444.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:22<00:00,  1.68it/s, loss=0.14, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000411, train/loss_step=0.123, global_step=3445.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|██████████| 5971/5971 [59:23<00:00,  1.68it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00691, train/loss_vlb_step=3.18e-5, train/loss_step=0.00691, global_step=3445.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:24<00:00,  1.68it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00465, train/loss_vlb_step=2.13e-5, train/loss_step=0.00465, global_step=3445.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:26<00:00,  1.67it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.5e-5, train/loss_step=0.0187, global_step=3445.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]   
Epoch 5: 100%|██████████| 5971/5971 [59:27<00:00,  1.67it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000165, train/loss_step=0.0479, global_step=3446.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:28<00:00,  1.67it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.08e-5, train/loss_step=0.00384, global_step=3446.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:29<00:00,  1.67it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.67e-5, train/loss_step=0.0159, global_step=3446.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5: 100%|██████████| 5971/5971 [59:31<00:00,  1.67it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00111, train/loss_step=0.274, global_step=3446.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|██████████| 5971/5971 [59:32<00:00,  1.67it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.0048, train/loss_vlb_step=2.55e-5, train/loss_step=0.0048, global_step=3447.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:33<00:00,  1.67it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.46e-5, train/loss_step=0.0121, global_step=3447.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:34<00:00,  1.67it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000997, train/loss_step=0.262, global_step=3447.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|██████████| 5971/5971 [59:36<00:00,  1.67it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00692, train/loss_vlb_step=3.44e-5, train/loss_step=0.00692, global_step=3447.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:37<00:00,  1.67it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000118, train/loss_step=0.0314, global_step=3448.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5: 100%|██████████| 5971/5971 [59:38<00:00,  1.67it/s, loss=0.0911, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.44e-6, train/loss_step=0.0014, global_step=3448.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|██████████| 5971/5971 [59:39<00:00,  1.67it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000632, train/loss_step=0.180, global_step=3448.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 5: 100%|██████████| 5971/5971 [59:41<00:00,  1.67it/s, loss=0.108, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00257, train/loss_step=0.397, global_step=3448.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]  
Epoch 5: 100%|██████████| 5971/5971 [59:44<00:00,  1.67it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.03e-5, train/loss_step=0.00173, global_step=3449.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 5:   0%|          | 0/5971 [00:00<00:00, 9892.23it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.03e-5, train/loss_step=0.00173, global_step=3449.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149] 
Epoch 6:   0%|          | 0/5971 [00:00<00:02, 2813.08it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.03e-5, train/loss_step=0.00173, global_step=3449.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 6:   0%|          | 1/5971 [00:02<1:56:17,  1.17s/it, loss=0.0895, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.03e-5, train/loss_step=0.00173, global_step=3449.0, train/loss_simple_epoch=0.149, train/loss_vlb_epoch=0.00294, train/loss_epoch=0.149]
Epoch 6:   0%|          | 1/5971 [00:02<1:56:21,  1.17s/it, loss=0.0771, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000131, train/loss_step=0.0335, global_step=3450.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   0%|          | 2/5971 [00:03<1:48:04,  1.09s/it, loss=0.0771, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000131, train/loss_step=0.0335, global_step=3450.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 2/5971 [00:03<1:48:06,  1.09s/it, loss=0.0797, v_num=0, train/loss_simple_step=0.0558, train/loss_vlb_step=0.000191, train/loss_step=0.0558, global_step=3450.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 3/5971 [00:04<1:43:50,  1.04s/it, loss=0.0797, v_num=0, train/loss_simple_step=0.0558, train/loss_vlb_step=0.000191, train/loss_step=0.0558, global_step=3450.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 3/5971 [00:04<1:43:51,  1.04s/it, loss=0.0763, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000168, train/loss_step=0.0443, global_step=3450.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 4/5971 [00:06<2:11:18,  1.32s/it, loss=0.0763, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000168, train/loss_step=0.0443, global_step=3450.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 4/5971 [00:06<2:11:19,  1.32s/it, loss=0.0702, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=8.1e-6, train/loss_step=0.00136, global_step=3450.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 5/5971 [00:07<2:04:25,  1.25s/it, loss=0.0702, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=8.1e-6, train/loss_step=0.00136, global_step=3450.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 5/5971 [00:07<2:04:26,  1.25s/it, loss=0.0711, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.0001, train/loss_step=0.0256, global_step=3451.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   0%|          | 6/5971 [00:08<1:59:15,  1.20s/it, loss=0.0711, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.0001, train/loss_step=0.0256, global_step=3451.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 6/5971 [00:08<1:59:16,  1.20s/it, loss=0.0712, v_num=0, train/loss_simple_step=0.00633, train/loss_vlb_step=3.19e-5, train/loss_step=0.00633, global_step=3451.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 7/5971 [00:09<1:55:33,  1.16s/it, loss=0.0712, v_num=0, train/loss_simple_step=0.00633, train/loss_vlb_step=3.19e-5, train/loss_step=0.00633, global_step=3451.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 7/5971 [00:09<1:55:34,  1.16s/it, loss=0.0711, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.74e-5, train/loss_step=0.0168, global_step=3451.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   0%|          | 8/5971 [00:11<2:07:03,  1.28s/it, loss=0.0711, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.74e-5, train/loss_step=0.0168, global_step=3451.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 8/5971 [00:11<2:07:04,  1.28s/it, loss=0.0774, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000584, train/loss_step=0.174, global_step=3451.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   0%|          | 9/5971 [00:12<2:03:23,  1.24s/it, loss=0.0774, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000584, train/loss_step=0.174, global_step=3451.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 9/5971 [00:12<2:03:23,  1.24s/it, loss=0.0775, v_num=0, train/loss_simple_step=0.00504, train/loss_vlb_step=2.54e-5, train/loss_step=0.00504, global_step=3452.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 10/5971 [00:13<2:00:09,  1.21s/it, loss=0.0775, v_num=0, train/loss_simple_step=0.00504, train/loss_vlb_step=2.54e-5, train/loss_step=0.00504, global_step=3452.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 10/5971 [00:13<2:00:10,  1.21s/it, loss=0.0844, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000527, train/loss_step=0.154, global_step=3452.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   0%|          | 11/5971 [00:14<1:57:25,  1.18s/it, loss=0.0844, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000527, train/loss_step=0.154, global_step=3452.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 11/5971 [00:14<1:57:25,  1.18s/it, loss=0.073, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.00016, train/loss_step=0.0464, global_step=3452.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 12/5971 [00:16<2:06:51,  1.28s/it, loss=0.073, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.00016, train/loss_step=0.0464, global_step=3452.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 12/5971 [00:16<2:06:51,  1.28s/it, loss=0.0803, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000508, train/loss_step=0.151, global_step=3452.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 13/5971 [00:17<2:04:11,  1.25s/it, loss=0.0803, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000508, train/loss_step=0.151, global_step=3452.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 13/5971 [00:17<2:04:12,  1.25s/it, loss=0.082, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000169, train/loss_step=0.0462, global_step=3453.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 14/5971 [00:18<2:01:51,  1.23s/it, loss=0.082, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000169, train/loss_step=0.0462, global_step=3453.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 14/5971 [00:18<2:01:52,  1.23s/it, loss=0.0811, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000919, train/loss_step=0.243, global_step=3453.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   0%|          | 15/5971 [00:19<1:59:50,  1.21s/it, loss=0.0811, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000919, train/loss_step=0.243, global_step=3453.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 15/5971 [00:19<1:59:51,  1.21s/it, loss=0.0821, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.000102, train/loss_step=0.0276, global_step=3453.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 16/5971 [00:21<2:05:17,  1.26s/it, loss=0.0821, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.000102, train/loss_step=0.0276, global_step=3453.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 16/5971 [00:21<2:05:18,  1.26s/it, loss=0.0941, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00118, train/loss_step=0.271, global_step=3453.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   0%|          | 17/5971 [00:22<2:03:20,  1.24s/it, loss=0.0941, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00118, train/loss_step=0.271, global_step=3453.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 17/5971 [00:22<2:03:20,  1.24s/it, loss=0.0999, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=3454.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 18/5971 [00:23<2:01:28,  1.22s/it, loss=0.0999, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=3454.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 18/5971 [00:23<2:01:28,  1.22s/it, loss=0.0923, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000111, train/loss_step=0.028, global_step=3454.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 19/5971 [00:24<1:59:53,  1.21s/it, loss=0.0923, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000111, train/loss_step=0.028, global_step=3454.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 19/5971 [00:24<1:59:53,  1.21s/it, loss=0.0767, v_num=0, train/loss_simple_step=0.0849, train/loss_vlb_step=0.000284, train/loss_step=0.0849, global_step=3454.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 20/5971 [00:26<2:04:04,  1.25s/it, loss=0.0767, v_num=0, train/loss_simple_step=0.0849, train/loss_vlb_step=0.000284, train/loss_step=0.0849, global_step=3454.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 20/5971 [00:26<2:04:04,  1.25s/it, loss=0.0885, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000907, train/loss_step=0.237, global_step=3454.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   0%|          | 21/5971 [00:27<2:02:29,  1.24s/it, loss=0.0885, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000907, train/loss_step=0.237, global_step=3454.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 21/5971 [00:27<2:02:29,  1.24s/it, loss=0.0953, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000579, train/loss_step=0.170, global_step=3455.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 22/5971 [00:28<2:00:56,  1.22s/it, loss=0.0953, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000579, train/loss_step=0.170, global_step=3455.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 22/5971 [00:28<2:00:57,  1.22s/it, loss=0.112, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00275, train/loss_step=0.386, global_step=3455.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   0%|          | 23/5971 [00:28<1:59:29,  1.21s/it, loss=0.112, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00275, train/loss_step=0.386, global_step=3455.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 23/5971 [00:28<1:59:29,  1.21s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000319, train/loss_step=0.0948, global_step=3455.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 24/5971 [00:31<2:03:16,  1.24s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000319, train/loss_step=0.0948, global_step=3455.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 24/5971 [00:31<2:03:16,  1.24s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000255, train/loss_step=0.0772, global_step=3455.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 25/5971 [00:32<2:01:58,  1.23s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000255, train/loss_step=0.0772, global_step=3455.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 25/5971 [00:32<2:01:58,  1.23s/it, loss=0.127, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000805, train/loss_step=0.201, global_step=3456.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   0%|          | 26/5971 [00:32<2:00:42,  1.22s/it, loss=0.127, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000805, train/loss_step=0.201, global_step=3456.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 26/5971 [00:32<2:00:42,  1.22s/it, loss=0.131, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.00027, train/loss_step=0.0815, global_step=3456.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 27/5971 [00:33<1:59:31,  1.21s/it, loss=0.131, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.00027, train/loss_step=0.0815, global_step=3456.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 27/5971 [00:33<1:59:31,  1.21s/it, loss=0.132, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000135, train/loss_step=0.036, global_step=3456.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   0%|          | 28/5971 [00:35<2:02:42,  1.24s/it, loss=0.132, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000135, train/loss_step=0.036, global_step=3456.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 28/5971 [00:35<2:02:43,  1.24s/it, loss=0.152, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.006, train/loss_step=0.577, global_step=3456.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   0%|          | 29/5971 [00:36<2:01:54,  1.23s/it, loss=0.152, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.006, train/loss_step=0.577, global_step=3456.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   0%|          | 29/5971 [00:36<2:01:54,  1.23s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.83e-5, train/loss_step=0.0139, global_step=3457.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 30/5971 [00:37<2:00:49,  1.22s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.83e-5, train/loss_step=0.0139, global_step=3457.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 30/5971 [00:37<2:00:49,  1.22s/it, loss=0.146, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000101, train/loss_step=0.0257, global_step=3457.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 31/5971 [00:38<1:59:46,  1.21s/it, loss=0.146, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=0.000101, train/loss_step=0.0257, global_step=3457.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 31/5971 [00:38<1:59:46,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.03e-5, train/loss_step=0.0139, global_step=3457.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   1%|          | 32/5971 [00:41<2:03:19,  1.25s/it, loss=0.144, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.03e-5, train/loss_step=0.0139, global_step=3457.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 32/5971 [00:41<2:03:19,  1.25s/it, loss=0.167, v_num=0, train/loss_simple_step=0.602, train/loss_vlb_step=0.0114, train/loss_step=0.602, global_step=3457.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|          | 33/5971 [00:42<2:02:19,  1.24s/it, loss=0.167, v_num=0, train/loss_simple_step=0.602, train/loss_vlb_step=0.0114, train/loss_step=0.602, global_step=3457.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 33/5971 [00:42<2:02:19,  1.24s/it, loss=0.167, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000201, train/loss_step=0.0573, global_step=3458.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 34/5971 [00:42<2:01:17,  1.23s/it, loss=0.167, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000201, train/loss_step=0.0573, global_step=3458.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 34/5971 [00:42<2:01:17,  1.23s/it, loss=0.157, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000111, train/loss_step=0.0287, global_step=3458.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 35/5971 [00:43<2:00:21,  1.22s/it, loss=0.157, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000111, train/loss_step=0.0287, global_step=3458.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 35/5971 [00:43<2:00:21,  1.22s/it, loss=0.178, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00323, train/loss_step=0.458, global_step=3458.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|          | 36/5971 [00:45<2:02:57,  1.24s/it, loss=0.178, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00323, train/loss_step=0.458, global_step=3458.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 36/5971 [00:45<2:02:57,  1.24s/it, loss=0.165, v_num=0, train/loss_simple_step=0.00509, train/loss_vlb_step=2.67e-5, train/loss_step=0.00509, global_step=3458.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 37/5971 [00:46<2:02:01,  1.23s/it, loss=0.165, v_num=0, train/loss_simple_step=0.00509, train/loss_vlb_step=2.67e-5, train/loss_step=0.00509, global_step=3458.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 37/5971 [00:46<2:02:01,  1.23s/it, loss=0.181, v_num=0, train/loss_simple_step=0.447, train/loss_vlb_step=0.00208, train/loss_step=0.447, global_step=3459.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   1%|          | 38/5971 [00:47<2:01:08,  1.23s/it, loss=0.181, v_num=0, train/loss_simple_step=0.447, train/loss_vlb_step=0.00208, train/loss_step=0.447, global_step=3459.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 38/5971 [00:47<2:01:08,  1.23s/it, loss=0.18, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.36e-5, train/loss_step=0.00715, global_step=3459.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 39/5971 [00:48<2:00:16,  1.22s/it, loss=0.18, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.36e-5, train/loss_step=0.00715, global_step=3459.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 39/5971 [00:48<2:00:16,  1.22s/it, loss=0.178, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000146, train/loss_step=0.0382, global_step=3459.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 40/5971 [00:51<2:03:29,  1.25s/it, loss=0.178, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000146, train/loss_step=0.0382, global_step=3459.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 40/5971 [00:51<2:03:29,  1.25s/it, loss=0.179, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00095, train/loss_step=0.268, global_step=3459.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|          | 41/5971 [00:52<2:04:25,  1.26s/it, loss=0.179, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00095, train/loss_step=0.268, global_step=3459.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 41/5971 [00:52<2:04:25,  1.26s/it, loss=0.176, v_num=0, train/loss_simple_step=0.0947, train/loss_vlb_step=0.000313, train/loss_step=0.0947, global_step=3460.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 42/5971 [00:54<2:05:10,  1.27s/it, loss=0.176, v_num=0, train/loss_simple_step=0.0947, train/loss_vlb_step=0.000313, train/loss_step=0.0947, global_step=3460.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 42/5971 [00:54<2:05:10,  1.27s/it, loss=0.177, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00288, train/loss_step=0.415, global_step=3460.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|          | 43/5971 [00:55<2:05:42,  1.27s/it, loss=0.177, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00288, train/loss_step=0.415, global_step=3460.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 43/5971 [00:55<2:05:42,  1.27s/it, loss=0.199, v_num=0, train/loss_simple_step=0.529, train/loss_vlb_step=0.00487, train/loss_step=0.529, global_step=3460.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 44/5971 [00:59<2:11:01,  1.33s/it, loss=0.199, v_num=0, train/loss_simple_step=0.529, train/loss_vlb_step=0.00487, train/loss_step=0.529, global_step=3460.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 44/5971 [00:59<2:11:03,  1.33s/it, loss=0.199, v_num=0, train/loss_simple_step=0.0741, train/loss_vlb_step=0.00025, train/loss_step=0.0741, global_step=3460.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 45/5971 [01:01<2:12:36,  1.34s/it, loss=0.199, v_num=0, train/loss_simple_step=0.0741, train/loss_vlb_step=0.00025, train/loss_step=0.0741, global_step=3460.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 45/5971 [01:01<2:12:36,  1.34s/it, loss=0.191, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000161, train/loss_step=0.0469, global_step=3461.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 46/5971 [01:03<2:13:08,  1.35s/it, loss=0.191, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000161, train/loss_step=0.0469, global_step=3461.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 46/5971 [01:03<2:13:08,  1.35s/it, loss=0.189, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000132, train/loss_step=0.0354, global_step=3461.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 47/5971 [01:04<2:13:31,  1.35s/it, loss=0.189, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000132, train/loss_step=0.0354, global_step=3461.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 47/5971 [01:04<2:13:31,  1.35s/it, loss=0.187, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=8.11e-6, train/loss_step=0.00136, global_step=3461.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 48/5971 [01:08<2:17:03,  1.39s/it, loss=0.187, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=8.11e-6, train/loss_step=0.00136, global_step=3461.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 48/5971 [01:08<2:17:03,  1.39s/it, loss=0.187, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.0056, train/loss_step=0.578, global_step=3461.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:   1%|          | 49/5971 [01:09<2:17:13,  1.39s/it, loss=0.187, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.0056, train/loss_step=0.578, global_step=3461.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 49/5971 [01:09<2:17:13,  1.39s/it, loss=0.19, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000228, train/loss_step=0.0671, global_step=3462.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 50/5971 [01:10<2:16:52,  1.39s/it, loss=0.19, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000228, train/loss_step=0.0671, global_step=3462.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 50/5971 [01:10<2:16:52,  1.39s/it, loss=0.191, v_num=0, train/loss_simple_step=0.0458, train/loss_vlb_step=0.00016, train/loss_step=0.0458, global_step=3462.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 51/5971 [01:12<2:16:58,  1.39s/it, loss=0.191, v_num=0, train/loss_simple_step=0.0458, train/loss_vlb_step=0.00016, train/loss_step=0.0458, global_step=3462.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 51/5971 [01:12<2:16:58,  1.39s/it, loss=0.192, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000146, train/loss_step=0.039, global_step=3462.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   1%|          | 52/5971 [01:15<2:21:07,  1.43s/it, loss=0.192, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000146, train/loss_step=0.039, global_step=3462.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 52/5971 [01:15<2:21:07,  1.43s/it, loss=0.162, v_num=0, train/loss_simple_step=0.00428, train/loss_vlb_step=2.25e-5, train/loss_step=0.00428, global_step=3462.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 53/5971 [01:17<2:21:20,  1.43s/it, loss=0.162, v_num=0, train/loss_simple_step=0.00428, train/loss_vlb_step=2.25e-5, train/loss_step=0.00428, global_step=3462.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 53/5971 [01:17<2:21:20,  1.43s/it, loss=0.163, v_num=0, train/loss_simple_step=0.0869, train/loss_vlb_step=0.000286, train/loss_step=0.0869, global_step=3463.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   1%|          | 54/5971 [01:18<2:21:23,  1.43s/it, loss=0.163, v_num=0, train/loss_simple_step=0.0869, train/loss_vlb_step=0.000286, train/loss_step=0.0869, global_step=3463.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 54/5971 [01:18<2:21:26,  1.43s/it, loss=0.163, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=8.07e-5, train/loss_step=0.020, global_step=3463.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|          | 55/5971 [01:20<2:21:36,  1.44s/it, loss=0.163, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=8.07e-5, train/loss_step=0.020, global_step=3463.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 55/5971 [01:20<2:21:37,  1.44s/it, loss=0.142, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000119, train/loss_step=0.0337, global_step=3463.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 56/5971 [01:23<2:24:52,  1.47s/it, loss=0.142, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000119, train/loss_step=0.0337, global_step=3463.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 56/5971 [01:23<2:24:53,  1.47s/it, loss=0.144, v_num=0, train/loss_simple_step=0.0442, train/loss_vlb_step=0.000154, train/loss_step=0.0442, global_step=3463.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 57/5971 [01:25<2:25:08,  1.47s/it, loss=0.144, v_num=0, train/loss_simple_step=0.0442, train/loss_vlb_step=0.000154, train/loss_step=0.0442, global_step=3463.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 57/5971 [01:25<2:25:10,  1.47s/it, loss=0.122, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.23e-5, train/loss_step=0.00429, global_step=3464.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 58/5971 [01:26<2:24:59,  1.47s/it, loss=0.122, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.23e-5, train/loss_step=0.00429, global_step=3464.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 58/5971 [01:26<2:24:59,  1.47s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.04e-5, train/loss_step=0.0119, global_step=3464.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   1%|          | 59/5971 [01:28<2:25:18,  1.47s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.04e-5, train/loss_step=0.0119, global_step=3464.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 59/5971 [01:28<2:25:18,  1.47s/it, loss=0.148, v_num=0, train/loss_simple_step=0.558, train/loss_vlb_step=0.00564, train/loss_step=0.558, global_step=3464.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   1%|          | 60/5971 [01:32<2:29:34,  1.52s/it, loss=0.148, v_num=0, train/loss_simple_step=0.558, train/loss_vlb_step=0.00564, train/loss_step=0.558, global_step=3464.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 60/5971 [01:32<2:29:34,  1.52s/it, loss=0.145, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000709, train/loss_step=0.204, global_step=3464.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 61/5971 [01:34<2:29:22,  1.52s/it, loss=0.145, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000709, train/loss_step=0.204, global_step=3464.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 61/5971 [01:34<2:29:22,  1.52s/it, loss=0.15, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.00072, train/loss_step=0.197, global_step=3465.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   1%|          | 62/5971 [01:35<2:28:51,  1.51s/it, loss=0.15, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.00072, train/loss_step=0.197, global_step=3465.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 62/5971 [01:35<2:28:51,  1.51s/it, loss=0.136, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000435, train/loss_step=0.131, global_step=3465.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 63/5971 [01:36<2:28:30,  1.51s/it, loss=0.136, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000435, train/loss_step=0.131, global_step=3465.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 63/5971 [01:36<2:28:30,  1.51s/it, loss=0.119, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000749, train/loss_step=0.202, global_step=3465.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 64/5971 [01:39<2:31:23,  1.54s/it, loss=0.119, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000749, train/loss_step=0.202, global_step=3465.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 64/5971 [01:39<2:31:23,  1.54s/it, loss=0.117, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=9.68e-5, train/loss_step=0.0276, global_step=3465.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 65/5971 [01:41<2:31:01,  1.53s/it, loss=0.117, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=9.68e-5, train/loss_step=0.0276, global_step=3465.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 65/5971 [01:41<2:31:01,  1.53s/it, loss=0.115, v_num=0, train/loss_simple_step=0.00633, train/loss_vlb_step=3.15e-5, train/loss_step=0.00633, global_step=3466.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 66/5971 [01:42<2:30:54,  1.53s/it, loss=0.115, v_num=0, train/loss_simple_step=0.00633, train/loss_vlb_step=3.15e-5, train/loss_step=0.00633, global_step=3466.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 66/5971 [01:42<2:30:54,  1.53s/it, loss=0.116, v_num=0, train/loss_simple_step=0.059, train/loss_vlb_step=0.000202, train/loss_step=0.059, global_step=3466.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|          | 67/5971 [01:44<2:30:48,  1.53s/it, loss=0.116, v_num=0, train/loss_simple_step=0.059, train/loss_vlb_step=0.000202, train/loss_step=0.059, global_step=3466.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 67/5971 [01:44<2:30:51,  1.53s/it, loss=0.143, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00681, train/loss_step=0.545, global_step=3466.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   1%|          | 68/5971 [01:47<2:32:59,  1.56s/it, loss=0.143, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00681, train/loss_step=0.545, global_step=3466.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 68/5971 [01:47<2:32:59,  1.56s/it, loss=0.149, v_num=0, train/loss_simple_step=0.701, train/loss_vlb_step=0.0262, train/loss_step=0.701, global_step=3466.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   1%|          | 69/5971 [01:48<2:32:44,  1.55s/it, loss=0.149, v_num=0, train/loss_simple_step=0.701, train/loss_vlb_step=0.0262, train/loss_step=0.701, global_step=3466.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 69/5971 [01:48<2:32:44,  1.55s/it, loss=0.149, v_num=0, train/loss_simple_step=0.0587, train/loss_vlb_step=0.000196, train/loss_step=0.0587, global_step=3467.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 70/5971 [01:49<2:32:10,  1.55s/it, loss=0.149, v_num=0, train/loss_simple_step=0.0587, train/loss_vlb_step=0.000196, train/loss_step=0.0587, global_step=3467.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 70/5971 [01:49<2:32:10,  1.55s/it, loss=0.147, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.06e-5, train/loss_step=0.00404, global_step=3467.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 71/5971 [01:51<2:32:31,  1.55s/it, loss=0.147, v_num=0, train/loss_simple_step=0.00404, train/loss_vlb_step=2.06e-5, train/loss_step=0.00404, global_step=3467.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 71/5971 [01:51<2:32:31,  1.55s/it, loss=0.145, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.31e-5, train/loss_step=0.00243, global_step=3467.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 72/5971 [01:54<2:34:26,  1.57s/it, loss=0.145, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.31e-5, train/loss_step=0.00243, global_step=3467.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 72/5971 [01:54<2:34:26,  1.57s/it, loss=0.153, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000555, train/loss_step=0.159, global_step=3467.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|          | 73/5971 [01:56<2:34:33,  1.57s/it, loss=0.153, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000555, train/loss_step=0.159, global_step=3467.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 73/5971 [01:56<2:34:33,  1.57s/it, loss=0.15, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000115, train/loss_step=0.0301, global_step=3468.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 74/5971 [01:57<2:34:18,  1.57s/it, loss=0.15, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000115, train/loss_step=0.0301, global_step=3468.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|          | 74/5971 [01:57<2:34:18,  1.57s/it, loss=0.149, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.78e-5, train/loss_step=0.0054, global_step=3468.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 75/5971 [01:58<2:33:47,  1.57s/it, loss=0.149, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.78e-5, train/loss_step=0.0054, global_step=3468.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 75/5971 [01:58<2:33:48,  1.57s/it, loss=0.162, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00138, train/loss_step=0.287, global_step=3468.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   1%|▏         | 76/5971 [02:01<2:35:13,  1.58s/it, loss=0.162, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00138, train/loss_step=0.287, global_step=3468.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 76/5971 [02:01<2:35:13,  1.58s/it, loss=0.162, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000146, train/loss_step=0.0393, global_step=3468.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 77/5971 [02:03<2:34:55,  1.58s/it, loss=0.162, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000146, train/loss_step=0.0393, global_step=3468.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 77/5971 [02:03<2:34:55,  1.58s/it, loss=0.163, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=0.000103, train/loss_step=0.0261, global_step=3469.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 78/5971 [02:04<2:34:32,  1.57s/it, loss=0.163, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=0.000103, train/loss_step=0.0261, global_step=3469.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 78/5971 [02:04<2:34:32,  1.57s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.00011, train/loss_step=0.0299, global_step=3469.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   1%|▏         | 79/5971 [02:05<2:34:14,  1.57s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.00011, train/loss_step=0.0299, global_step=3469.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 79/5971 [02:05<2:34:14,  1.57s/it, loss=0.16, v_num=0, train/loss_simple_step=0.494, train/loss_vlb_step=0.00418, train/loss_step=0.494, global_step=3469.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|▏         | 80/5971 [02:08<2:36:20,  1.59s/it, loss=0.16, v_num=0, train/loss_simple_step=0.494, train/loss_vlb_step=0.00418, train/loss_step=0.494, global_step=3469.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 80/5971 [02:08<2:36:21,  1.59s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000107, train/loss_step=0.0275, global_step=3469.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 81/5971 [02:10<2:35:54,  1.59s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000107, train/loss_step=0.0275, global_step=3469.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 81/5971 [02:10<2:35:54,  1.59s/it, loss=0.143, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000117, train/loss_step=0.0311, global_step=3470.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 82/5971 [02:11<2:35:22,  1.58s/it, loss=0.143, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000117, train/loss_step=0.0311, global_step=3470.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 82/5971 [02:11<2:35:22,  1.58s/it, loss=0.137, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.93e-5, train/loss_step=0.0114, global_step=3470.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   1%|▏         | 83/5971 [02:12<2:35:01,  1.58s/it, loss=0.137, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=4.93e-5, train/loss_step=0.0114, global_step=3470.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 83/5971 [02:12<2:35:01,  1.58s/it, loss=0.137, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000692, train/loss_step=0.195, global_step=3470.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   1%|▏         | 84/5971 [02:15<2:36:44,  1.60s/it, loss=0.137, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000692, train/loss_step=0.195, global_step=3470.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 84/5971 [02:15<2:36:45,  1.60s/it, loss=0.161, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.0044, train/loss_step=0.514, global_step=3470.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   1%|▏         | 85/5971 [02:17<2:36:46,  1.60s/it, loss=0.161, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.0044, train/loss_step=0.514, global_step=3470.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 85/5971 [02:17<2:36:46,  1.60s/it, loss=0.166, v_num=0, train/loss_simple_step=0.0995, train/loss_vlb_step=0.000327, train/loss_step=0.0995, global_step=3471.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 86/5971 [02:18<2:36:37,  1.60s/it, loss=0.166, v_num=0, train/loss_simple_step=0.0995, train/loss_vlb_step=0.000327, train/loss_step=0.0995, global_step=3471.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 86/5971 [02:18<2:36:37,  1.60s/it, loss=0.163, v_num=0, train/loss_simple_step=0.00761, train/loss_vlb_step=3.65e-5, train/loss_step=0.00761, global_step=3471.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 87/5971 [02:20<2:36:14,  1.59s/it, loss=0.163, v_num=0, train/loss_simple_step=0.00761, train/loss_vlb_step=3.65e-5, train/loss_step=0.00761, global_step=3471.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 87/5971 [02:20<2:36:14,  1.59s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00936, train/loss_vlb_step=4.35e-5, train/loss_step=0.00936, global_step=3471.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 88/5971 [02:23<2:38:06,  1.61s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00936, train/loss_vlb_step=4.35e-5, train/loss_step=0.00936, global_step=3471.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 88/5971 [02:23<2:38:06,  1.61s/it, loss=0.115, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.000937, train/loss_step=0.272, global_step=3471.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   1%|▏         | 89/5971 [02:24<2:37:37,  1.61s/it, loss=0.115, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.000937, train/loss_step=0.272, global_step=3471.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   1%|▏         | 89/5971 [02:24<2:37:37,  1.61s/it, loss=0.113, v_num=0, train/loss_simple_step=0.00641, train/loss_vlb_step=2.84e-5, train/loss_step=0.00641, global_step=3472.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 90/5971 [02:26<2:37:16,  1.60s/it, loss=0.113, v_num=0, train/loss_simple_step=0.00641, train/loss_vlb_step=2.84e-5, train/loss_step=0.00641, global_step=3472.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 90/5971 [02:26<2:37:16,  1.60s/it, loss=0.122, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000719, train/loss_step=0.194, global_step=3472.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   2%|▏         | 91/5971 [02:27<2:36:57,  1.60s/it, loss=0.122, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000719, train/loss_step=0.194, global_step=3472.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 91/5971 [02:27<2:36:57,  1.60s/it, loss=0.13, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000657, train/loss_step=0.170, global_step=3472.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   2%|▏         | 92/5971 [02:30<2:38:35,  1.62s/it, loss=0.13, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000657, train/loss_step=0.170, global_step=3472.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 92/5971 [02:30<2:38:35,  1.62s/it, loss=0.136, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000961, train/loss_step=0.262, global_step=3472.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 93/5971 [02:31<2:38:14,  1.62s/it, loss=0.136, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000961, train/loss_step=0.262, global_step=3472.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 93/5971 [02:31<2:38:14,  1.62s/it, loss=0.134, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.97e-6, train/loss_step=0.00166, global_step=3473.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 94/5971 [02:33<2:38:12,  1.62s/it, loss=0.134, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.97e-6, train/loss_step=0.00166, global_step=3473.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 94/5971 [02:33<2:38:12,  1.62s/it, loss=0.134, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.61e-5, train/loss_step=0.00296, global_step=3473.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 95/5971 [02:34<2:38:01,  1.61s/it, loss=0.134, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.61e-5, train/loss_step=0.00296, global_step=3473.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 95/5971 [02:34<2:38:02,  1.61s/it, loss=0.123, v_num=0, train/loss_simple_step=0.0664, train/loss_vlb_step=0.000222, train/loss_step=0.0664, global_step=3473.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   2%|▏         | 96/5971 [02:37<2:39:24,  1.63s/it, loss=0.123, v_num=0, train/loss_simple_step=0.0664, train/loss_vlb_step=0.000222, train/loss_step=0.0664, global_step=3473.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 96/5971 [02:37<2:39:27,  1.63s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.45e-5, train/loss_step=0.0103, global_step=3473.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   2%|▏         | 97/5971 [02:39<2:39:20,  1.63s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.45e-5, train/loss_step=0.0103, global_step=3473.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 97/5971 [02:39<2:39:20,  1.63s/it, loss=0.127, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   2%|▏         | 98/5971 [02:40<2:39:06,  1.63s/it, loss=0.127, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 98/5971 [02:40<2:39:07,  1.63s/it, loss=0.129, v_num=0, train/loss_simple_step=0.0703, train/loss_vlb_step=0.00024, train/loss_step=0.0703, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 99/5971 [02:42<2:38:44,  1.62s/it, loss=0.129, v_num=0, train/loss_simple_step=0.0703, train/loss_vlb_step=0.00024, train/loss_step=0.0703, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 99/5971 [02:42<2:38:45,  1.62s/it, loss=0.112, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000462, train/loss_step=0.140, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   2%|▏         | 100/5971 [02:45<2:40:13,  1.64s/it, loss=0.112, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000462, train/loss_step=0.140, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   2%|▏         | 100/5971 [02:45<2:40:13,  1.64s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<02:38,  1.05it/s][A
Epoch 6:   2%|▏         | 102/5971 [02:46<2:38:02,  1.62s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:01<01:21,  2.02it/s][A
Epoch 6:   2%|▏         | 104/5971 [02:46<2:35:12,  1.59s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   2%|▏         | 4/167 [00:01<00:36,  4.52it/s][A

Validating:   3%|▎         | 5/167 [00:01<00:31,  5.22it/s][A
Epoch 6:   2%|▏         | 106/5971 [02:46<2:32:24,  1.56s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   4%|▍         | 7/167 [00:01<00:21,  7.39it/s][A
Epoch 6:   2%|▏         | 108/5971 [02:46<2:29:41,  1.53s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▌         | 9/167 [00:01<00:16,  9.44it/s][A
Epoch 6:   2%|▏         | 110/5971 [02:47<2:27:03,  1.51s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:01<00:17,  8.82it/s][A
Epoch 6:   2%|▏         | 112/5971 [02:47<2:24:37,  1.48s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:02<00:16,  9.53it/s][A
Epoch 6:   2%|▏         | 114/5971 [02:47<2:22:12,  1.46s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   9%|▉         | 15/167 [00:02<00:13, 10.93it/s][A
Epoch 6:   2%|▏         | 116/5971 [02:47<2:19:50,  1.43s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:02<00:12, 11.93it/s][A
Epoch 6:   2%|▏         | 118/5971 [02:47<2:17:33,  1.41s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█▏        | 19/167 [00:02<00:11, 12.70it/s][A
Epoch 6:   2%|▏         | 120/5971 [02:47<2:15:20,  1.39s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:02<00:11, 13.10it/s][A
Epoch 6:   2%|▏         | 122/5971 [02:48<2:13:12,  1.37s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:02<00:10, 13.99it/s][A
Epoch 6:   2%|▏         | 124/5971 [02:48<2:11:07,  1.35s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:02<00:10, 13.92it/s][A
Epoch 6:   2%|▏         | 126/5971 [02:48<2:09:07,  1.33s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:03<00:09, 14.27it/s][A
Epoch 6:   2%|▏         | 128/5971 [02:48<2:07:10,  1.31s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:03<00:09, 14.29it/s][A
Epoch 6:   2%|▏         | 130/5971 [02:48<2:05:18,  1.29s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:03<00:09, 13.97it/s][A
Epoch 6:   2%|▏         | 132/5971 [02:48<2:03:29,  1.27s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:03<00:09, 14.33it/s][A
Epoch 6:   2%|▏         | 134/5971 [02:48<2:01:42,  1.25s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:03<00:08, 14.93it/s][A
Epoch 6:   2%|▏         | 136/5971 [02:49<1:59:58,  1.23s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:03<00:08, 15.55it/s][A
Epoch 6:   2%|▏         | 138/5971 [02:49<1:58:17,  1.22s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:03<00:08, 15.38it/s][A
Epoch 6:   2%|▏         | 140/5971 [02:49<1:56:39,  1.20s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:03<00:07, 15.97it/s][A
Epoch 6:   2%|▏         | 142/5971 [02:49<1:55:04,  1.18s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:04<00:08, 14.96it/s][A
Epoch 6:   2%|▏         | 145/5971 [02:49<1:52:47,  1.16s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:04<00:08, 13.64it/s][A
Epoch 6:   2%|▏         | 148/5971 [02:49<1:50:37,  1.14s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:04<00:08, 14.38it/s][A

Validating:  30%|██▉       | 50/167 [00:04<00:08, 14.12it/s][A
Epoch 6:   3%|▎         | 151/5971 [02:50<1:48:31,  1.12s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:04<00:08, 13.34it/s][A
Epoch 6:   3%|▎         | 154/5971 [02:50<1:46:29,  1.10s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:04<00:07, 14.53it/s][A

Validating:  34%|███▎      | 56/167 [00:04<00:07, 15.75it/s][A
Epoch 6:   3%|▎         | 157/5971 [02:50<1:44:31,  1.08s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:05<00:08, 12.14it/s][A
Epoch 6:   3%|▎         | 160/5971 [02:50<1:42:42,  1.06s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:05<00:08, 13.06it/s][A

Validating:  37%|███▋      | 62/167 [00:05<00:07, 14.25it/s][A
Epoch 6:   3%|▎         | 163/5971 [02:50<1:40:52,  1.04s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:05<00:07, 14.12it/s][A
Epoch 6:   3%|▎         | 166/5971 [02:51<1:39:08,  1.02s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:05<00:07, 13.84it/s][A

Validating:  41%|████      | 68/167 [00:05<00:07, 13.62it/s][A
Epoch 6:   3%|▎         | 169/5971 [02:51<1:37:28,  1.01s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:06<00:06, 14.68it/s][A
Epoch 6:   3%|▎         | 172/5971 [02:51<1:35:50,  1.01it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:06<00:06, 14.49it/s][A

Validating:  44%|████▍     | 74/167 [00:06<00:05, 15.75it/s][A
Epoch 6:   3%|▎         | 175/5971 [02:51<1:34:15,  1.02it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:06<00:05, 16.06it/s][A
Epoch 6:   3%|▎         | 178/5971 [02:51<1:32:45,  1.04it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:06<00:06, 14.32it/s][A

Validating:  48%|████▊     | 80/167 [00:06<00:06, 13.42it/s][A
Epoch 6:   3%|▎         | 181/5971 [02:52<1:31:17,  1.06it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:07<00:08, 10.46it/s][A
Epoch 6:   3%|▎         | 184/5971 [02:52<1:29:57,  1.07it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|█████     | 84/167 [00:07<00:06, 12.01it/s][A

Validating:  51%|█████▏    | 86/167 [00:07<00:06, 12.74it/s][A
Epoch 6:   3%|▎         | 187/5971 [02:52<1:28:33,  1.09it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:07<00:05, 13.56it/s][A
Epoch 6:   3%|▎         | 190/5971 [02:52<1:27:13,  1.10it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 90/167 [00:07<00:05, 13.65it/s][A

Validating:  55%|█████▌    | 92/167 [00:07<00:05, 13.33it/s][A
Epoch 6:   3%|▎         | 193/5971 [02:53<1:25:56,  1.12it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:07<00:06, 11.00it/s][A
Epoch 6:   3%|▎         | 196/5971 [02:53<1:24:47,  1.14it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:08<00:06, 10.85it/s][A

Validating:  59%|█████▊    | 98/167 [00:08<00:05, 11.86it/s][A
Epoch 6:   3%|▎         | 199/5971 [02:53<1:23:33,  1.15it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:08<00:04, 14.32it/s][A
Epoch 6:   3%|▎         | 202/5971 [02:53<1:22:21,  1.17it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:08<00:04, 15.50it/s][A
Epoch 6:   3%|▎         | 205/5971 [02:54<1:21:10,  1.18it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:08<00:04, 15.29it/s][A

Validating:  64%|██████▍   | 107/167 [00:08<00:05, 10.86it/s][A
Epoch 6:   3%|▎         | 208/5971 [02:54<1:20:09,  1.20it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:09<00:05, 10.22it/s][A
Epoch 6:   4%|▎         | 211/5971 [02:54<1:19:10,  1.21it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:09<00:06,  8.48it/s][A

Validating:  68%|██████▊   | 113/167 [00:09<00:06,  8.48it/s][A
Epoch 6:   4%|▎         | 214/5971 [02:55<1:18:11,  1.23it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:09<00:05,  9.01it/s][A
Epoch 6:   4%|▎         | 217/5971 [02:55<1:17:11,  1.24it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:10<00:04, 11.42it/s][A
Epoch 6:   4%|▎         | 220/5971 [02:55<1:16:10,  1.26it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:10<00:03, 12.01it/s][A

Validating:  73%|███████▎  | 122/167 [00:10<00:03, 11.98it/s][A
Epoch 6:   4%|▎         | 223/5971 [02:55<1:15:12,  1.27it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:10<00:03, 11.96it/s][A
Epoch 6:   4%|▍         | 226/5971 [02:56<1:14:16,  1.29it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:10<00:03, 12.97it/s][A

Validating:  77%|███████▋  | 128/167 [00:10<00:03, 12.93it/s][A
Epoch 6:   4%|▍         | 229/5971 [02:56<1:13:21,  1.30it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:11<00:02, 12.56it/s][A
Epoch 6:   4%|▍         | 232/5971 [02:56<1:12:28,  1.32it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:11<00:02, 14.22it/s][A
Epoch 6:   4%|▍         | 235/5971 [02:56<1:11:34,  1.34it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:11<00:02, 14.82it/s][A

Validating:  82%|████████▏ | 137/167 [00:11<00:02, 14.60it/s][A
Epoch 6:   4%|▍         | 238/5971 [02:56<1:10:43,  1.35it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:11<00:01, 15.57it/s][A
Epoch 6:   4%|▍         | 241/5971 [02:57<1:09:53,  1.37it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:11<00:01, 14.77it/s][A

Validating:  86%|████████▌ | 143/167 [00:11<00:01, 15.15it/s][A
Epoch 6:   4%|▍         | 244/5971 [02:57<1:09:04,  1.38it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:11<00:01, 14.58it/s][A
Epoch 6:   4%|▍         | 247/5971 [02:57<1:08:16,  1.40it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:12<00:01, 15.43it/s][A

Validating:  89%|████████▉ | 149/167 [00:12<00:01, 15.42it/s][A
Epoch 6:   4%|▍         | 250/5971 [02:57<1:07:29,  1.41it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:12<00:01, 14.88it/s][A
Epoch 6:   4%|▍         | 253/5971 [02:57<1:06:44,  1.43it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:12<00:00, 17.02it/s][A
Epoch 6:   4%|▍         | 256/5971 [02:58<1:05:58,  1.44it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:12<00:00, 18.47it/s][A
Epoch 6:   4%|▍         | 259/5971 [02:58<1:05:14,  1.46it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:12<00:00, 18.51it/s][A
Epoch 6:   4%|▍         | 262/5971 [02:58<1:04:30,  1.48it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:12<00:00, 19.84it/s][A
Epoch 6:   4%|▍         | 265/5971 [02:58<1:03:47,  1.49it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:13<00:00, 20.42it/s][A
Epoch 6:   4%|▍         | 268/5971 [02:58<1:03:06,  1.51it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   4%|▍         | 268/5971 [02:58<1:03:11,  1.50it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.11e-5, train/loss_step=0.0147, global_step=3474.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:   5%|▍         | 269/5971 [03:00<1:03:30,  1.50it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.000169, train/loss_step=0.0495, global_step=3475.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 270/5971 [03:01<1:03:45,  1.49it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000172, train/loss_step=0.0483, global_step=3475.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 271/5971 [03:03<1:04:05,  1.48it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000172, train/loss_step=0.0483, global_step=3475.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 271/5971 [03:03<1:04:06,  1.48it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00706, train/loss_vlb_step=3.11e-5, train/loss_step=0.00706, global_step=3475.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 272/5971 [03:06<1:04:49,  1.47it/s, loss=0.0791, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.35e-5, train/loss_step=0.0118, global_step=3475.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▍         | 273/5971 [03:07<1:05:00,  1.46it/s, loss=0.0751, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.91e-5, train/loss_step=0.0185, global_step=3476.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 274/5971 [03:08<1:05:13,  1.46it/s, loss=0.0751, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.91e-5, train/loss_step=0.0185, global_step=3476.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 274/5971 [03:08<1:05:13,  1.46it/s, loss=0.0825, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000523, train/loss_step=0.155, global_step=3476.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▍         | 275/5971 [03:10<1:05:24,  1.45it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.00711, train/loss_vlb_step=3.33e-5, train/loss_step=0.00711, global_step=3476.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 276/5971 [03:14<1:06:38,  1.42it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.494, train/loss_vlb_step=0.0053, train/loss_step=0.494, global_step=3476.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:   5%|▍         | 277/5971 [03:15<1:06:49,  1.42it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.494, train/loss_vlb_step=0.0053, train/loss_step=0.494, global_step=3476.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 277/5971 [03:15<1:06:49,  1.42it/s, loss=0.111, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00171, train/loss_step=0.350, global_step=3477.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 278/5971 [03:17<1:07:01,  1.42it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00349, train/loss_vlb_step=1.86e-5, train/loss_step=0.00349, global_step=3477.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 279/5971 [03:18<1:07:10,  1.41it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=2.97e-5, train/loss_step=0.00636, global_step=3477.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 280/5971 [03:24<1:09:00,  1.37it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.00636, train/loss_vlb_step=2.97e-5, train/loss_step=0.00636, global_step=3477.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 280/5971 [03:24<1:09:01,  1.37it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000986, train/loss_step=0.240, global_step=3477.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   5%|▍         | 281/5971 [03:26<1:09:18,  1.37it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000385, train/loss_step=0.113, global_step=3478.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 282/5971 [03:27<1:09:31,  1.36it/s, loss=0.113, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00124, train/loss_step=0.311, global_step=3478.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   5%|▍         | 283/5971 [03:29<1:09:46,  1.36it/s, loss=0.113, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00124, train/loss_step=0.311, global_step=3478.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 283/5971 [03:29<1:09:46,  1.36it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00436, train/loss_vlb_step=2.1e-5, train/loss_step=0.00436, global_step=3478.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 284/5971 [03:32<1:10:42,  1.34it/s, loss=0.123, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00116, train/loss_step=0.274, global_step=3478.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   5%|▍         | 285/5971 [03:34<1:10:57,  1.34it/s, loss=0.123, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000465, train/loss_step=0.138, global_step=3479.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 286/5971 [03:35<1:11:08,  1.33it/s, loss=0.123, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000465, train/loss_step=0.138, global_step=3479.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 286/5971 [03:35<1:11:08,  1.33it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00345, train/loss_vlb_step=1.89e-5, train/loss_step=0.00345, global_step=3479.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 287/5971 [03:36<1:11:20,  1.33it/s, loss=0.118, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.0004, train/loss_step=0.121, global_step=3479.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:   5%|▍         | 288/5971 [03:39<1:12:04,  1.31it/s, loss=0.149, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.0086, train/loss_step=0.621, global_step=3479.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 289/5971 [03:41<1:12:15,  1.31it/s, loss=0.149, v_num=0, train/loss_simple_step=0.621, train/loss_vlb_step=0.0086, train/loss_step=0.621, global_step=3479.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 289/5971 [03:41<1:12:15,  1.31it/s, loss=0.155, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000568, train/loss_step=0.170, global_step=3480.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 290/5971 [03:42<1:12:24,  1.31it/s, loss=0.181, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.0085, train/loss_step=0.574, global_step=3480.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   5%|▍         | 291/5971 [03:43<1:12:33,  1.30it/s, loss=0.195, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00108, train/loss_step=0.277, global_step=3480.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 292/5971 [03:47<1:13:22,  1.29it/s, loss=0.195, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00108, train/loss_step=0.277, global_step=3480.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 292/5971 [03:47<1:13:22,  1.29it/s, loss=0.204, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000872, train/loss_step=0.208, global_step=3480.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 293/5971 [03:48<1:13:32,  1.29it/s, loss=0.206, v_num=0, train/loss_simple_step=0.051, train/loss_vlb_step=0.000178, train/loss_step=0.051, global_step=3481.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 294/5971 [03:49<1:13:42,  1.28it/s, loss=0.228, v_num=0, train/loss_simple_step=0.596, train/loss_vlb_step=0.0145, train/loss_step=0.596, global_step=3481.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   5%|▍         | 295/5971 [03:51<1:13:50,  1.28it/s, loss=0.228, v_num=0, train/loss_simple_step=0.596, train/loss_vlb_step=0.0145, train/loss_step=0.596, global_step=3481.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 295/5971 [03:51<1:13:50,  1.28it/s, loss=0.234, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000399, train/loss_step=0.121, global_step=3481.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 296/5971 [03:53<1:14:29,  1.27it/s, loss=0.224, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00109, train/loss_step=0.290, global_step=3481.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▍         | 297/5971 [03:55<1:14:43,  1.27it/s, loss=0.209, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.00019, train/loss_step=0.057, global_step=3482.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 298/5971 [03:57<1:14:58,  1.26it/s, loss=0.209, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.00019, train/loss_step=0.057, global_step=3482.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▍         | 298/5971 [03:57<1:14:58,  1.26it/s, loss=0.209, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.8e-5, train/loss_step=0.010, global_step=3482.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▌         | 299/5971 [03:58<1:15:15,  1.26it/s, loss=0.209, v_num=0, train/loss_simple_step=0.00423, train/loss_vlb_step=2.34e-5, train/loss_step=0.00423, global_step=3482.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 300/5971 [04:01<1:15:50,  1.25it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.87e-5, train/loss_step=0.00366, global_step=3482.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 301/5971 [04:03<1:16:03,  1.24it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00366, train/loss_vlb_step=1.87e-5, train/loss_step=0.00366, global_step=3482.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 301/5971 [04:03<1:16:03,  1.24it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000119, train/loss_step=0.0328, global_step=3483.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▌         | 302/5971 [04:04<1:16:13,  1.24it/s, loss=0.197, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00196, train/loss_step=0.381, global_step=3483.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   5%|▌         | 303/5971 [04:05<1:16:18,  1.24it/s, loss=0.203, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=3483.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 304/5971 [04:08<1:17:00,  1.23it/s, loss=0.203, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000415, train/loss_step=0.126, global_step=3483.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 304/5971 [04:08<1:17:00,  1.23it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=2e-5, train/loss_step=0.0036, global_step=3483.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   5%|▌         | 305/5971 [04:10<1:17:11,  1.22it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00538, train/loss_vlb_step=2.56e-5, train/loss_step=0.00538, global_step=3484.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 306/5971 [04:11<1:17:22,  1.22it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000139, train/loss_step=0.0399, global_step=3484.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▌         | 307/5971 [04:12<1:17:32,  1.22it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000139, train/loss_step=0.0399, global_step=3484.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 307/5971 [04:12<1:17:32,  1.22it/s, loss=0.183, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000309, train/loss_step=0.094, global_step=3484.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   5%|▌         | 308/5971 [04:15<1:18:08,  1.21it/s, loss=0.156, v_num=0, train/loss_simple_step=0.076, train/loss_vlb_step=0.00025, train/loss_step=0.076, global_step=3484.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▌         | 309/5971 [04:17<1:18:17,  1.21it/s, loss=0.155, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000532, train/loss_step=0.154, global_step=3485.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 310/5971 [04:18<1:18:27,  1.20it/s, loss=0.155, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000532, train/loss_step=0.154, global_step=3485.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 310/5971 [04:18<1:18:27,  1.20it/s, loss=0.139, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.00107, train/loss_step=0.254, global_step=3485.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▌         | 311/5971 [04:20<1:18:36,  1.20it/s, loss=0.17, v_num=0, train/loss_simple_step=0.900, train/loss_vlb_step=0.453, train/loss_step=0.900, global_step=3485.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   5%|▌         | 312/5971 [04:22<1:19:12,  1.19it/s, loss=0.17, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000755, train/loss_step=0.208, global_step=3485.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 313/5971 [04:24<1:19:24,  1.19it/s, loss=0.17, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000755, train/loss_step=0.208, global_step=3485.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 313/5971 [04:24<1:19:24,  1.19it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0745, train/loss_vlb_step=0.000248, train/loss_step=0.0745, global_step=3486.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 314/5971 [04:25<1:19:34,  1.18it/s, loss=0.148, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=3486.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   5%|▌         | 315/5971 [04:27<1:19:47,  1.18it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.09e-5, train/loss_step=0.0154, global_step=3486.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 316/5971 [04:30<1:20:31,  1.17it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.09e-5, train/loss_step=0.0154, global_step=3486.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 316/5971 [04:30<1:20:31,  1.17it/s, loss=0.138, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000664, train/loss_step=0.190, global_step=3486.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▌         | 317/5971 [04:32<1:20:41,  1.17it/s, loss=0.149, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00135, train/loss_step=0.283, global_step=3487.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▌         | 318/5971 [04:33<1:20:49,  1.17it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.87e-5, train/loss_step=0.0132, global_step=3487.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 319/5971 [04:35<1:20:58,  1.16it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.87e-5, train/loss_step=0.0132, global_step=3487.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 319/5971 [04:35<1:20:58,  1.16it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00965, train/loss_vlb_step=4.42e-5, train/loss_step=0.00965, global_step=3487.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 320/5971 [04:38<1:21:40,  1.15it/s, loss=0.182, v_num=0, train/loss_simple_step=0.657, train/loss_vlb_step=0.0142, train/loss_step=0.657, global_step=3487.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   5%|▌         | 321/5971 [04:39<1:21:49,  1.15it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=4.13e-5, train/loss_step=0.00845, global_step=3488.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 322/5971 [04:41<1:21:59,  1.15it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=4.13e-5, train/loss_step=0.00845, global_step=3488.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 322/5971 [04:41<1:21:59,  1.15it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00816, train/loss_vlb_step=3.72e-5, train/loss_step=0.00816, global_step=3488.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 323/5971 [04:42<1:22:09,  1.15it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.000177, train/loss_step=0.0511, global_step=3488.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   5%|▌         | 324/5971 [04:45<1:22:48,  1.14it/s, loss=0.181, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00319, train/loss_step=0.459, global_step=3488.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   5%|▌         | 325/5971 [04:47<1:22:59,  1.13it/s, loss=0.181, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00319, train/loss_step=0.459, global_step=3488.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 325/5971 [04:47<1:22:59,  1.13it/s, loss=0.183, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000119, train/loss_step=0.033, global_step=3489.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 326/5971 [04:48<1:23:07,  1.13it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0795, train/loss_vlb_step=0.000267, train/loss_step=0.0795, global_step=3489.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 327/5971 [04:50<1:23:17,  1.13it/s, loss=0.227, v_num=0, train/loss_simple_step=0.938, train/loss_vlb_step=0.472, train/loss_step=0.938, global_step=3489.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:   5%|▌         | 328/5971 [04:53<1:23:53,  1.12it/s, loss=0.227, v_num=0, train/loss_simple_step=0.938, train/loss_vlb_step=0.472, train/loss_step=0.938, global_step=3489.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   5%|▌         | 328/5971 [04:53<1:23:53,  1.12it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0027, train/loss_vlb_step=1.52e-5, train/loss_step=0.0027, global_step=3489.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 329/5971 [04:54<1:24:02,  1.12it/s, loss=0.235, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00314, train/loss_step=0.398, global_step=3490.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   6%|▌         | 330/5971 [04:56<1:24:16,  1.12it/s, loss=0.232, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000684, train/loss_step=0.194, global_step=3490.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 331/5971 [04:58<1:24:24,  1.11it/s, loss=0.232, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000684, train/loss_step=0.194, global_step=3490.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 331/5971 [04:58<1:24:24,  1.11it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000136, train/loss_step=0.0403, global_step=3490.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 332/5971 [05:01<1:25:00,  1.11it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00947, train/loss_vlb_step=4.12e-5, train/loss_step=0.00947, global_step=3490.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 333/5971 [05:02<1:25:07,  1.10it/s, loss=0.19, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00134, train/loss_step=0.287, global_step=3491.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   6%|▌         | 334/5971 [05:03<1:25:14,  1.10it/s, loss=0.19, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00134, train/loss_step=0.287, global_step=3491.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 334/5971 [05:03<1:25:14,  1.10it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.41e-5, train/loss_step=0.00247, global_step=3491.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 335/5971 [05:05<1:25:23,  1.10it/s, loss=0.191, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.00052, train/loss_step=0.157, global_step=3491.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   6%|▌         | 336/5971 [05:08<1:25:58,  1.09it/s, loss=0.208, v_num=0, train/loss_simple_step=0.527, train/loss_vlb_step=0.00456, train/loss_step=0.527, global_step=3491.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 337/5971 [05:09<1:26:02,  1.09it/s, loss=0.208, v_num=0, train/loss_simple_step=0.527, train/loss_vlb_step=0.00456, train/loss_step=0.527, global_step=3491.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 337/5971 [05:09<1:26:02,  1.09it/s, loss=0.218, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.0048, train/loss_step=0.489, global_step=3492.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   6%|▌         | 338/5971 [05:11<1:26:10,  1.09it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0377, train/loss_vlb_step=0.000132, train/loss_step=0.0377, global_step=3492.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 339/5971 [05:12<1:26:17,  1.09it/s, loss=0.221, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.00015, train/loss_step=0.041, global_step=3492.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   6%|▌         | 340/5971 [05:15<1:26:51,  1.08it/s, loss=0.221, v_num=0, train/loss_simple_step=0.041, train/loss_vlb_step=0.00015, train/loss_step=0.041, global_step=3492.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 340/5971 [05:15<1:26:51,  1.08it/s, loss=0.193, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000343, train/loss_step=0.104, global_step=3492.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 341/5971 [05:17<1:26:58,  1.08it/s, loss=0.201, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000551, train/loss_step=0.158, global_step=3493.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 342/5971 [05:18<1:27:05,  1.08it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000149, train/loss_step=0.0397, global_step=3493.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 343/5971 [05:19<1:27:09,  1.08it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.000149, train/loss_step=0.0397, global_step=3493.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 343/5971 [05:19<1:27:09,  1.08it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.22e-5, train/loss_step=0.0183, global_step=3493.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   6%|▌         | 344/5971 [05:22<1:27:40,  1.07it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000287, train/loss_step=0.0859, global_step=3493.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 345/5971 [05:23<1:27:46,  1.07it/s, loss=0.188, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000487, train/loss_step=0.140, global_step=3494.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   6%|▌         | 346/5971 [05:25<1:27:51,  1.07it/s, loss=0.188, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000487, train/loss_step=0.140, global_step=3494.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 346/5971 [05:25<1:27:51,  1.07it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=5.93e-5, train/loss_step=0.0141, global_step=3494.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 347/5971 [05:26<1:27:57,  1.07it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.000145, train/loss_step=0.0398, global_step=3494.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 348/5971 [05:29<1:28:28,  1.06it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000209, train/loss_step=0.0635, global_step=3494.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 349/5971 [05:30<1:28:29,  1.06it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000209, train/loss_step=0.0635, global_step=3494.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 349/5971 [05:30<1:28:29,  1.06it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=1.02e-5, train/loss_step=0.0017, global_step=3495.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   6%|▌         | 350/5971 [05:31<1:28:30,  1.06it/s, loss=0.121, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.00052, train/loss_step=0.155, global_step=3495.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   6%|▌         | 351/5971 [05:32<1:28:29,  1.06it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0855, train/loss_vlb_step=0.000283, train/loss_step=0.0855, global_step=3495.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 352/5971 [05:36<1:29:15,  1.05it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0855, train/loss_vlb_step=0.000283, train/loss_step=0.0855, global_step=3495.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 352/5971 [05:36<1:29:15,  1.05it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.45e-5, train/loss_step=0.00478, global_step=3495.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 353/5971 [05:37<1:29:16,  1.05it/s, loss=0.115, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000475, train/loss_step=0.144, global_step=3496.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   6%|▌         | 354/5971 [05:38<1:29:16,  1.05it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.04e-5, train/loss_step=0.00176, global_step=3496.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 355/5971 [05:39<1:29:16,  1.05it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.04e-5, train/loss_step=0.00176, global_step=3496.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 355/5971 [05:39<1:29:16,  1.05it/s, loss=0.113, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00036, train/loss_step=0.109, global_step=3496.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   6%|▌         | 356/5971 [05:42<1:29:49,  1.04it/s, loss=0.116, v_num=0, train/loss_simple_step=0.591, train/loss_vlb_step=0.00739, train/loss_step=0.591, global_step=3496.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 357/5971 [05:43<1:29:51,  1.04it/s, loss=0.0957, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000259, train/loss_step=0.0772, global_step=3497.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 358/5971 [05:44<1:29:53,  1.04it/s, loss=0.0957, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000259, train/loss_step=0.0772, global_step=3497.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 358/5971 [05:44<1:29:53,  1.04it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0024, train/loss_vlb_step=1.38e-5, train/loss_step=0.0024, global_step=3497.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   6%|▌         | 359/5971 [05:46<1:29:55,  1.04it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=3497.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   6%|▌         | 360/5971 [05:49<1:30:34,  1.03it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.0584, train/loss_vlb_step=0.000201, train/loss_step=0.0584, global_step=3497.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 361/5971 [05:50<1:30:39,  1.03it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.0584, train/loss_vlb_step=0.000201, train/loss_step=0.0584, global_step=3497.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 361/5971 [05:50<1:30:39,  1.03it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=8.62e-5, train/loss_step=0.0245, global_step=3498.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   6%|▌         | 362/5971 [05:52<1:30:43,  1.03it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.00674, train/loss_vlb_step=3.38e-5, train/loss_step=0.00674, global_step=3498.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 363/5971 [05:53<1:30:44,  1.03it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.3e-6, train/loss_step=0.00141, global_step=3498.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   6%|▌         | 364/5971 [05:56<1:31:16,  1.02it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.3e-6, train/loss_step=0.00141, global_step=3498.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 364/5971 [05:56<1:31:16,  1.02it/s, loss=0.103, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00184, train/loss_step=0.398, global_step=3498.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   6%|▌         | 365/5971 [05:57<1:31:22,  1.02it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0938, train/loss_vlb_step=0.00031, train/loss_step=0.0938, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 365/5971 [06:07<1:33:45,  1.00s/it, loss=0.101, v_num=0, train/loss_simple_step=0.0938, train/loss_vlb_step=0.00031, train/loss_step=0.0938, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 366/5971 [06:42<1:42:30,  1.10s/it, loss=0.101, v_num=0, train/loss_simple_step=0.0938, train/loss_vlb_step=0.00031, train/loss_step=0.0938, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 366/5971 [06:42<1:42:30,  1.10s/it, loss=0.0999, v_num=0, train/loss_simple_step=0.00185, train/loss_vlb_step=1.12e-5, train/loss_step=0.00185, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 367/5971 [06:44<1:42:34,  1.10s/it, loss=0.0999, v_num=0, train/loss_simple_step=0.00185, train/loss_vlb_step=1.12e-5, train/loss_step=0.00185, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 367/5971 [06:44<1:42:34,  1.10s/it, loss=0.109, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000774, train/loss_step=0.222, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   6%|▌         | 368/5971 [06:47<1:43:04,  1.10s/it, loss=0.109, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000774, train/loss_step=0.222, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   6%|▌         | 368/5971 [06:47<1:43:04,  1.10s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:33,  1.77it/s][A
Epoch 6:   6%|▌         | 370/5971 [06:47<1:42:38,  1.10s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<01:04,  2.55it/s][A
Epoch 6:   6%|▌         | 372/5971 [06:48<1:42:08,  1.09s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   2%|▏         | 4/167 [00:00<00:31,  5.20it/s][A
Epoch 6:   6%|▋         | 375/5971 [06:48<1:41:18,  1.09s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   4%|▍         | 7/167 [00:01<00:16,  9.43it/s][A

Validating:   5%|▌         | 9/167 [00:01<00:14, 10.76it/s][A
Epoch 6:   6%|▋         | 378/5971 [06:48<1:40:29,  1.08s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:01<00:13, 11.96it/s][A
Epoch 6:   6%|▋         | 381/5971 [06:48<1:39:42,  1.07s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:01<00:11, 13.23it/s][A

Validating:   9%|▉         | 15/167 [00:01<00:10, 14.10it/s][A
Epoch 6:   6%|▋         | 384/5971 [06:48<1:38:54,  1.06s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:09, 15.49it/s][A
Epoch 6:   6%|▋         | 387/5971 [06:49<1:38:07,  1.05s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:08, 17.01it/s][A
Epoch 6:   7%|▋         | 390/5971 [06:49<1:37:21,  1.05s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 22/167 [00:01<00:08, 17.65it/s][A
Epoch 6:   7%|▋         | 393/5971 [06:49<1:36:36,  1.04s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:02<00:07, 19.11it/s][A

Validating:  16%|█▌        | 27/167 [00:02<00:07, 18.03it/s][A
Epoch 6:   7%|▋         | 396/5971 [06:49<1:35:51,  1.03s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:02<00:07, 18.41it/s][A
Epoch 6:   7%|▋         | 399/5971 [06:49<1:35:07,  1.02s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:02<00:07, 19.19it/s][A
Epoch 6:   7%|▋         | 402/5971 [06:49<1:34:24,  1.02s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:02<00:06, 19.61it/s][A
Epoch 6:   7%|▋         | 405/5971 [06:50<1:33:41,  1.01s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:02<00:06, 19.40it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:06, 19.37it/s][A
Epoch 6:   7%|▋         | 408/5971 [06:50<1:32:59,  1.00s/it, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:06, 19.45it/s][A
Epoch 6:   7%|▋         | 411/5971 [06:50<1:32:17,  1.00it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:03<00:06, 19.44it/s][A
Epoch 6:   7%|▋         | 414/5971 [06:50<1:31:36,  1.01it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:03<00:06, 19.25it/s][A
Epoch 6:   7%|▋         | 417/5971 [06:50<1:30:56,  1.02it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:03<00:05, 20.02it/s][A
Epoch 6:   7%|▋         | 420/5971 [06:50<1:30:16,  1.02it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:03<00:05, 20.34it/s][A
Epoch 6:   7%|▋         | 423/5971 [06:50<1:29:36,  1.03it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:03<00:05, 20.90it/s][A
Epoch 6:   7%|▋         | 426/5971 [06:51<1:28:58,  1.04it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:03<00:05, 20.99it/s][A
Epoch 6:   7%|▋         | 429/5971 [06:51<1:28:19,  1.05it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:03<00:05, 20.61it/s][A
Epoch 6:   7%|▋         | 432/5971 [06:51<1:27:42,  1.05it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:04<00:05, 20.59it/s][A
Epoch 6:   7%|▋         | 435/5971 [06:51<1:27:05,  1.06it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:04<00:04, 20.64it/s][A
Epoch 6:   7%|▋         | 438/5971 [06:51<1:26:28,  1.07it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:04<00:04, 20.81it/s][A
Epoch 6:   7%|▋         | 441/5971 [06:51<1:25:52,  1.07it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:04<00:04, 18.85it/s][A
Epoch 6:   7%|▋         | 444/5971 [06:51<1:25:17,  1.08it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:04<00:04, 19.35it/s][A
Epoch 6:   7%|▋         | 447/5971 [06:52<1:24:41,  1.09it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:04<00:04, 19.64it/s][A

Validating:  49%|████▊     | 81/167 [00:04<00:04, 19.14it/s][A
Epoch 6:   8%|▊         | 450/5971 [06:52<1:24:07,  1.09it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:05<00:04, 19.32it/s][A
Epoch 6:   8%|▊         | 453/5971 [06:52<1:23:33,  1.10it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:05<00:04, 17.09it/s][A

Validating:  52%|█████▏    | 87/167 [00:05<00:04, 16.65it/s][A
Epoch 6:   8%|▊         | 456/5971 [06:52<1:23:00,  1.11it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:05<00:04, 16.40it/s][A
Epoch 6:   8%|▊         | 459/5971 [06:52<1:22:27,  1.11it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:05<00:05, 15.13it/s][A

Validating:  56%|█████▌    | 93/167 [00:05<00:04, 15.86it/s][A
Epoch 6:   8%|▊         | 462/5971 [06:53<1:21:55,  1.12it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:05<00:03, 18.19it/s][A
Epoch 6:   8%|▊         | 465/5971 [06:53<1:21:22,  1.13it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:05<00:03, 19.76it/s][A
Epoch 6:   8%|▊         | 468/5971 [06:53<1:20:49,  1.13it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:06<00:03, 20.64it/s][A
Epoch 6:   8%|▊         | 471/5971 [06:53<1:20:17,  1.14it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:06<00:02, 20.76it/s][A
Epoch 6:   8%|▊         | 474/5971 [06:53<1:19:46,  1.15it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:06<00:02, 21.52it/s][A
Epoch 6:   8%|▊         | 477/5971 [06:53<1:19:15,  1.16it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:06<00:02, 22.08it/s][A
Epoch 6:   8%|▊         | 480/5971 [06:53<1:18:44,  1.16it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:06<00:02, 21.09it/s][A
Epoch 6:   8%|▊         | 483/5971 [06:54<1:18:14,  1.17it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:06<00:02, 21.33it/s][A
Epoch 6:   8%|▊         | 486/5971 [06:54<1:17:44,  1.18it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:06<00:02, 21.79it/s][A
Epoch 6:   8%|▊         | 489/5971 [06:54<1:17:15,  1.18it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:07<00:02, 21.44it/s][A
Epoch 6:   8%|▊         | 492/5971 [06:54<1:16:45,  1.19it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:07<00:02, 20.48it/s][A
Epoch 6:   8%|▊         | 495/5971 [06:54<1:16:17,  1.20it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:07<00:01, 20.38it/s][A
Epoch 6:   8%|▊         | 498/5971 [06:54<1:15:48,  1.20it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:07<00:01, 19.72it/s][A
Epoch 6:   8%|▊         | 501/5971 [06:54<1:15:21,  1.21it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:07<00:01, 19.80it/s][A
Epoch 6:   8%|▊         | 504/5971 [06:55<1:14:53,  1.22it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:07<00:01, 19.25it/s][A
Epoch 6:   8%|▊         | 507/5971 [06:55<1:14:26,  1.22it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:07<00:01, 18.79it/s][A

Validating:  84%|████████▍ | 141/167 [00:08<00:01, 18.91it/s][A
Epoch 6:   9%|▊         | 510/5971 [06:55<1:13:59,  1.23it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:08<00:01, 18.99it/s][A
Epoch 6:   9%|▊         | 513/5971 [06:55<1:13:32,  1.24it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:08<00:01, 18.95it/s][A
Epoch 6:   9%|▊         | 516/5971 [06:55<1:13:06,  1.24it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:08<00:00, 18.92it/s][A
Epoch 6:   9%|▊         | 519/5971 [06:55<1:12:40,  1.25it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:08<00:00, 19.48it/s][A
Epoch 6:   9%|▊         | 522/5971 [06:56<1:12:14,  1.26it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:08<00:00, 18.87it/s][A
Epoch 6:   9%|▉         | 525/5971 [06:56<1:11:48,  1.26it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:08<00:00, 19.14it/s][A
Epoch 6:   9%|▉         | 528/5971 [06:56<1:11:23,  1.27it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:09<00:00, 19.86it/s][A

Validating:  97%|█████████▋| 162/167 [00:09<00:00, 19.37it/s][A
Epoch 6:   9%|▉         | 531/5971 [06:56<1:10:58,  1.28it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:09<00:00, 19.85it/s][A
Epoch 6:   9%|▉         | 534/5971 [06:56<1:10:34,  1.28it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 536/5971 [06:57<1:10:25,  1.29it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:21,  2.24it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:16,  2.83it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:12,  3.64it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:10,  4.04it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:10,  4.11it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:10,  4.09it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:09,  4.08it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:09,  4.06it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:03<00:10,  3.69it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:09,  3.76it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:09,  3.70it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:04<00:09,  3.68it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:04<00:09,  3.77it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:04<00:08,  3.85it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:08,  3.81it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:05<00:08,  3.82it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:05<00:07,  3.83it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:05<00:07,  3.87it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:06<00:07,  3.81it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:06<00:07,  3.47it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:06<00:08,  3.20it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:07<00:07,  3.24it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:07<00:07,  3.36it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:07<00:06,  3.44it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:07<00:06,  3.52it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:08<00:05,  3.52it/s][A
Epoch 6:   9%|▉         | 536/5971 [07:07<1:12:04,  1.26it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Spaced Sampler:  60%|██████    | 30/50 [00:08<00:05,  3.57it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:08<00:05,  3.60it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:08<00:04,  3.63it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:09<00:04,  3.63it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:09<00:04,  3.66it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:09<00:04,  3.69it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:10<00:03,  3.71it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:10<00:03,  3.72it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:10<00:03,  3.74it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:10<00:03,  3.64it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:11<00:03,  3.31it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:11<00:02,  3.11it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:11<00:02,  3.29it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:12<00:02,  3.40it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:12<00:01,  3.50it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:12<00:01,  3.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:12<00:01,  3.61it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:13<00:00,  3.65it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:13<00:00,  3.69it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:13<00:00,  3.72it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.71it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.56it/s]

Epoch 6:   9%|▉         | 537/5971 [07:14<1:13:07,  1.24it/s, loss=0.107, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.9e-5, train/loss_step=0.016, global_step=3499.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 537/5971 [07:14<1:13:07,  1.24it/s, loss=0.114, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.00051, train/loss_step=0.152, global_step=3500.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:22,  2.12it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:17,  2.65it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:16,  2.73it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:16,  2.70it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:15,  2.83it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:13,  3.14it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:12,  3.39it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:11,  3.54it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:03<00:11,  3.56it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:11,  3.51it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:03<00:10,  3.57it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:04<00:10,  3.39it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:04<00:11,  3.17it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:04<00:10,  3.21it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:05<00:10,  3.33it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:09,  3.40it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:09,  3.55it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:05<00:08,  3.67it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:07,  3.76it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:06<00:07,  3.77it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:06<00:07,  3.85it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:06<00:06,  3.97it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:07<00:06,  4.09it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:07<00:06,  4.05it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:07<00:05,  4.11it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:07<00:05,  4.09it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:08<00:05,  4.00it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:08<00:05,  4.05it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:08<00:04,  4.09it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:09<00:05,  3.23it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:09<00:05,  3.17it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:09<00:04,  3.40it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:09<00:04,  3.58it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:10<00:04,  3.58it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:10<00:04,  3.32it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:10<00:03,  3.48it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:11<00:03,  3.62it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:11<00:03,  3.65it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:11<00:02,  3.61it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:12<00:02,  3.18it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:12<00:02,  3.12it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:12<00:02,  3.26it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:12<00:01,  3.41it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:13<00:01,  3.50it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:13<00:01,  3.55it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:13<00:00,  3.65it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:13<00:00,  3.69it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:14<00:00,  3.70it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.78it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.45it/s]

Epoch 6:   9%|▉         | 538/5971 [07:31<1:15:53,  1.19it/s, loss=0.114, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.00051, train/loss_step=0.152, global_step=3500.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 538/5971 [07:31<1:15:53,  1.19it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.09e-5, train/loss_step=0.00194, global_step=3500.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:22,  2.13it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:18,  2.58it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:15,  2.95it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:13,  3.22it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:13,  3.22it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:12,  3.36it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:12,  3.47it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:11,  3.53it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:03<00:11,  3.62it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:10,  3.63it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:03<00:11,  3.40it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:04<00:11,  3.14it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:04<00:10,  3.30it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:04<00:10,  3.41it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:04<00:09,  3.53it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:09,  3.61it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:08,  3.64it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:05<00:08,  3.67it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:08,  3.75it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:06<00:07,  3.77it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:06<00:07,  3.77it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:06<00:07,  3.74it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:07<00:06,  3.75it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:07<00:06,  3.72it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:07<00:06,  3.72it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:07<00:06,  3.74it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:08<00:05,  3.67it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:08<00:05,  3.74it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:08<00:05,  3.81it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:09<00:05,  3.51it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:09<00:05,  3.19it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:09<00:05,  3.30it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:09<00:04,  3.40it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:10<00:04,  3.46it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:10<00:03,  3.50it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:10<00:03,  3.57it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:11<00:03,  3.61it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:11<00:03,  3.64it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:11<00:02,  3.66it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:11<00:02,  3.68it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:12<00:02,  3.74it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:12<00:01,  3.79it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:12<00:01,  3.80it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:12<00:01,  3.81it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:13<00:01,  3.79it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:13<00:00,  3.79it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:13<00:00,  3.64it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:14<00:00,  3.54it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.63it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.49it/s]

Epoch 6:   9%|▉         | 539/5971 [07:49<1:18:40,  1.15it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.09e-5, train/loss_step=0.00194, global_step=3500.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 539/5971 [07:49<1:18:40,  1.15it/s, loss=0.113, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.00095, train/loss_step=0.224, global_step=3500.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:40,  1.21it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:25,  1.86it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:20,  2.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:17,  2.70it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:14,  3.00it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:13,  3.21it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:14,  3.04it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:14,  3.00it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:03<00:12,  3.30it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:03<00:11,  3.50it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:03<00:10,  3.67it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:03<00:10,  3.78it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:04<00:09,  3.85it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:04<00:09,  3.87it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:04<00:08,  3.92it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:04<00:08,  3.88it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:05<00:08,  3.82it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:05<00:08,  3.76it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:05<00:08,  3.75it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:06<00:08,  3.73it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:06<00:07,  3.73it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:06<00:07,  3.77it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:06<00:07,  3.77it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:07<00:06,  3.76it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:07<00:06,  3.78it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:07<00:06,  3.77it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:07<00:06,  3.79it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:08<00:05,  3.81it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:08<00:05,  3.85it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:08<00:05,  3.89it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:08<00:04,  3.81it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:09<00:04,  3.65it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:09<00:05,  3.26it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:09<00:04,  3.37it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:10<00:04,  3.48it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:10<00:03,  3.53it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:10<00:03,  3.59it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:10<00:03,  3.67it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:11<00:02,  3.75it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:11<00:02,  3.76it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:11<00:02,  3.79it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:12<00:02,  3.85it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:12<00:01,  3.84it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:12<00:01,  3.79it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:12<00:01,  3.74it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:13<00:01,  3.62it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:13<00:00,  3.60it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:13<00:00,  3.60it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:13<00:00,  3.63it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.62it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:14<00:00,  3.51it/s]

Epoch 6:   9%|▉         | 540/5971 [08:07<1:21:36,  1.11it/s, loss=0.113, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.00095, train/loss_step=0.224, global_step=3500.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 540/5971 [08:07<1:21:36,  1.11it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.1e-6, train/loss_step=0.00135, global_step=3500.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 541/5971 [08:08<1:21:37,  1.11it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.1e-6, train/loss_step=0.00135, global_step=3500.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 541/5971 [08:08<1:21:37,  1.11it/s, loss=0.114, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.00055, train/loss_step=0.166, global_step=3501.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   9%|▉         | 542/5971 [08:09<1:21:38,  1.11it/s, loss=0.114, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.00055, train/loss_step=0.166, global_step=3501.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 542/5971 [08:09<1:21:38,  1.11it/s, loss=0.124, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000777, train/loss_step=0.194, global_step=3501.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 543/5971 [08:11<1:21:39,  1.11it/s, loss=0.124, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000777, train/loss_step=0.194, global_step=3501.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 543/5971 [08:11<1:21:39,  1.11it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00815, train/loss_vlb_step=3.79e-5, train/loss_step=0.00815, global_step=3501.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 544/5971 [08:13<1:21:57,  1.10it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00815, train/loss_vlb_step=3.79e-5, train/loss_step=0.00815, global_step=3501.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 544/5971 [08:13<1:21:57,  1.10it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.0689, train/loss_vlb_step=0.000234, train/loss_step=0.0689, global_step=3501.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 545/5971 [08:15<1:22:01,  1.10it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.0689, train/loss_vlb_step=0.000234, train/loss_step=0.0689, global_step=3501.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 545/5971 [08:15<1:22:01,  1.10it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000594, train/loss_step=0.168, global_step=3502.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   9%|▉         | 546/5971 [08:16<1:22:04,  1.10it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000594, train/loss_step=0.168, global_step=3502.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 546/5971 [08:16<1:22:04,  1.10it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.00693, train/loss_vlb_step=3.44e-5, train/loss_step=0.00693, global_step=3502.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 547/5971 [08:17<1:22:04,  1.10it/s, loss=0.0976, v_num=0, train/loss_simple_step=0.00693, train/loss_vlb_step=3.44e-5, train/loss_step=0.00693, global_step=3502.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 547/5971 [08:17<1:22:04,  1.10it/s, loss=0.0917, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=7.97e-5, train/loss_step=0.0195, global_step=3502.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   9%|▉         | 548/5971 [08:20<1:22:20,  1.10it/s, loss=0.0917, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=7.97e-5, train/loss_step=0.0195, global_step=3502.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 548/5971 [08:20<1:22:20,  1.10it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.47e-6, train/loss_step=0.00166, global_step=3502.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 549/5971 [08:21<1:22:21,  1.10it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.47e-6, train/loss_step=0.00166, global_step=3502.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 549/5971 [08:21<1:22:21,  1.10it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.75e-5, train/loss_step=0.0053, global_step=3503.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   9%|▉         | 550/5971 [08:22<1:22:23,  1.10it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.75e-5, train/loss_step=0.0053, global_step=3503.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 550/5971 [08:22<1:22:23,  1.10it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.14e-5, train/loss_step=0.0204, global_step=3503.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 551/5971 [08:23<1:22:24,  1.10it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.14e-5, train/loss_step=0.0204, global_step=3503.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 551/5971 [08:23<1:22:24,  1.10it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000149, train/loss_step=0.0407, global_step=3503.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 552/5971 [08:26<1:22:40,  1.09it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000149, train/loss_step=0.0407, global_step=3503.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 552/5971 [08:26<1:22:40,  1.09it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00145, train/loss_step=0.332, global_step=3503.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   9%|▉         | 553/5971 [08:27<1:22:43,  1.09it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00145, train/loss_step=0.332, global_step=3503.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 553/5971 [08:27<1:22:43,  1.09it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.08e-5, train/loss_step=0.0232, global_step=3504.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 554/5971 [08:29<1:22:50,  1.09it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.08e-5, train/loss_step=0.0232, global_step=3504.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 554/5971 [08:29<1:22:50,  1.09it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.00066, train/loss_step=0.197, global_step=3504.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   9%|▉         | 555/5971 [08:30<1:22:50,  1.09it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.00066, train/loss_step=0.197, global_step=3504.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 555/5971 [08:30<1:22:50,  1.09it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.0007, train/loss_step=0.189, global_step=3504.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:   9%|▉         | 556/5971 [08:33<1:23:09,  1.09it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.0007, train/loss_step=0.189, global_step=3504.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 556/5971 [08:33<1:23:09,  1.09it/s, loss=0.104, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00135, train/loss_step=0.268, global_step=3504.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 557/5971 [08:34<1:23:09,  1.08it/s, loss=0.104, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00135, train/loss_step=0.268, global_step=3504.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 557/5971 [08:34<1:23:09,  1.08it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.0634, train/loss_vlb_step=0.000223, train/loss_step=0.0634, global_step=3505.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 558/5971 [08:35<1:23:09,  1.08it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.0634, train/loss_vlb_step=0.000223, train/loss_step=0.0634, global_step=3505.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 558/5971 [08:35<1:23:09,  1.08it/s, loss=0.105, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000371, train/loss_step=0.112, global_step=3505.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   9%|▉         | 559/5971 [08:36<1:23:10,  1.08it/s, loss=0.105, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000371, train/loss_step=0.112, global_step=3505.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 559/5971 [08:36<1:23:10,  1.08it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000212, train/loss_step=0.0628, global_step=3505.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 560/5971 [08:39<1:23:27,  1.08it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.0628, train/loss_vlb_step=0.000212, train/loss_step=0.0628, global_step=3505.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 560/5971 [08:39<1:23:27,  1.08it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.15e-5, train/loss_step=0.00643, global_step=3505.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 561/5971 [08:40<1:23:28,  1.08it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.15e-5, train/loss_step=0.00643, global_step=3505.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 561/5971 [08:40<1:23:28,  1.08it/s, loss=0.0902, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.64e-5, train/loss_step=0.018, global_step=3506.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   9%|▉         | 562/5971 [08:41<1:23:32,  1.08it/s, loss=0.0902, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.64e-5, train/loss_step=0.018, global_step=3506.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 562/5971 [08:41<1:23:32,  1.08it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00174, train/loss_step=0.338, global_step=3506.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 563/5971 [08:42<1:23:32,  1.08it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00174, train/loss_step=0.338, global_step=3506.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 563/5971 [08:42<1:23:32,  1.08it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.00831, train/loss_vlb_step=3.63e-5, train/loss_step=0.00831, global_step=3506.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 564/5971 [08:45<1:23:45,  1.08it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.00831, train/loss_vlb_step=3.63e-5, train/loss_step=0.00831, global_step=3506.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 564/5971 [08:45<1:23:45,  1.08it/s, loss=0.101, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000447, train/loss_step=0.136, global_step=3506.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:   9%|▉         | 565/5971 [08:46<1:23:45,  1.08it/s, loss=0.101, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000447, train/loss_step=0.136, global_step=3506.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 565/5971 [08:46<1:23:45,  1.08it/s, loss=0.129, v_num=0, train/loss_simple_step=0.721, train/loss_vlb_step=0.0213, train/loss_step=0.721, global_step=3507.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:   9%|▉         | 566/5971 [08:47<1:23:44,  1.08it/s, loss=0.129, v_num=0, train/loss_simple_step=0.721, train/loss_vlb_step=0.0213, train/loss_step=0.721, global_step=3507.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 566/5971 [08:47<1:23:44,  1.08it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.98e-5, train/loss_step=0.0135, global_step=3507.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 567/5971 [08:47<1:23:43,  1.08it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.98e-5, train/loss_step=0.0135, global_step=3507.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:   9%|▉         | 567/5971 [08:48<1:23:43,  1.08it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.37e-5, train/loss_step=0.0025, global_step=3507.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 568/5971 [08:51<1:24:02,  1.07it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.37e-5, train/loss_step=0.0025, global_step=3507.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 568/5971 [08:51<1:24:02,  1.07it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.46e-5, train/loss_step=0.0165, global_step=3507.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 569/5971 [08:52<1:24:04,  1.07it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=7.46e-5, train/loss_step=0.0165, global_step=3507.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 569/5971 [08:52<1:24:04,  1.07it/s, loss=0.159, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.00971, train/loss_step=0.609, global_step=3508.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  10%|▉         | 570/5971 [08:53<1:24:03,  1.07it/s, loss=0.159, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.00971, train/loss_step=0.609, global_step=3508.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 570/5971 [08:53<1:24:03,  1.07it/s, loss=0.158, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=3508.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 571/5971 [08:54<1:24:02,  1.07it/s, loss=0.158, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.09e-5, train/loss_step=0.004, global_step=3508.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 571/5971 [08:54<1:24:02,  1.07it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00489, train/loss_vlb_step=2.6e-5, train/loss_step=0.00489, global_step=3508.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 572/5971 [08:57<1:24:21,  1.07it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00489, train/loss_vlb_step=2.6e-5, train/loss_step=0.00489, global_step=3508.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 572/5971 [08:57<1:24:21,  1.07it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.6e-5, train/loss_step=0.00283, global_step=3508.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|▉         | 573/5971 [08:58<1:24:22,  1.07it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.6e-5, train/loss_step=0.00283, global_step=3508.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 573/5971 [08:58<1:24:22,  1.07it/s, loss=0.153, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.0018, train/loss_step=0.293, global_step=3509.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  10%|▉         | 574/5971 [08:59<1:24:23,  1.07it/s, loss=0.153, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.0018, train/loss_step=0.293, global_step=3509.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 574/5971 [08:59<1:24:23,  1.07it/s, loss=0.166, v_num=0, train/loss_simple_step=0.451, train/loss_vlb_step=0.00311, train/loss_step=0.451, global_step=3509.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 575/5971 [09:00<1:24:26,  1.07it/s, loss=0.166, v_num=0, train/loss_simple_step=0.451, train/loss_vlb_step=0.00311, train/loss_step=0.451, global_step=3509.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 575/5971 [09:00<1:24:26,  1.07it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.22e-5, train/loss_step=0.00204, global_step=3509.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 576/5971 [09:03<1:24:43,  1.06it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.22e-5, train/loss_step=0.00204, global_step=3509.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 576/5971 [09:03<1:24:43,  1.06it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00339, train/loss_vlb_step=1.92e-5, train/loss_step=0.00339, global_step=3509.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 577/5971 [09:04<1:24:45,  1.06it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00339, train/loss_vlb_step=1.92e-5, train/loss_step=0.00339, global_step=3509.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 577/5971 [09:04<1:24:45,  1.06it/s, loss=0.171, v_num=0, train/loss_simple_step=0.616, train/loss_vlb_step=0.0113, train/loss_step=0.616, global_step=3510.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  10%|▉         | 578/5971 [09:05<1:24:45,  1.06it/s, loss=0.171, v_num=0, train/loss_simple_step=0.616, train/loss_vlb_step=0.0113, train/loss_step=0.616, global_step=3510.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 578/5971 [09:05<1:24:45,  1.06it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000244, train/loss_step=0.0734, global_step=3510.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 579/5971 [09:07<1:24:45,  1.06it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000244, train/loss_step=0.0734, global_step=3510.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 579/5971 [09:07<1:24:45,  1.06it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000187, train/loss_step=0.0515, global_step=3510.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 580/5971 [09:09<1:25:00,  1.06it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000187, train/loss_step=0.0515, global_step=3510.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 580/5971 [09:09<1:25:00,  1.06it/s, loss=0.195, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00573, train/loss_step=0.534, global_step=3510.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  10%|▉         | 581/5971 [09:10<1:25:01,  1.06it/s, loss=0.195, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.00573, train/loss_step=0.534, global_step=3510.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 581/5971 [09:10<1:25:02,  1.06it/s, loss=0.207, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00103, train/loss_step=0.249, global_step=3511.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 582/5971 [09:12<1:25:02,  1.06it/s, loss=0.207, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00103, train/loss_step=0.249, global_step=3511.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 582/5971 [09:12<1:25:02,  1.06it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.000118, train/loss_step=0.0293, global_step=3511.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 583/5971 [09:13<1:25:02,  1.06it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.000118, train/loss_step=0.0293, global_step=3511.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 583/5971 [09:13<1:25:02,  1.06it/s, loss=0.196, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=3511.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  10%|▉         | 584/5971 [09:16<1:25:26,  1.05it/s, loss=0.196, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=3511.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 584/5971 [09:16<1:25:26,  1.05it/s, loss=0.197, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.00053, train/loss_step=0.155, global_step=3511.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|▉         | 585/5971 [09:17<1:25:28,  1.05it/s, loss=0.197, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.00053, train/loss_step=0.155, global_step=3511.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 585/5971 [09:17<1:25:28,  1.05it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000132, train/loss_step=0.0367, global_step=3512.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 586/5971 [09:19<1:25:29,  1.05it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0367, train/loss_vlb_step=0.000132, train/loss_step=0.0367, global_step=3512.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 586/5971 [09:19<1:25:29,  1.05it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.36e-5, train/loss_step=0.0229, global_step=3512.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|▉         | 587/5971 [09:20<1:25:29,  1.05it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.36e-5, train/loss_step=0.0229, global_step=3512.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 587/5971 [09:20<1:25:29,  1.05it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0085, train/loss_vlb_step=4.1e-5, train/loss_step=0.0085, global_step=3512.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|▉         | 588/5971 [09:22<1:25:44,  1.05it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0085, train/loss_vlb_step=4.1e-5, train/loss_step=0.0085, global_step=3512.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 588/5971 [09:22<1:25:44,  1.05it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000105, train/loss_step=0.0291, global_step=3512.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 589/5971 [09:23<1:25:43,  1.05it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000105, train/loss_step=0.0291, global_step=3512.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 589/5971 [09:23<1:25:43,  1.05it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0683, train/loss_vlb_step=0.000226, train/loss_step=0.0683, global_step=3513.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 590/5971 [09:24<1:25:43,  1.05it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0683, train/loss_vlb_step=0.000226, train/loss_step=0.0683, global_step=3513.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 590/5971 [09:24<1:25:43,  1.05it/s, loss=0.158, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00319, train/loss_step=0.429, global_step=3513.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  10%|▉         | 591/5971 [09:26<1:25:45,  1.05it/s, loss=0.158, v_num=0, train/loss_simple_step=0.429, train/loss_vlb_step=0.00319, train/loss_step=0.429, global_step=3513.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 591/5971 [09:26<1:25:45,  1.05it/s, loss=0.17, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000957, train/loss_step=0.249, global_step=3513.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 592/5971 [09:28<1:25:56,  1.04it/s, loss=0.17, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000957, train/loss_step=0.249, global_step=3513.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 592/5971 [09:28<1:25:56,  1.04it/s, loss=0.188, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00176, train/loss_step=0.348, global_step=3513.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 593/5971 [09:29<1:25:55,  1.04it/s, loss=0.188, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00176, train/loss_step=0.348, global_step=3513.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 593/5971 [09:29<1:25:55,  1.04it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=9.99e-5, train/loss_step=0.0266, global_step=3514.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 594/5971 [09:30<1:25:54,  1.04it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=9.99e-5, train/loss_step=0.0266, global_step=3514.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 594/5971 [09:30<1:25:54,  1.04it/s, loss=0.154, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000187, train/loss_step=0.053, global_step=3514.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|▉         | 595/5971 [09:31<1:25:53,  1.04it/s, loss=0.154, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000187, train/loss_step=0.053, global_step=3514.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 595/5971 [09:31<1:25:53,  1.04it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000212, train/loss_step=0.0622, global_step=3514.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 596/5971 [09:34<1:26:11,  1.04it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000212, train/loss_step=0.0622, global_step=3514.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 596/5971 [09:34<1:26:11,  1.04it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0705, train/loss_vlb_step=0.000233, train/loss_step=0.0705, global_step=3514.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 597/5971 [09:35<1:26:10,  1.04it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0705, train/loss_vlb_step=0.000233, train/loss_step=0.0705, global_step=3514.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|▉         | 597/5971 [09:35<1:26:10,  1.04it/s, loss=0.152, v_num=0, train/loss_simple_step=0.442, train/loss_vlb_step=0.00302, train/loss_step=0.442, global_step=3515.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  10%|█         | 598/5971 [09:36<1:26:10,  1.04it/s, loss=0.152, v_num=0, train/loss_simple_step=0.442, train/loss_vlb_step=0.00302, train/loss_step=0.442, global_step=3515.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 598/5971 [09:36<1:26:10,  1.04it/s, loss=0.161, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.00107, train/loss_step=0.252, global_step=3515.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 599/5971 [09:37<1:26:09,  1.04it/s, loss=0.161, v_num=0, train/loss_simple_step=0.252, train/loss_vlb_step=0.00107, train/loss_step=0.252, global_step=3515.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 599/5971 [09:37<1:26:09,  1.04it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0719, train/loss_vlb_step=0.000244, train/loss_step=0.0719, global_step=3515.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 600/5971 [09:39<1:26:22,  1.04it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0719, train/loss_vlb_step=0.000244, train/loss_step=0.0719, global_step=3515.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 600/5971 [09:39<1:26:22,  1.04it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000166, train/loss_step=0.0474, global_step=3515.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 601/5971 [09:40<1:26:22,  1.04it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000166, train/loss_step=0.0474, global_step=3515.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 601/5971 [09:40<1:26:22,  1.04it/s, loss=0.156, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.00667, train/loss_step=0.622, global_step=3516.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  10%|█         | 602/5971 [09:41<1:26:21,  1.04it/s, loss=0.156, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.00667, train/loss_step=0.622, global_step=3516.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 602/5971 [09:41<1:26:21,  1.04it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00354, train/loss_vlb_step=1.86e-5, train/loss_step=0.00354, global_step=3516.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 603/5971 [09:42<1:26:20,  1.04it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00354, train/loss_vlb_step=1.86e-5, train/loss_step=0.00354, global_step=3516.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 603/5971 [09:42<1:26:20,  1.04it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.04e-5, train/loss_step=0.0229, global_step=3516.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  10%|█         | 604/5971 [09:45<1:26:32,  1.03it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.04e-5, train/loss_step=0.0229, global_step=3516.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 604/5971 [09:45<1:26:32,  1.03it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000208, train/loss_step=0.0614, global_step=3516.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 605/5971 [09:46<1:26:33,  1.03it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000208, train/loss_step=0.0614, global_step=3516.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 605/5971 [09:46<1:26:33,  1.03it/s, loss=0.155, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000763, train/loss_step=0.216, global_step=3517.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  10%|█         | 606/5971 [09:47<1:26:32,  1.03it/s, loss=0.155, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000763, train/loss_step=0.216, global_step=3517.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 606/5971 [09:47<1:26:32,  1.03it/s, loss=0.182, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.00866, train/loss_step=0.557, global_step=3517.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|█         | 607/5971 [09:48<1:26:32,  1.03it/s, loss=0.182, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.00866, train/loss_step=0.557, global_step=3517.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 607/5971 [09:48<1:26:32,  1.03it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00236, train/loss_vlb_step=1.28e-5, train/loss_step=0.00236, global_step=3517.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 608/5971 [09:51<1:26:45,  1.03it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00236, train/loss_vlb_step=1.28e-5, train/loss_step=0.00236, global_step=3517.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 608/5971 [09:51<1:26:45,  1.03it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00423, train/loss_vlb_step=2.14e-5, train/loss_step=0.00423, global_step=3517.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|█         | 609/5971 [09:52<1:26:44,  1.03it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00423, train/loss_vlb_step=2.14e-5, train/loss_step=0.00423, global_step=3517.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 609/5971 [09:52<1:26:44,  1.03it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000266, train/loss_step=0.0807, global_step=3518.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 610/5971 [09:53<1:26:43,  1.03it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000266, train/loss_step=0.0807, global_step=3518.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 610/5971 [09:53<1:26:43,  1.03it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000121, train/loss_step=0.0327, global_step=3518.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 611/5971 [09:53<1:26:42,  1.03it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000121, train/loss_step=0.0327, global_step=3518.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 611/5971 [09:54<1:26:42,  1.03it/s, loss=0.16, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000818, train/loss_step=0.233, global_step=3518.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  10%|█         | 612/5971 [09:56<1:26:54,  1.03it/s, loss=0.16, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000818, train/loss_step=0.233, global_step=3518.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 612/5971 [09:56<1:26:54,  1.03it/s, loss=0.151, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000519, train/loss_step=0.153, global_step=3518.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 613/5971 [09:57<1:26:56,  1.03it/s, loss=0.151, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000519, train/loss_step=0.153, global_step=3518.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 613/5971 [09:57<1:26:56,  1.03it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000187, train/loss_step=0.0529, global_step=3519.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 614/5971 [09:58<1:26:55,  1.03it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000187, train/loss_step=0.0529, global_step=3519.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 614/5971 [09:58<1:26:55,  1.03it/s, loss=0.166, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00159, train/loss_step=0.331, global_step=3519.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  10%|█         | 615/5971 [09:59<1:26:54,  1.03it/s, loss=0.166, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00159, train/loss_step=0.331, global_step=3519.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 615/5971 [09:59<1:26:54,  1.03it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000167, train/loss_step=0.0465, global_step=3519.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 616/5971 [10:02<1:27:06,  1.02it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000167, train/loss_step=0.0465, global_step=3519.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 616/5971 [10:02<1:27:06,  1.02it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00555, train/loss_vlb_step=2.72e-5, train/loss_step=0.00555, global_step=3519.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 617/5971 [10:03<1:27:06,  1.02it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00555, train/loss_vlb_step=2.72e-5, train/loss_step=0.00555, global_step=3519.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 617/5971 [10:03<1:27:06,  1.02it/s, loss=0.169, v_num=0, train/loss_simple_step=0.582, train/loss_vlb_step=0.00701, train/loss_step=0.582, global_step=3520.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  10%|█         | 618/5971 [10:04<1:27:06,  1.02it/s, loss=0.169, v_num=0, train/loss_simple_step=0.582, train/loss_vlb_step=0.00701, train/loss_step=0.582, global_step=3520.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 618/5971 [10:04<1:27:06,  1.02it/s, loss=0.171, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.0011, train/loss_step=0.293, global_step=3520.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|█         | 619/5971 [10:05<1:27:05,  1.02it/s, loss=0.171, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.0011, train/loss_step=0.293, global_step=3520.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 619/5971 [10:05<1:27:05,  1.02it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000116, train/loss_step=0.0303, global_step=3520.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 620/5971 [10:07<1:27:14,  1.02it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000116, train/loss_step=0.0303, global_step=3520.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 620/5971 [10:07<1:27:14,  1.02it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.000261, train/loss_step=0.0788, global_step=3520.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|█         | 621/5971 [10:08<1:27:13,  1.02it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.000261, train/loss_step=0.0788, global_step=3520.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 621/5971 [10:08<1:27:13,  1.02it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0639, train/loss_vlb_step=0.000228, train/loss_step=0.0639, global_step=3521.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 622/5971 [10:09<1:27:12,  1.02it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0639, train/loss_vlb_step=0.000228, train/loss_step=0.0639, global_step=3521.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 622/5971 [10:09<1:27:12,  1.02it/s, loss=0.156, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00105, train/loss_step=0.267, global_step=3521.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  10%|█         | 623/5971 [10:10<1:27:10,  1.02it/s, loss=0.156, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00105, train/loss_step=0.267, global_step=3521.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 623/5971 [10:10<1:27:10,  1.02it/s, loss=0.17, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00173, train/loss_step=0.308, global_step=3521.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  10%|█         | 624/5971 [10:12<1:27:24,  1.02it/s, loss=0.17, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00173, train/loss_step=0.308, global_step=3521.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 624/5971 [10:12<1:27:24,  1.02it/s, loss=0.174, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.00046, train/loss_step=0.137, global_step=3521.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 625/5971 [10:14<1:27:23,  1.02it/s, loss=0.174, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.00046, train/loss_step=0.137, global_step=3521.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 625/5971 [10:14<1:27:23,  1.02it/s, loss=0.174, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000789, train/loss_step=0.222, global_step=3522.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 626/5971 [10:14<1:27:22,  1.02it/s, loss=0.174, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000789, train/loss_step=0.222, global_step=3522.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  10%|█         | 626/5971 [10:14<1:27:22,  1.02it/s, loss=0.165, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00261, train/loss_step=0.376, global_step=3522.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  11%|█         | 627/5971 [10:15<1:27:20,  1.02it/s, loss=0.165, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00261, train/loss_step=0.376, global_step=3522.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 627/5971 [10:15<1:27:20,  1.02it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000164, train/loss_step=0.0484, global_step=3522.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 628/5971 [10:18<1:27:36,  1.02it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000164, train/loss_step=0.0484, global_step=3522.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 628/5971 [10:18<1:27:36,  1.02it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.09e-5, train/loss_step=0.00401, global_step=3522.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 629/5971 [10:19<1:27:36,  1.02it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.09e-5, train/loss_step=0.00401, global_step=3522.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 629/5971 [10:19<1:27:36,  1.02it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0086, train/loss_vlb_step=4.12e-5, train/loss_step=0.0086, global_step=3523.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  11%|█         | 630/5971 [10:20<1:27:36,  1.02it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0086, train/loss_vlb_step=4.12e-5, train/loss_step=0.0086, global_step=3523.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 630/5971 [10:20<1:27:36,  1.02it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000116, train/loss_step=0.0322, global_step=3523.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 631/5971 [10:21<1:27:34,  1.02it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0322, train/loss_vlb_step=0.000116, train/loss_step=0.0322, global_step=3523.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 631/5971 [10:21<1:27:34,  1.02it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000145, train/loss_step=0.0378, global_step=3523.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 632/5971 [10:25<1:27:51,  1.01it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0378, train/loss_vlb_step=0.000145, train/loss_step=0.0378, global_step=3523.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 632/5971 [10:25<1:27:51,  1.01it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.62e-5, train/loss_step=0.0263, global_step=3523.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  11%|█         | 633/5971 [10:26<1:27:50,  1.01it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.62e-5, train/loss_step=0.0263, global_step=3523.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 633/5971 [10:26<1:27:50,  1.01it/s, loss=0.159, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00162, train/loss_step=0.281, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  11%|█         | 634/5971 [10:26<1:27:49,  1.01it/s, loss=0.159, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00162, train/loss_step=0.281, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 634/5971 [10:26<1:27:49,  1.01it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=9.97e-5, train/loss_step=0.0279, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 635/5971 [10:27<1:27:47,  1.01it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=9.97e-5, train/loss_step=0.0279, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 635/5971 [10:27<1:27:47,  1.01it/s, loss=0.16, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00229, train/loss_step=0.375, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  11%|█         | 636/5971 [10:31<1:28:06,  1.01it/s, loss=0.16, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00229, train/loss_step=0.375, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  11%|█         | 636/5971 [10:31<1:28:06,  1.01it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:24,  1.96it/s][A
Epoch 6:  11%|█         | 638/5971 [10:31<1:27:52,  1.01it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:45,  3.65it/s][A
Epoch 6:  11%|█         | 640/5971 [10:31<1:27:35,  1.01it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.38it/s][A
Epoch 6:  11%|█         | 643/5971 [10:32<1:27:10,  1.02it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   4%|▍         | 7/167 [00:01<00:21,  7.29it/s][A

Validating:   5%|▌         | 9/167 [00:01<00:17,  9.17it/s][A
Epoch 6:  11%|█         | 646/5971 [10:32<1:26:45,  1.02it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 12/167 [00:01<00:12, 12.54it/s][A
Epoch 6:  11%|█         | 649/5971 [10:32<1:26:19,  1.03it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   9%|▉         | 15/167 [00:01<00:09, 15.82it/s][A
Epoch 6:  11%|█         | 652/5971 [10:32<1:25:53,  1.03it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:01<00:08, 18.02it/s][A
Epoch 6:  11%|█         | 655/5971 [10:32<1:25:28,  1.04it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:07, 20.18it/s][A
Epoch 6:  11%|█         | 658/5971 [10:32<1:25:02,  1.04it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 20.77it/s][A
Epoch 6:  11%|█         | 661/5971 [10:33<1:24:37,  1.05it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:06, 20.58it/s][A
Epoch 6:  11%|█         | 664/5971 [10:33<1:24:13,  1.05it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:02<00:06, 22.15it/s][A
Epoch 6:  11%|█         | 667/5971 [10:33<1:23:48,  1.05it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 22.65it/s][A
Epoch 6:  11%|█         | 670/5971 [10:33<1:23:24,  1.06it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 22.86it/s][A
Epoch 6:  11%|█▏        | 673/5971 [10:33<1:23:00,  1.06it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 24.06it/s][A
Epoch 6:  11%|█▏        | 676/5971 [10:33<1:22:36,  1.07it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.34it/s][A
Epoch 6:  11%|█▏        | 679/5971 [10:33<1:22:12,  1.07it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.41it/s][A
Epoch 6:  11%|█▏        | 682/5971 [10:33<1:21:48,  1.08it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.72it/s][A
Epoch 6:  11%|█▏        | 685/5971 [10:34<1:21:25,  1.08it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.02it/s][A
Epoch 6:  12%|█▏        | 688/5971 [10:34<1:21:02,  1.09it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:03<00:04, 26.12it/s][A
Epoch 6:  12%|█▏        | 691/5971 [10:34<1:20:39,  1.09it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:03<00:04, 26.43it/s][A
Epoch 6:  12%|█▏        | 694/5971 [10:34<1:20:16,  1.10it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:03<00:04, 24.18it/s][A
Epoch 6:  12%|█▏        | 697/5971 [10:34<1:19:54,  1.10it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:03<00:04, 23.64it/s][A
Epoch 6:  12%|█▏        | 700/5971 [10:34<1:19:32,  1.10it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 23.76it/s][A
Epoch 6:  12%|█▏        | 703/5971 [10:34<1:19:09,  1.11it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 24.05it/s][A
Epoch 6:  12%|█▏        | 706/5971 [10:34<1:18:48,  1.11it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.56it/s][A
Epoch 6:  12%|█▏        | 709/5971 [10:35<1:18:26,  1.12it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 23.48it/s][A
Epoch 6:  12%|█▏        | 712/5971 [10:35<1:18:04,  1.12it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:04<00:03, 23.74it/s][A
Epoch 6:  12%|█▏        | 715/5971 [10:35<1:17:43,  1.13it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▊     | 81/167 [00:04<00:03, 24.83it/s][A
Epoch 6:  12%|█▏        | 718/5971 [10:35<1:17:22,  1.13it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|█████     | 84/167 [00:04<00:03, 26.13it/s][A
Epoch 6:  12%|█▏        | 721/5971 [10:35<1:17:00,  1.14it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 25.89it/s][A
Epoch 6:  12%|█▏        | 724/5971 [10:35<1:16:39,  1.14it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 25.90it/s][A
Epoch 6:  12%|█▏        | 727/5971 [10:35<1:16:19,  1.15it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.08it/s][A
Epoch 6:  12%|█▏        | 730/5971 [10:35<1:15:58,  1.15it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.83it/s][A
Epoch 6:  12%|█▏        | 733/5971 [10:35<1:15:38,  1.15it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.58it/s][A
Epoch 6:  12%|█▏        | 736/5971 [10:36<1:15:18,  1.16it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.23it/s][A
Epoch 6:  12%|█▏        | 739/5971 [10:36<1:14:58,  1.16it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:05<00:02, 25.69it/s][A
Epoch 6:  12%|█▏        | 742/5971 [10:36<1:14:38,  1.17it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 23.81it/s][A
Epoch 6:  12%|█▏        | 745/5971 [10:36<1:14:18,  1.17it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 24.21it/s][A
Epoch 6:  13%|█▎        | 748/5971 [10:36<1:13:58,  1.18it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 23.87it/s][A
Epoch 6:  13%|█▎        | 751/5971 [10:36<1:13:39,  1.18it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:05<00:02, 24.32it/s][A
Epoch 6:  13%|█▎        | 754/5971 [10:36<1:13:20,  1.19it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 23.69it/s][A
Epoch 6:  13%|█▎        | 757/5971 [10:36<1:13:01,  1.19it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 22.89it/s][A
Epoch 6:  13%|█▎        | 760/5971 [10:37<1:12:42,  1.19it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 23.61it/s][A
Epoch 6:  13%|█▎        | 763/5971 [10:37<1:12:23,  1.20it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:06<00:01, 24.65it/s][A
Epoch 6:  13%|█▎        | 766/5971 [10:37<1:12:04,  1.20it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 23.09it/s][A
Epoch 6:  13%|█▎        | 769/5971 [10:37<1:11:46,  1.21it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:06<00:01, 24.18it/s][A
Epoch 6:  13%|█▎        | 772/5971 [10:37<1:11:28,  1.21it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 24.28it/s][A
Epoch 6:  13%|█▎        | 775/5971 [10:37<1:11:09,  1.22it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 25.36it/s][A
Epoch 6:  13%|█▎        | 778/5971 [10:37<1:10:51,  1.22it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 24.69it/s][A
Epoch 6:  13%|█▎        | 781/5971 [10:37<1:10:33,  1.23it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 24.31it/s][A
Epoch 6:  13%|█▎        | 784/5971 [10:38<1:10:16,  1.23it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.80it/s][A
Epoch 6:  13%|█▎        | 787/5971 [10:38<1:09:58,  1.23it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:07<00:00, 24.77it/s][A
Epoch 6:  13%|█▎        | 790/5971 [10:38<1:09:40,  1.24it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:07<00:00, 25.71it/s][A
Epoch 6:  13%|█▎        | 793/5971 [10:38<1:09:23,  1.24it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:07<00:00, 26.25it/s][A
Epoch 6:  13%|█▎        | 796/5971 [10:38<1:09:05,  1.25it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 25.59it/s][A
Epoch 6:  13%|█▎        | 799/5971 [10:38<1:08:48,  1.25it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 23.85it/s][A
Epoch 6:  13%|█▎        | 802/5971 [10:38<1:08:31,  1.26it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  13%|█▎        | 804/5971 [10:39<1:08:22,  1.26it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  13%|█▎        | 805/5971 [10:40<1:08:22,  1.26it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.00012, train/loss_step=0.0325, global_step=3524.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  13%|█▎        | 805/5971 [10:40<1:08:22,  1.26it/s, loss=0.146, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00108, train/loss_step=0.273, global_step=3525.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  13%|█▎        | 806/5971 [10:40<1:08:22,  1.26it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.74e-5, train/loss_step=0.0183, global_step=3525.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 807/5971 [10:41<1:08:22,  1.26it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000238, train/loss_step=0.0698, global_step=3525.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 808/5971 [10:46<1:08:47,  1.25it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000238, train/loss_step=0.0698, global_step=3525.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 808/5971 [10:46<1:08:47,  1.25it/s, loss=0.152, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00268, train/loss_step=0.428, global_step=3525.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  14%|█▎        | 809/5971 [10:47<1:08:48,  1.25it/s, loss=0.15, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.53e-5, train/loss_step=0.016, global_step=3526.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▎        | 810/5971 [10:48<1:08:48,  1.25it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.15e-5, train/loss_step=0.00924, global_step=3526.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 811/5971 [10:49<1:08:48,  1.25it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.15e-5, train/loss_step=0.00924, global_step=3526.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 811/5971 [10:49<1:08:48,  1.25it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.31e-5, train/loss_step=0.00665, global_step=3526.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 812/5971 [10:52<1:09:00,  1.25it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.000201, train/loss_step=0.0609, global_step=3526.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▎        | 813/5971 [10:53<1:09:00,  1.25it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00891, train/loss_vlb_step=4.12e-5, train/loss_step=0.00891, global_step=3527.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 814/5971 [10:54<1:09:00,  1.25it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00891, train/loss_vlb_step=4.12e-5, train/loss_step=0.00891, global_step=3527.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 814/5971 [10:54<1:09:00,  1.25it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.19e-5, train/loss_step=0.0121, global_step=3527.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▎        | 815/5971 [10:55<1:09:01,  1.25it/s, loss=0.102, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00142, train/loss_step=0.301, global_step=3527.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  14%|█▎        | 816/5971 [10:57<1:09:10,  1.24it/s, loss=0.107, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000357, train/loss_step=0.109, global_step=3527.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 817/5971 [10:58<1:09:10,  1.24it/s, loss=0.107, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000357, train/loss_step=0.109, global_step=3527.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 817/5971 [10:58<1:09:10,  1.24it/s, loss=0.119, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000959, train/loss_step=0.249, global_step=3528.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 818/5971 [10:59<1:09:10,  1.24it/s, loss=0.123, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=3528.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 819/5971 [11:00<1:09:10,  1.24it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.57e-5, train/loss_step=0.00501, global_step=3528.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 820/5971 [11:02<1:09:19,  1.24it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.57e-5, train/loss_step=0.00501, global_step=3528.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▎        | 820/5971 [11:02<1:09:19,  1.24it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.07e-5, train/loss_step=0.00668, global_step=3528.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▎        | 821/5971 [11:03<1:09:19,  1.24it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.15e-6, train/loss_step=0.00135, global_step=3529.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 822/5971 [11:04<1:09:19,  1.24it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.32e-5, train/loss_step=0.0201, global_step=3529.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  14%|█▍        | 823/5971 [11:05<1:09:19,  1.24it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.32e-5, train/loss_step=0.0201, global_step=3529.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 823/5971 [11:05<1:09:19,  1.24it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.092, train/loss_vlb_step=0.000307, train/loss_step=0.092, global_step=3529.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 824/5971 [11:08<1:09:29,  1.23it/s, loss=0.102, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000991, train/loss_step=0.247, global_step=3529.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▍        | 825/5971 [11:09<1:09:29,  1.23it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0767, train/loss_vlb_step=0.000255, train/loss_step=0.0767, global_step=3530.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 826/5971 [11:10<1:09:28,  1.23it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0767, train/loss_vlb_step=0.000255, train/loss_step=0.0767, global_step=3530.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 826/5971 [11:10<1:09:28,  1.23it/s, loss=0.112, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.0022, train/loss_step=0.415, global_step=3530.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  14%|█▍        | 827/5971 [11:11<1:09:29,  1.23it/s, loss=0.117, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000555, train/loss_step=0.161, global_step=3530.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 828/5971 [11:13<1:09:38,  1.23it/s, loss=0.118, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.0034, train/loss_step=0.445, global_step=3530.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  14%|█▍        | 829/5971 [11:14<1:09:38,  1.23it/s, loss=0.118, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.0034, train/loss_step=0.445, global_step=3530.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 829/5971 [11:14<1:09:38,  1.23it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0207, train/loss_vlb_step=8.41e-5, train/loss_step=0.0207, global_step=3531.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 830/5971 [11:15<1:09:38,  1.23it/s, loss=0.125, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000515, train/loss_step=0.149, global_step=3531.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▍        | 831/5971 [11:16<1:09:38,  1.23it/s, loss=0.145, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00226, train/loss_step=0.418, global_step=3531.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▍        | 832/5971 [11:18<1:09:48,  1.23it/s, loss=0.145, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00226, train/loss_step=0.418, global_step=3531.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 832/5971 [11:18<1:09:48,  1.23it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.000169, train/loss_step=0.0475, global_step=3531.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 833/5971 [11:19<1:09:48,  1.23it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000153, train/loss_step=0.0427, global_step=3532.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 834/5971 [11:20<1:09:48,  1.23it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.33e-5, train/loss_step=0.00455, global_step=3532.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 835/5971 [11:21<1:09:47,  1.23it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.33e-5, train/loss_step=0.00455, global_step=3532.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 835/5971 [11:21<1:09:47,  1.23it/s, loss=0.14, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000659, train/loss_step=0.185, global_step=3532.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  14%|█▍        | 836/5971 [11:24<1:09:56,  1.22it/s, loss=0.141, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000436, train/loss_step=0.133, global_step=3532.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 837/5971 [11:25<1:09:56,  1.22it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000114, train/loss_step=0.0339, global_step=3533.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 838/5971 [11:26<1:09:58,  1.22it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000114, train/loss_step=0.0339, global_step=3533.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 838/5971 [11:26<1:09:58,  1.22it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00891, train/loss_vlb_step=3.96e-5, train/loss_step=0.00891, global_step=3533.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 839/5971 [11:27<1:09:58,  1.22it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.000112, train/loss_step=0.0281, global_step=3533.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▍        | 840/5971 [11:29<1:10:06,  1.22it/s, loss=0.128, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.36e-5, train/loss_step=0.023, global_step=3533.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  14%|█▍        | 841/5971 [11:30<1:10:06,  1.22it/s, loss=0.128, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9.36e-5, train/loss_step=0.023, global_step=3533.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 841/5971 [11:30<1:10:06,  1.22it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0251, train/loss_vlb_step=9.88e-5, train/loss_step=0.0251, global_step=3534.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 842/5971 [11:31<1:10:07,  1.22it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.00017, train/loss_step=0.0477, global_step=3534.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▍        | 843/5971 [11:32<1:10:07,  1.22it/s, loss=0.146, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00211, train/loss_step=0.409, global_step=3534.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▍        | 844/5971 [11:35<1:10:17,  1.22it/s, loss=0.146, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00211, train/loss_step=0.409, global_step=3534.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 844/5971 [11:35<1:10:17,  1.22it/s, loss=0.145, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.00091, train/loss_step=0.224, global_step=3534.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 845/5971 [11:36<1:10:19,  1.21it/s, loss=0.153, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000863, train/loss_step=0.231, global_step=3535.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 846/5971 [11:37<1:10:18,  1.21it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0501, train/loss_vlb_step=0.000183, train/loss_step=0.0501, global_step=3535.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 847/5971 [11:38<1:10:19,  1.21it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0501, train/loss_vlb_step=0.000183, train/loss_step=0.0501, global_step=3535.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 847/5971 [11:38<1:10:19,  1.21it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.33e-5, train/loss_step=0.00234, global_step=3535.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 848/5971 [11:41<1:10:30,  1.21it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=8.15e-5, train/loss_step=0.0193, global_step=3535.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  14%|█▍        | 849/5971 [11:42<1:10:31,  1.21it/s, loss=0.123, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00244, train/loss_step=0.384, global_step=3536.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  14%|█▍        | 850/5971 [11:43<1:10:30,  1.21it/s, loss=0.123, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00244, train/loss_step=0.384, global_step=3536.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 850/5971 [11:43<1:10:30,  1.21it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.09e-5, train/loss_step=0.0018, global_step=3536.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 851/5971 [11:44<1:10:30,  1.21it/s, loss=0.111, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00144, train/loss_step=0.325, global_step=3536.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  14%|█▍        | 852/5971 [11:46<1:10:40,  1.21it/s, loss=0.114, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000347, train/loss_step=0.105, global_step=3536.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 853/5971 [11:47<1:10:40,  1.21it/s, loss=0.114, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000347, train/loss_step=0.105, global_step=3536.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 853/5971 [11:47<1:10:40,  1.21it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.34e-5, train/loss_step=0.0256, global_step=3537.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 854/5971 [11:48<1:10:40,  1.21it/s, loss=0.125, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00102, train/loss_step=0.247, global_step=3537.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  14%|█▍        | 855/5971 [11:49<1:10:41,  1.21it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000136, train/loss_step=0.0368, global_step=3537.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 856/5971 [11:52<1:10:51,  1.20it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000136, train/loss_step=0.0368, global_step=3537.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 856/5971 [11:52<1:10:51,  1.20it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000169, train/loss_step=0.0448, global_step=3537.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 857/5971 [11:53<1:10:51,  1.20it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.52e-5, train/loss_step=0.0219, global_step=3538.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▍        | 858/5971 [11:54<1:10:52,  1.20it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.000171, train/loss_step=0.0472, global_step=3538.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 859/5971 [11:55<1:10:52,  1.20it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.000171, train/loss_step=0.0472, global_step=3538.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 859/5971 [11:55<1:10:52,  1.20it/s, loss=0.127, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00114, train/loss_step=0.263, global_step=3538.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  14%|█▍        | 860/5971 [11:57<1:11:00,  1.20it/s, loss=0.136, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000765, train/loss_step=0.212, global_step=3538.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 861/5971 [11:58<1:11:00,  1.20it/s, loss=0.155, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00179, train/loss_step=0.394, global_step=3539.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  14%|█▍        | 862/5971 [11:59<1:11:00,  1.20it/s, loss=0.155, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00179, train/loss_step=0.394, global_step=3539.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 862/5971 [11:59<1:11:00,  1.20it/s, loss=0.171, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00274, train/loss_step=0.379, global_step=3539.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 863/5971 [12:00<1:11:00,  1.20it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0507, train/loss_vlb_step=0.000175, train/loss_step=0.0507, global_step=3539.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 864/5971 [12:02<1:11:08,  1.20it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000105, train/loss_step=0.0266, global_step=3539.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 865/5971 [12:03<1:11:08,  1.20it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000105, train/loss_step=0.0266, global_step=3539.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  14%|█▍        | 865/5971 [12:03<1:11:08,  1.20it/s, loss=0.137, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=3540.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  15%|█▍        | 866/5971 [12:04<1:11:08,  1.20it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0712, train/loss_vlb_step=0.000238, train/loss_step=0.0712, global_step=3540.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 867/5971 [12:05<1:11:08,  1.20it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000243, train/loss_step=0.0735, global_step=3540.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 868/5971 [12:08<1:11:18,  1.19it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000243, train/loss_step=0.0735, global_step=3540.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 868/5971 [12:08<1:11:18,  1.19it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.5e-5, train/loss_step=0.00966, global_step=3540.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 869/5971 [12:09<1:11:18,  1.19it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00486, train/loss_vlb_step=2.46e-5, train/loss_step=0.00486, global_step=3541.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 870/5971 [12:10<1:11:17,  1.19it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.48e-5, train/loss_step=0.00715, global_step=3541.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 871/5971 [12:11<1:11:17,  1.19it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.48e-5, train/loss_step=0.00715, global_step=3541.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 871/5971 [12:11<1:11:17,  1.19it/s, loss=0.13, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00283, train/loss_step=0.468, global_step=3541.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  15%|█▍        | 872/5971 [12:13<1:11:24,  1.19it/s, loss=0.136, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000843, train/loss_step=0.227, global_step=3541.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 873/5971 [12:14<1:11:25,  1.19it/s, loss=0.161, v_num=0, train/loss_simple_step=0.517, train/loss_vlb_step=0.00429, train/loss_step=0.517, global_step=3542.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  15%|█▍        | 874/5971 [12:15<1:11:25,  1.19it/s, loss=0.161, v_num=0, train/loss_simple_step=0.517, train/loss_vlb_step=0.00429, train/loss_step=0.517, global_step=3542.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 874/5971 [12:15<1:11:25,  1.19it/s, loss=0.155, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000432, train/loss_step=0.131, global_step=3542.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 875/5971 [12:16<1:11:25,  1.19it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.58e-5, train/loss_step=0.0129, global_step=3542.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 876/5971 [12:18<1:11:33,  1.19it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00515, train/loss_vlb_step=2.65e-5, train/loss_step=0.00515, global_step=3542.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 877/5971 [12:19<1:11:33,  1.19it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00515, train/loss_vlb_step=2.65e-5, train/loss_step=0.00515, global_step=3542.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 877/5971 [12:19<1:11:33,  1.19it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000196, train/loss_step=0.0573, global_step=3543.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  15%|█▍        | 878/5971 [12:20<1:11:33,  1.19it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000278, train/loss_step=0.0841, global_step=3543.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 879/5971 [12:21<1:11:33,  1.19it/s, loss=0.147, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000357, train/loss_step=0.106, global_step=3543.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  15%|█▍        | 880/5971 [12:24<1:11:42,  1.18it/s, loss=0.147, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000357, train/loss_step=0.106, global_step=3543.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 880/5971 [12:24<1:11:42,  1.18it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000127, train/loss_step=0.0337, global_step=3543.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 881/5971 [12:25<1:11:43,  1.18it/s, loss=0.128, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000686, train/loss_step=0.194, global_step=3544.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  15%|█▍        | 882/5971 [12:26<1:11:42,  1.18it/s, loss=0.125, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00128, train/loss_step=0.318, global_step=3544.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  15%|█▍        | 883/5971 [12:27<1:11:42,  1.18it/s, loss=0.125, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00128, train/loss_step=0.318, global_step=3544.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 883/5971 [12:27<1:11:42,  1.18it/s, loss=0.137, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00109, train/loss_step=0.287, global_step=3544.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 884/5971 [12:29<1:11:50,  1.18it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.19e-5, train/loss_step=0.00209, global_step=3544.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 885/5971 [12:30<1:11:49,  1.18it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000192, train/loss_step=0.0538, global_step=3545.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  15%|█▍        | 886/5971 [12:31<1:11:49,  1.18it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000192, train/loss_step=0.0538, global_step=3545.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 886/5971 [12:31<1:11:49,  1.18it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00552, train/loss_vlb_step=2.71e-5, train/loss_step=0.00552, global_step=3545.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 887/5971 [12:32<1:11:49,  1.18it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0414, train/loss_vlb_step=0.00015, train/loss_step=0.0414, global_step=3545.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  15%|█▍        | 888/5971 [12:35<1:11:56,  1.18it/s, loss=0.146, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00201, train/loss_step=0.357, global_step=3545.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  15%|█▍        | 889/5971 [12:35<1:11:56,  1.18it/s, loss=0.146, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00201, train/loss_step=0.357, global_step=3545.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 889/5971 [12:35<1:11:56,  1.18it/s, loss=0.159, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00108, train/loss_step=0.275, global_step=3546.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 890/5971 [12:36<1:11:56,  1.18it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0705, train/loss_vlb_step=0.000236, train/loss_step=0.0705, global_step=3546.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 891/5971 [12:37<1:11:56,  1.18it/s, loss=0.147, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000573, train/loss_step=0.169, global_step=3546.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  15%|█▍        | 892/5971 [12:40<1:12:06,  1.17it/s, loss=0.147, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000573, train/loss_step=0.169, global_step=3546.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 892/5971 [12:40<1:12:06,  1.17it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00369, train/loss_vlb_step=2.02e-5, train/loss_step=0.00369, global_step=3546.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 893/5971 [12:41<1:12:05,  1.17it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0592, train/loss_vlb_step=0.00021, train/loss_step=0.0592, global_step=3547.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  15%|█▍        | 894/5971 [12:42<1:12:05,  1.17it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000119, train/loss_step=0.0297, global_step=3547.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 895/5971 [12:43<1:12:05,  1.17it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000119, train/loss_step=0.0297, global_step=3547.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▍        | 895/5971 [12:43<1:12:05,  1.17it/s, loss=0.125, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.002, train/loss_step=0.343, global_step=3547.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  15%|█▌        | 896/5971 [12:45<1:12:12,  1.17it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.32e-5, train/loss_step=0.0102, global_step=3547.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 897/5971 [12:46<1:12:13,  1.17it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.00016, train/loss_step=0.0436, global_step=3548.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 898/5971 [12:47<1:12:12,  1.17it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.00016, train/loss_step=0.0436, global_step=3548.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 898/5971 [12:47<1:12:12,  1.17it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.24e-5, train/loss_step=0.00206, global_step=3548.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 899/5971 [12:48<1:12:12,  1.17it/s, loss=0.122, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=3548.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  15%|█▌        | 900/5971 [12:51<1:12:20,  1.17it/s, loss=0.126, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=3548.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 901/5971 [12:52<1:12:20,  1.17it/s, loss=0.126, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000364, train/loss_step=0.110, global_step=3548.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 901/5971 [12:52<1:12:20,  1.17it/s, loss=0.13, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00108, train/loss_step=0.285, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  15%|█▌        | 902/5971 [12:53<1:12:19,  1.17it/s, loss=0.126, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000864, train/loss_step=0.238, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 903/5971 [12:54<1:12:19,  1.17it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000308, train/loss_step=0.0937, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 904/5971 [12:56<1:12:27,  1.17it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000308, train/loss_step=0.0937, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  15%|█▌        | 904/5971 [12:56<1:12:27,  1.17it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.25it/s][A

Validating:   1%|          | 2/167 [00:00<00:47,  3.50it/s][A
Epoch 6:  15%|█▌        | 907/5971 [12:57<1:12:14,  1.17it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.78it/s][A
Epoch 6:  15%|█▌        | 910/5971 [12:57<1:11:58,  1.17it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   4%|▍         | 7/167 [00:00<00:14, 11.32it/s][A
Epoch 6:  15%|█▌        | 913/5971 [12:57<1:11:42,  1.18it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   6%|▌         | 10/167 [00:00<00:10, 14.74it/s][A
Epoch 6:  15%|█▌        | 916/5971 [12:57<1:11:26,  1.18it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:01<00:08, 17.29it/s][A
Epoch 6:  15%|█▌        | 919/5971 [12:57<1:11:10,  1.18it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|▉         | 16/167 [00:01<00:08, 18.84it/s][A
Epoch 6:  15%|█▌        | 922/5971 [12:57<1:10:55,  1.19it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█▏        | 19/167 [00:01<00:07, 20.38it/s][A
Epoch 6:  15%|█▌        | 925/5971 [12:57<1:10:39,  1.19it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 20.90it/s][A
Epoch 6:  16%|█▌        | 928/5971 [12:58<1:10:24,  1.19it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 21.24it/s][A
Epoch 6:  16%|█▌        | 931/5971 [12:58<1:10:08,  1.20it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 28/167 [00:01<00:06, 21.47it/s][A
Epoch 6:  16%|█▌        | 934/5971 [12:58<1:09:53,  1.20it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:01<00:06, 21.69it/s][A
Epoch 6:  16%|█▌        | 937/5971 [12:58<1:09:38,  1.20it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|██        | 34/167 [00:02<00:06, 21.74it/s][A
Epoch 6:  16%|█▌        | 940/5971 [12:58<1:09:23,  1.21it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 22.72it/s][A
Epoch 6:  16%|█▌        | 943/5971 [12:58<1:09:08,  1.21it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 22.49it/s][A
Epoch 6:  16%|█▌        | 946/5971 [12:58<1:08:53,  1.22it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:05, 22.29it/s][A
Epoch 6:  16%|█▌        | 949/5971 [12:59<1:08:38,  1.22it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:02<00:05, 23.28it/s][A
Epoch 6:  16%|█▌        | 952/5971 [12:59<1:08:23,  1.22it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 24.50it/s][A
Epoch 6:  16%|█▌        | 955/5971 [12:59<1:08:08,  1.23it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:02<00:04, 24.43it/s][A
Epoch 6:  16%|█▌        | 958/5971 [12:59<1:07:54,  1.23it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:02<00:05, 21.94it/s][A
Epoch 6:  16%|█▌        | 961/5971 [12:59<1:07:39,  1.23it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:03<00:04, 22.26it/s][A
Epoch 6:  16%|█▌        | 964/5971 [12:59<1:07:25,  1.24it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 22.60it/s][A
Epoch 6:  16%|█▌        | 967/5971 [12:59<1:07:11,  1.24it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:03<00:04, 23.48it/s][A
Epoch 6:  16%|█▌        | 970/5971 [12:59<1:06:57,  1.24it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:04, 23.13it/s][A
Epoch 6:  16%|█▋        | 973/5971 [13:00<1:06:42,  1.25it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:04, 24.12it/s][A
Epoch 6:  16%|█▋        | 976/5971 [13:00<1:06:28,  1.25it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 23.91it/s][A
Epoch 6:  16%|█▋        | 979/5971 [13:00<1:06:14,  1.26it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 24.94it/s][A
Epoch 6:  16%|█▋        | 982/5971 [13:00<1:06:00,  1.26it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 24.69it/s][A
Epoch 6:  16%|█▋        | 985/5971 [13:00<1:05:47,  1.26it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 25.31it/s][A
Epoch 6:  17%|█▋        | 988/5971 [13:00<1:05:33,  1.27it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:04<00:03, 26.37it/s][A
Epoch 6:  17%|█▋        | 991/5971 [13:00<1:05:19,  1.27it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 25.83it/s][A
Epoch 6:  17%|█▋        | 994/5971 [13:00<1:05:06,  1.27it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.80it/s][A
Epoch 6:  17%|█▋        | 997/5971 [13:01<1:04:52,  1.28it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.61it/s][A
Epoch 6:  17%|█▋        | 1000/5971 [13:01<1:04:39,  1.28it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.30it/s][A
Epoch 6:  17%|█▋        | 1003/5971 [13:01<1:04:25,  1.29it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.49it/s][A
Epoch 6:  17%|█▋        | 1006/5971 [13:01<1:04:12,  1.29it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.57it/s][A
Epoch 6:  17%|█▋        | 1009/5971 [13:01<1:03:59,  1.29it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 25.71it/s][A
Epoch 6:  17%|█▋        | 1012/5971 [13:01<1:03:46,  1.30it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 25.66it/s][A
Epoch 6:  17%|█▋        | 1015/5971 [13:01<1:03:33,  1.30it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 26.69it/s][A
Epoch 6:  17%|█▋        | 1018/5971 [13:01<1:03:20,  1.30it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 26.62it/s][A
Epoch 6:  17%|█▋        | 1021/5971 [13:01<1:03:07,  1.31it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.38it/s][A
Epoch 6:  17%|█▋        | 1024/5971 [13:02<1:02:54,  1.31it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 24.61it/s][A
Epoch 6:  17%|█▋        | 1027/5971 [13:02<1:02:41,  1.31it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.73it/s][A
Epoch 6:  17%|█▋        | 1030/5971 [13:02<1:02:29,  1.32it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.01it/s][A
Epoch 6:  17%|█▋        | 1033/5971 [13:02<1:02:16,  1.32it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 24.84it/s][A
Epoch 6:  17%|█▋        | 1036/5971 [13:02<1:02:04,  1.33it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:06<00:01, 24.61it/s][A
Epoch 6:  17%|█▋        | 1039/5971 [13:02<1:01:51,  1.33it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 24.59it/s][A
Epoch 6:  17%|█▋        | 1042/5971 [13:02<1:01:39,  1.33it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 24.31it/s][A
Epoch 6:  18%|█▊        | 1045/5971 [13:02<1:01:27,  1.34it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 24.59it/s][A
Epoch 6:  18%|█▊        | 1048/5971 [13:03<1:01:14,  1.34it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 24.52it/s][A
Epoch 6:  18%|█▊        | 1051/5971 [13:03<1:01:02,  1.34it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 24.88it/s][A
Epoch 6:  18%|█▊        | 1054/5971 [13:03<1:00:50,  1.35it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 24.81it/s][A
Epoch 6:  18%|█▊        | 1057/5971 [13:03<1:00:38,  1.35it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.46it/s][A
Epoch 6:  18%|█▊        | 1060/5971 [13:03<1:00:26,  1.35it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 24.46it/s][A
Epoch 6:  18%|█▊        | 1063/5971 [13:03<1:00:14,  1.36it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 25.64it/s][A
Epoch 6:  18%|█▊        | 1066/5971 [13:03<1:00:02,  1.36it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 25.27it/s][A
Epoch 6:  18%|█▊        | 1069/5971 [13:03<59:51,  1.37it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  18%|█▊        | 1072/5971 [13:04<59:40,  1.37it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  18%|█▊        | 1073/5971 [13:05<59:41,  1.37it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=3549.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1073/5971 [13:05<59:41,  1.37it/s, loss=0.123, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000567, train/loss_step=0.171, global_step=3550.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  18%|█▊        | 1074/5971 [13:06<59:41,  1.37it/s, loss=0.137, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00122, train/loss_step=0.288, global_step=3550.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  18%|█▊        | 1075/5971 [13:07<59:42,  1.37it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.27e-5, train/loss_step=0.0124, global_step=3550.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1076/5971 [13:09<59:48,  1.36it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=0.000102, train/loss_step=0.0258, global_step=3550.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1077/5971 [13:10<59:48,  1.36it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=0.000102, train/loss_step=0.0258, global_step=3550.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1077/5971 [13:10<59:48,  1.36it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=8.94e-6, train/loss_step=0.00156, global_step=3551.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1078/5971 [13:11<59:48,  1.36it/s, loss=0.133, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.00778, train/loss_step=0.622, global_step=3551.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  18%|█▊        | 1079/5971 [13:12<59:49,  1.36it/s, loss=0.128, v_num=0, train/loss_simple_step=0.071, train/loss_vlb_step=0.000238, train/loss_step=0.071, global_step=3551.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1080/5971 [13:15<59:57,  1.36it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00614, train/loss_vlb_step=2.93e-5, train/loss_step=0.00614, global_step=3551.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1081/5971 [13:16<59:58,  1.36it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00614, train/loss_vlb_step=2.93e-5, train/loss_step=0.00614, global_step=3551.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1081/5971 [13:16<59:58,  1.36it/s, loss=0.142, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00162, train/loss_step=0.348, global_step=3552.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  18%|█▊        | 1082/5971 [13:17<59:58,  1.36it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0578, train/loss_vlb_step=0.0002, train/loss_step=0.0578, global_step=3552.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1083/5971 [13:18<59:58,  1.36it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00959, train/loss_vlb_step=4.41e-5, train/loss_step=0.00959, global_step=3552.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1084/5971 [13:20<1:00:04,  1.36it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000129, train/loss_step=0.0358, global_step=3552.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1085/5971 [13:21<1:00:05,  1.36it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000129, train/loss_step=0.0358, global_step=3552.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1085/5971 [13:21<1:00:05,  1.36it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000247, train/loss_step=0.0723, global_step=3553.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  18%|█▊        | 1086/5971 [13:22<1:00:05,  1.35it/s, loss=0.137, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000467, train/loss_step=0.140, global_step=3553.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  18%|█▊        | 1087/5971 [13:23<1:00:05,  1.35it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.4e-5, train/loss_step=0.00248, global_step=3553.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1088/5971 [13:25<1:00:11,  1.35it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000103, train/loss_step=0.0268, global_step=3553.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1089/5971 [13:26<1:00:12,  1.35it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000103, train/loss_step=0.0268, global_step=3553.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1089/5971 [13:26<1:00:12,  1.35it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.84e-5, train/loss_step=0.0108, global_step=3554.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  18%|█▊        | 1090/5971 [13:27<1:00:12,  1.35it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.00019, train/loss_step=0.0554, global_step=3554.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1091/5971 [13:28<1:00:12,  1.35it/s, loss=0.122, v_num=0, train/loss_simple_step=0.484, train/loss_vlb_step=0.0042, train/loss_step=0.484, global_step=3554.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  18%|█▊        | 1092/5971 [13:30<1:00:18,  1.35it/s, loss=0.124, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.000163, train/loss_step=0.046, global_step=3554.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1093/5971 [13:31<1:00:18,  1.35it/s, loss=0.124, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.000163, train/loss_step=0.046, global_step=3554.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1093/5971 [13:31<1:00:18,  1.35it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.43e-5, train/loss_step=0.0184, global_step=3555.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1094/5971 [13:32<1:00:19,  1.35it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00482, train/loss_vlb_step=2.54e-5, train/loss_step=0.00482, global_step=3555.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1095/5971 [13:33<1:00:19,  1.35it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000199, train/loss_step=0.0573, global_step=3555.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  18%|█▊        | 1096/5971 [13:35<1:00:25,  1.34it/s, loss=0.121, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00189, train/loss_step=0.352, global_step=3555.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  18%|█▊        | 1097/5971 [13:36<1:00:25,  1.34it/s, loss=0.121, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00189, train/loss_step=0.352, global_step=3555.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1097/5971 [13:36<1:00:25,  1.34it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0297, train/loss_vlb_step=0.000109, train/loss_step=0.0297, global_step=3556.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1098/5971 [13:37<1:00:25,  1.34it/s, loss=0.103, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000877, train/loss_step=0.228, global_step=3556.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  18%|█▊        | 1099/5971 [13:38<1:00:25,  1.34it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000109, train/loss_step=0.0287, global_step=3556.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1100/5971 [13:41<1:00:32,  1.34it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00968, train/loss_vlb_step=4.51e-5, train/loss_step=0.00968, global_step=3556.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1101/5971 [13:42<1:00:33,  1.34it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00968, train/loss_vlb_step=4.51e-5, train/loss_step=0.00968, global_step=3556.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1101/5971 [13:42<1:00:33,  1.34it/s, loss=0.103, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00191, train/loss_step=0.400, global_step=3557.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  18%|█▊        | 1102/5971 [13:43<1:00:33,  1.34it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.28e-5, train/loss_step=0.0113, global_step=3557.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1103/5971 [13:44<1:00:33,  1.34it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0494, train/loss_vlb_step=0.000176, train/loss_step=0.0494, global_step=3557.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  18%|█▊        | 1104/5971 [13:46<1:00:39,  1.34it/s, loss=0.121, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00158, train/loss_step=0.389, global_step=3557.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  19%|█▊        | 1105/5971 [13:47<1:00:39,  1.34it/s, loss=0.121, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00158, train/loss_step=0.389, global_step=3557.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1105/5971 [13:47<1:00:39,  1.34it/s, loss=0.125, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000494, train/loss_step=0.149, global_step=3558.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1106/5971 [13:48<1:00:39,  1.34it/s, loss=0.127, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000652, train/loss_step=0.192, global_step=3558.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1107/5971 [13:49<1:00:39,  1.34it/s, loss=0.156, v_num=0, train/loss_simple_step=0.571, train/loss_vlb_step=0.00642, train/loss_step=0.571, global_step=3558.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▊        | 1108/5971 [13:51<1:00:45,  1.33it/s, loss=0.174, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00249, train/loss_step=0.386, global_step=3558.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1109/5971 [13:52<1:00:46,  1.33it/s, loss=0.174, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00249, train/loss_step=0.386, global_step=3558.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1109/5971 [13:52<1:00:46,  1.33it/s, loss=0.174, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.24e-5, train/loss_step=0.017, global_step=3559.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1110/5971 [13:53<1:00:46,  1.33it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0521, train/loss_vlb_step=0.000186, train/loss_step=0.0521, global_step=3559.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1111/5971 [13:54<1:00:46,  1.33it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.69e-5, train/loss_step=0.0125, global_step=3559.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  19%|█▊        | 1112/5971 [13:56<1:00:53,  1.33it/s, loss=0.149, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.52e-5, train/loss_step=0.016, global_step=3559.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▊        | 1113/5971 [13:57<1:00:53,  1.33it/s, loss=0.149, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.52e-5, train/loss_step=0.016, global_step=3559.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1113/5971 [13:57<1:00:53,  1.33it/s, loss=0.153, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=3560.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1114/5971 [13:58<1:00:53,  1.33it/s, loss=0.167, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00137, train/loss_step=0.288, global_step=3560.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▊        | 1115/5971 [13:59<1:00:54,  1.33it/s, loss=0.183, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00216, train/loss_step=0.376, global_step=3560.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1116/5971 [14:02<1:00:59,  1.33it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0869, train/loss_vlb_step=0.000293, train/loss_step=0.0869, global_step=3560.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1117/5971 [14:03<1:01:00,  1.33it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0869, train/loss_vlb_step=0.000293, train/loss_step=0.0869, global_step=3560.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1117/5971 [14:03<1:01:00,  1.33it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000129, train/loss_step=0.0357, global_step=3561.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1118/5971 [14:03<1:01:00,  1.33it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.08e-5, train/loss_step=0.00184, global_step=3561.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▊        | 1119/5971 [14:04<1:01:00,  1.33it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000101, train/loss_step=0.0283, global_step=3561.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▉        | 1120/5971 [14:07<1:01:06,  1.32it/s, loss=0.174, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00126, train/loss_step=0.320, global_step=3561.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  19%|█▉        | 1121/5971 [14:08<1:01:06,  1.32it/s, loss=0.174, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00126, train/loss_step=0.320, global_step=3561.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1121/5971 [14:08<1:01:06,  1.32it/s, loss=0.165, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000993, train/loss_step=0.217, global_step=3562.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1122/5971 [14:09<1:01:06,  1.32it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.66e-5, train/loss_step=0.0153, global_step=3562.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1123/5971 [14:10<1:01:06,  1.32it/s, loss=0.172, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000591, train/loss_step=0.171, global_step=3562.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▉        | 1124/5971 [14:12<1:01:12,  1.32it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.00011, train/loss_step=0.0275, global_step=3562.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1125/5971 [14:13<1:01:12,  1.32it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.00011, train/loss_step=0.0275, global_step=3562.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1125/5971 [14:13<1:01:12,  1.32it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.76e-5, train/loss_step=0.0055, global_step=3563.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1126/5971 [14:14<1:01:12,  1.32it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000112, train/loss_step=0.0325, global_step=3563.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1127/5971 [14:15<1:01:12,  1.32it/s, loss=0.13, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00193, train/loss_step=0.402, global_step=3563.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  19%|█▉        | 1128/5971 [14:17<1:01:17,  1.32it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.64e-5, train/loss_step=0.0054, global_step=3563.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1129/5971 [14:18<1:01:17,  1.32it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.64e-5, train/loss_step=0.0054, global_step=3563.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1129/5971 [14:18<1:01:17,  1.32it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.39e-5, train/loss_step=0.0123, global_step=3564.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1130/5971 [14:19<1:01:17,  1.32it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5e-5, train/loss_step=0.0112, global_step=3564.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  19%|█▉        | 1131/5971 [14:19<1:01:17,  1.32it/s, loss=0.113, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000334, train/loss_step=0.102, global_step=3564.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1132/5971 [14:22<1:01:24,  1.31it/s, loss=0.124, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00125, train/loss_step=0.234, global_step=3564.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▉        | 1133/5971 [14:23<1:01:24,  1.31it/s, loss=0.124, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00125, train/loss_step=0.234, global_step=3564.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1133/5971 [14:23<1:01:24,  1.31it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=9.04e-6, train/loss_step=0.00153, global_step=3565.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1134/5971 [14:24<1:01:24,  1.31it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00269, train/loss_vlb_step=1.51e-5, train/loss_step=0.00269, global_step=3565.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1135/5971 [14:25<1:01:24,  1.31it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.88e-5, train/loss_step=0.016, global_step=3565.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  19%|█▉        | 1136/5971 [14:27<1:01:29,  1.31it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000615, train/loss_step=0.173, global_step=3565.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1137/5971 [14:28<1:01:29,  1.31it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000615, train/loss_step=0.173, global_step=3565.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1137/5971 [14:28<1:01:29,  1.31it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.00753, train/loss_vlb_step=3.64e-5, train/loss_step=0.00753, global_step=3566.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1138/5971 [14:29<1:01:29,  1.31it/s, loss=0.101, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000964, train/loss_step=0.231, global_step=3566.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  19%|█▉        | 1139/5971 [14:30<1:01:29,  1.31it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=7.78e-5, train/loss_step=0.0195, global_step=3566.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▉        | 1140/5971 [14:32<1:01:34,  1.31it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000141, train/loss_step=0.0389, global_step=3566.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1141/5971 [14:33<1:01:34,  1.31it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000141, train/loss_step=0.0389, global_step=3566.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1141/5971 [14:33<1:01:34,  1.31it/s, loss=0.0795, v_num=0, train/loss_simple_step=0.0816, train/loss_vlb_step=0.000278, train/loss_step=0.0816, global_step=3567.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1142/5971 [14:34<1:01:34,  1.31it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00178, train/loss_step=0.331, global_step=3567.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  19%|█▉        | 1143/5971 [14:35<1:01:34,  1.31it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.00677, train/loss_vlb_step=3.25e-5, train/loss_step=0.00677, global_step=3567.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1144/5971 [14:37<1:01:39,  1.30it/s, loss=0.086, v_num=0, train/loss_simple_step=0.00556, train/loss_vlb_step=2.83e-5, train/loss_step=0.00556, global_step=3567.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▉        | 1145/5971 [14:38<1:01:39,  1.30it/s, loss=0.086, v_num=0, train/loss_simple_step=0.00556, train/loss_vlb_step=2.83e-5, train/loss_step=0.00556, global_step=3567.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1145/5971 [14:38<1:01:39,  1.30it/s, loss=0.089, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000217, train/loss_step=0.0645, global_step=3568.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▉        | 1146/5971 [14:39<1:01:39,  1.30it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.0853, train/loss_vlb_step=0.000298, train/loss_step=0.0853, global_step=3568.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1147/5971 [14:40<1:01:39,  1.30it/s, loss=0.0749, v_num=0, train/loss_simple_step=0.0686, train/loss_vlb_step=0.000234, train/loss_step=0.0686, global_step=3568.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1148/5971 [14:42<1:01:44,  1.30it/s, loss=0.076, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=9.82e-5, train/loss_step=0.0264, global_step=3568.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  19%|█▉        | 1149/5971 [14:43<1:01:45,  1.30it/s, loss=0.076, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=9.82e-5, train/loss_step=0.0264, global_step=3568.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1149/5971 [14:43<1:01:45,  1.30it/s, loss=0.0775, v_num=0, train/loss_simple_step=0.0435, train/loss_vlb_step=0.00015, train/loss_step=0.0435, global_step=3569.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1150/5971 [14:44<1:01:45,  1.30it/s, loss=0.079, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000144, train/loss_step=0.0408, global_step=3569.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1151/5971 [14:45<1:01:45,  1.30it/s, loss=0.0756, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000126, train/loss_step=0.0342, global_step=3569.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1152/5971 [14:47<1:01:50,  1.30it/s, loss=0.0643, v_num=0, train/loss_simple_step=0.00742, train/loss_vlb_step=3.5e-5, train/loss_step=0.00742, global_step=3569.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1153/5971 [14:48<1:01:50,  1.30it/s, loss=0.0643, v_num=0, train/loss_simple_step=0.00742, train/loss_vlb_step=3.5e-5, train/loss_step=0.00742, global_step=3569.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1153/5971 [14:48<1:01:50,  1.30it/s, loss=0.074, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000672, train/loss_step=0.195, global_step=3570.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  19%|█▉        | 1154/5971 [14:49<1:01:50,  1.30it/s, loss=0.0768, v_num=0, train/loss_simple_step=0.0597, train/loss_vlb_step=0.000205, train/loss_step=0.0597, global_step=3570.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1155/5971 [14:50<1:01:50,  1.30it/s, loss=0.08, v_num=0, train/loss_simple_step=0.080, train/loss_vlb_step=0.000269, train/loss_step=0.080, global_step=3570.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  19%|█▉        | 1156/5971 [14:52<1:01:55,  1.30it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00151, train/loss_step=0.330, global_step=3570.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1157/5971 [14:53<1:01:55,  1.30it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00151, train/loss_step=0.330, global_step=3570.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1157/5971 [14:53<1:01:55,  1.30it/s, loss=0.108, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00298, train/loss_step=0.404, global_step=3571.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▉        | 1158/5971 [14:54<1:01:55,  1.30it/s, loss=0.105, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000625, train/loss_step=0.181, global_step=3571.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1159/5971 [14:55<1:01:55,  1.30it/s, loss=0.118, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00103, train/loss_step=0.267, global_step=3571.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  19%|█▉        | 1160/5971 [14:57<1:02:00,  1.29it/s, loss=0.127, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000845, train/loss_step=0.228, global_step=3571.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1161/5971 [14:58<1:02:00,  1.29it/s, loss=0.127, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000845, train/loss_step=0.228, global_step=3571.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1161/5971 [14:58<1:02:00,  1.29it/s, loss=0.133, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000762, train/loss_step=0.210, global_step=3572.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1162/5971 [14:59<1:02:00,  1.29it/s, loss=0.123, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000403, train/loss_step=0.122, global_step=3572.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  19%|█▉        | 1163/5971 [15:00<1:01:59,  1.29it/s, loss=0.14, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00225, train/loss_step=0.355, global_step=3572.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  19%|█▉        | 1164/5971 [15:02<1:02:05,  1.29it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=1.01e-5, train/loss_step=0.00166, global_step=3572.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  20%|█▉        | 1165/5971 [15:03<1:02:05,  1.29it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=1.01e-5, train/loss_step=0.00166, global_step=3572.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  20%|█▉        | 1165/5971 [15:03<1:02:05,  1.29it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.11e-5, train/loss_step=0.00206, global_step=3573.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  20%|█▉        | 1166/5971 [15:04<1:02:05,  1.29it/s, loss=0.152, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00225, train/loss_step=0.385, global_step=3573.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  20%|█▉        | 1167/5971 [15:05<1:02:04,  1.29it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000125, train/loss_step=0.0333, global_step=3573.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  20%|█▉        | 1168/5971 [15:07<1:02:09,  1.29it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.29e-5, train/loss_step=0.00247, global_step=3573.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  20%|█▉        | 1169/5971 [15:08<1:02:09,  1.29it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00247, train/loss_vlb_step=1.29e-5, train/loss_step=0.00247, global_step=3573.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  20%|█▉        | 1169/5971 [15:08<1:02:09,  1.29it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00471, train/loss_vlb_step=2.36e-5, train/loss_step=0.00471, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  20%|█▉        | 1170/5971 [15:09<1:02:09,  1.29it/s, loss=0.164, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00228, train/loss_step=0.384, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  20%|█▉        | 1171/5971 [15:10<1:02:08,  1.29it/s, loss=0.192, v_num=0, train/loss_simple_step=0.580, train/loss_vlb_step=0.010, train/loss_step=0.580, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  20%|█▉        | 1172/5971 [15:13<1:02:16,  1.28it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  20%|█▉        | 1173/5971 [15:13<1:02:12,  1.29it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.27it/s][A

Validating:   1%|          | 2/167 [00:00<00:48,  3.38it/s][A
Epoch 6:  20%|█▉        | 1177/5971 [15:13<1:01:59,  1.29it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.83it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:12, 13.18it/s][A
Epoch 6:  20%|█▉        | 1181/5971 [15:14<1:01:44,  1.29it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.66it/s][A
Epoch 6:  20%|█▉        | 1185/5971 [15:14<1:01:29,  1.30it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   9%|▉         | 15/167 [00:01<00:07, 21.10it/s][A
Epoch 6:  20%|█▉        | 1189/5971 [15:14<1:01:14,  1.30it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:01<00:06, 22.64it/s][A
Epoch 6:  20%|█▉        | 1193/5971 [15:14<1:00:59,  1.31it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 24.09it/s][A

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.95it/s][A
Epoch 6:  20%|██        | 1197/5971 [15:14<1:00:45,  1.31it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.20it/s][A
Epoch 6:  20%|██        | 1201/5971 [15:14<1:00:30,  1.31it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.77it/s][A
Epoch 6:  20%|██        | 1205/5971 [15:15<1:00:16,  1.32it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.62it/s][A

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.66it/s][A
Epoch 6:  20%|██        | 1209/5971 [15:15<1:00:01,  1.32it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 26.58it/s][A
Epoch 6:  20%|██        | 1213/5971 [15:15<59:47,  1.33it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.19it/s][A
Epoch 6:  20%|██        | 1217/5971 [15:15<59:33,  1.33it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.28it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 24.74it/s][A
Epoch 6:  20%|██        | 1221/5971 [15:15<59:19,  1.33it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.28it/s][A
Epoch 6:  21%|██        | 1225/5971 [15:15<59:05,  1.34it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.26it/s][A
Epoch 6:  21%|██        | 1229/5971 [15:15<58:51,  1.34it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.17it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.75it/s][A
Epoch 6:  21%|██        | 1233/5971 [15:16<58:37,  1.35it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.20it/s][A
Epoch 6:  21%|██        | 1237/5971 [15:16<58:23,  1.35it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.71it/s][A
Epoch 6:  21%|██        | 1241/5971 [15:16<58:10,  1.36it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.12it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.71it/s][A
Epoch 6:  21%|██        | 1245/5971 [15:16<57:56,  1.36it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 24.47it/s][A
Epoch 6:  21%|██        | 1249/5971 [15:16<57:43,  1.36it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.35it/s][A
Epoch 6:  21%|██        | 1253/5971 [15:16<57:29,  1.37it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.68it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.29it/s][A
Epoch 6:  21%|██        | 1257/5971 [15:17<57:16,  1.37it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.77it/s][A
Epoch 6:  21%|██        | 1261/5971 [15:17<57:03,  1.38it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.90it/s][A
Epoch 6:  21%|██        | 1265/5971 [15:17<56:49,  1.38it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.40it/s][A

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.82it/s][A
Epoch 6:  21%|██▏       | 1269/5971 [15:17<56:36,  1.38it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.16it/s][A
Epoch 6:  21%|██▏       | 1273/5971 [15:17<56:23,  1.39it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.11it/s][A
Epoch 6:  21%|██▏       | 1277/5971 [15:17<56:10,  1.39it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.69it/s][A

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 28.17it/s][A
Epoch 6:  21%|██▏       | 1281/5971 [15:17<55:58,  1.40it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.24it/s][A
Epoch 6:  22%|██▏       | 1285/5971 [15:18<55:45,  1.40it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 26.77it/s][A
Epoch 6:  22%|██▏       | 1289/5971 [15:18<55:32,  1.40it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.59it/s][A

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.97it/s][A
Epoch 6:  22%|██▏       | 1293/5971 [15:18<55:20,  1.41it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.22it/s][A
Epoch 6:  22%|██▏       | 1297/5971 [15:18<55:07,  1.41it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.19it/s][A
Epoch 6:  22%|██▏       | 1301/5971 [15:18<54:55,  1.42it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.76it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.37it/s][A
Epoch 6:  22%|██▏       | 1305/5971 [15:18<54:42,  1.42it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.29it/s][A
Epoch 6:  22%|██▏       | 1309/5971 [15:18<54:30,  1.43it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.69it/s][A
Epoch 6:  22%|██▏       | 1313/5971 [15:19<54:18,  1.43it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.59it/s][A

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 27.66it/s][A
Epoch 6:  22%|██▏       | 1317/5971 [15:19<54:06,  1.43it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.69it/s][A
Epoch 6:  22%|██▏       | 1321/5971 [15:19<53:53,  1.44it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.91it/s][A
Epoch 6:  22%|██▏       | 1325/5971 [15:19<53:41,  1.44it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.00it/s][A

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.76it/s][A
Epoch 6:  22%|██▏       | 1329/5971 [15:19<53:29,  1.45it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.46it/s][A
Epoch 6:  22%|██▏       | 1333/5971 [15:19<53:18,  1.45it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.96it/s][A
Epoch 6:  22%|██▏       | 1337/5971 [15:20<53:06,  1.45it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 28.06it/s][A
Epoch 6:  22%|██▏       | 1340/5971 [15:20<52:58,  1.46it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  22%|██▏       | 1341/5971 [15:21<52:58,  1.46it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.56e-5, train/loss_step=0.0122, global_step=3574.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  22%|██▏       | 1341/5971 [15:21<52:58,  1.46it/s, loss=0.188, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=3575.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  22%|██▏       | 1342/5971 [15:22<52:58,  1.46it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.53e-5, train/loss_step=0.00265, global_step=3575.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  22%|██▏       | 1343/5971 [15:23<52:58,  1.46it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0943, train/loss_vlb_step=0.00031, train/loss_step=0.0943, global_step=3575.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  23%|██▎       | 1344/5971 [15:25<53:04,  1.45it/s, loss=0.185, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.0015, train/loss_step=0.317, global_step=3575.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  23%|██▎       | 1345/5971 [15:26<53:04,  1.45it/s, loss=0.185, v_num=0, train/loss_simple_step=0.317, train/loss_vlb_step=0.0015, train/loss_step=0.317, global_step=3575.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1345/5971 [15:26<53:04,  1.45it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.23e-5, train/loss_step=0.00226, global_step=3576.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1346/5971 [15:27<53:04,  1.45it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.47e-5, train/loss_step=0.00484, global_step=3576.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1347/5971 [15:28<53:04,  1.45it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.74e-5, train/loss_step=0.0129, global_step=3576.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  23%|██▎       | 1348/5971 [15:30<53:10,  1.45it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.5e-5, train/loss_step=0.00715, global_step=3576.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1349/5971 [15:31<53:10,  1.45it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.5e-5, train/loss_step=0.00715, global_step=3576.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1349/5971 [15:31<53:10,  1.45it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000122, train/loss_step=0.0316, global_step=3577.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1350/5971 [15:32<53:10,  1.45it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.92e-5, train/loss_step=0.0198, global_step=3577.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1351/5971 [15:33<53:10,  1.45it/s, loss=0.107, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000502, train/loss_step=0.142, global_step=3577.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1352/5971 [15:35<53:14,  1.45it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00631, train/loss_vlb_step=3.09e-5, train/loss_step=0.00631, global_step=3577.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1353/5971 [15:36<53:14,  1.45it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00631, train/loss_vlb_step=3.09e-5, train/loss_step=0.00631, global_step=3577.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1353/5971 [15:36<53:14,  1.45it/s, loss=0.151, v_num=0, train/loss_simple_step=0.878, train/loss_vlb_step=0.148, train/loss_step=0.878, global_step=3578.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  23%|██▎       | 1354/5971 [15:37<53:14,  1.45it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000144, train/loss_step=0.0394, global_step=3578.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1355/5971 [15:38<53:14,  1.44it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000301, train/loss_step=0.0916, global_step=3578.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1356/5971 [15:40<53:19,  1.44it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000117, train/loss_step=0.0326, global_step=3578.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1357/5971 [15:41<53:19,  1.44it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000117, train/loss_step=0.0326, global_step=3578.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1357/5971 [15:41<53:19,  1.44it/s, loss=0.158, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00181, train/loss_step=0.388, global_step=3579.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  23%|██▎       | 1358/5971 [15:42<53:19,  1.44it/s, loss=0.145, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000425, train/loss_step=0.124, global_step=3579.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1359/5971 [15:43<53:19,  1.44it/s, loss=0.126, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000783, train/loss_step=0.197, global_step=3579.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1360/5971 [15:45<53:24,  1.44it/s, loss=0.127, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000133, train/loss_step=0.036, global_step=3579.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1361/5971 [15:46<53:24,  1.44it/s, loss=0.127, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000133, train/loss_step=0.036, global_step=3579.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1361/5971 [15:46<53:24,  1.44it/s, loss=0.164, v_num=0, train/loss_simple_step=0.852, train/loss_vlb_step=0.0488, train/loss_step=0.852, global_step=3580.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  23%|██▎       | 1362/5971 [15:47<53:24,  1.44it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00522, train/loss_vlb_step=2.63e-5, train/loss_step=0.00522, global_step=3580.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1363/5971 [15:48<53:24,  1.44it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.69e-5, train/loss_step=0.00749, global_step=3580.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1364/5971 [15:51<53:30,  1.44it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.16e-5, train/loss_step=0.0214, global_step=3580.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1365/5971 [15:51<53:30,  1.43it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.16e-5, train/loss_step=0.0214, global_step=3580.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1365/5971 [15:51<53:30,  1.43it/s, loss=0.161, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00172, train/loss_step=0.318, global_step=3581.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  23%|██▎       | 1366/5971 [15:52<53:29,  1.43it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00314, train/loss_vlb_step=1.71e-5, train/loss_step=0.00314, global_step=3581.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1367/5971 [15:53<53:29,  1.43it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.2e-5, train/loss_step=0.0145, global_step=3581.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  23%|██▎       | 1368/5971 [15:56<53:35,  1.43it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=7.91e-5, train/loss_step=0.0196, global_step=3581.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1369/5971 [15:57<53:35,  1.43it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=7.91e-5, train/loss_step=0.0196, global_step=3581.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1369/5971 [15:57<53:35,  1.43it/s, loss=0.171, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000782, train/loss_step=0.216, global_step=3582.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1370/5971 [15:58<53:35,  1.43it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000127, train/loss_step=0.0347, global_step=3582.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1371/5971 [15:58<53:34,  1.43it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.87e-5, train/loss_step=0.00348, global_step=3582.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1372/5971 [16:01<53:39,  1.43it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.53e-6, train/loss_step=0.00159, global_step=3582.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1373/5971 [16:02<53:39,  1.43it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.53e-6, train/loss_step=0.00159, global_step=3582.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1373/5971 [16:02<53:39,  1.43it/s, loss=0.16, v_num=0, train/loss_simple_step=0.800, train/loss_vlb_step=0.0515, train/loss_step=0.800, global_step=3583.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  23%|██▎       | 1374/5971 [16:02<53:39,  1.43it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000131, train/loss_step=0.0354, global_step=3583.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1375/5971 [16:03<53:39,  1.43it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.83e-5, train/loss_step=0.0165, global_step=3583.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1376/5971 [16:06<53:43,  1.43it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.09e-5, train/loss_step=0.00182, global_step=3583.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1377/5971 [16:07<53:43,  1.43it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.09e-5, train/loss_step=0.00182, global_step=3583.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1377/5971 [16:07<53:43,  1.43it/s, loss=0.173, v_num=0, train/loss_simple_step=0.759, train/loss_vlb_step=0.0305, train/loss_step=0.759, global_step=3584.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  23%|██▎       | 1378/5971 [16:07<53:43,  1.42it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00257, train/loss_vlb_step=1.44e-5, train/loss_step=0.00257, global_step=3584.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1379/5971 [16:08<53:43,  1.42it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00272, train/loss_vlb_step=1.49e-5, train/loss_step=0.00272, global_step=3584.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1380/5971 [16:10<53:47,  1.42it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.000302, train/loss_step=0.0912, global_step=3584.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  23%|██▎       | 1381/5971 [16:11<53:47,  1.42it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0912, train/loss_vlb_step=0.000302, train/loss_step=0.0912, global_step=3584.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1381/5971 [16:11<53:47,  1.42it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000115, train/loss_step=0.0291, global_step=3585.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1382/5971 [16:12<53:47,  1.42it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0321, train/loss_vlb_step=0.000118, train/loss_step=0.0321, global_step=3585.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1383/5971 [16:13<53:47,  1.42it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00283, train/loss_vlb_step=1.56e-5, train/loss_step=0.00283, global_step=3585.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1384/5971 [16:15<53:52,  1.42it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.48e-5, train/loss_step=0.0124, global_step=3585.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  23%|██▎       | 1385/5971 [16:16<53:51,  1.42it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.48e-5, train/loss_step=0.0124, global_step=3585.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1385/5971 [16:16<53:51,  1.42it/s, loss=0.119, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00135, train/loss_step=0.293, global_step=3586.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1386/5971 [16:17<53:51,  1.42it/s, loss=0.127, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000559, train/loss_step=0.165, global_step=3586.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1387/5971 [16:18<53:51,  1.42it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0914, train/loss_vlb_step=0.0003, train/loss_step=0.0914, global_step=3586.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1388/5971 [16:20<53:56,  1.42it/s, loss=0.138, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000555, train/loss_step=0.160, global_step=3586.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1389/5971 [16:21<53:56,  1.42it/s, loss=0.138, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000555, train/loss_step=0.160, global_step=3586.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1389/5971 [16:21<53:56,  1.42it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.7e-5, train/loss_step=0.0125, global_step=3587.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1390/5971 [16:22<53:56,  1.42it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.05e-5, train/loss_step=0.00176, global_step=3587.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1391/5971 [16:23<53:56,  1.42it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000159, train/loss_step=0.0428, global_step=3587.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1392/5971 [16:25<54:00,  1.41it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.27e-5, train/loss_step=0.0116, global_step=3587.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1393/5971 [16:26<54:00,  1.41it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.27e-5, train/loss_step=0.0116, global_step=3587.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1393/5971 [16:26<54:00,  1.41it/s, loss=0.094, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=3588.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  23%|██▎       | 1394/5971 [16:27<54:00,  1.41it/s, loss=0.124, v_num=0, train/loss_simple_step=0.641, train/loss_vlb_step=0.0278, train/loss_step=0.641, global_step=3588.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  23%|██▎       | 1395/5971 [16:28<54:00,  1.41it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.62e-5, train/loss_step=0.0237, global_step=3588.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1396/5971 [16:30<54:04,  1.41it/s, loss=0.145, v_num=0, train/loss_simple_step=0.419, train/loss_vlb_step=0.00236, train/loss_step=0.419, global_step=3588.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  23%|██▎       | 1397/5971 [16:31<54:04,  1.41it/s, loss=0.145, v_num=0, train/loss_simple_step=0.419, train/loss_vlb_step=0.00236, train/loss_step=0.419, global_step=3588.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1397/5971 [16:31<54:04,  1.41it/s, loss=0.123, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00119, train/loss_step=0.305, global_step=3589.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1398/5971 [16:32<54:04,  1.41it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0651, train/loss_vlb_step=0.000216, train/loss_step=0.0651, global_step=3589.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1399/5971 [16:33<54:04,  1.41it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000144, train/loss_step=0.0371, global_step=3589.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1400/5971 [16:35<54:08,  1.41it/s, loss=0.138, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00117, train/loss_step=0.298, global_step=3589.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  23%|██▎       | 1401/5971 [16:36<54:08,  1.41it/s, loss=0.138, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00117, train/loss_step=0.298, global_step=3589.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1401/5971 [16:36<54:08,  1.41it/s, loss=0.139, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000155, train/loss_step=0.042, global_step=3590.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1402/5971 [16:37<54:07,  1.41it/s, loss=0.143, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000365, train/loss_step=0.111, global_step=3590.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  23%|██▎       | 1403/5971 [16:38<54:07,  1.41it/s, loss=0.154, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000842, train/loss_step=0.234, global_step=3590.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1404/5971 [16:40<54:12,  1.40it/s, loss=0.178, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00337, train/loss_step=0.481, global_step=3590.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▎       | 1405/5971 [16:41<54:11,  1.40it/s, loss=0.178, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00337, train/loss_step=0.481, global_step=3590.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1405/5971 [16:41<54:12,  1.40it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00332, train/loss_vlb_step=1.82e-5, train/loss_step=0.00332, global_step=3591.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1406/5971 [16:42<54:11,  1.40it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.68e-5, train/loss_step=0.00298, global_step=3591.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1407/5971 [16:43<54:11,  1.40it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0729, train/loss_vlb_step=0.000246, train/loss_step=0.0729, global_step=3591.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▎       | 1408/5971 [16:45<54:15,  1.40it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000156, train/loss_step=0.0423, global_step=3591.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1409/5971 [16:46<54:15,  1.40it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000156, train/loss_step=0.0423, global_step=3591.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1409/5971 [16:46<54:15,  1.40it/s, loss=0.176, v_num=0, train/loss_simple_step=0.562, train/loss_vlb_step=0.00481, train/loss_step=0.562, global_step=3592.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  24%|██▎       | 1410/5971 [16:47<54:15,  1.40it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.44e-5, train/loss_step=0.0126, global_step=3592.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1411/5971 [16:48<54:15,  1.40it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000165, train/loss_step=0.0474, global_step=3592.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1412/5971 [16:50<54:19,  1.40it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0785, train/loss_vlb_step=0.000264, train/loss_step=0.0785, global_step=3592.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▎       | 1413/5971 [16:51<54:19,  1.40it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0785, train/loss_vlb_step=0.000264, train/loss_step=0.0785, global_step=3592.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1413/5971 [16:51<54:19,  1.40it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0436, train/loss_vlb_step=0.000165, train/loss_step=0.0436, global_step=3593.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1414/5971 [16:51<54:18,  1.40it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0953, train/loss_vlb_step=0.000313, train/loss_step=0.0953, global_step=3593.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1415/5971 [16:52<54:18,  1.40it/s, loss=0.154, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000411, train/loss_step=0.124, global_step=3593.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  24%|██▎       | 1416/5971 [16:55<54:22,  1.40it/s, loss=0.139, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=3593.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1417/5971 [16:56<54:23,  1.40it/s, loss=0.139, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=3593.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▎       | 1417/5971 [16:56<54:23,  1.40it/s, loss=0.14, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00124, train/loss_step=0.323, global_step=3594.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  24%|██▎       | 1418/5971 [16:56<54:23,  1.40it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0435, train/loss_vlb_step=0.000144, train/loss_step=0.0435, global_step=3594.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1419/5971 [16:57<54:22,  1.40it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.74e-5, train/loss_step=0.0105, global_step=3594.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▍       | 1420/5971 [17:00<54:27,  1.39it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.000224, train/loss_step=0.0654, global_step=3594.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1421/5971 [17:01<54:27,  1.39it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.000224, train/loss_step=0.0654, global_step=3594.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1421/5971 [17:01<54:27,  1.39it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.78e-5, train/loss_step=0.00329, global_step=3595.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1422/5971 [17:02<54:27,  1.39it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000126, train/loss_step=0.0338, global_step=3595.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  24%|██▍       | 1423/5971 [17:02<54:26,  1.39it/s, loss=0.117, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000778, train/loss_step=0.167, global_step=3595.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▍       | 1424/5971 [17:04<54:30,  1.39it/s, loss=0.093, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.65e-5, train/loss_step=0.0077, global_step=3595.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1425/5971 [17:05<54:30,  1.39it/s, loss=0.093, v_num=0, train/loss_simple_step=0.0077, train/loss_vlb_step=3.65e-5, train/loss_step=0.0077, global_step=3595.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1425/5971 [17:05<54:30,  1.39it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.37e-5, train/loss_step=0.00251, global_step=3596.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1426/5971 [17:06<54:30,  1.39it/s, loss=0.111, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00174, train/loss_step=0.365, global_step=3596.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  24%|██▍       | 1427/5971 [17:07<54:29,  1.39it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00475, train/loss_vlb_step=2.52e-5, train/loss_step=0.00475, global_step=3596.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1428/5971 [17:09<54:33,  1.39it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000203, train/loss_step=0.0604, global_step=3596.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▍       | 1429/5971 [17:10<54:33,  1.39it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000203, train/loss_step=0.0604, global_step=3596.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1429/5971 [17:10<54:33,  1.39it/s, loss=0.0811, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.63e-5, train/loss_step=0.0133, global_step=3597.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1430/5971 [17:11<54:33,  1.39it/s, loss=0.0827, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000162, train/loss_step=0.0448, global_step=3597.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1431/5971 [17:12<54:33,  1.39it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00122, train/loss_step=0.299, global_step=3597.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  24%|██▍       | 1432/5971 [17:14<54:37,  1.39it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.11e-5, train/loss_step=0.0143, global_step=3597.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1433/5971 [17:15<54:36,  1.38it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.11e-5, train/loss_step=0.0143, global_step=3597.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1433/5971 [17:15<54:36,  1.38it/s, loss=0.09, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.17e-5, train/loss_step=0.00207, global_step=3598.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1434/5971 [17:16<54:36,  1.38it/s, loss=0.102, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00169, train/loss_step=0.337, global_step=3598.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  24%|██▍       | 1435/5971 [17:17<54:36,  1.38it/s, loss=0.129, v_num=0, train/loss_simple_step=0.654, train/loss_vlb_step=0.0128, train/loss_step=0.654, global_step=3598.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▍       | 1436/5971 [17:19<54:40,  1.38it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0684, train/loss_vlb_step=0.000227, train/loss_step=0.0684, global_step=3598.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1437/5971 [17:20<54:40,  1.38it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0684, train/loss_vlb_step=0.000227, train/loss_step=0.0684, global_step=3598.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1437/5971 [17:20<54:40,  1.38it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.55e-5, train/loss_step=0.0185, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▍       | 1438/5971 [17:21<54:40,  1.38it/s, loss=0.142, v_num=0, train/loss_simple_step=0.677, train/loss_vlb_step=0.00937, train/loss_step=0.677, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  24%|██▍       | 1439/5971 [17:22<54:39,  1.38it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00304, train/loss_vlb_step=1.71e-5, train/loss_step=0.00304, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  24%|██▍       | 1440/5971 [17:25<54:46,  1.38it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  24%|██▍       | 1441/5971 [17:25<54:43,  1.38it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:06,  2.49it/s][A

Validating:   1%|          | 2/167 [00:00<00:43,  3.76it/s][A
Epoch 6:  24%|██▍       | 1445/5971 [17:25<54:33,  1.38it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.34it/s][A
Epoch 6:  24%|██▍       | 1449/5971 [17:26<54:22,  1.39it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▌         | 9/167 [00:00<00:10, 15.22it/s][A

Validating:   7%|▋         | 12/167 [00:00<00:08, 17.88it/s][A
Epoch 6:  24%|██▍       | 1453/5971 [17:26<54:11,  1.39it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.33it/s][A
Epoch 6:  24%|██▍       | 1457/5971 [17:26<53:59,  1.39it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:01<00:06, 21.45it/s][A
Epoch 6:  24%|██▍       | 1461/5971 [17:26<53:48,  1.40it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.03it/s][A

Validating:  14%|█▍        | 24/167 [00:01<00:05, 23.96it/s][A
Epoch 6:  25%|██▍       | 1465/5971 [17:26<53:37,  1.40it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.58it/s][A
Epoch 6:  25%|██▍       | 1469/5971 [17:26<53:26,  1.40it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.29it/s][A
Epoch 6:  25%|██▍       | 1473/5971 [17:27<53:15,  1.41it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.25it/s][A

Validating:  22%|██▏       | 36/167 [00:01<00:05, 26.16it/s][A
Epoch 6:  25%|██▍       | 1477/5971 [17:27<53:04,  1.41it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 27.65it/s][A
Epoch 6:  25%|██▍       | 1481/5971 [17:27<52:53,  1.42it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.11it/s][A
Epoch 6:  25%|██▍       | 1485/5971 [17:27<52:42,  1.42it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.22it/s][A
Epoch 6:  25%|██▍       | 1489/5971 [17:27<52:31,  1.42it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.83it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 27.26it/s][A
Epoch 6:  25%|██▌       | 1493/5971 [17:27<52:20,  1.43it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 27.71it/s][A
Epoch 6:  25%|██▌       | 1497/5971 [17:27<52:09,  1.43it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.79it/s][A
Epoch 6:  25%|██▌       | 1501/5971 [17:28<51:59,  1.43it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 26.39it/s][A

Validating:  38%|███▊      | 64/167 [00:02<00:03, 26.31it/s][A
Epoch 6:  25%|██▌       | 1505/5971 [17:28<51:48,  1.44it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:03, 25.79it/s][A
Epoch 6:  25%|██▌       | 1509/5971 [17:28<51:37,  1.44it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.58it/s][A
Epoch 6:  25%|██▌       | 1513/5971 [17:28<51:27,  1.44it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.83it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 27.35it/s][A
Epoch 6:  25%|██▌       | 1517/5971 [17:28<51:16,  1.45it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 27.88it/s][A
Epoch 6:  25%|██▌       | 1521/5971 [17:28<51:06,  1.45it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.10it/s][A
Epoch 6:  26%|██▌       | 1525/5971 [17:28<50:56,  1.45it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:03, 27.17it/s][A

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.32it/s][A
Epoch 6:  26%|██▌       | 1529/5971 [17:29<50:45,  1.46it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 26.45it/s][A
Epoch 6:  26%|██▌       | 1533/5971 [17:29<50:35,  1.46it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.49it/s][A
Epoch 6:  26%|██▌       | 1537/5971 [17:29<50:25,  1.47it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.61it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 27.29it/s][A
Epoch 6:  26%|██▌       | 1541/5971 [17:29<50:15,  1.47it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.08it/s][A
Epoch 6:  26%|██▌       | 1545/5971 [17:29<50:05,  1.47it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.47it/s][A
Epoch 6:  26%|██▌       | 1549/5971 [17:29<49:55,  1.48it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.53it/s][A
Epoch 6:  26%|██▌       | 1553/5971 [17:30<49:45,  1.48it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.08it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 27.74it/s][A
Epoch 6:  26%|██▌       | 1557/5971 [17:30<49:35,  1.48it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 27.71it/s][A
Epoch 6:  26%|██▌       | 1561/5971 [17:30<49:25,  1.49it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.24it/s][A
Epoch 6:  26%|██▌       | 1565/5971 [17:30<49:15,  1.49it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.68it/s][A
Epoch 6:  26%|██▋       | 1569/5971 [17:30<49:05,  1.49it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.59it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.15it/s][A
Epoch 6:  26%|██▋       | 1573/5971 [17:30<48:55,  1.50it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.62it/s][A
Epoch 6:  26%|██▋       | 1577/5971 [17:30<48:46,  1.50it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.94it/s][A
Epoch 6:  26%|██▋       | 1581/5971 [17:31<48:36,  1.51it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.54it/s][A
Epoch 6:  27%|██▋       | 1585/5971 [17:31<48:27,  1.51it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 26.59it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.68it/s][A
Epoch 6:  27%|██▋       | 1589/5971 [17:31<48:17,  1.51it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.67it/s][A
Epoch 6:  27%|██▋       | 1593/5971 [17:31<48:08,  1.52it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.05it/s][A
Epoch 6:  27%|██▋       | 1597/5971 [17:31<47:58,  1.52it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.10it/s][A
Epoch 6:  27%|██▋       | 1601/5971 [17:31<47:49,  1.52it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 28.06it/s][A
Epoch 6:  27%|██▋       | 1605/5971 [17:31<47:39,  1.53it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 28.13it/s][A
Epoch 6:  27%|██▋       | 1608/5971 [17:32<47:33,  1.53it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.73it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.95it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.16it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.51it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.44it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.61it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.69it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.69it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.69it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.60it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.52it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.66it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.63it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.61it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.61it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.64it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.64it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.64it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s]

Epoch 6:  27%|██▋       | 1609/5971 [17:44<48:03,  1.51it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00664, train/loss_vlb_step=3.2e-5, train/loss_step=0.00664, global_step=3599.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1609/5971 [17:44<48:03,  1.51it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.000148, train/loss_step=0.0401, global_step=3600.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.39it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.75it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.03it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.10it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.39it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.61it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.55it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.37it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.39it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.43it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.56it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.59it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.61it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.56it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.56it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.48it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.45it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 6:  27%|██▋       | 1610/5971 [17:56<48:33,  1.50it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.000148, train/loss_step=0.0401, global_step=3600.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1610/5971 [17:56<48:33,  1.50it/s, loss=0.17, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.00674, train/loss_step=0.609, global_step=3600.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:40,  1.21it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:26,  1.84it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:21,  2.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:19,  2.41it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:18,  2.37it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:18,  2.33it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:19,  2.17it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:18,  2.33it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:04<00:16,  2.44it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:15,  2.54it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:04<00:15,  2.54it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:05<00:14,  2.55it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:14,  2.63it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:05<00:13,  2.70it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:06<00:12,  2.76it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:06<00:12,  2.76it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:06<00:12,  2.69it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:07<00:12,  2.62it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:07<00:12,  2.56it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:08<00:12,  2.48it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:08<00:11,  2.46it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:08<00:10,  2.55it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:09<00:11,  2.44it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:09<00:10,  2.48it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:10<00:10,  2.41it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:10<00:09,  2.44it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:11<00:09,  2.44it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:11<00:08,  2.55it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:11<00:08,  2.49it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:12<00:08,  2.49it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:12<00:07,  2.44it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:13<00:07,  2.39it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:13<00:07,  2.37it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:14<00:07,  2.27it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:14<00:06,  2.23it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:14<00:06,  2.30it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:15<00:05,  2.25it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:15<00:05,  2.23it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:16<00:04,  2.29it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:16<00:04,  2.33it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:17<00:03,  2.38it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:17<00:03,  2.43it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:17<00:02,  2.49it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:18<00:02,  2.52it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:18<00:01,  2.58it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:18<00:01,  2.55it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:19<00:01,  2.46it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:19<00:00,  2.50it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:20<00:00,  2.49it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:20<00:00,  2.55it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:20<00:00,  2.43it/s]

Epoch 6:  27%|██▋       | 1611/5971 [18:19<49:34,  1.47it/s, loss=0.17, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.00674, train/loss_step=0.609, global_step=3600.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1611/5971 [18:19<49:34,  1.47it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.37e-5, train/loss_step=0.00231, global_step=3600.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:48,  1.02it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:34,  1.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:28,  1.64it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:02<00:23,  1.93it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:02<00:20,  2.23it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:02<00:17,  2.54it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:03<00:16,  2.66it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:03<00:15,  2.72it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:04<00:14,  2.81it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:04<00:14,  2.84it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:04<00:13,  2.85it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:05<00:13,  2.84it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:05<00:13,  2.82it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:05<00:12,  2.80it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:06<00:12,  2.88it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:06<00:11,  2.91it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:06<00:11,  2.96it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:07<00:10,  2.96it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:07<00:10,  2.89it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:07<00:10,  2.94it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:08<00:09,  3.05it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:08<00:08,  3.24it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:08<00:08,  3.35it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:08<00:07,  3.46it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:09<00:07,  3.54it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:09<00:06,  3.51it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:09<00:06,  3.54it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:10<00:06,  3.57it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:10<00:06,  3.34it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:10<00:06,  3.12it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:11<00:06,  3.05it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:11<00:05,  3.18it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:11<00:05,  3.12it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:11<00:04,  3.22it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:12<00:04,  3.43it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:12<00:03,  3.54it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:12<00:03,  3.57it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:13<00:03,  3.64it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:13<00:02,  3.71it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:13<00:02,  3.58it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:13<00:02,  3.54it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:14<00:02,  3.40it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:14<00:02,  3.31it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:14<00:01,  3.42it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:15<00:01,  3.53it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:15<00:01,  3.41it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:15<00:00,  3.61it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:15<00:00,  3.69it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:16<00:00,  3.80it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:16<00:00,  3.77it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:16<00:00,  3.06it/s]

Epoch 6:  27%|██▋       | 1612/5971 [18:41<50:31,  1.44it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.37e-5, train/loss_step=0.00231, global_step=3600.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1612/5971 [18:41<50:31,  1.44it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00275, train/loss_vlb_step=1.58e-5, train/loss_step=0.00275, global_step=3600.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1613/5971 [18:43<50:32,  1.44it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00275, train/loss_vlb_step=1.58e-5, train/loss_step=0.00275, global_step=3600.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1613/5971 [18:43<50:32,  1.44it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0781, train/loss_vlb_step=0.000259, train/loss_step=0.0781, global_step=3601.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  27%|██▋       | 1614/5971 [18:44<50:33,  1.44it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0781, train/loss_vlb_step=0.000259, train/loss_step=0.0781, global_step=3601.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1614/5971 [18:44<50:33,  1.44it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000223, train/loss_step=0.0658, global_step=3601.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  27%|██▋       | 1615/5971 [18:45<50:33,  1.44it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000223, train/loss_step=0.0658, global_step=3601.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1615/5971 [18:45<50:33,  1.44it/s, loss=0.16, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000675, train/loss_step=0.195, global_step=3601.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  27%|██▋       | 1616/5971 [18:48<50:38,  1.43it/s, loss=0.16, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000675, train/loss_step=0.195, global_step=3601.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1616/5971 [18:48<50:38,  1.43it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.49e-6, train/loss_step=0.00156, global_step=3601.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1617/5971 [18:49<50:38,  1.43it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.49e-6, train/loss_step=0.00156, global_step=3601.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1617/5971 [18:49<50:38,  1.43it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00362, train/loss_vlb_step=2.01e-5, train/loss_step=0.00362, global_step=3602.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1618/5971 [18:50<50:38,  1.43it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00362, train/loss_vlb_step=2.01e-5, train/loss_step=0.00362, global_step=3602.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1618/5971 [18:50<50:38,  1.43it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.76e-5, train/loss_step=0.0161, global_step=3602.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  27%|██▋       | 1619/5971 [18:51<50:39,  1.43it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.76e-5, train/loss_step=0.0161, global_step=3602.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1619/5971 [18:51<50:39,  1.43it/s, loss=0.15, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000803, train/loss_step=0.208, global_step=3602.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  27%|██▋       | 1620/5971 [18:53<50:43,  1.43it/s, loss=0.15, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000803, train/loss_step=0.208, global_step=3602.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1620/5971 [18:53<50:43,  1.43it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000142, train/loss_step=0.0413, global_step=3602.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1621/5971 [18:54<50:43,  1.43it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000142, train/loss_step=0.0413, global_step=3602.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1621/5971 [18:54<50:43,  1.43it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000131, train/loss_step=0.0337, global_step=3603.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1622/5971 [18:55<50:43,  1.43it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000131, train/loss_step=0.0337, global_step=3603.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1622/5971 [18:55<50:43,  1.43it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.22e-5, train/loss_step=0.0249, global_step=3603.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  27%|██▋       | 1623/5971 [18:56<50:43,  1.43it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.22e-5, train/loss_step=0.0249, global_step=3603.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1623/5971 [18:56<50:43,  1.43it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.67e-5, train/loss_step=0.00292, global_step=3603.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1624/5971 [18:59<50:47,  1.43it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.67e-5, train/loss_step=0.00292, global_step=3603.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1624/5971 [18:59<50:47,  1.43it/s, loss=0.134, v_num=0, train/loss_simple_step=0.645, train/loss_vlb_step=0.0157, train/loss_step=0.645, global_step=3603.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  27%|██▋       | 1625/5971 [19:00<50:47,  1.43it/s, loss=0.134, v_num=0, train/loss_simple_step=0.645, train/loss_vlb_step=0.0157, train/loss_step=0.645, global_step=3603.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1625/5971 [19:00<50:47,  1.43it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.71e-5, train/loss_step=0.0105, global_step=3604.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1626/5971 [19:01<50:47,  1.43it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.71e-5, train/loss_step=0.0105, global_step=3604.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1626/5971 [19:01<50:47,  1.43it/s, loss=0.107, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000496, train/loss_step=0.149, global_step=3604.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  27%|██▋       | 1627/5971 [19:02<50:47,  1.43it/s, loss=0.107, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000496, train/loss_step=0.149, global_step=3604.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1627/5971 [19:02<50:47,  1.43it/s, loss=0.11, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000242, train/loss_step=0.073, global_step=3604.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  27%|██▋       | 1628/5971 [19:04<50:52,  1.42it/s, loss=0.11, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000242, train/loss_step=0.073, global_step=3604.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1628/5971 [19:04<50:52,  1.42it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0898, train/loss_vlb_step=0.000297, train/loss_step=0.0898, global_step=3604.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1629/5971 [19:05<50:52,  1.42it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0898, train/loss_vlb_step=0.000297, train/loss_step=0.0898, global_step=3604.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1629/5971 [19:05<50:52,  1.42it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.00025, train/loss_step=0.0755, global_step=3605.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  27%|██▋       | 1630/5971 [19:06<50:52,  1.42it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.00025, train/loss_step=0.0755, global_step=3605.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1630/5971 [19:06<50:52,  1.42it/s, loss=0.1, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00148, train/loss_step=0.282, global_step=3605.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  27%|██▋       | 1631/5971 [19:07<50:51,  1.42it/s, loss=0.1, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00148, train/loss_step=0.282, global_step=3605.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1631/5971 [19:07<50:51,  1.42it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00436, train/loss_vlb_step=2.28e-5, train/loss_step=0.00436, global_step=3605.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1632/5971 [19:10<50:56,  1.42it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00436, train/loss_vlb_step=2.28e-5, train/loss_step=0.00436, global_step=3605.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1632/5971 [19:10<50:56,  1.42it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.31e-5, train/loss_step=0.0174, global_step=3605.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1633/5971 [19:11<50:56,  1.42it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.31e-5, train/loss_step=0.0174, global_step=3605.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1633/5971 [19:11<50:56,  1.42it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000155, train/loss_step=0.0425, global_step=3606.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1634/5971 [19:12<50:56,  1.42it/s, loss=0.0991, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000155, train/loss_step=0.0425, global_step=3606.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1634/5971 [19:12<50:56,  1.42it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.81e-5, train/loss_step=0.0132, global_step=3606.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  27%|██▋       | 1635/5971 [19:13<50:56,  1.42it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.81e-5, train/loss_step=0.0132, global_step=3606.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1635/5971 [19:13<50:56,  1.42it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.29e-5, train/loss_step=0.00226, global_step=3606.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1636/5971 [19:15<51:01,  1.42it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.29e-5, train/loss_step=0.00226, global_step=3606.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1636/5971 [19:15<51:01,  1.42it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=5.91e-5, train/loss_step=0.014, global_step=3606.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  27%|██▋       | 1637/5971 [19:16<51:01,  1.42it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=5.91e-5, train/loss_step=0.014, global_step=3606.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1637/5971 [19:16<51:01,  1.42it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.09e-5, train/loss_step=0.0174, global_step=3607.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1638/5971 [19:17<51:00,  1.42it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.09e-5, train/loss_step=0.0174, global_step=3607.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1638/5971 [19:17<51:00,  1.42it/s, loss=0.0877, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.1e-5, train/loss_step=0.00638, global_step=3607.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1639/5971 [19:18<51:00,  1.42it/s, loss=0.0877, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.1e-5, train/loss_step=0.00638, global_step=3607.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1639/5971 [19:18<51:00,  1.42it/s, loss=0.078, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.44e-5, train/loss_step=0.0132, global_step=3607.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  27%|██▋       | 1640/5971 [19:21<51:05,  1.41it/s, loss=0.078, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.44e-5, train/loss_step=0.0132, global_step=3607.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1640/5971 [19:21<51:05,  1.41it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000193, train/loss_step=0.0551, global_step=3607.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1641/5971 [19:22<51:05,  1.41it/s, loss=0.0786, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000193, train/loss_step=0.0551, global_step=3607.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1641/5971 [19:22<51:05,  1.41it/s, loss=0.104, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00478, train/loss_step=0.533, global_step=3608.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  27%|██▋       | 1642/5971 [19:23<51:05,  1.41it/s, loss=0.104, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00478, train/loss_step=0.533, global_step=3608.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  27%|██▋       | 1642/5971 [19:23<51:05,  1.41it/s, loss=0.103, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.72e-5, train/loss_step=0.011, global_step=3608.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1643/5971 [19:24<51:06,  1.41it/s, loss=0.103, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.72e-5, train/loss_step=0.011, global_step=3608.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1643/5971 [19:24<51:06,  1.41it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000223, train/loss_step=0.0669, global_step=3608.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1644/5971 [19:27<51:10,  1.41it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000223, train/loss_step=0.0669, global_step=3608.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1644/5971 [19:27<51:10,  1.41it/s, loss=0.0754, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000121, train/loss_step=0.0316, global_step=3608.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1645/5971 [19:28<51:10,  1.41it/s, loss=0.0754, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000121, train/loss_step=0.0316, global_step=3608.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1645/5971 [19:28<51:10,  1.41it/s, loss=0.0787, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000258, train/loss_step=0.0759, global_step=3609.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1646/5971 [19:29<51:10,  1.41it/s, loss=0.0787, v_num=0, train/loss_simple_step=0.0759, train/loss_vlb_step=0.000258, train/loss_step=0.0759, global_step=3609.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1646/5971 [19:29<51:10,  1.41it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00344, train/loss_step=0.441, global_step=3609.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  28%|██▊       | 1647/5971 [19:30<51:10,  1.41it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00344, train/loss_step=0.441, global_step=3609.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1647/5971 [19:30<51:10,  1.41it/s, loss=0.09, v_num=0, train/loss_simple_step=0.00747, train/loss_vlb_step=3.59e-5, train/loss_step=0.00747, global_step=3609.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1648/5971 [19:32<51:14,  1.41it/s, loss=0.09, v_num=0, train/loss_simple_step=0.00747, train/loss_vlb_step=3.59e-5, train/loss_step=0.00747, global_step=3609.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1648/5971 [19:32<51:14,  1.41it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=3609.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1649/5971 [19:33<51:14,  1.41it/s, loss=0.0921, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=3609.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1649/5971 [19:33<51:14,  1.41it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.0986, train/loss_vlb_step=0.000324, train/loss_step=0.0986, global_step=3610.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1650/5971 [19:34<51:14,  1.41it/s, loss=0.0933, v_num=0, train/loss_simple_step=0.0986, train/loss_vlb_step=0.000324, train/loss_step=0.0986, global_step=3610.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1650/5971 [19:34<51:14,  1.41it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000156, train/loss_step=0.0443, global_step=3610.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1651/5971 [19:35<51:14,  1.41it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.0443, train/loss_vlb_step=0.000156, train/loss_step=0.0443, global_step=3610.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1651/5971 [19:35<51:14,  1.41it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.12e-5, train/loss_step=0.0139, global_step=3610.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1652/5971 [19:38<51:18,  1.40it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.12e-5, train/loss_step=0.0139, global_step=3610.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1652/5971 [19:38<51:18,  1.40it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000507, train/loss_step=0.147, global_step=3610.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1653/5971 [19:38<51:17,  1.40it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000507, train/loss_step=0.147, global_step=3610.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1653/5971 [19:38<51:17,  1.40it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.00035, train/loss_step=0.107, global_step=3611.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1654/5971 [19:39<51:17,  1.40it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.00035, train/loss_step=0.107, global_step=3611.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1654/5971 [19:39<51:17,  1.40it/s, loss=0.125, v_num=0, train/loss_simple_step=0.679, train/loss_vlb_step=0.0124, train/loss_step=0.679, global_step=3611.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  28%|██▊       | 1655/5971 [19:41<51:18,  1.40it/s, loss=0.125, v_num=0, train/loss_simple_step=0.679, train/loss_vlb_step=0.0124, train/loss_step=0.679, global_step=3611.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1655/5971 [19:41<51:18,  1.40it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000167, train/loss_step=0.0481, global_step=3611.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1656/5971 [19:43<51:21,  1.40it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000167, train/loss_step=0.0481, global_step=3611.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1656/5971 [19:43<51:21,  1.40it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.63e-5, train/loss_step=0.00294, global_step=3611.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1657/5971 [19:44<51:22,  1.40it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.63e-5, train/loss_step=0.00294, global_step=3611.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1657/5971 [19:44<51:22,  1.40it/s, loss=0.132, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000451, train/loss_step=0.136, global_step=3612.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  28%|██▊       | 1658/5971 [19:45<51:21,  1.40it/s, loss=0.132, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000451, train/loss_step=0.136, global_step=3612.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1658/5971 [19:45<51:21,  1.40it/s, loss=0.139, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000482, train/loss_step=0.146, global_step=3612.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1659/5971 [19:46<51:21,  1.40it/s, loss=0.139, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000482, train/loss_step=0.146, global_step=3612.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1659/5971 [19:46<51:21,  1.40it/s, loss=0.141, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.00015, train/loss_step=0.038, global_step=3612.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1660/5971 [19:48<51:24,  1.40it/s, loss=0.141, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.00015, train/loss_step=0.038, global_step=3612.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1660/5971 [19:48<51:24,  1.40it/s, loss=0.164, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00385, train/loss_step=0.526, global_step=3612.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1661/5971 [19:49<51:24,  1.40it/s, loss=0.164, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00385, train/loss_step=0.526, global_step=3612.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1661/5971 [19:49<51:24,  1.40it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=9.27e-6, train/loss_step=0.00153, global_step=3613.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1662/5971 [19:50<51:24,  1.40it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=9.27e-6, train/loss_step=0.00153, global_step=3613.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1662/5971 [19:50<51:24,  1.40it/s, loss=0.186, v_num=0, train/loss_simple_step=0.970, train/loss_vlb_step=0.488, train/loss_step=0.970, global_step=3613.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  28%|██▊       | 1663/5971 [19:51<51:24,  1.40it/s, loss=0.186, v_num=0, train/loss_simple_step=0.970, train/loss_vlb_step=0.488, train/loss_step=0.970, global_step=3613.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1663/5971 [19:51<51:24,  1.40it/s, loss=0.211, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.00856, train/loss_step=0.578, global_step=3613.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1664/5971 [19:53<51:28,  1.39it/s, loss=0.211, v_num=0, train/loss_simple_step=0.578, train/loss_vlb_step=0.00856, train/loss_step=0.578, global_step=3613.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1664/5971 [19:53<51:28,  1.39it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000159, train/loss_step=0.0427, global_step=3613.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1665/5971 [19:54<51:27,  1.39it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000159, train/loss_step=0.0427, global_step=3613.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1665/5971 [19:54<51:27,  1.39it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0095, train/loss_vlb_step=4.48e-5, train/loss_step=0.0095, global_step=3614.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1666/5971 [19:55<51:27,  1.39it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0095, train/loss_vlb_step=4.48e-5, train/loss_step=0.0095, global_step=3614.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1666/5971 [19:55<51:27,  1.39it/s, loss=0.187, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.64e-5, train/loss_step=0.011, global_step=3614.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  28%|██▊       | 1667/5971 [19:56<51:27,  1.39it/s, loss=0.187, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.64e-5, train/loss_step=0.011, global_step=3614.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1667/5971 [19:56<51:27,  1.39it/s, loss=0.193, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.0004, train/loss_step=0.121, global_step=3614.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1668/5971 [19:58<51:30,  1.39it/s, loss=0.193, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.0004, train/loss_step=0.121, global_step=3614.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1668/5971 [19:58<51:30,  1.39it/s, loss=0.186, v_num=0, train/loss_simple_step=0.000998, train/loss_vlb_step=5.96e-6, train/loss_step=0.000998, global_step=3614.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1669/5971 [19:59<51:30,  1.39it/s, loss=0.186, v_num=0, train/loss_simple_step=0.000998, train/loss_vlb_step=5.96e-6, train/loss_step=0.000998, global_step=3614.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1669/5971 [19:59<51:30,  1.39it/s, loss=0.197, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00182, train/loss_step=0.324, global_step=3615.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  28%|██▊       | 1670/5971 [20:00<51:30,  1.39it/s, loss=0.197, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00182, train/loss_step=0.324, global_step=3615.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1670/5971 [20:00<51:30,  1.39it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.96e-5, train/loss_step=0.0037, global_step=3615.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1671/5971 [20:01<51:30,  1.39it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.96e-5, train/loss_step=0.0037, global_step=3615.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1671/5971 [20:01<51:30,  1.39it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.56e-5, train/loss_step=0.00296, global_step=3615.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1672/5971 [20:03<51:33,  1.39it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.56e-5, train/loss_step=0.00296, global_step=3615.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1672/5971 [20:03<51:33,  1.39it/s, loss=0.213, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.00476, train/loss_step=0.504, global_step=3615.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  28%|██▊       | 1673/5971 [20:04<51:33,  1.39it/s, loss=0.213, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.00476, train/loss_step=0.504, global_step=3615.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1673/5971 [20:04<51:33,  1.39it/s, loss=0.224, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00136, train/loss_step=0.341, global_step=3616.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1674/5971 [20:05<51:33,  1.39it/s, loss=0.224, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00136, train/loss_step=0.341, global_step=3616.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1674/5971 [20:05<51:33,  1.39it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00629, train/loss_vlb_step=3.23e-5, train/loss_step=0.00629, global_step=3616.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1675/5971 [20:06<51:33,  1.39it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00629, train/loss_vlb_step=3.23e-5, train/loss_step=0.00629, global_step=3616.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1675/5971 [20:06<51:33,  1.39it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.62e-5, train/loss_step=0.0229, global_step=3616.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  28%|██▊       | 1676/5971 [20:08<51:36,  1.39it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.62e-5, train/loss_step=0.0229, global_step=3616.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1676/5971 [20:08<51:36,  1.39it/s, loss=0.198, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000571, train/loss_step=0.165, global_step=3616.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1677/5971 [20:09<51:36,  1.39it/s, loss=0.198, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000571, train/loss_step=0.165, global_step=3616.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1677/5971 [20:09<51:36,  1.39it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00548, train/loss_vlb_step=2.77e-5, train/loss_step=0.00548, global_step=3617.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1678/5971 [20:10<51:35,  1.39it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00548, train/loss_vlb_step=2.77e-5, train/loss_step=0.00548, global_step=3617.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1678/5971 [20:10<51:35,  1.39it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000173, train/loss_step=0.0483, global_step=3617.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1679/5971 [20:11<51:35,  1.39it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000173, train/loss_step=0.0483, global_step=3617.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1679/5971 [20:11<51:35,  1.39it/s, loss=0.219, v_num=0, train/loss_simple_step=0.703, train/loss_vlb_step=0.0305, train/loss_step=0.703, global_step=3617.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  28%|██▊       | 1680/5971 [20:14<51:39,  1.38it/s, loss=0.219, v_num=0, train/loss_simple_step=0.703, train/loss_vlb_step=0.0305, train/loss_step=0.703, global_step=3617.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1680/5971 [20:14<51:39,  1.38it/s, loss=0.211, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00224, train/loss_step=0.355, global_step=3617.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1681/5971 [20:15<51:39,  1.38it/s, loss=0.211, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.00224, train/loss_step=0.355, global_step=3617.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1681/5971 [20:15<51:39,  1.38it/s, loss=0.233, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.0027, train/loss_step=0.446, global_step=3618.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  28%|██▊       | 1682/5971 [20:15<51:38,  1.38it/s, loss=0.233, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.0027, train/loss_step=0.446, global_step=3618.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1682/5971 [20:15<51:38,  1.38it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0749, train/loss_vlb_step=0.000246, train/loss_step=0.0749, global_step=3618.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1683/5971 [20:16<51:38,  1.38it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0749, train/loss_vlb_step=0.000246, train/loss_step=0.0749, global_step=3618.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1683/5971 [20:16<51:38,  1.38it/s, loss=0.173, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.000992, train/loss_step=0.271, global_step=3618.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  28%|██▊       | 1684/5971 [20:19<51:41,  1.38it/s, loss=0.173, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.000992, train/loss_step=0.271, global_step=3618.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1684/5971 [20:19<51:41,  1.38it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.57e-5, train/loss_step=0.0153, global_step=3618.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1685/5971 [20:20<51:41,  1.38it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.57e-5, train/loss_step=0.0153, global_step=3618.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1685/5971 [20:20<51:41,  1.38it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0769, train/loss_vlb_step=0.000268, train/loss_step=0.0769, global_step=3619.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1686/5971 [20:20<51:41,  1.38it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0769, train/loss_vlb_step=0.000268, train/loss_step=0.0769, global_step=3619.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1686/5971 [20:20<51:41,  1.38it/s, loss=0.18, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=3619.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  28%|██▊       | 1687/5971 [20:21<51:41,  1.38it/s, loss=0.18, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=3619.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1687/5971 [20:21<51:41,  1.38it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000221, train/loss_step=0.0658, global_step=3619.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1688/5971 [20:24<51:44,  1.38it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000221, train/loss_step=0.0658, global_step=3619.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1688/5971 [20:24<51:44,  1.38it/s, loss=0.201, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00362, train/loss_step=0.473, global_step=3619.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  28%|██▊       | 1689/5971 [20:25<51:44,  1.38it/s, loss=0.201, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00362, train/loss_step=0.473, global_step=3619.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1689/5971 [20:25<51:44,  1.38it/s, loss=0.186, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000123, train/loss_step=0.033, global_step=3620.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1690/5971 [20:26<51:44,  1.38it/s, loss=0.186, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000123, train/loss_step=0.033, global_step=3620.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1690/5971 [20:26<51:44,  1.38it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.28e-5, train/loss_step=0.00419, global_step=3620.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1691/5971 [20:27<51:43,  1.38it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.28e-5, train/loss_step=0.00419, global_step=3620.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1691/5971 [20:27<51:43,  1.38it/s, loss=0.206, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00182, train/loss_step=0.394, global_step=3620.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  28%|██▊       | 1692/5971 [20:29<51:47,  1.38it/s, loss=0.206, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00182, train/loss_step=0.394, global_step=3620.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1692/5971 [20:29<51:47,  1.38it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.59e-5, train/loss_step=0.0212, global_step=3620.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1693/5971 [20:30<51:47,  1.38it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.59e-5, train/loss_step=0.0212, global_step=3620.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1693/5971 [20:30<51:47,  1.38it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0652, train/loss_vlb_step=0.000222, train/loss_step=0.0652, global_step=3621.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1694/5971 [20:31<51:46,  1.38it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0652, train/loss_vlb_step=0.000222, train/loss_step=0.0652, global_step=3621.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1694/5971 [20:31<51:46,  1.38it/s, loss=0.178, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000686, train/loss_step=0.197, global_step=3621.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  28%|██▊       | 1695/5971 [20:32<51:46,  1.38it/s, loss=0.178, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000686, train/loss_step=0.197, global_step=3621.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1695/5971 [20:32<51:46,  1.38it/s, loss=0.183, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000417, train/loss_step=0.123, global_step=3621.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1696/5971 [20:34<51:49,  1.37it/s, loss=0.183, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000417, train/loss_step=0.123, global_step=3621.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1696/5971 [20:34<51:49,  1.37it/s, loss=0.184, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000778, train/loss_step=0.196, global_step=3621.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1697/5971 [20:35<51:49,  1.37it/s, loss=0.184, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000778, train/loss_step=0.196, global_step=3621.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1697/5971 [20:35<51:49,  1.37it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=7.89e-5, train/loss_step=0.0206, global_step=3622.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1698/5971 [20:36<51:49,  1.37it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=7.89e-5, train/loss_step=0.0206, global_step=3622.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1698/5971 [20:36<51:49,  1.37it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=9.89e-5, train/loss_step=0.0261, global_step=3622.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1699/5971 [20:37<51:48,  1.37it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=9.89e-5, train/loss_step=0.0261, global_step=3622.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1699/5971 [20:37<51:48,  1.37it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000227, train/loss_step=0.0669, global_step=3622.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1700/5971 [20:39<51:51,  1.37it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000227, train/loss_step=0.0669, global_step=3622.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1700/5971 [20:39<51:51,  1.37it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00901, train/loss_vlb_step=4.08e-5, train/loss_step=0.00901, global_step=3622.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1701/5971 [20:40<51:51,  1.37it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00901, train/loss_vlb_step=4.08e-5, train/loss_step=0.00901, global_step=3622.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  28%|██▊       | 1701/5971 [20:40<51:51,  1.37it/s, loss=0.113, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.58e-5, train/loss_step=0.011, global_step=3623.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  29%|██▊       | 1702/5971 [20:41<51:51,  1.37it/s, loss=0.113, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.58e-5, train/loss_step=0.011, global_step=3623.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1702/5971 [20:41<51:51,  1.37it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0489, train/loss_vlb_step=0.000175, train/loss_step=0.0489, global_step=3623.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1703/5971 [20:42<51:50,  1.37it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0489, train/loss_vlb_step=0.000175, train/loss_step=0.0489, global_step=3623.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1703/5971 [20:42<51:50,  1.37it/s, loss=0.126, v_num=0, train/loss_simple_step=0.565, train/loss_vlb_step=0.00468, train/loss_step=0.565, global_step=3623.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  29%|██▊       | 1704/5971 [20:44<51:54,  1.37it/s, loss=0.126, v_num=0, train/loss_simple_step=0.565, train/loss_vlb_step=0.00468, train/loss_step=0.565, global_step=3623.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1704/5971 [20:44<51:54,  1.37it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.28e-5, train/loss_step=0.00221, global_step=3623.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1705/5971 [20:45<51:53,  1.37it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.28e-5, train/loss_step=0.00221, global_step=3623.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1705/5971 [20:45<51:53,  1.37it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00488, train/loss_vlb_step=2.5e-5, train/loss_step=0.00488, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  29%|██▊       | 1706/5971 [20:46<51:53,  1.37it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00488, train/loss_vlb_step=2.5e-5, train/loss_step=0.00488, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1706/5971 [20:46<51:53,  1.37it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000112, train/loss_step=0.0298, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1707/5971 [20:47<51:53,  1.37it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000112, train/loss_step=0.0298, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1707/5971 [20:47<51:53,  1.37it/s, loss=0.124, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000728, train/loss_step=0.194, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  29%|██▊       | 1708/5971 [20:49<51:56,  1.37it/s, loss=0.124, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000728, train/loss_step=0.194, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  29%|██▊       | 1708/5971 [20:49<51:56,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.44it/s][A
Epoch 6:  29%|██▊       | 1710/5971 [20:49<51:52,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:45,  3.66it/s][A
Epoch 6:  29%|██▊       | 1712/5971 [20:50<51:47,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.30it/s][A
Epoch 6:  29%|██▊       | 1715/5971 [20:50<51:40,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.08it/s][A
Epoch 6:  29%|██▉       | 1718/5971 [20:50<51:33,  1.37it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.12it/s][A
Epoch 6:  29%|██▉       | 1721/5971 [20:50<51:26,  1.38it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.83it/s][A
Epoch 6:  29%|██▉       | 1724/5971 [20:50<51:18,  1.38it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.31it/s][A
Epoch 6:  29%|██▉       | 1727/5971 [20:50<51:11,  1.38it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.07it/s][A
Epoch 6:  29%|██▉       | 1730/5971 [20:50<51:04,  1.38it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 21.52it/s][A
Epoch 6:  29%|██▉       | 1733/5971 [20:50<50:57,  1.39it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.36it/s][A
Epoch 6:  29%|██▉       | 1736/5971 [20:51<50:50,  1.39it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 22.46it/s][A
Epoch 6:  29%|██▉       | 1739/5971 [20:51<50:43,  1.39it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 22.94it/s][A
Epoch 6:  29%|██▉       | 1742/5971 [20:51<50:35,  1.39it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:01<00:05, 23.14it/s][A
Epoch 6:  29%|██▉       | 1745/5971 [20:51<50:28,  1.40it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 23.45it/s][A
Epoch 6:  29%|██▉       | 1748/5971 [20:51<50:21,  1.40it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.22it/s][A
Epoch 6:  29%|██▉       | 1751/5971 [20:51<50:14,  1.40it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 24.22it/s][A
Epoch 6:  29%|██▉       | 1754/5971 [20:51<50:07,  1.40it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 24.63it/s][A
Epoch 6:  29%|██▉       | 1757/5971 [20:51<50:00,  1.40it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 24.54it/s][A
Epoch 6:  29%|██▉       | 1760/5971 [20:52<49:53,  1.41it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 22.81it/s][A
Epoch 6:  30%|██▉       | 1763/5971 [20:52<49:47,  1.41it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 23.73it/s][A
Epoch 6:  30%|██▉       | 1766/5971 [20:52<49:40,  1.41it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 23.82it/s][A
Epoch 6:  30%|██▉       | 1769/5971 [20:52<49:33,  1.41it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:03<00:04, 23.20it/s][A
Epoch 6:  30%|██▉       | 1772/5971 [20:52<49:26,  1.42it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 24.03it/s][A
Epoch 6:  30%|██▉       | 1775/5971 [20:52<49:19,  1.42it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:03<00:04, 24.57it/s][A
Epoch 6:  30%|██▉       | 1778/5971 [20:52<49:12,  1.42it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 24.87it/s][A
Epoch 6:  30%|██▉       | 1781/5971 [20:52<49:05,  1.42it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:04, 23.03it/s][A
Epoch 6:  30%|██▉       | 1784/5971 [20:53<48:59,  1.42it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 23.01it/s][A
Epoch 6:  30%|██▉       | 1787/5971 [20:53<48:52,  1.43it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 23.37it/s][A
Epoch 6:  30%|██▉       | 1790/5971 [20:53<48:45,  1.43it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 23.29it/s][A
Epoch 6:  30%|███       | 1793/5971 [20:53<48:39,  1.43it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 23.36it/s][A
Epoch 6:  30%|███       | 1796/5971 [20:53<48:32,  1.43it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 24.79it/s][A
Epoch 6:  30%|███       | 1799/5971 [20:53<48:25,  1.44it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.27it/s][A
Epoch 6:  30%|███       | 1802/5971 [20:53<48:19,  1.44it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.33it/s][A
Epoch 6:  30%|███       | 1805/5971 [20:53<48:12,  1.44it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.49it/s][A
Epoch 6:  30%|███       | 1808/5971 [20:54<48:05,  1.44it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.43it/s][A
Epoch 6:  30%|███       | 1811/5971 [20:54<47:59,  1.44it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.16it/s][A
Epoch 6:  30%|███       | 1814/5971 [20:54<47:52,  1.45it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.65it/s][A
Epoch 6:  30%|███       | 1817/5971 [20:54<47:46,  1.45it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 25.30it/s][A
Epoch 6:  30%|███       | 1820/5971 [20:54<47:39,  1.45it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 26.52it/s][A
Epoch 6:  31%|███       | 1824/5971 [20:54<47:30,  1.45it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 27.30it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.73it/s][A
Epoch 6:  31%|███       | 1828/5971 [20:54<47:22,  1.46it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.35it/s][A
Epoch 6:  31%|███       | 1832/5971 [20:54<47:13,  1.46it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 22.85it/s][A
Epoch 6:  31%|███       | 1836/5971 [20:55<47:05,  1.46it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 23.70it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 24.35it/s][A
Epoch 6:  31%|███       | 1840/5971 [20:55<46:56,  1.47it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.40it/s][A
Epoch 6:  31%|███       | 1844/5971 [20:55<46:48,  1.47it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 25.45it/s][A
Epoch 6:  31%|███       | 1848/5971 [20:55<46:39,  1.47it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 25.47it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.61it/s][A
Epoch 6:  31%|███       | 1852/5971 [20:55<46:31,  1.48it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 24.68it/s][A
Epoch 6:  31%|███       | 1856/5971 [20:55<46:23,  1.48it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 22.58it/s][A
Epoch 6:  31%|███       | 1860/5971 [20:56<46:14,  1.48it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 22.37it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 22.72it/s][A
Epoch 6:  31%|███       | 1864/5971 [20:56<46:06,  1.48it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 23.49it/s][A
Epoch 6:  31%|███▏      | 1868/5971 [20:56<45:58,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 23.67it/s][A
Epoch 6:  31%|███▏      | 1872/5971 [20:56<45:50,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 23.19it/s][A

Validating: 100%|██████████| 167/167 [00:07<00:00, 24.21it/s][A
Epoch 6:  31%|███▏      | 1876/5971 [20:56<45:41,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  31%|███▏      | 1876/5971 [20:57<45:42,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.873, train/loss_vlb_step=0.220, train/loss_step=0.873, global_step=3624.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  31%|███▏      | 1877/5971 [20:58<45:42,  1.49it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000212, train/loss_step=0.0636, global_step=3625.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  31%|███▏      | 1878/5971 [20:59<45:42,  1.49it/s, loss=0.148, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.00016, train/loss_step=0.044, global_step=3625.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  31%|███▏      | 1879/5971 [20:59<45:42,  1.49it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0915, train/loss_vlb_step=0.000307, train/loss_step=0.0915, global_step=3625.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  31%|███▏      | 1880/5971 [21:02<45:45,  1.49it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0915, train/loss_vlb_step=0.000307, train/loss_step=0.0915, global_step=3625.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  31%|███▏      | 1880/5971 [21:02<45:45,  1.49it/s, loss=0.149, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.00199, train/loss_step=0.357, global_step=3625.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  32%|███▏      | 1881/5971 [21:03<45:45,  1.49it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00746, train/loss_vlb_step=3.48e-5, train/loss_step=0.00746, global_step=3626.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1882/5971 [21:04<45:45,  1.49it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00931, train/loss_vlb_step=4.3e-5, train/loss_step=0.00931, global_step=3626.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1883/5971 [21:05<45:45,  1.49it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.72e-5, train/loss_step=0.0256, global_step=3626.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1884/5971 [21:07<45:48,  1.49it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.72e-5, train/loss_step=0.0256, global_step=3626.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1884/5971 [21:07<45:48,  1.49it/s, loss=0.123, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.9e-5, train/loss_step=0.011, global_step=3626.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  32%|███▏      | 1885/5971 [21:08<45:48,  1.49it/s, loss=0.133, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000756, train/loss_step=0.221, global_step=3627.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1886/5971 [21:09<45:48,  1.49it/s, loss=0.168, v_num=0, train/loss_simple_step=0.729, train/loss_vlb_step=0.0316, train/loss_step=0.729, global_step=3627.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  32%|███▏      | 1887/5971 [21:10<45:48,  1.49it/s, loss=0.189, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00458, train/loss_step=0.478, global_step=3627.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1888/5971 [21:12<45:51,  1.48it/s, loss=0.189, v_num=0, train/loss_simple_step=0.478, train/loss_vlb_step=0.00458, train/loss_step=0.478, global_step=3627.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1888/5971 [21:12<45:51,  1.48it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00932, train/loss_vlb_step=3.99e-5, train/loss_step=0.00932, global_step=3627.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1889/5971 [21:13<45:51,  1.48it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0772, train/loss_vlb_step=0.000265, train/loss_step=0.0772, global_step=3628.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1890/5971 [21:14<45:50,  1.48it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00527, train/loss_vlb_step=2.69e-5, train/loss_step=0.00527, global_step=3628.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1891/5971 [21:15<45:50,  1.48it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.95e-5, train/loss_step=0.0112, global_step=3628.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1892/5971 [21:17<45:53,  1.48it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.95e-5, train/loss_step=0.0112, global_step=3628.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1892/5971 [21:17<45:53,  1.48it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0719, train/loss_vlb_step=0.000241, train/loss_step=0.0719, global_step=3628.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1893/5971 [21:18<45:53,  1.48it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.09e-5, train/loss_step=0.0018, global_step=3629.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1894/5971 [21:19<45:52,  1.48it/s, loss=0.174, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000641, train/loss_step=0.190, global_step=3629.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1895/5971 [21:20<45:52,  1.48it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0666, train/loss_vlb_step=0.000228, train/loss_step=0.0666, global_step=3629.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1896/5971 [21:22<45:55,  1.48it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0666, train/loss_vlb_step=0.000228, train/loss_step=0.0666, global_step=3629.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1896/5971 [21:22<45:55,  1.48it/s, loss=0.131, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000503, train/loss_step=0.146, global_step=3629.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  32%|███▏      | 1897/5971 [21:23<45:55,  1.48it/s, loss=0.144, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00149, train/loss_step=0.323, global_step=3630.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1898/5971 [21:24<45:54,  1.48it/s, loss=0.16, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.0016, train/loss_step=0.363, global_step=3630.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  32%|███▏      | 1899/5971 [21:25<45:54,  1.48it/s, loss=0.174, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00309, train/loss_step=0.384, global_step=3630.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1900/5971 [21:27<45:57,  1.48it/s, loss=0.174, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00309, train/loss_step=0.384, global_step=3630.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1900/5971 [21:27<45:57,  1.48it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.28e-5, train/loss_step=0.0173, global_step=3630.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1901/5971 [21:28<45:57,  1.48it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0484, train/loss_vlb_step=0.000166, train/loss_step=0.0484, global_step=3631.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1902/5971 [21:29<45:56,  1.48it/s, loss=0.174, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00225, train/loss_step=0.292, global_step=3631.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  32%|███▏      | 1903/5971 [21:30<45:56,  1.48it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000138, train/loss_step=0.0372, global_step=3631.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1904/5971 [21:32<45:59,  1.47it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000138, train/loss_step=0.0372, global_step=3631.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1904/5971 [21:32<45:59,  1.47it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.92e-5, train/loss_step=0.0137, global_step=3631.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1905/5971 [21:33<45:59,  1.47it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.11e-5, train/loss_step=0.00191, global_step=3632.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1906/5971 [21:34<45:58,  1.47it/s, loss=0.139, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.00086, train/loss_step=0.238, global_step=3632.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  32%|███▏      | 1907/5971 [21:35<45:58,  1.47it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0388, train/loss_vlb_step=0.000145, train/loss_step=0.0388, global_step=3632.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1908/5971 [21:37<46:01,  1.47it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0388, train/loss_vlb_step=0.000145, train/loss_step=0.0388, global_step=3632.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1908/5971 [21:37<46:01,  1.47it/s, loss=0.132, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00135, train/loss_step=0.308, global_step=3632.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  32%|███▏      | 1909/5971 [21:38<46:01,  1.47it/s, loss=0.158, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0065, train/loss_step=0.606, global_step=3633.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1910/5971 [21:39<46:01,  1.47it/s, loss=0.178, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00254, train/loss_step=0.406, global_step=3633.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1911/5971 [21:40<46:00,  1.47it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.1e-5, train/loss_step=0.00394, global_step=3633.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1912/5971 [21:42<46:03,  1.47it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.1e-5, train/loss_step=0.00394, global_step=3633.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1912/5971 [21:42<46:03,  1.47it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000163, train/loss_step=0.0469, global_step=3633.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1913/5971 [21:43<46:03,  1.47it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.00012, train/loss_step=0.0327, global_step=3634.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1914/5971 [21:44<46:03,  1.47it/s, loss=0.182, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00106, train/loss_step=0.257, global_step=3634.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  32%|███▏      | 1915/5971 [21:45<46:02,  1.47it/s, loss=0.183, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000345, train/loss_step=0.104, global_step=3634.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1916/5971 [21:47<46:05,  1.47it/s, loss=0.183, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000345, train/loss_step=0.104, global_step=3634.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1916/5971 [21:47<46:05,  1.47it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00724, train/loss_vlb_step=3.4e-5, train/loss_step=0.00724, global_step=3634.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1917/5971 [21:48<46:05,  1.47it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00834, train/loss_vlb_step=4.02e-5, train/loss_step=0.00834, global_step=3635.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1918/5971 [21:49<46:05,  1.47it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00431, train/loss_vlb_step=2.28e-5, train/loss_step=0.00431, global_step=3635.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1919/5971 [21:50<46:05,  1.47it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0059, train/loss_vlb_step=3.02e-5, train/loss_step=0.0059, global_step=3635.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  32%|███▏      | 1920/5971 [21:52<46:07,  1.46it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0059, train/loss_vlb_step=3.02e-5, train/loss_step=0.0059, global_step=3635.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1920/5971 [21:52<46:07,  1.46it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000121, train/loss_step=0.0313, global_step=3635.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1921/5971 [21:53<46:07,  1.46it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.63e-5, train/loss_step=0.0212, global_step=3636.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1922/5971 [21:54<46:07,  1.46it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000182, train/loss_step=0.0503, global_step=3636.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1923/5971 [21:55<46:07,  1.46it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00428, train/loss_vlb_step=2.3e-5, train/loss_step=0.00428, global_step=3636.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1924/5971 [21:57<46:09,  1.46it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00428, train/loss_vlb_step=2.3e-5, train/loss_step=0.00428, global_step=3636.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1924/5971 [21:57<46:09,  1.46it/s, loss=0.116, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000489, train/loss_step=0.141, global_step=3636.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1925/5971 [21:58<46:09,  1.46it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.000108, train/loss_step=0.0287, global_step=3637.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1926/5971 [21:59<46:09,  1.46it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.63e-5, train/loss_step=0.0263, global_step=3637.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1927/5971 [22:00<46:09,  1.46it/s, loss=0.114, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000616, train/loss_step=0.181, global_step=3637.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1928/5971 [22:02<46:11,  1.46it/s, loss=0.114, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000616, train/loss_step=0.181, global_step=3637.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1928/5971 [22:02<46:11,  1.46it/s, loss=0.121, v_num=0, train/loss_simple_step=0.461, train/loss_vlb_step=0.00299, train/loss_step=0.461, global_step=3637.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1929/5971 [22:03<46:11,  1.46it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=3638.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1930/5971 [22:04<46:11,  1.46it/s, loss=0.105, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00673, train/loss_step=0.568, global_step=3638.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  32%|███▏      | 1931/5971 [22:05<46:10,  1.46it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.01e-5, train/loss_step=0.00172, global_step=3638.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1932/5971 [22:07<46:13,  1.46it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1.01e-5, train/loss_step=0.00172, global_step=3638.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1932/5971 [22:07<46:13,  1.46it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.7e-5, train/loss_step=0.0134, global_step=3638.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  32%|███▏      | 1933/5971 [22:08<46:13,  1.46it/s, loss=0.11, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.00057, train/loss_step=0.161, global_step=3639.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  32%|███▏      | 1934/5971 [22:09<46:13,  1.46it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.07e-5, train/loss_step=0.0231, global_step=3639.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1935/5971 [22:10<46:12,  1.46it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.45e-5, train/loss_step=0.0128, global_step=3639.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1936/5971 [22:12<46:15,  1.45it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.45e-5, train/loss_step=0.0128, global_step=3639.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1936/5971 [22:12<46:15,  1.45it/s, loss=0.0941, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.17e-5, train/loss_step=0.0168, global_step=3639.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1937/5971 [22:13<46:15,  1.45it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000409, train/loss_step=0.124, global_step=3640.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  32%|███▏      | 1938/5971 [22:14<46:15,  1.45it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=3.84e-5, train/loss_step=0.00854, global_step=3640.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1939/5971 [22:15<46:14,  1.45it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.24e-5, train/loss_step=0.00216, global_step=3640.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1940/5971 [22:17<46:17,  1.45it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.24e-5, train/loss_step=0.00216, global_step=3640.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  32%|███▏      | 1940/5971 [22:17<46:17,  1.45it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000141, train/loss_step=0.0394, global_step=3640.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  33%|███▎      | 1941/5971 [22:18<46:17,  1.45it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.00226, train/loss_vlb_step=1.34e-5, train/loss_step=0.00226, global_step=3641.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1942/5971 [22:19<46:17,  1.45it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000203, train/loss_step=0.0604, global_step=3641.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1943/5971 [22:20<46:17,  1.45it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.09e-5, train/loss_step=0.00393, global_step=3641.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1944/5971 [22:22<46:19,  1.45it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.09e-5, train/loss_step=0.00393, global_step=3641.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1944/5971 [22:22<46:19,  1.45it/s, loss=0.101, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000574, train/loss_step=0.167, global_step=3641.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  33%|███▎      | 1945/5971 [22:23<46:19,  1.45it/s, loss=0.112, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.00098, train/loss_step=0.254, global_step=3642.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1946/5971 [22:24<46:19,  1.45it/s, loss=0.117, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000383, train/loss_step=0.117, global_step=3642.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1947/5971 [22:25<46:19,  1.45it/s, loss=0.11, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000164, train/loss_step=0.047, global_step=3642.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1948/5971 [22:27<46:21,  1.45it/s, loss=0.11, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000164, train/loss_step=0.047, global_step=3642.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1948/5971 [22:27<46:21,  1.45it/s, loss=0.0876, v_num=0, train/loss_simple_step=0.00829, train/loss_vlb_step=4e-5, train/loss_step=0.00829, global_step=3642.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1949/5971 [22:28<46:21,  1.45it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000124, train/loss_step=0.0333, global_step=3643.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1950/5971 [22:29<46:21,  1.45it/s, loss=0.0648, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000797, train/loss_step=0.199, global_step=3643.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  33%|███▎      | 1951/5971 [22:30<46:21,  1.45it/s, loss=0.0663, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000117, train/loss_step=0.033, global_step=3643.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1952/5971 [22:32<46:24,  1.44it/s, loss=0.0663, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000117, train/loss_step=0.033, global_step=3643.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1952/5971 [22:32<46:24,  1.44it/s, loss=0.0677, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000151, train/loss_step=0.0403, global_step=3643.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1953/5971 [22:33<46:23,  1.44it/s, loss=0.0693, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000707, train/loss_step=0.194, global_step=3644.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  33%|███▎      | 1954/5971 [22:34<46:23,  1.44it/s, loss=0.0683, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.95e-5, train/loss_step=0.00358, global_step=3644.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1955/5971 [22:35<46:23,  1.44it/s, loss=0.0681, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.67e-5, train/loss_step=0.00768, global_step=3644.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1956/5971 [22:37<46:25,  1.44it/s, loss=0.0681, v_num=0, train/loss_simple_step=0.00768, train/loss_vlb_step=3.67e-5, train/loss_step=0.00768, global_step=3644.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1956/5971 [22:37<46:25,  1.44it/s, loss=0.0749, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000537, train/loss_step=0.154, global_step=3644.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  33%|███▎      | 1957/5971 [22:38<46:25,  1.44it/s, loss=0.0887, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00195, train/loss_step=0.399, global_step=3645.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1958/5971 [22:39<46:25,  1.44it/s, loss=0.102, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00102, train/loss_step=0.283, global_step=3645.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1959/5971 [22:40<46:24,  1.44it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=4.95e-5, train/loss_step=0.0116, global_step=3645.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1960/5971 [22:42<46:27,  1.44it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=4.95e-5, train/loss_step=0.0116, global_step=3645.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1960/5971 [22:42<46:27,  1.44it/s, loss=0.109, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000517, train/loss_step=0.154, global_step=3645.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1961/5971 [22:43<46:26,  1.44it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.23e-5, train/loss_step=0.0171, global_step=3646.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1962/5971 [22:44<46:26,  1.44it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.07e-5, train/loss_step=0.00189, global_step=3646.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1963/5971 [22:45<46:26,  1.44it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000156, train/loss_step=0.0438, global_step=3646.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1964/5971 [22:47<46:29,  1.44it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000156, train/loss_step=0.0438, global_step=3646.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1964/5971 [22:47<46:29,  1.44it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000318, train/loss_step=0.0967, global_step=3646.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1965/5971 [22:48<46:28,  1.44it/s, loss=0.105, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000881, train/loss_step=0.247, global_step=3647.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  33%|███▎      | 1966/5971 [22:49<46:28,  1.44it/s, loss=0.115, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00182, train/loss_step=0.323, global_step=3647.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1967/5971 [22:50<46:28,  1.44it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000278, train/loss_step=0.0844, global_step=3647.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1968/5971 [22:52<46:30,  1.43it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000278, train/loss_step=0.0844, global_step=3647.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1968/5971 [22:52<46:30,  1.43it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00937, train/loss_vlb_step=4.17e-5, train/loss_step=0.00937, global_step=3647.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1969/5971 [22:53<46:30,  1.43it/s, loss=0.129, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00146, train/loss_step=0.277, global_step=3648.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  33%|███▎      | 1970/5971 [22:54<46:30,  1.43it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.19e-5, train/loss_step=0.0123, global_step=3648.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1971/5971 [22:55<46:29,  1.43it/s, loss=0.133, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00126, train/loss_step=0.303, global_step=3648.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  33%|███▎      | 1972/5971 [22:57<46:32,  1.43it/s, loss=0.133, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00126, train/loss_step=0.303, global_step=3648.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1972/5971 [22:57<46:32,  1.43it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=2.02e-5, train/loss_step=0.00374, global_step=3648.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1973/5971 [22:58<46:32,  1.43it/s, loss=0.134, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000918, train/loss_step=0.250, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  33%|███▎      | 1974/5971 [22:59<46:32,  1.43it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00321, train/loss_vlb_step=1.69e-5, train/loss_step=0.00321, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1975/5971 [23:00<46:31,  1.43it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00949, train/loss_vlb_step=4.36e-5, train/loss_step=0.00949, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1976/5971 [23:02<46:34,  1.43it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00949, train/loss_vlb_step=4.36e-5, train/loss_step=0.00949, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  33%|███▎      | 1976/5971 [23:02<46:34,  1.43it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:08,  2.42it/s][A

Validating:   1%|          | 2/167 [00:00<00:43,  3.80it/s][A
Epoch 6:  33%|███▎      | 1980/5971 [23:03<46:27,  1.43it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:16, 10.12it/s][A
Epoch 6:  33%|███▎      | 1984/5971 [23:03<46:18,  1.43it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:10, 15.04it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.95it/s][A
Epoch 6:  33%|███▎      | 1988/5971 [23:03<46:10,  1.44it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.22it/s][A
Epoch 6:  33%|███▎      | 1992/5971 [23:03<46:02,  1.44it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.27it/s][A
Epoch 6:  33%|███▎      | 1996/5971 [23:03<45:54,  1.44it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.85it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.37it/s][A
Epoch 6:  33%|███▎      | 2000/5971 [23:04<45:46,  1.45it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.36it/s][A
Epoch 6:  34%|███▎      | 2004/5971 [23:04<45:38,  1.45it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.50it/s][A
Epoch 6:  34%|███▎      | 2008/5971 [23:04<45:30,  1.45it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.22it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 25.65it/s][A
Epoch 6:  34%|███▎      | 2012/5971 [23:04<45:23,  1.45it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 25.97it/s][A
Epoch 6:  34%|███▍      | 2016/5971 [23:04<45:15,  1.46it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.34it/s][A
Epoch 6:  34%|███▍      | 2020/5971 [23:04<45:07,  1.46it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.85it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.25it/s][A
Epoch 6:  34%|███▍      | 2024/5971 [23:05<44:59,  1.46it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.86it/s][A
Epoch 6:  34%|███▍      | 2028/5971 [23:05<44:51,  1.46it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 24.96it/s][A
Epoch 6:  34%|███▍      | 2032/5971 [23:05<44:44,  1.47it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.08it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 24.90it/s][A
Epoch 6:  34%|███▍      | 2036/5971 [23:05<44:36,  1.47it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 25.91it/s][A
Epoch 6:  34%|███▍      | 2040/5971 [23:05<44:28,  1.47it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:02<00:04, 25.45it/s][A
Epoch 6:  34%|███▍      | 2044/5971 [23:05<44:21,  1.48it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:03<00:04, 24.52it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.58it/s][A
Epoch 6:  34%|███▍      | 2048/5971 [23:05<44:13,  1.48it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 24.90it/s][A
Epoch 6:  34%|███▍      | 2052/5971 [23:06<44:06,  1.48it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.82it/s][A
Epoch 6:  34%|███▍      | 2056/5971 [23:06<43:58,  1.48it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.69it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.80it/s][A
Epoch 6:  35%|███▍      | 2060/5971 [23:06<43:50,  1.49it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 23.84it/s][A
Epoch 6:  35%|███▍      | 2064/5971 [23:06<43:43,  1.49it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 23.90it/s][A
Epoch 6:  35%|███▍      | 2068/5971 [23:06<43:36,  1.49it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 24.62it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 24.17it/s][A
Epoch 6:  35%|███▍      | 2072/5971 [23:06<43:28,  1.49it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.91it/s][A
Epoch 6:  35%|███▍      | 2076/5971 [23:07<43:21,  1.50it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.44it/s][A
Epoch 6:  35%|███▍      | 2080/5971 [23:07<43:13,  1.50it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.17it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.75it/s][A
Epoch 6:  35%|███▍      | 2084/5971 [23:07<43:06,  1.50it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.50it/s][A
Epoch 6:  35%|███▍      | 2088/5971 [23:07<42:59,  1.51it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 23.88it/s][A
Epoch 6:  35%|███▌      | 2092/5971 [23:07<42:52,  1.51it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 24.00it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 24.78it/s][A
Epoch 6:  35%|███▌      | 2096/5971 [23:07<42:44,  1.51it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.12it/s][A
Epoch 6:  35%|███▌      | 2100/5971 [23:08<42:37,  1.51it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.10it/s][A
Epoch 6:  35%|███▌      | 2104/5971 [23:08<42:30,  1.52it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.05it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.40it/s][A
Epoch 6:  35%|███▌      | 2108/5971 [23:08<42:23,  1.52it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.47it/s][A
Epoch 6:  35%|███▌      | 2112/5971 [23:08<42:15,  1.52it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 23.78it/s][A
Epoch 6:  35%|███▌      | 2116/5971 [23:08<42:08,  1.52it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 24.38it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:01, 23.91it/s][A
Epoch 6:  36%|███▌      | 2120/5971 [23:08<42:01,  1.53it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 22.90it/s][A
Epoch 6:  36%|███▌      | 2124/5971 [23:09<41:54,  1.53it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 21.46it/s][A
Epoch 6:  36%|███▌      | 2128/5971 [23:09<41:47,  1.53it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 20.59it/s][A

Validating:  93%|█████████▎| 155/167 [00:07<00:02,  5.92it/s][A
Epoch 6:  36%|███▌      | 2132/5971 [23:10<41:43,  1.53it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:08<00:01,  7.56it/s][A
Epoch 6:  36%|███▌      | 2136/5971 [23:10<41:36,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:08<00:00,  9.55it/s][A
Epoch 6:  36%|███▌      | 2140/5971 [23:11<41:29,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:08<00:00, 11.84it/s][A

Validating: 100%|██████████| 167/167 [00:08<00:00, 14.20it/s][A
Epoch 6:  36%|███▌      | 2144/5971 [23:11<41:22,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2144/5971 [23:11<41:23,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3649.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  36%|███▌      | 2145/5971 [23:12<41:23,  1.54it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.81e-5, train/loss_step=0.00327, global_step=3650.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2146/5971 [23:13<41:23,  1.54it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.0587, train/loss_vlb_step=0.0002, train/loss_step=0.0587, global_step=3650.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  36%|███▌      | 2147/5971 [23:14<41:22,  1.54it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.00014, train/loss_step=0.0396, global_step=3650.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2148/5971 [23:16<41:25,  1.54it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.00014, train/loss_step=0.0396, global_step=3650.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2148/5971 [23:16<41:25,  1.54it/s, loss=0.112, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00301, train/loss_step=0.463, global_step=3650.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  36%|███▌      | 2149/5971 [23:17<41:25,  1.54it/s, loss=0.146, v_num=0, train/loss_simple_step=0.685, train/loss_vlb_step=0.00841, train/loss_step=0.685, global_step=3651.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2150/5971 [23:18<41:24,  1.54it/s, loss=0.154, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000536, train/loss_step=0.157, global_step=3651.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2151/5971 [23:19<41:24,  1.54it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0616, train/loss_vlb_step=0.000203, train/loss_step=0.0616, global_step=3651.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2152/5971 [23:23<41:28,  1.53it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0616, train/loss_vlb_step=0.000203, train/loss_step=0.0616, global_step=3651.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2152/5971 [23:23<41:28,  1.53it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.00011, train/loss_step=0.0286, global_step=3651.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  36%|███▌      | 2153/5971 [23:24<41:28,  1.53it/s, loss=0.154, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00122, train/loss_step=0.296, global_step=3652.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  36%|███▌      | 2154/5971 [23:24<41:28,  1.53it/s, loss=0.138, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.41e-5, train/loss_step=0.010, global_step=3652.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2155/5971 [23:25<41:28,  1.53it/s, loss=0.144, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000731, train/loss_step=0.201, global_step=3652.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2156/5971 [23:28<41:30,  1.53it/s, loss=0.144, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000731, train/loss_step=0.201, global_step=3652.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2156/5971 [23:28<41:30,  1.53it/s, loss=0.176, v_num=0, train/loss_simple_step=0.653, train/loss_vlb_step=0.00801, train/loss_step=0.653, global_step=3652.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  36%|███▌      | 2157/5971 [23:29<41:30,  1.53it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=0.000101, train/loss_step=0.0249, global_step=3653.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2158/5971 [23:29<41:30,  1.53it/s, loss=0.182, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00199, train/loss_step=0.385, global_step=3653.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  36%|███▌      | 2159/5971 [23:30<41:29,  1.53it/s, loss=0.174, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000463, train/loss_step=0.140, global_step=3653.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2160/5971 [23:33<41:32,  1.53it/s, loss=0.174, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000463, train/loss_step=0.140, global_step=3653.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2160/5971 [23:33<41:32,  1.53it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=3.12e-5, train/loss_step=0.00608, global_step=3653.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2161/5971 [23:34<41:31,  1.53it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000186, train/loss_step=0.0508, global_step=3654.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  36%|███▌      | 2162/5971 [23:35<41:31,  1.53it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.54e-5, train/loss_step=0.0215, global_step=3654.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  36%|███▌      | 2163/5971 [23:35<41:31,  1.53it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.25e-5, train/loss_step=0.00419, global_step=3654.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2164/5971 [23:38<41:33,  1.53it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.25e-5, train/loss_step=0.00419, global_step=3654.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▌      | 2164/5971 [23:38<41:33,  1.53it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000105, train/loss_step=0.0286, global_step=3654.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  36%|███▋      | 2165/5971 [23:39<41:33,  1.53it/s, loss=0.171, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=3655.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  36%|███▋      | 2166/5971 [23:40<41:33,  1.53it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.5e-5, train/loss_step=0.00278, global_step=3655.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2167/5971 [23:40<41:33,  1.53it/s, loss=0.188, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.0027, train/loss_step=0.440, global_step=3655.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  36%|███▋      | 2168/5971 [23:43<41:35,  1.52it/s, loss=0.188, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.0027, train/loss_step=0.440, global_step=3655.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2168/5971 [23:43<41:35,  1.52it/s, loss=0.19, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00562, train/loss_step=0.508, global_step=3655.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2169/5971 [23:44<41:35,  1.52it/s, loss=0.177, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00226, train/loss_step=0.418, global_step=3656.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2170/5971 [23:45<41:34,  1.52it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.00013, train/loss_step=0.0364, global_step=3656.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2171/5971 [23:45<41:34,  1.52it/s, loss=0.175, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000479, train/loss_step=0.142, global_step=3656.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  36%|███▋      | 2172/5971 [23:48<41:36,  1.52it/s, loss=0.175, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000479, train/loss_step=0.142, global_step=3656.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2172/5971 [23:48<41:36,  1.52it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000177, train/loss_step=0.0508, global_step=3656.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2173/5971 [23:48<41:36,  1.52it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000243, train/loss_step=0.0699, global_step=3657.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2174/5971 [23:49<41:36,  1.52it/s, loss=0.167, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000211, train/loss_step=0.060, global_step=3657.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  36%|███▋      | 2175/5971 [23:50<41:36,  1.52it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00727, train/loss_vlb_step=3.56e-5, train/loss_step=0.00727, global_step=3657.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2176/5971 [23:53<41:38,  1.52it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00727, train/loss_vlb_step=3.56e-5, train/loss_step=0.00727, global_step=3657.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2176/5971 [23:53<41:38,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.85e-5, train/loss_step=0.0186, global_step=3657.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  36%|███▋      | 2177/5971 [23:54<41:38,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.86e-5, train/loss_step=0.0221, global_step=3658.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2178/5971 [23:55<41:38,  1.52it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000212, train/loss_step=0.0642, global_step=3658.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  36%|███▋      | 2179/5971 [23:56<41:38,  1.52it/s, loss=0.105, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000138, train/loss_step=0.038, global_step=3658.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2180/5971 [23:58<41:39,  1.52it/s, loss=0.105, v_num=0, train/loss_simple_step=0.038, train/loss_vlb_step=0.000138, train/loss_step=0.038, global_step=3658.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2180/5971 [23:58<41:39,  1.52it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0397, train/loss_vlb_step=0.00014, train/loss_step=0.0397, global_step=3658.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2181/5971 [23:59<41:39,  1.52it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=6.91e-5, train/loss_step=0.0169, global_step=3659.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2182/5971 [24:00<41:39,  1.52it/s, loss=0.117, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00107, train/loss_step=0.263, global_step=3659.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2183/5971 [24:00<41:39,  1.52it/s, loss=0.147, v_num=0, train/loss_simple_step=0.604, train/loss_vlb_step=0.00506, train/loss_step=0.604, global_step=3659.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2184/5971 [24:03<41:41,  1.51it/s, loss=0.147, v_num=0, train/loss_simple_step=0.604, train/loss_vlb_step=0.00506, train/loss_step=0.604, global_step=3659.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2184/5971 [24:03<41:41,  1.51it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0341, train/loss_vlb_step=0.00013, train/loss_step=0.0341, global_step=3659.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2185/5971 [24:04<41:41,  1.51it/s, loss=0.149, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000508, train/loss_step=0.153, global_step=3660.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2186/5971 [24:05<41:41,  1.51it/s, loss=0.169, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00228, train/loss_step=0.390, global_step=3660.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2187/5971 [24:06<41:41,  1.51it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.78e-6, train/loss_step=0.00145, global_step=3660.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2188/5971 [24:08<41:43,  1.51it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.78e-6, train/loss_step=0.00145, global_step=3660.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2188/5971 [24:08<41:43,  1.51it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.46e-5, train/loss_step=0.00261, global_step=3660.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2189/5971 [24:09<41:42,  1.51it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0617, train/loss_vlb_step=0.000206, train/loss_step=0.0617, global_step=3661.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2190/5971 [24:10<41:42,  1.51it/s, loss=0.11, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000515, train/loss_step=0.154, global_step=3661.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  37%|███▋      | 2191/5971 [24:11<41:42,  1.51it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0097, train/loss_vlb_step=4.55e-5, train/loss_step=0.0097, global_step=3661.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2192/5971 [24:13<41:44,  1.51it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0097, train/loss_vlb_step=4.55e-5, train/loss_step=0.0097, global_step=3661.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2192/5971 [24:13<41:44,  1.51it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=9.01e-5, train/loss_step=0.0221, global_step=3661.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2193/5971 [24:14<41:44,  1.51it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.4e-5, train/loss_step=0.0126, global_step=3662.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2194/5971 [24:15<41:43,  1.51it/s, loss=0.0963, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.65e-5, train/loss_step=0.0105, global_step=3662.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2195/5971 [24:16<41:43,  1.51it/s, loss=0.103, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000444, train/loss_step=0.132, global_step=3662.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2196/5971 [24:18<41:45,  1.51it/s, loss=0.103, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000444, train/loss_step=0.132, global_step=3662.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2196/5971 [24:18<41:45,  1.51it/s, loss=0.11, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000569, train/loss_step=0.162, global_step=3662.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2197/5971 [24:19<41:45,  1.51it/s, loss=0.132, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.0026, train/loss_step=0.477, global_step=3663.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2198/5971 [24:20<41:45,  1.51it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0593, train/loss_vlb_step=0.00021, train/loss_step=0.0593, global_step=3663.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2199/5971 [24:21<41:45,  1.51it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0412, train/loss_vlb_step=0.000149, train/loss_step=0.0412, global_step=3663.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2200/5971 [24:23<41:47,  1.50it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0412, train/loss_vlb_step=0.000149, train/loss_step=0.0412, global_step=3663.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2200/5971 [24:23<41:47,  1.50it/s, loss=0.133, v_num=0, train/loss_simple_step=0.048, train/loss_vlb_step=0.000171, train/loss_step=0.048, global_step=3663.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2201/5971 [24:24<41:46,  1.50it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00205, train/loss_vlb_step=1.18e-5, train/loss_step=0.00205, global_step=3664.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2202/5971 [24:25<41:46,  1.50it/s, loss=0.14, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00311, train/loss_step=0.426, global_step=3664.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  37%|███▋      | 2203/5971 [24:26<41:46,  1.50it/s, loss=0.116, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=3664.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2204/5971 [24:28<41:48,  1.50it/s, loss=0.116, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=3664.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2204/5971 [24:28<41:48,  1.50it/s, loss=0.122, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000522, train/loss_step=0.156, global_step=3664.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2205/5971 [24:29<41:48,  1.50it/s, loss=0.13, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00123, train/loss_step=0.301, global_step=3665.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2206/5971 [24:30<41:48,  1.50it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=8.92e-6, train/loss_step=0.00156, global_step=3665.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2207/5971 [24:31<41:48,  1.50it/s, loss=0.123, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00147, train/loss_step=0.253, global_step=3665.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  37%|███▋      | 2208/5971 [24:33<41:49,  1.50it/s, loss=0.123, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00147, train/loss_step=0.253, global_step=3665.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2208/5971 [24:33<41:49,  1.50it/s, loss=0.145, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00385, train/loss_step=0.454, global_step=3665.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2209/5971 [24:34<41:49,  1.50it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0913, train/loss_vlb_step=0.000303, train/loss_step=0.0913, global_step=3666.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2210/5971 [24:35<41:49,  1.50it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.05e-5, train/loss_step=0.0123, global_step=3666.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2211/5971 [24:36<41:49,  1.50it/s, loss=0.145, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000399, train/loss_step=0.119, global_step=3666.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2212/5971 [24:38<41:50,  1.50it/s, loss=0.145, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000399, train/loss_step=0.119, global_step=3666.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2212/5971 [24:38<41:50,  1.50it/s, loss=0.157, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.00107, train/loss_step=0.254, global_step=3666.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2213/5971 [24:39<41:50,  1.50it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.00018, train/loss_step=0.0495, global_step=3667.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2214/5971 [24:39<41:50,  1.50it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.73e-5, train/loss_step=0.0106, global_step=3667.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2215/5971 [24:40<41:49,  1.50it/s, loss=0.158, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000452, train/loss_step=0.131, global_step=3667.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2216/5971 [24:43<41:52,  1.49it/s, loss=0.158, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000452, train/loss_step=0.131, global_step=3667.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2216/5971 [24:43<41:52,  1.49it/s, loss=0.161, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000867, train/loss_step=0.213, global_step=3667.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2217/5971 [24:44<41:52,  1.49it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000157, train/loss_step=0.0438, global_step=3668.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2218/5971 [24:45<41:51,  1.49it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0982, train/loss_vlb_step=0.000323, train/loss_step=0.0982, global_step=3668.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2219/5971 [24:45<41:51,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0943, train/loss_vlb_step=0.000312, train/loss_step=0.0943, global_step=3668.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2220/5971 [24:48<41:53,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0943, train/loss_vlb_step=0.000312, train/loss_step=0.0943, global_step=3668.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2220/5971 [24:48<41:53,  1.49it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.00029, train/loss_step=0.0847, global_step=3668.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2221/5971 [24:48<41:52,  1.49it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.39e-5, train/loss_step=0.0244, global_step=3669.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2222/5971 [24:49<41:52,  1.49it/s, loss=0.144, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00207, train/loss_step=0.369, global_step=3669.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2223/5971 [24:50<41:52,  1.49it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.05e-5, train/loss_step=0.0139, global_step=3669.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2224/5971 [24:53<41:54,  1.49it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.05e-5, train/loss_step=0.0139, global_step=3669.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2224/5971 [24:53<41:54,  1.49it/s, loss=0.139, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000646, train/loss_step=0.172, global_step=3669.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2225/5971 [24:54<41:54,  1.49it/s, loss=0.135, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000745, train/loss_step=0.203, global_step=3670.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2226/5971 [24:54<41:53,  1.49it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000131, train/loss_step=0.0344, global_step=3670.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2227/5971 [24:55<41:53,  1.49it/s, loss=0.135, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000776, train/loss_step=0.220, global_step=3670.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2228/5971 [24:57<41:55,  1.49it/s, loss=0.135, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000776, train/loss_step=0.220, global_step=3670.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2228/5971 [24:57<41:55,  1.49it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.38e-5, train/loss_step=0.0152, global_step=3670.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2229/5971 [24:58<41:55,  1.49it/s, loss=0.132, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00347, train/loss_step=0.488, global_step=3671.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2230/5971 [24:59<41:54,  1.49it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.75e-5, train/loss_step=0.0256, global_step=3671.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2231/5971 [25:00<41:54,  1.49it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.58e-5, train/loss_step=0.0153, global_step=3671.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2232/5971 [25:02<41:56,  1.49it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.58e-5, train/loss_step=0.0153, global_step=3671.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2232/5971 [25:02<41:56,  1.49it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=2.04e-5, train/loss_step=0.00401, global_step=3671.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2233/5971 [25:03<41:55,  1.49it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.58e-5, train/loss_step=0.0109, global_step=3672.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  37%|███▋      | 2234/5971 [25:04<41:55,  1.49it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0764, train/loss_vlb_step=0.000251, train/loss_step=0.0764, global_step=3672.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2235/5971 [25:05<41:55,  1.49it/s, loss=0.117, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00046, train/loss_step=0.140, global_step=3672.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  37%|███▋      | 2236/5971 [25:07<41:57,  1.48it/s, loss=0.117, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00046, train/loss_step=0.140, global_step=3672.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2236/5971 [25:07<41:57,  1.48it/s, loss=0.137, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.0132, train/loss_step=0.611, global_step=3672.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  37%|███▋      | 2237/5971 [25:08<41:56,  1.48it/s, loss=0.141, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000368, train/loss_step=0.112, global_step=3673.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2238/5971 [25:09<41:56,  1.48it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.8e-5, train/loss_step=0.00327, global_step=3673.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  37%|███▋      | 2239/5971 [25:10<41:56,  1.48it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.56e-5, train/loss_step=0.00749, global_step=3673.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  38%|███▊      | 2240/5971 [25:12<41:58,  1.48it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.56e-5, train/loss_step=0.00749, global_step=3673.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  38%|███▊      | 2240/5971 [25:12<41:58,  1.48it/s, loss=0.15, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00352, train/loss_step=0.456, global_step=3673.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  38%|███▊      | 2241/5971 [25:13<41:57,  1.48it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00745, train/loss_vlb_step=3.23e-5, train/loss_step=0.00745, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  38%|███▊      | 2242/5971 [25:14<41:57,  1.48it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000205, train/loss_step=0.0604, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  38%|███▊      | 2243/5971 [25:15<41:57,  1.48it/s, loss=0.14, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000452, train/loss_step=0.138, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  38%|███▊      | 2244/5971 [25:17<41:59,  1.48it/s, loss=0.14, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000452, train/loss_step=0.138, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  38%|███▊      | 2244/5971 [25:17<41:59,  1.48it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<02:02,  1.36it/s][A

Validating:   1%|          | 2/167 [00:00<01:15,  2.18it/s][A
Epoch 6:  38%|███▊      | 2248/5971 [25:18<41:53,  1.48it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:01<00:25,  6.38it/s][A
Epoch 6:  38%|███▊      | 2252/5971 [25:18<41:46,  1.48it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:01<00:15, 10.55it/s][A

Validating:   7%|▋         | 11/167 [00:01<00:11, 14.18it/s][A
Epoch 6:  38%|███▊      | 2256/5971 [25:18<41:40,  1.49it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:09, 15.93it/s][A
Epoch 6:  38%|███▊      | 2260/5971 [25:19<41:33,  1.49it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:01<00:07, 19.42it/s][A
Epoch 6:  38%|███▊      | 2264/5971 [25:19<41:26,  1.49it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:07, 20.63it/s][A
Epoch 6:  38%|███▊      | 2268/5971 [25:19<41:19,  1.49it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 22.22it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.55it/s][A
Epoch 6:  38%|███▊      | 2272/5971 [25:19<41:12,  1.50it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:02<00:05, 23.80it/s][A
Epoch 6:  38%|███▊      | 2276/5971 [25:19<41:06,  1.50it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:02<00:05, 24.92it/s][A
Epoch 6:  38%|███▊      | 2280/5971 [25:19<40:59,  1.50it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 24.57it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.11it/s][A
Epoch 6:  38%|███▊      | 2284/5971 [25:19<40:52,  1.50it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.60it/s][A
Epoch 6:  38%|███▊      | 2288/5971 [25:20<40:45,  1.51it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 24.96it/s][A
Epoch 6:  38%|███▊      | 2292/5971 [25:20<40:39,  1.51it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.52it/s][A
Epoch 6:  38%|███▊      | 2296/5971 [25:20<40:32,  1.51it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:02<00:04, 26.97it/s][A

Validating:  33%|███▎      | 55/167 [00:03<00:04, 27.44it/s][A
Epoch 6:  39%|███▊      | 2300/5971 [25:20<40:25,  1.51it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:03<00:04, 26.06it/s][A
Epoch 6:  39%|███▊      | 2304/5971 [25:20<40:19,  1.52it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 24.75it/s][A
Epoch 6:  39%|███▊      | 2308/5971 [25:20<40:12,  1.52it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:03<00:04, 24.26it/s][A

Validating:  40%|████      | 67/167 [00:03<00:04, 24.40it/s][A
Epoch 6:  39%|███▊      | 2312/5971 [25:21<40:06,  1.52it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.67it/s][A
Epoch 6:  39%|███▉      | 2316/5971 [25:21<39:59,  1.52it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.38it/s][A
Epoch 6:  39%|███▉      | 2320/5971 [25:21<39:53,  1.53it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.69it/s][A

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.32it/s][A
Epoch 6:  39%|███▉      | 2324/5971 [25:21<39:46,  1.53it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 26.38it/s][A
Epoch 6:  39%|███▉      | 2328/5971 [25:21<39:40,  1.53it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:04<00:03, 26.18it/s][A
Epoch 6:  39%|███▉      | 2332/5971 [25:21<39:33,  1.53it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 25.36it/s][A

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 24.96it/s][A
Epoch 6:  39%|███▉      | 2336/5971 [25:22<39:27,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.16it/s][A
Epoch 6:  39%|███▉      | 2340/5971 [25:22<39:20,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 26.00it/s][A
Epoch 6:  39%|███▉      | 2344/5971 [25:22<39:14,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.68it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 26.98it/s][A
Epoch 6:  39%|███▉      | 2348/5971 [25:22<39:08,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 25.99it/s][A
Epoch 6:  39%|███▉      | 2352/5971 [25:22<39:01,  1.55it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 22.42it/s][A
Epoch 6:  39%|███▉      | 2356/5971 [25:22<38:55,  1.55it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 23.84it/s][A

Validating:  69%|██████▉   | 115/167 [00:05<00:02, 23.67it/s][A
Epoch 6:  40%|███▉      | 2360/5971 [25:22<38:49,  1.55it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:05<00:02, 22.20it/s][A
Epoch 6:  40%|███▉      | 2364/5971 [25:23<38:43,  1.55it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:02, 22.89it/s][A
Epoch 6:  40%|███▉      | 2368/5971 [25:23<38:36,  1.56it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 22.94it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 23.87it/s][A
Epoch 6:  40%|███▉      | 2372/5971 [25:23<38:30,  1.56it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:06<00:01, 25.10it/s][A
Epoch 6:  40%|███▉      | 2376/5971 [25:23<38:24,  1.56it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:06<00:01, 25.47it/s][A
Epoch 6:  40%|███▉      | 2380/5971 [25:23<38:18,  1.56it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:06<00:01, 25.66it/s][A

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 25.85it/s][A
Epoch 6:  40%|███▉      | 2384/5971 [25:23<38:11,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.29it/s][A
Epoch 6:  40%|███▉      | 2388/5971 [25:24<38:05,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.23it/s][A
Epoch 6:  40%|████      | 2392/5971 [25:24<37:59,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.90it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 24.33it/s][A
Epoch 6:  40%|████      | 2396/5971 [25:24<37:53,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:07<00:00, 24.94it/s][A
Epoch 6:  40%|████      | 2400/5971 [25:24<37:47,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 26.09it/s][A
Epoch 6:  40%|████      | 2404/5971 [25:24<37:41,  1.58it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 26.96it/s][A
Epoch 6:  40%|████      | 2408/5971 [25:24<37:35,  1.58it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 28.56it/s][A
Epoch 6:  40%|████      | 2412/5971 [25:24<37:29,  1.58it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  40%|████      | 2412/5971 [25:25<37:29,  1.58it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.44e-5, train/loss_step=0.00468, global_step=3674.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  40%|████      | 2413/5971 [25:26<37:29,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.5e-5, train/loss_step=0.0129, global_step=3675.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  40%|████      | 2414/5971 [25:27<37:29,  1.58it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.74e-5, train/loss_step=0.0189, global_step=3675.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  40%|████      | 2415/5971 [25:27<37:28,  1.58it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000145, train/loss_step=0.0382, global_step=3675.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  40%|████      | 2416/5971 [25:30<37:31,  1.58it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000145, train/loss_step=0.0382, global_step=3675.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  40%|████      | 2416/5971 [25:30<37:31,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00173, train/loss_step=0.326, global_step=3675.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  40%|████      | 2417/5971 [25:31<37:30,  1.58it/s, loss=0.109, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00036, train/loss_step=0.109, global_step=3676.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  40%|████      | 2418/5971 [25:32<37:30,  1.58it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000138, train/loss_step=0.0373, global_step=3676.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2419/5971 [25:33<37:30,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.684, train/loss_vlb_step=0.0323, train/loss_step=0.684, global_step=3676.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  41%|████      | 2420/5971 [25:35<37:32,  1.58it/s, loss=0.143, v_num=0, train/loss_simple_step=0.684, train/loss_vlb_step=0.0323, train/loss_step=0.684, global_step=3676.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2420/5971 [25:35<37:32,  1.58it/s, loss=0.173, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.00884, train/loss_step=0.611, global_step=3676.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2421/5971 [25:36<37:32,  1.58it/s, loss=0.184, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000834, train/loss_step=0.235, global_step=3677.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2422/5971 [25:37<37:31,  1.58it/s, loss=0.199, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00234, train/loss_step=0.377, global_step=3677.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████      | 2423/5971 [25:38<37:31,  1.58it/s, loss=0.194, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.09e-5, train/loss_step=0.025, global_step=3677.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2424/5971 [25:40<37:33,  1.57it/s, loss=0.194, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.09e-5, train/loss_step=0.025, global_step=3677.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2424/5971 [25:40<37:33,  1.57it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0657, train/loss_vlb_step=0.000221, train/loss_step=0.0657, global_step=3677.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2425/5971 [25:41<37:32,  1.57it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0078, train/loss_vlb_step=3.55e-5, train/loss_step=0.0078, global_step=3678.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████      | 2426/5971 [25:42<37:32,  1.57it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.26e-5, train/loss_step=0.0175, global_step=3678.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2427/5971 [25:43<37:32,  1.57it/s, loss=0.172, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.0007, train/loss_step=0.201, global_step=3678.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  41%|████      | 2428/5971 [25:45<37:33,  1.57it/s, loss=0.172, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.0007, train/loss_step=0.201, global_step=3678.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2428/5971 [25:45<37:33,  1.57it/s, loss=0.177, v_num=0, train/loss_simple_step=0.565, train/loss_vlb_step=0.00597, train/loss_step=0.565, global_step=3678.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2429/5971 [25:46<37:33,  1.57it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.000105, train/loss_step=0.0267, global_step=3679.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2430/5971 [25:47<37:33,  1.57it/s, loss=0.188, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00131, train/loss_step=0.266, global_step=3679.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  41%|████      | 2431/5971 [25:47<37:33,  1.57it/s, loss=0.207, v_num=0, train/loss_simple_step=0.506, train/loss_vlb_step=0.00543, train/loss_step=0.506, global_step=3679.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2432/5971 [25:50<37:34,  1.57it/s, loss=0.207, v_num=0, train/loss_simple_step=0.506, train/loss_vlb_step=0.00543, train/loss_step=0.506, global_step=3679.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2432/5971 [25:50<37:34,  1.57it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0909, train/loss_vlb_step=0.000304, train/loss_step=0.0909, global_step=3679.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2433/5971 [25:51<37:34,  1.57it/s, loss=0.216, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=3680.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  41%|████      | 2434/5971 [25:51<37:34,  1.57it/s, loss=0.215, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.08e-5, train/loss_step=0.00179, global_step=3680.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2435/5971 [25:52<37:33,  1.57it/s, loss=0.225, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000813, train/loss_step=0.231, global_step=3680.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  41%|████      | 2436/5971 [25:54<37:35,  1.57it/s, loss=0.225, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000813, train/loss_step=0.231, global_step=3680.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2436/5971 [25:54<37:35,  1.57it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.2e-5, train/loss_step=0.0243, global_step=3680.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████      | 2437/5971 [25:55<37:35,  1.57it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.62e-6, train/loss_step=0.00161, global_step=3681.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2438/5971 [25:56<37:34,  1.57it/s, loss=0.209, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000406, train/loss_step=0.123, global_step=3681.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  41%|████      | 2439/5971 [25:57<37:34,  1.57it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000219, train/loss_step=0.0645, global_step=3681.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2440/5971 [26:00<37:36,  1.56it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000219, train/loss_step=0.0645, global_step=3681.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2440/5971 [26:00<37:36,  1.56it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000132, train/loss_step=0.0349, global_step=3681.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2441/5971 [26:00<37:36,  1.56it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00245, train/loss_vlb_step=1.32e-5, train/loss_step=0.00245, global_step=3682.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2442/5971 [26:01<37:36,  1.56it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.00012, train/loss_step=0.0314, global_step=3682.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  41%|████      | 2443/5971 [26:02<37:35,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00129, train/loss_vlb_step=7.82e-6, train/loss_step=0.00129, global_step=3682.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2444/5971 [26:04<37:37,  1.56it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00129, train/loss_vlb_step=7.82e-6, train/loss_step=0.00129, global_step=3682.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2444/5971 [26:04<37:37,  1.56it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000107, train/loss_step=0.0266, global_step=3682.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████      | 2445/5971 [26:05<37:37,  1.56it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.43e-5, train/loss_step=0.00255, global_step=3683.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2446/5971 [26:06<37:36,  1.56it/s, loss=0.131, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00132, train/loss_step=0.304, global_step=3683.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  41%|████      | 2447/5971 [26:07<37:36,  1.56it/s, loss=0.123, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000162, train/loss_step=0.043, global_step=3683.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2448/5971 [26:09<37:38,  1.56it/s, loss=0.123, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000162, train/loss_step=0.043, global_step=3683.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2448/5971 [26:09<37:38,  1.56it/s, loss=0.11, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00154, train/loss_step=0.299, global_step=3683.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  41%|████      | 2449/5971 [26:10<37:38,  1.56it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=7.99e-5, train/loss_step=0.0204, global_step=3684.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2450/5971 [26:11<37:37,  1.56it/s, loss=0.11, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00103, train/loss_step=0.276, global_step=3684.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  41%|████      | 2451/5971 [26:12<37:37,  1.56it/s, loss=0.108, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00342, train/loss_step=0.468, global_step=3684.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2452/5971 [26:14<37:39,  1.56it/s, loss=0.108, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00342, train/loss_step=0.468, global_step=3684.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2452/5971 [26:14<37:39,  1.56it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000302, train/loss_step=0.0918, global_step=3684.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2453/5971 [26:15<37:38,  1.56it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.00025, train/loss_step=0.0761, global_step=3685.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████      | 2454/5971 [26:16<37:38,  1.56it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.12e-5, train/loss_step=0.0186, global_step=3685.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2455/5971 [26:17<37:38,  1.56it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.0516, train/loss_vlb_step=0.000189, train/loss_step=0.0516, global_step=3685.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2456/5971 [26:19<37:39,  1.56it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.0516, train/loss_vlb_step=0.000189, train/loss_step=0.0516, global_step=3685.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2456/5971 [26:19<37:39,  1.56it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.44e-5, train/loss_step=0.0118, global_step=3685.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████      | 2457/5971 [26:20<37:39,  1.56it/s, loss=0.0975, v_num=0, train/loss_simple_step=0.00353, train/loss_vlb_step=1.87e-5, train/loss_step=0.00353, global_step=3686.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2458/5971 [26:21<37:39,  1.55it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.0042, train/loss_vlb_step=2.15e-5, train/loss_step=0.0042, global_step=3686.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  41%|████      | 2459/5971 [26:22<37:38,  1.55it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=3686.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████      | 2460/5971 [26:24<37:40,  1.55it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=3686.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2460/5971 [26:24<37:40,  1.55it/s, loss=0.105, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.00106, train/loss_step=0.229, global_step=3686.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  41%|████      | 2461/5971 [26:25<37:40,  1.55it/s, loss=0.123, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00182, train/loss_step=0.364, global_step=3687.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2462/5971 [26:26<37:40,  1.55it/s, loss=0.133, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000762, train/loss_step=0.223, global_step=3687.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████      | 2463/5971 [26:27<37:39,  1.55it/s, loss=0.143, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000746, train/loss_step=0.209, global_step=3687.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2464/5971 [26:29<37:41,  1.55it/s, loss=0.143, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000746, train/loss_step=0.209, global_step=3687.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2464/5971 [26:29<37:41,  1.55it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.71e-5, train/loss_step=0.0154, global_step=3687.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2465/5971 [26:30<37:41,  1.55it/s, loss=0.158, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00139, train/loss_step=0.315, global_step=3688.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  41%|████▏     | 2466/5971 [26:31<37:40,  1.55it/s, loss=0.153, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000689, train/loss_step=0.208, global_step=3688.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2467/5971 [26:32<37:40,  1.55it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.8e-5, train/loss_step=0.0215, global_step=3688.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2468/5971 [26:34<37:42,  1.55it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.8e-5, train/loss_step=0.0215, global_step=3688.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2468/5971 [26:34<37:42,  1.55it/s, loss=0.141, v_num=0, train/loss_simple_step=0.075, train/loss_vlb_step=0.000253, train/loss_step=0.075, global_step=3688.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2469/5971 [26:35<37:41,  1.55it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00198, train/loss_vlb_step=1.15e-5, train/loss_step=0.00198, global_step=3689.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2470/5971 [26:36<37:41,  1.55it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000149, train/loss_step=0.0425, global_step=3689.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2471/5971 [26:37<37:41,  1.55it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=5.12e-5, train/loss_step=0.0109, global_step=3689.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████▏     | 2472/5971 [26:39<37:42,  1.55it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=5.12e-5, train/loss_step=0.0109, global_step=3689.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2472/5971 [26:39<37:42,  1.55it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000198, train/loss_step=0.0586, global_step=3689.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2473/5971 [26:40<37:42,  1.55it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0277, train/loss_vlb_step=0.0001, train/loss_step=0.0277, global_step=3690.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  41%|████▏     | 2474/5971 [26:41<37:42,  1.55it/s, loss=0.117, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00165, train/loss_step=0.332, global_step=3690.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  41%|████▏     | 2475/5971 [26:41<37:41,  1.55it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.69e-5, train/loss_step=0.00293, global_step=3690.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2476/5971 [26:44<37:43,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.69e-5, train/loss_step=0.00293, global_step=3690.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  41%|████▏     | 2476/5971 [26:44<37:43,  1.54it/s, loss=0.14, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00427, train/loss_step=0.509, global_step=3690.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  41%|████▏     | 2477/5971 [26:44<37:43,  1.54it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0729, train/loss_vlb_step=0.000242, train/loss_step=0.0729, global_step=3691.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2478/5971 [26:45<37:42,  1.54it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0482, train/loss_vlb_step=0.000176, train/loss_step=0.0482, global_step=3691.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2479/5971 [26:46<37:42,  1.54it/s, loss=0.158, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00218, train/loss_step=0.388, global_step=3691.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  42%|████▏     | 2480/5971 [26:49<37:44,  1.54it/s, loss=0.158, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00218, train/loss_step=0.388, global_step=3691.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2480/5971 [26:49<37:44,  1.54it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0459, train/loss_vlb_step=0.000164, train/loss_step=0.0459, global_step=3691.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2481/5971 [26:49<37:43,  1.54it/s, loss=0.139, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000589, train/loss_step=0.177, global_step=3692.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  42%|████▏     | 2482/5971 [26:50<37:43,  1.54it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0365, train/loss_vlb_step=0.000136, train/loss_step=0.0365, global_step=3692.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2483/5971 [26:51<37:43,  1.54it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00572, train/loss_vlb_step=2.8e-5, train/loss_step=0.00572, global_step=3692.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2484/5971 [26:53<37:44,  1.54it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00572, train/loss_vlb_step=2.8e-5, train/loss_step=0.00572, global_step=3692.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2484/5971 [26:53<37:44,  1.54it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0875, train/loss_vlb_step=0.000298, train/loss_step=0.0875, global_step=3692.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2485/5971 [26:54<37:44,  1.54it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.81e-5, train/loss_step=0.00334, global_step=3693.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2486/5971 [26:55<37:43,  1.54it/s, loss=0.107, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000713, train/loss_step=0.199, global_step=3693.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  42%|████▏     | 2487/5971 [26:56<37:43,  1.54it/s, loss=0.118, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00101, train/loss_step=0.243, global_step=3693.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  42%|████▏     | 2488/5971 [26:58<37:45,  1.54it/s, loss=0.118, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00101, train/loss_step=0.243, global_step=3693.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2488/5971 [26:58<37:45,  1.54it/s, loss=0.122, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000488, train/loss_step=0.148, global_step=3693.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2489/5971 [26:59<37:45,  1.54it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0481, train/loss_vlb_step=0.000166, train/loss_step=0.0481, global_step=3694.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2490/5971 [27:00<37:44,  1.54it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.63e-5, train/loss_step=0.00287, global_step=3694.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2491/5971 [27:01<37:44,  1.54it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000163, train/loss_step=0.0439, global_step=3694.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  42%|████▏     | 2492/5971 [27:03<37:45,  1.54it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0439, train/loss_vlb_step=0.000163, train/loss_step=0.0439, global_step=3694.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2492/5971 [27:03<37:45,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000747, train/loss_step=0.214, global_step=3694.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  42%|████▏     | 2493/5971 [27:04<37:45,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0318, train/loss_vlb_step=0.000121, train/loss_step=0.0318, global_step=3695.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2494/5971 [27:05<37:45,  1.54it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0786, train/loss_vlb_step=0.00026, train/loss_step=0.0786, global_step=3695.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  42%|████▏     | 2495/5971 [27:06<37:44,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=9.24e-5, train/loss_step=0.0222, global_step=3695.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  42%|████▏     | 2496/5971 [27:08<37:46,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=9.24e-5, train/loss_step=0.0222, global_step=3695.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2496/5971 [27:08<37:46,  1.53it/s, loss=0.0971, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.000155, train/loss_step=0.0455, global_step=3695.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2497/5971 [27:09<37:45,  1.53it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000107, train/loss_step=0.0259, global_step=3696.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2498/5971 [27:10<37:45,  1.53it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.22e-5, train/loss_step=0.00218, global_step=3696.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2499/5971 [27:11<37:45,  1.53it/s, loss=0.108, v_num=0, train/loss_simple_step=0.698, train/loss_vlb_step=0.00925, train/loss_step=0.698, global_step=3696.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  42%|████▏     | 2500/5971 [27:13<37:46,  1.53it/s, loss=0.108, v_num=0, train/loss_simple_step=0.698, train/loss_vlb_step=0.00925, train/loss_step=0.698, global_step=3696.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2500/5971 [27:13<37:46,  1.53it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.16e-5, train/loss_step=0.0116, global_step=3696.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2501/5971 [27:14<37:46,  1.53it/s, loss=0.129, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.314, train/loss_step=0.624, global_step=3697.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  42%|████▏     | 2502/5971 [27:15<37:46,  1.53it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.07e-6, train/loss_step=0.00135, global_step=3697.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2503/5971 [27:15<37:45,  1.53it/s, loss=0.149, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00248, train/loss_step=0.454, global_step=3697.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  42%|████▏     | 2504/5971 [27:18<37:47,  1.53it/s, loss=0.149, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00248, train/loss_step=0.454, global_step=3697.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2504/5971 [27:18<37:47,  1.53it/s, loss=0.153, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000594, train/loss_step=0.167, global_step=3697.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2505/5971 [27:18<37:46,  1.53it/s, loss=0.159, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00039, train/loss_step=0.112, global_step=3698.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  42%|████▏     | 2506/5971 [27:19<37:46,  1.53it/s, loss=0.149, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.19e-5, train/loss_step=0.002, global_step=3698.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2507/5971 [27:20<37:46,  1.53it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.00015, train/loss_step=0.0417, global_step=3698.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2508/5971 [27:23<37:47,  1.53it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.00015, train/loss_step=0.0417, global_step=3698.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2508/5971 [27:23<37:47,  1.53it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.2e-5, train/loss_step=0.00201, global_step=3698.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2509/5971 [27:24<37:47,  1.53it/s, loss=0.135, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000396, train/loss_step=0.120, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  42%|████▏     | 2510/5971 [27:24<37:47,  1.53it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.43e-5, train/loss_step=0.00749, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2511/5971 [27:25<37:46,  1.53it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0668, train/loss_vlb_step=0.000231, train/loss_step=0.0668, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  42%|████▏     | 2512/5971 [27:27<37:48,  1.52it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0668, train/loss_vlb_step=0.000231, train/loss_step=0.0668, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  42%|████▏     | 2512/5971 [27:27<37:48,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:05,  2.55it/s][A

Validating:   1%|          | 2/167 [00:00<00:51,  3.21it/s][A
Epoch 6:  42%|████▏     | 2516/5971 [27:28<37:43,  1.53it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.84it/s][A
Epoch 6:  42%|████▏     | 2520/5971 [27:28<37:37,  1.53it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.91it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.36it/s][A
Epoch 6:  42%|████▏     | 2524/5971 [27:29<37:31,  1.53it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:08, 19.00it/s][A
Epoch 6:  42%|████▏     | 2528/5971 [27:29<37:25,  1.53it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.83it/s][A
Epoch 6:  42%|████▏     | 2532/5971 [27:29<37:19,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.76it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.67it/s][A
Epoch 6:  42%|████▏     | 2536/5971 [27:29<37:13,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.75it/s][A
Epoch 6:  43%|████▎     | 2540/5971 [27:29<37:07,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.54it/s][A
Epoch 6:  43%|████▎     | 2544/5971 [27:29<37:01,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.50it/s][A

Validating:  21%|██        | 35/167 [00:01<00:04, 26.77it/s][A
Epoch 6:  43%|████▎     | 2548/5971 [27:29<36:55,  1.54it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.68it/s][A
Epoch 6:  43%|████▎     | 2552/5971 [27:30<36:49,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.13it/s][A
Epoch 6:  43%|████▎     | 2556/5971 [27:30<36:43,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.92it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.03it/s][A
Epoch 6:  43%|████▎     | 2560/5971 [27:30<36:38,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.56it/s][A
Epoch 6:  43%|████▎     | 2564/5971 [27:30<36:32,  1.55it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.67it/s][A
Epoch 6:  43%|████▎     | 2568/5971 [27:30<36:26,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.34it/s][A
Epoch 6:  43%|████▎     | 2572/5971 [27:30<36:20,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.60it/s][A
Epoch 6:  43%|████▎     | 2576/5971 [27:30<36:15,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.43it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 25.82it/s][A
Epoch 6:  43%|████▎     | 2580/5971 [27:31<36:09,  1.56it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.26it/s][A
Epoch 6:  43%|████▎     | 2584/5971 [27:31<36:03,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.79it/s][A
Epoch 6:  43%|████▎     | 2588/5971 [27:31<35:57,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.13it/s][A

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.29it/s][A
Epoch 6:  43%|████▎     | 2592/5971 [27:31<35:52,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.04it/s][A
Epoch 6:  43%|████▎     | 2596/5971 [27:31<35:46,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.61it/s][A
Epoch 6:  44%|████▎     | 2600/5971 [27:31<35:40,  1.57it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.58it/s][A
Epoch 6:  44%|████▎     | 2604/5971 [27:32<35:35,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.26it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.24it/s][A
Epoch 6:  44%|████▎     | 2608/5971 [27:32<35:29,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 27.76it/s][A
Epoch 6:  44%|████▎     | 2612/5971 [27:32<35:24,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.72it/s][A
Epoch 6:  44%|████▍     | 2616/5971 [27:32<35:18,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.33it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.43it/s][A
Epoch 6:  44%|████▍     | 2620/5971 [27:32<35:13,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.51it/s][A
Epoch 6:  44%|████▍     | 2624/5971 [27:32<35:07,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 25.17it/s][A
Epoch 6:  44%|████▍     | 2628/5971 [27:32<35:01,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 25.93it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.89it/s][A
Epoch 6:  44%|████▍     | 2632/5971 [27:33<34:56,  1.59it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 28.38it/s][A
Epoch 6:  44%|████▍     | 2636/5971 [27:33<34:50,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 28.76it/s][A
Epoch 6:  44%|████▍     | 2640/5971 [27:33<34:45,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.18it/s][A
Epoch 6:  44%|████▍     | 2644/5971 [27:33<34:39,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.39it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.92it/s][A
Epoch 6:  44%|████▍     | 2648/5971 [27:33<34:34,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 25.50it/s][A
Epoch 6:  44%|████▍     | 2652/5971 [27:33<34:29,  1.60it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:05<00:01, 25.69it/s][A
Epoch 6:  44%|████▍     | 2656/5971 [27:34<34:23,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 25.37it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 25.14it/s][A
Epoch 6:  45%|████▍     | 2660/5971 [27:34<34:18,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.71it/s][A
Epoch 6:  45%|████▍     | 2664/5971 [27:34<34:12,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 24.40it/s][A
Epoch 6:  45%|████▍     | 2668/5971 [27:34<34:07,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 25.64it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.69it/s][A
Epoch 6:  45%|████▍     | 2672/5971 [27:34<34:02,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.37it/s][A
Epoch 6:  45%|████▍     | 2676/5971 [27:34<33:56,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 24.56it/s][A
Epoch 6:  45%|████▍     | 2680/5971 [27:34<33:51,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▍     | 2680/5971 [27:35<33:51,  1.62it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.15it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.71it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.12it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.46it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.89it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.05it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.16it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.30it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.69it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.56it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.58it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.54it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.56it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.60it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.62it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.45it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.17it/s]

Epoch 6:  45%|████▍     | 2680/5971 [27:47<34:06,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▍     | 2681/5971 [27:47<34:05,  1.61it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.49e-5, train/loss_step=0.0154, global_step=3699.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▍     | 2681/5971 [27:47<34:05,  1.61it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.13e-5, train/loss_step=0.00203, global_step=3700.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.88it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.27it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.61it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.86it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.01it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.08it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.25it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.31it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.44it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.38it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.45it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.42it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.58it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.56it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.52it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.54it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.52it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.57it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.17it/s]

Epoch 6:  45%|████▍     | 2682/5971 [27:59<34:18,  1.60it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.13e-5, train/loss_step=0.00203, global_step=3700.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▍     | 2682/5971 [27:59<34:18,  1.60it/s, loss=0.133, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.00103, train/loss_step=0.232, global_step=3700.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.25it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.87it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.33it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.67it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.91it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.09it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.20it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.38it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.40it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.51it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.55it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.60it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.56it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.41it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.54it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.48it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.34it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.23it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 6:  45%|████▍     | 2683/5971 [28:11<34:31,  1.59it/s, loss=0.133, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.00103, train/loss_step=0.232, global_step=3700.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▍     | 2683/5971 [28:11<34:31,  1.59it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00399, train/loss_vlb_step=2.07e-5, train/loss_step=0.00399, global_step=3700.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.17it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.76it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.19it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.52it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.79it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.96it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.10it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.33it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.36it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.37it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.40it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.48it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.58it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.38it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.22it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.28it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.43it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.50it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.56it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.48it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.11it/s]

Epoch 6:  45%|████▍     | 2684/5971 [28:24<34:47,  1.57it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00399, train/loss_vlb_step=2.07e-5, train/loss_step=0.00399, global_step=3700.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▍     | 2684/5971 [28:24<34:47,  1.57it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.38e-5, train/loss_step=0.0204, global_step=3700.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  45%|████▍     | 2685/5971 [28:25<34:46,  1.57it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0204, train/loss_vlb_step=8.38e-5, train/loss_step=0.0204, global_step=3700.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▍     | 2685/5971 [28:25<34:46,  1.57it/s, loss=0.141, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.00106, train/loss_step=0.239, global_step=3701.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▍     | 2686/5971 [28:26<34:46,  1.57it/s, loss=0.141, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.00106, train/loss_step=0.239, global_step=3701.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▍     | 2686/5971 [28:26<34:46,  1.57it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.6e-5, train/loss_step=0.0193, global_step=3701.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2687/5971 [28:27<34:46,  1.57it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.6e-5, train/loss_step=0.0193, global_step=3701.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2687/5971 [28:27<34:46,  1.57it/s, loss=0.122, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00115, train/loss_step=0.295, global_step=3701.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▌     | 2688/5971 [28:30<34:47,  1.57it/s, loss=0.122, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00115, train/loss_step=0.295, global_step=3701.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2688/5971 [28:30<34:47,  1.57it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00418, train/loss_vlb_step=2.11e-5, train/loss_step=0.00418, global_step=3701.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2689/5971 [28:30<34:47,  1.57it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00418, train/loss_vlb_step=2.11e-5, train/loss_step=0.00418, global_step=3701.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2689/5971 [28:30<34:47,  1.57it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.95e-5, train/loss_step=0.0247, global_step=3702.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▌     | 2690/5971 [28:31<34:47,  1.57it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.95e-5, train/loss_step=0.0247, global_step=3702.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2690/5971 [28:31<34:47,  1.57it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.00726, train/loss_vlb_step=3.54e-5, train/loss_step=0.00726, global_step=3702.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2691/5971 [28:32<34:46,  1.57it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.00726, train/loss_vlb_step=3.54e-5, train/loss_step=0.00726, global_step=3702.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2691/5971 [28:32<34:46,  1.57it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.7e-5, train/loss_step=0.0151, global_step=3702.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  45%|████▌     | 2692/5971 [28:34<34:47,  1.57it/s, loss=0.0699, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.7e-5, train/loss_step=0.0151, global_step=3702.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2692/5971 [28:34<34:47,  1.57it/s, loss=0.0892, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.00523, train/loss_step=0.554, global_step=3702.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▌     | 2693/5971 [28:35<34:47,  1.57it/s, loss=0.0892, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.00523, train/loss_step=0.554, global_step=3702.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2693/5971 [28:35<34:47,  1.57it/s, loss=0.0843, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.01e-5, train/loss_step=0.0139, global_step=3703.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2694/5971 [28:36<34:47,  1.57it/s, loss=0.0843, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.01e-5, train/loss_step=0.0139, global_step=3703.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2694/5971 [28:36<34:47,  1.57it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000881, train/loss_step=0.228, global_step=3703.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▌     | 2695/5971 [28:37<34:46,  1.57it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000881, train/loss_step=0.228, global_step=3703.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2695/5971 [28:37<34:46,  1.57it/s, loss=0.103, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.00062, train/loss_step=0.185, global_step=3703.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  45%|████▌     | 2696/5971 [28:39<34:48,  1.57it/s, loss=0.103, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.00062, train/loss_step=0.185, global_step=3703.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2696/5971 [28:39<34:48,  1.57it/s, loss=0.12, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.0016, train/loss_step=0.348, global_step=3703.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  45%|████▌     | 2697/5971 [28:40<34:47,  1.57it/s, loss=0.12, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.0016, train/loss_step=0.348, global_step=3703.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2697/5971 [28:40<34:47,  1.57it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.74e-5, train/loss_step=0.00309, global_step=3704.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2698/5971 [28:41<34:47,  1.57it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.74e-5, train/loss_step=0.00309, global_step=3704.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2698/5971 [28:41<34:47,  1.57it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0696, train/loss_vlb_step=0.000232, train/loss_step=0.0696, global_step=3704.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▌     | 2699/5971 [28:42<34:47,  1.57it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0696, train/loss_vlb_step=0.000232, train/loss_step=0.0696, global_step=3704.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2699/5971 [28:42<34:47,  1.57it/s, loss=0.137, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00633, train/loss_step=0.468, global_step=3704.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  45%|████▌     | 2700/5971 [28:44<34:48,  1.57it/s, loss=0.137, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00633, train/loss_step=0.468, global_step=3704.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2700/5971 [28:44<34:48,  1.57it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.41e-5, train/loss_step=0.00487, global_step=3704.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2701/5971 [28:45<34:48,  1.57it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.41e-5, train/loss_step=0.00487, global_step=3704.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2701/5971 [28:45<34:48,  1.57it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0843, train/loss_vlb_step=0.000281, train/loss_step=0.0843, global_step=3705.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▌     | 2702/5971 [28:46<34:47,  1.57it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0843, train/loss_vlb_step=0.000281, train/loss_step=0.0843, global_step=3705.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2702/5971 [28:46<34:47,  1.57it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.46e-5, train/loss_step=0.0149, global_step=3705.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  45%|████▌     | 2703/5971 [28:47<34:47,  1.57it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.46e-5, train/loss_step=0.0149, global_step=3705.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2703/5971 [28:47<34:47,  1.57it/s, loss=0.145, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00131, train/loss_step=0.302, global_step=3705.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▌     | 2704/5971 [28:49<34:48,  1.56it/s, loss=0.145, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00131, train/loss_step=0.302, global_step=3705.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2704/5971 [28:49<34:48,  1.56it/s, loss=0.169, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00721, train/loss_step=0.499, global_step=3705.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2705/5971 [28:50<34:48,  1.56it/s, loss=0.169, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00721, train/loss_step=0.499, global_step=3705.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2705/5971 [28:50<34:48,  1.56it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00445, train/loss_vlb_step=2.35e-5, train/loss_step=0.00445, global_step=3706.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2706/5971 [28:51<34:48,  1.56it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00445, train/loss_vlb_step=2.35e-5, train/loss_step=0.00445, global_step=3706.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2706/5971 [28:51<34:48,  1.56it/s, loss=0.16, v_num=0, train/loss_simple_step=0.082, train/loss_vlb_step=0.00027, train/loss_step=0.082, global_step=3706.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  45%|████▌     | 2707/5971 [28:52<34:47,  1.56it/s, loss=0.16, v_num=0, train/loss_simple_step=0.082, train/loss_vlb_step=0.00027, train/loss_step=0.082, global_step=3706.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2707/5971 [28:52<34:47,  1.56it/s, loss=0.149, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.00026, train/loss_step=0.077, global_step=3706.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2708/5971 [28:54<34:49,  1.56it/s, loss=0.149, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.00026, train/loss_step=0.077, global_step=3706.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2708/5971 [28:54<34:49,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0787, train/loss_vlb_step=0.000276, train/loss_step=0.0787, global_step=3706.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2709/5971 [28:55<34:49,  1.56it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0787, train/loss_vlb_step=0.000276, train/loss_step=0.0787, global_step=3706.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2709/5971 [28:55<34:49,  1.56it/s, loss=0.176, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00435, train/loss_step=0.490, global_step=3707.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  45%|████▌     | 2710/5971 [28:56<34:48,  1.56it/s, loss=0.176, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00435, train/loss_step=0.490, global_step=3707.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2710/5971 [28:56<34:48,  1.56it/s, loss=0.203, v_num=0, train/loss_simple_step=0.548, train/loss_vlb_step=0.0057, train/loss_step=0.548, global_step=3707.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  45%|████▌     | 2711/5971 [28:57<34:48,  1.56it/s, loss=0.203, v_num=0, train/loss_simple_step=0.548, train/loss_vlb_step=0.0057, train/loss_step=0.548, global_step=3707.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2711/5971 [28:57<34:48,  1.56it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000129, train/loss_step=0.0335, global_step=3707.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2712/5971 [28:59<34:49,  1.56it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000129, train/loss_step=0.0335, global_step=3707.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2712/5971 [28:59<34:49,  1.56it/s, loss=0.192, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00112, train/loss_step=0.305, global_step=3707.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  45%|████▌     | 2713/5971 [29:00<34:49,  1.56it/s, loss=0.192, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00112, train/loss_step=0.305, global_step=3707.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2713/5971 [29:00<34:49,  1.56it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0898, train/loss_vlb_step=0.000297, train/loss_step=0.0898, global_step=3708.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2714/5971 [29:01<34:48,  1.56it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0898, train/loss_vlb_step=0.000297, train/loss_step=0.0898, global_step=3708.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2714/5971 [29:01<34:48,  1.56it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.24e-5, train/loss_step=0.00206, global_step=3708.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2715/5971 [29:02<34:48,  1.56it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.24e-5, train/loss_step=0.00206, global_step=3708.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2715/5971 [29:02<34:48,  1.56it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.46e-5, train/loss_step=0.0053, global_step=3708.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  45%|████▌     | 2716/5971 [29:04<34:49,  1.56it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.46e-5, train/loss_step=0.0053, global_step=3708.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  45%|████▌     | 2716/5971 [29:04<34:49,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=3.31e-5, train/loss_step=0.00661, global_step=3708.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2717/5971 [29:05<34:49,  1.56it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=3.31e-5, train/loss_step=0.00661, global_step=3708.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2717/5971 [29:05<34:49,  1.56it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.29e-5, train/loss_step=0.0178, global_step=3709.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▌     | 2718/5971 [29:06<34:49,  1.56it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.29e-5, train/loss_step=0.0178, global_step=3709.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2718/5971 [29:06<34:49,  1.56it/s, loss=0.188, v_num=0, train/loss_simple_step=0.642, train/loss_vlb_step=0.00667, train/loss_step=0.642, global_step=3709.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▌     | 2719/5971 [29:07<34:48,  1.56it/s, loss=0.188, v_num=0, train/loss_simple_step=0.642, train/loss_vlb_step=0.00667, train/loss_step=0.642, global_step=3709.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2719/5971 [29:07<34:48,  1.56it/s, loss=0.18, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00138, train/loss_step=0.322, global_step=3709.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2720/5971 [29:09<34:50,  1.56it/s, loss=0.18, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00138, train/loss_step=0.322, global_step=3709.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2720/5971 [29:09<34:50,  1.56it/s, loss=0.188, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000523, train/loss_step=0.156, global_step=3709.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2721/5971 [29:10<34:49,  1.56it/s, loss=0.188, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000523, train/loss_step=0.156, global_step=3709.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2721/5971 [29:10<34:49,  1.56it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00621, train/loss_vlb_step=3.18e-5, train/loss_step=0.00621, global_step=3710.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2722/5971 [29:11<34:49,  1.55it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00621, train/loss_vlb_step=3.18e-5, train/loss_step=0.00621, global_step=3710.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2722/5971 [29:11<34:49,  1.55it/s, loss=0.201, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00162, train/loss_step=0.353, global_step=3710.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  46%|████▌     | 2723/5971 [29:12<34:49,  1.55it/s, loss=0.201, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00162, train/loss_step=0.353, global_step=3710.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2723/5971 [29:12<34:49,  1.55it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.19e-5, train/loss_step=0.00671, global_step=3710.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2724/5971 [29:14<34:50,  1.55it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.19e-5, train/loss_step=0.00671, global_step=3710.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2724/5971 [29:14<34:50,  1.55it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.8e-5, train/loss_step=0.00315, global_step=3710.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2725/5971 [29:15<34:50,  1.55it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.8e-5, train/loss_step=0.00315, global_step=3710.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2725/5971 [29:15<34:50,  1.55it/s, loss=0.176, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00136, train/loss_step=0.298, global_step=3711.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  46%|████▌     | 2726/5971 [29:16<34:49,  1.55it/s, loss=0.176, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00136, train/loss_step=0.298, global_step=3711.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2726/5971 [29:16<34:49,  1.55it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.64e-5, train/loss_step=0.00521, global_step=3711.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2727/5971 [29:17<34:49,  1.55it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.64e-5, train/loss_step=0.00521, global_step=3711.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2727/5971 [29:17<34:49,  1.55it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.57e-5, train/loss_step=0.00521, global_step=3711.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2728/5971 [29:19<34:50,  1.55it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.57e-5, train/loss_step=0.00521, global_step=3711.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2728/5971 [29:19<34:50,  1.55it/s, loss=0.187, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00325, train/loss_step=0.441, global_step=3711.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  46%|████▌     | 2729/5971 [29:20<34:50,  1.55it/s, loss=0.187, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00325, train/loss_step=0.441, global_step=3711.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2729/5971 [29:20<34:50,  1.55it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000134, train/loss_step=0.0383, global_step=3712.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2730/5971 [29:21<34:50,  1.55it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000134, train/loss_step=0.0383, global_step=3712.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2730/5971 [29:21<34:50,  1.55it/s, loss=0.146, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000674, train/loss_step=0.190, global_step=3712.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▌     | 2731/5971 [29:22<34:49,  1.55it/s, loss=0.146, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000674, train/loss_step=0.190, global_step=3712.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2731/5971 [29:22<34:49,  1.55it/s, loss=0.152, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000476, train/loss_step=0.142, global_step=3712.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2732/5971 [29:24<34:50,  1.55it/s, loss=0.152, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000476, train/loss_step=0.142, global_step=3712.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2732/5971 [29:24<34:50,  1.55it/s, loss=0.149, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.00093, train/loss_step=0.246, global_step=3712.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2733/5971 [29:25<34:50,  1.55it/s, loss=0.149, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.00093, train/loss_step=0.246, global_step=3712.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2733/5971 [29:25<34:50,  1.55it/s, loss=0.145, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.48e-5, train/loss_step=0.013, global_step=3713.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2734/5971 [29:26<34:50,  1.55it/s, loss=0.145, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.48e-5, train/loss_step=0.013, global_step=3713.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2734/5971 [29:26<34:50,  1.55it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=7.56e-5, train/loss_step=0.0206, global_step=3713.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2735/5971 [29:27<34:49,  1.55it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=7.56e-5, train/loss_step=0.0206, global_step=3713.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2735/5971 [29:27<34:49,  1.55it/s, loss=0.165, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00228, train/loss_step=0.384, global_step=3713.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▌     | 2736/5971 [29:29<34:51,  1.55it/s, loss=0.165, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00228, train/loss_step=0.384, global_step=3713.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2736/5971 [29:29<34:51,  1.55it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000137, train/loss_step=0.0362, global_step=3713.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2737/5971 [29:30<34:50,  1.55it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000137, train/loss_step=0.0362, global_step=3713.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2737/5971 [29:30<34:50,  1.55it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00339, train/loss_vlb_step=1.79e-5, train/loss_step=0.00339, global_step=3714.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2738/5971 [29:30<34:50,  1.55it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00339, train/loss_vlb_step=1.79e-5, train/loss_step=0.00339, global_step=3714.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2738/5971 [29:30<34:50,  1.55it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000168, train/loss_step=0.0461, global_step=3714.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2739/5971 [29:31<34:49,  1.55it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000168, train/loss_step=0.0461, global_step=3714.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2739/5971 [29:31<34:49,  1.55it/s, loss=0.129, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000688, train/loss_step=0.191, global_step=3714.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▌     | 2740/5971 [29:33<34:51,  1.55it/s, loss=0.129, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000688, train/loss_step=0.191, global_step=3714.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2740/5971 [29:33<34:51,  1.55it/s, loss=0.145, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00449, train/loss_step=0.475, global_step=3714.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2741/5971 [29:34<34:50,  1.54it/s, loss=0.145, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00449, train/loss_step=0.475, global_step=3714.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2741/5971 [29:34<34:50,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.048, train/loss_vlb_step=0.000169, train/loss_step=0.048, global_step=3715.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2742/5971 [29:35<34:50,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.048, train/loss_vlb_step=0.000169, train/loss_step=0.048, global_step=3715.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2742/5971 [29:35<34:50,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000131, train/loss_step=0.0371, global_step=3715.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2743/5971 [29:36<34:49,  1.54it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000131, train/loss_step=0.0371, global_step=3715.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2743/5971 [29:36<34:49,  1.54it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.16e-5, train/loss_step=0.00435, global_step=3715.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2744/5971 [29:38<34:51,  1.54it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.16e-5, train/loss_step=0.00435, global_step=3715.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2744/5971 [29:38<34:51,  1.54it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000114, train/loss_step=0.0307, global_step=3715.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2745/5971 [29:39<34:50,  1.54it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000114, train/loss_step=0.0307, global_step=3715.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2745/5971 [29:39<34:50,  1.54it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000187, train/loss_step=0.0525, global_step=3716.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2746/5971 [29:40<34:50,  1.54it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000187, train/loss_step=0.0525, global_step=3716.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2746/5971 [29:40<34:50,  1.54it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0555, train/loss_vlb_step=0.000194, train/loss_step=0.0555, global_step=3716.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2747/5971 [29:41<34:49,  1.54it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0555, train/loss_vlb_step=0.000194, train/loss_step=0.0555, global_step=3716.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2747/5971 [29:41<34:49,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00145, train/loss_step=0.287, global_step=3716.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  46%|████▌     | 2748/5971 [29:43<34:51,  1.54it/s, loss=0.137, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00145, train/loss_step=0.287, global_step=3716.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2748/5971 [29:43<34:51,  1.54it/s, loss=0.125, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000664, train/loss_step=0.193, global_step=3716.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2749/5971 [29:44<34:50,  1.54it/s, loss=0.125, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000664, train/loss_step=0.193, global_step=3716.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2749/5971 [29:44<34:50,  1.54it/s, loss=0.138, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00165, train/loss_step=0.305, global_step=3717.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2750/5971 [29:45<34:50,  1.54it/s, loss=0.138, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00165, train/loss_step=0.305, global_step=3717.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2750/5971 [29:45<34:50,  1.54it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.97e-5, train/loss_step=0.0257, global_step=3717.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2751/5971 [29:46<34:50,  1.54it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.97e-5, train/loss_step=0.0257, global_step=3717.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2751/5971 [29:46<34:50,  1.54it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000155, train/loss_step=0.0451, global_step=3717.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2752/5971 [29:48<34:51,  1.54it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000155, train/loss_step=0.0451, global_step=3717.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2752/5971 [29:48<34:51,  1.54it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.88e-5, train/loss_step=0.0243, global_step=3717.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2753/5971 [29:49<34:50,  1.54it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.88e-5, train/loss_step=0.0243, global_step=3717.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2753/5971 [29:49<34:50,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000107, train/loss_step=0.0302, global_step=3718.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2754/5971 [29:50<34:50,  1.54it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000107, train/loss_step=0.0302, global_step=3718.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2754/5971 [29:50<34:50,  1.54it/s, loss=0.146, v_num=0, train/loss_simple_step=0.645, train/loss_vlb_step=0.0115, train/loss_step=0.645, global_step=3718.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  46%|████▌     | 2755/5971 [29:51<34:50,  1.54it/s, loss=0.146, v_num=0, train/loss_simple_step=0.645, train/loss_vlb_step=0.0115, train/loss_step=0.645, global_step=3718.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2755/5971 [29:51<34:50,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.12e-5, train/loss_step=0.0113, global_step=3718.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2756/5971 [29:53<34:51,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=5.12e-5, train/loss_step=0.0113, global_step=3718.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2756/5971 [29:53<34:51,  1.54it/s, loss=0.136, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000717, train/loss_step=0.208, global_step=3718.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2757/5971 [29:54<34:51,  1.54it/s, loss=0.136, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000717, train/loss_step=0.208, global_step=3718.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2757/5971 [29:54<34:51,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.0008, train/loss_step=0.230, global_step=3719.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▌     | 2758/5971 [29:55<34:50,  1.54it/s, loss=0.147, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.0008, train/loss_step=0.230, global_step=3719.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2758/5971 [29:55<34:50,  1.54it/s, loss=0.154, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000578, train/loss_step=0.174, global_step=3719.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2759/5971 [29:56<34:50,  1.54it/s, loss=0.154, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000578, train/loss_step=0.174, global_step=3719.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2759/5971 [29:56<34:50,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=3719.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▌     | 2760/5971 [29:58<34:51,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=3719.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2760/5971 [29:58<34:51,  1.54it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00913, train/loss_vlb_step=4.23e-5, train/loss_step=0.00913, global_step=3719.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2761/5971 [29:59<34:51,  1.53it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00913, train/loss_vlb_step=4.23e-5, train/loss_step=0.00913, global_step=3719.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▌     | 2761/5971 [29:59<34:51,  1.53it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.00021, train/loss_step=0.0613, global_step=3720.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▋     | 2762/5971 [30:00<34:51,  1.53it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.00021, train/loss_step=0.0613, global_step=3720.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2762/5971 [30:00<34:51,  1.53it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.32e-5, train/loss_step=0.0043, global_step=3720.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2763/5971 [30:01<34:50,  1.53it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.32e-5, train/loss_step=0.0043, global_step=3720.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2763/5971 [30:01<34:50,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.00048, train/loss_step=0.146, global_step=3720.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▋     | 2764/5971 [30:03<34:51,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.00048, train/loss_step=0.146, global_step=3720.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2764/5971 [30:03<34:51,  1.53it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000193, train/loss_step=0.0581, global_step=3720.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2765/5971 [30:04<34:51,  1.53it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000193, train/loss_step=0.0581, global_step=3720.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2765/5971 [30:04<34:51,  1.53it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.18e-5, train/loss_step=0.00422, global_step=3721.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2766/5971 [30:05<34:50,  1.53it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.18e-5, train/loss_step=0.00422, global_step=3721.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2766/5971 [30:05<34:50,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0862, train/loss_vlb_step=0.000287, train/loss_step=0.0862, global_step=3721.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▋     | 2767/5971 [30:06<34:50,  1.53it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0862, train/loss_vlb_step=0.000287, train/loss_step=0.0862, global_step=3721.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2767/5971 [30:06<34:50,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.77e-5, train/loss_step=0.0247, global_step=3721.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▋     | 2768/5971 [30:08<34:51,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.77e-5, train/loss_step=0.0247, global_step=3721.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2768/5971 [30:08<34:51,  1.53it/s, loss=0.152, v_num=0, train/loss_simple_step=0.824, train/loss_vlb_step=0.0389, train/loss_step=0.824, global_step=3721.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  46%|████▋     | 2769/5971 [30:09<34:51,  1.53it/s, loss=0.152, v_num=0, train/loss_simple_step=0.824, train/loss_vlb_step=0.0389, train/loss_step=0.824, global_step=3721.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2769/5971 [30:09<34:51,  1.53it/s, loss=0.16, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00303, train/loss_step=0.467, global_step=3722.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2770/5971 [30:10<34:51,  1.53it/s, loss=0.16, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00303, train/loss_step=0.467, global_step=3722.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2770/5971 [30:10<34:51,  1.53it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000131, train/loss_step=0.0363, global_step=3722.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2771/5971 [30:11<34:50,  1.53it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000131, train/loss_step=0.0363, global_step=3722.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2771/5971 [30:11<34:50,  1.53it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.00014, train/loss_step=0.0396, global_step=3722.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  46%|████▋     | 2772/5971 [30:13<34:51,  1.53it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.00014, train/loss_step=0.0396, global_step=3722.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2772/5971 [30:13<34:51,  1.53it/s, loss=0.167, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000559, train/loss_step=0.165, global_step=3722.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2773/5971 [30:14<34:51,  1.53it/s, loss=0.167, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000559, train/loss_step=0.165, global_step=3722.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2773/5971 [30:14<34:51,  1.53it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.25e-6, train/loss_step=0.00141, global_step=3723.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2774/5971 [30:15<34:51,  1.53it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.25e-6, train/loss_step=0.00141, global_step=3723.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2774/5971 [30:15<34:51,  1.53it/s, loss=0.151, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00208, train/loss_step=0.342, global_step=3723.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  46%|████▋     | 2775/5971 [30:16<34:50,  1.53it/s, loss=0.151, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00208, train/loss_step=0.342, global_step=3723.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2775/5971 [30:16<34:50,  1.53it/s, loss=0.169, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00212, train/loss_step=0.381, global_step=3723.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2776/5971 [30:18<34:52,  1.53it/s, loss=0.169, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00212, train/loss_step=0.381, global_step=3723.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  46%|████▋     | 2776/5971 [30:18<34:52,  1.53it/s, loss=0.174, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00139, train/loss_step=0.313, global_step=3723.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  47%|████▋     | 2777/5971 [30:19<34:51,  1.53it/s, loss=0.174, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00139, train/loss_step=0.313, global_step=3723.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  47%|████▋     | 2777/5971 [30:19<34:51,  1.53it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.3e-5, train/loss_step=0.0139, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  47%|████▋     | 2778/5971 [30:20<34:51,  1.53it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.3e-5, train/loss_step=0.0139, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  47%|████▋     | 2778/5971 [30:20<34:51,  1.53it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=0.000101, train/loss_step=0.0255, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  47%|████▋     | 2779/5971 [30:21<34:51,  1.53it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=0.000101, train/loss_step=0.0255, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  47%|████▋     | 2779/5971 [30:21<34:51,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=1.99e-5, train/loss_step=0.00382, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  47%|████▋     | 2780/5971 [30:23<34:52,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=1.99e-5, train/loss_step=0.00382, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  47%|████▋     | 2780/5971 [30:23<34:52,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:00,  2.75it/s][A
Epoch 6:  47%|████▋     | 2782/5971 [30:23<34:49,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:47,  3.47it/s][A
Epoch 6:  47%|████▋     | 2784/5971 [30:23<34:47,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.08it/s][A
Epoch 6:  47%|████▋     | 2787/5971 [30:24<34:43,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:12, 13.15it/s][A
Epoch 6:  47%|████▋     | 2790/5971 [30:24<34:39,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.43it/s][A
Epoch 6:  47%|████▋     | 2793/5971 [30:24<34:35,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.23it/s][A
Epoch 6:  47%|████▋     | 2796/5971 [30:24<34:31,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.43it/s][A
Epoch 6:  47%|████▋     | 2799/5971 [30:24<34:26,  1.53it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.97it/s][A
Epoch 6:  47%|████▋     | 2802/5971 [30:24<34:22,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.80it/s][A
Epoch 6:  47%|████▋     | 2805/5971 [30:24<34:18,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.09it/s][A
Epoch 6:  47%|████▋     | 2808/5971 [30:24<34:14,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.63it/s][A
Epoch 6:  47%|████▋     | 2811/5971 [30:25<34:10,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.72it/s][A
Epoch 6:  47%|████▋     | 2814/5971 [30:25<34:06,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:01<00:04, 26.82it/s][A
Epoch 6:  47%|████▋     | 2817/5971 [30:25<34:02,  1.54it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.44it/s][A
Epoch 6:  47%|████▋     | 2821/5971 [30:25<33:57,  1.55it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 28.12it/s][A
Epoch 6:  47%|████▋     | 2825/5971 [30:25<33:52,  1.55it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.27it/s][A
Epoch 6:  47%|████▋     | 2829/5971 [30:25<33:46,  1.55it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 28.26it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 27.75it/s][A
Epoch 6:  47%|████▋     | 2833/5971 [30:25<33:41,  1.55it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:02<00:03, 28.90it/s][A
Epoch 6:  48%|████▊     | 2837/5971 [30:25<33:36,  1.55it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 29.07it/s][A
Epoch 6:  48%|████▊     | 2841/5971 [30:26<33:31,  1.56it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.02it/s][A
Epoch 6:  48%|████▊     | 2845/5971 [30:26<33:25,  1.56it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.56it/s][A

Validating:  41%|████      | 68/167 [00:03<00:03, 26.53it/s][A
Epoch 6:  48%|████▊     | 2849/5971 [30:26<33:20,  1.56it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.93it/s][A
Epoch 6:  48%|████▊     | 2853/5971 [30:26<33:15,  1.56it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.41it/s][A
Epoch 6:  48%|████▊     | 2857/5971 [30:26<33:10,  1.56it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 28.33it/s][A
Epoch 6:  48%|████▊     | 2861/5971 [30:26<33:05,  1.57it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.92it/s][A
Epoch 6:  48%|████▊     | 2865/5971 [30:26<32:59,  1.57it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.81it/s][A

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.45it/s][A
Epoch 6:  48%|████▊     | 2869/5971 [30:27<32:54,  1.57it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.66it/s][A
Epoch 6:  48%|████▊     | 2873/5971 [30:27<32:49,  1.57it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 27.53it/s][A
Epoch 6:  48%|████▊     | 2877/5971 [30:27<32:44,  1.57it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.13it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.93it/s][A
Epoch 6:  48%|████▊     | 2881/5971 [30:27<32:39,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.81it/s][A
Epoch 6:  48%|████▊     | 2885/5971 [30:27<32:34,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 28.56it/s][A
Epoch 6:  48%|████▊     | 2889/5971 [30:27<32:29,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.49it/s][A
Epoch 6:  48%|████▊     | 2893/5971 [30:27<32:24,  1.58it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 26.66it/s][A
Epoch 6:  49%|████▊     | 2897/5971 [30:28<32:19,  1.59it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.25it/s][A

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 25.83it/s][A
Epoch 6:  49%|████▊     | 2901/5971 [30:28<32:14,  1.59it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.20it/s][A
Epoch 6:  49%|████▊     | 2905/5971 [30:28<32:09,  1.59it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.68it/s][A
Epoch 6:  49%|████▊     | 2909/5971 [30:28<32:04,  1.59it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.93it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.62it/s][A
Epoch 6:  49%|████▉     | 2913/5971 [30:28<31:59,  1.59it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 28.61it/s][A
Epoch 6:  49%|████▉     | 2917/5971 [30:28<31:54,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:05<00:00, 28.55it/s][A
Epoch 6:  49%|████▉     | 2921/5971 [30:28<31:49,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 27.93it/s][A
Epoch 6:  49%|████▉     | 2925/5971 [30:29<31:44,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 27.32it/s][A

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 27.35it/s][A
Epoch 6:  49%|████▉     | 2929/5971 [30:29<31:39,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.67it/s][A
Epoch 6:  49%|████▉     | 2933/5971 [30:29<31:34,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.66it/s][A
Epoch 6:  49%|████▉     | 2937/5971 [30:29<31:29,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.32it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.93it/s][A
Epoch 6:  49%|████▉     | 2941/5971 [30:29<31:24,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.55it/s][A
Epoch 6:  49%|████▉     | 2945/5971 [30:29<31:19,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.10it/s][A
Epoch 6:  49%|████▉     | 2948/5971 [30:30<31:16,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  49%|████▉     | 2949/5971 [30:31<31:15,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00487, train/loss_vlb_step=2.48e-5, train/loss_step=0.00487, global_step=3724.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  49%|████▉     | 2949/5971 [30:31<31:15,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000224, train/loss_step=0.0665, global_step=3725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  49%|████▉     | 2950/5971 [30:32<31:15,  1.61it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0045, train/loss_vlb_step=2.29e-5, train/loss_step=0.0045, global_step=3725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  49%|████▉     | 2951/5971 [30:33<31:15,  1.61it/s, loss=0.176, v_num=0, train/loss_simple_step=0.655, train/loss_vlb_step=0.0142, train/loss_step=0.655, global_step=3725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  49%|████▉     | 2952/5971 [30:35<31:16,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.65e-5, train/loss_step=0.00294, global_step=3725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  49%|████▉     | 2953/5971 [30:36<31:16,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.65e-5, train/loss_step=0.00294, global_step=3725.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  49%|████▉     | 2953/5971 [30:36<31:16,  1.61it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0486, train/loss_vlb_step=0.000168, train/loss_step=0.0486, global_step=3726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  49%|████▉     | 2954/5971 [30:37<31:15,  1.61it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00413, train/loss_vlb_step=2.13e-5, train/loss_step=0.00413, global_step=3726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  49%|████▉     | 2955/5971 [30:38<31:15,  1.61it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0703, train/loss_vlb_step=0.000246, train/loss_step=0.0703, global_step=3726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|████▉     | 2956/5971 [30:40<31:16,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.73e-5, train/loss_step=0.0054, global_step=3726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|████▉     | 2957/5971 [30:41<31:16,  1.61it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.73e-5, train/loss_step=0.0054, global_step=3726.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2957/5971 [30:41<31:16,  1.61it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000135, train/loss_step=0.0361, global_step=3727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2958/5971 [30:42<31:15,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0711, train/loss_vlb_step=0.00024, train/loss_step=0.0711, global_step=3727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|████▉     | 2959/5971 [30:42<31:15,  1.61it/s, loss=0.117, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000445, train/loss_step=0.125, global_step=3727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|████▉     | 2960/5971 [30:45<31:16,  1.60it/s, loss=0.118, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000634, train/loss_step=0.185, global_step=3727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2961/5971 [30:46<31:15,  1.60it/s, loss=0.118, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000634, train/loss_step=0.185, global_step=3727.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2961/5971 [30:46<31:15,  1.60it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00855, train/loss_vlb_step=4.11e-5, train/loss_step=0.00855, global_step=3728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2962/5971 [30:46<31:15,  1.60it/s, loss=0.124, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00367, train/loss_step=0.456, global_step=3728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  50%|████▉     | 2963/5971 [30:47<31:15,  1.60it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0334, train/loss_vlb_step=0.00012, train/loss_step=0.0334, global_step=3728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2964/5971 [30:50<31:16,  1.60it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.05e-5, train/loss_step=0.00384, global_step=3728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2965/5971 [30:51<31:16,  1.60it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.05e-5, train/loss_step=0.00384, global_step=3728.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2965/5971 [30:51<31:16,  1.60it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.0052, train/loss_vlb_step=2.71e-5, train/loss_step=0.0052, global_step=3729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  50%|████▉     | 2966/5971 [30:52<31:16,  1.60it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000133, train/loss_step=0.0361, global_step=3729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2967/5971 [30:53<31:15,  1.60it/s, loss=0.0914, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2e-5, train/loss_step=0.00407, global_step=3729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  50%|████▉     | 2968/5971 [30:55<31:16,  1.60it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.00052, train/loss_step=0.156, global_step=3729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|████▉     | 2969/5971 [30:56<31:16,  1.60it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.00052, train/loss_step=0.156, global_step=3729.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2969/5971 [30:56<31:16,  1.60it/s, loss=0.11, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00157, train/loss_step=0.290, global_step=3730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  50%|████▉     | 2970/5971 [30:57<31:15,  1.60it/s, loss=0.145, v_num=0, train/loss_simple_step=0.705, train/loss_vlb_step=0.0172, train/loss_step=0.705, global_step=3730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2971/5971 [30:58<31:15,  1.60it/s, loss=0.116, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.00024, train/loss_step=0.073, global_step=3730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2972/5971 [31:00<31:16,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.866, train/loss_vlb_step=0.146, train/loss_step=0.866, global_step=3730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  50%|████▉     | 2973/5971 [31:01<31:16,  1.60it/s, loss=0.159, v_num=0, train/loss_simple_step=0.866, train/loss_vlb_step=0.146, train/loss_step=0.866, global_step=3730.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2973/5971 [31:01<31:16,  1.60it/s, loss=0.174, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00194, train/loss_step=0.349, global_step=3731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2974/5971 [31:02<31:15,  1.60it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.47e-5, train/loss_step=0.0217, global_step=3731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2975/5971 [31:03<31:15,  1.60it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00478, train/loss_vlb_step=2.51e-5, train/loss_step=0.00478, global_step=3731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2976/5971 [31:05<31:16,  1.60it/s, loss=0.177, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=3731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  50%|████▉     | 2977/5971 [31:06<31:16,  1.60it/s, loss=0.177, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=3731.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2977/5971 [31:06<31:16,  1.60it/s, loss=0.177, v_num=0, train/loss_simple_step=0.028, train/loss_vlb_step=0.000105, train/loss_step=0.028, global_step=3732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2978/5971 [31:06<31:15,  1.60it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0532, train/loss_vlb_step=0.000191, train/loss_step=0.0532, global_step=3732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2979/5971 [31:07<31:15,  1.60it/s, loss=0.178, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.00053, train/loss_step=0.158, global_step=3732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  50%|████▉     | 2980/5971 [31:09<31:16,  1.59it/s, loss=0.189, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.00208, train/loss_step=0.414, global_step=3732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2981/5971 [31:10<31:15,  1.59it/s, loss=0.189, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.00208, train/loss_step=0.414, global_step=3732.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2981/5971 [31:10<31:15,  1.59it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000181, train/loss_step=0.0547, global_step=3733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2982/5971 [31:11<31:15,  1.59it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.18e-5, train/loss_step=0.0205, global_step=3733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  50%|████▉     | 2983/5971 [31:12<31:15,  1.59it/s, loss=0.183, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00164, train/loss_step=0.305, global_step=3733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|████▉     | 2984/5971 [31:14<31:16,  1.59it/s, loss=0.192, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000626, train/loss_step=0.176, global_step=3733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2985/5971 [31:15<31:15,  1.59it/s, loss=0.192, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000626, train/loss_step=0.176, global_step=3733.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|████▉     | 2985/5971 [31:15<31:15,  1.59it/s, loss=0.223, v_num=0, train/loss_simple_step=0.629, train/loss_vlb_step=0.00745, train/loss_step=0.629, global_step=3734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|█████     | 2986/5971 [31:16<31:15,  1.59it/s, loss=0.232, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000751, train/loss_step=0.223, global_step=3734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2987/5971 [31:17<31:15,  1.59it/s, loss=0.251, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00216, train/loss_step=0.377, global_step=3734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|█████     | 2988/5971 [31:20<31:16,  1.59it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000252, train/loss_step=0.0738, global_step=3734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2989/5971 [31:21<31:16,  1.59it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000252, train/loss_step=0.0738, global_step=3734.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2989/5971 [31:21<31:16,  1.59it/s, loss=0.238, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=3735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  50%|█████     | 2990/5971 [31:21<31:15,  1.59it/s, loss=0.214, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000844, train/loss_step=0.229, global_step=3735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2991/5971 [31:22<31:15,  1.59it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0944, train/loss_vlb_step=0.00031, train/loss_step=0.0944, global_step=3735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2992/5971 [31:25<31:16,  1.59it/s, loss=0.173, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000111, train/loss_step=0.029, global_step=3735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|█████     | 2993/5971 [31:26<31:16,  1.59it/s, loss=0.173, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000111, train/loss_step=0.029, global_step=3735.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2993/5971 [31:26<31:16,  1.59it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0697, train/loss_vlb_step=0.000233, train/loss_step=0.0697, global_step=3736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2994/5971 [31:26<31:15,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00235, train/loss_vlb_step=1.38e-5, train/loss_step=0.00235, global_step=3736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2995/5971 [31:27<31:15,  1.59it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.61e-6, train/loss_step=0.00159, global_step=3736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2996/5971 [31:30<31:16,  1.59it/s, loss=0.154, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000133, train/loss_step=0.036, global_step=3736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  50%|█████     | 2997/5971 [31:30<31:15,  1.59it/s, loss=0.154, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000133, train/loss_step=0.036, global_step=3736.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2997/5971 [31:30<31:15,  1.59it/s, loss=0.172, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00293, train/loss_step=0.400, global_step=3737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|█████     | 2998/5971 [31:31<31:15,  1.59it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.57e-5, train/loss_step=0.0239, global_step=3737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 2999/5971 [31:32<31:15,  1.59it/s, loss=0.179, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00183, train/loss_step=0.315, global_step=3737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  50%|█████     | 3000/5971 [31:34<31:15,  1.58it/s, loss=0.168, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000732, train/loss_step=0.198, global_step=3737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3001/5971 [31:35<31:15,  1.58it/s, loss=0.168, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000732, train/loss_step=0.198, global_step=3737.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3001/5971 [31:35<31:15,  1.58it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0618, train/loss_vlb_step=0.00021, train/loss_step=0.0618, global_step=3738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3002/5971 [31:36<31:15,  1.58it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000278, train/loss_step=0.0842, global_step=3738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3003/5971 [31:37<31:14,  1.58it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.73e-5, train/loss_step=0.0162, global_step=3738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|█████     | 3004/5971 [31:39<31:15,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=1.01e-5, train/loss_step=0.00167, global_step=3738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3005/5971 [31:40<31:15,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=1.01e-5, train/loss_step=0.00167, global_step=3738.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3005/5971 [31:40<31:15,  1.58it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00361, train/loss_vlb_step=1.88e-5, train/loss_step=0.00361, global_step=3739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3006/5971 [31:41<31:14,  1.58it/s, loss=0.126, v_num=0, train/loss_simple_step=0.401, train/loss_vlb_step=0.00193, train/loss_step=0.401, global_step=3739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  50%|█████     | 3007/5971 [31:42<31:14,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00277, train/loss_step=0.412, global_step=3739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3008/5971 [31:44<31:15,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0876, train/loss_vlb_step=0.000288, train/loss_step=0.0876, global_step=3739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3009/5971 [31:45<31:15,  1.58it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0876, train/loss_vlb_step=0.000288, train/loss_step=0.0876, global_step=3739.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3009/5971 [31:45<31:15,  1.58it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.14e-5, train/loss_step=0.00207, global_step=3740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3010/5971 [31:46<31:14,  1.58it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000153, train/loss_step=0.0426, global_step=3740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|█████     | 3011/5971 [31:47<31:14,  1.58it/s, loss=0.116, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000405, train/loss_step=0.123, global_step=3740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  50%|█████     | 3012/5971 [31:49<31:15,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.62e-5, train/loss_step=0.0177, global_step=3740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3013/5971 [31:50<31:15,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.62e-5, train/loss_step=0.0177, global_step=3740.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3013/5971 [31:50<31:15,  1.58it/s, loss=0.115, v_num=0, train/loss_simple_step=0.070, train/loss_vlb_step=0.000236, train/loss_step=0.070, global_step=3741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  50%|█████     | 3014/5971 [31:51<31:14,  1.58it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000255, train/loss_step=0.0763, global_step=3741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  50%|█████     | 3015/5971 [31:52<31:14,  1.58it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.49e-5, train/loss_step=0.0177, global_step=3741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  51%|█████     | 3016/5971 [31:54<31:15,  1.58it/s, loss=0.127, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000649, train/loss_step=0.185, global_step=3741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3017/5971 [31:55<31:14,  1.58it/s, loss=0.127, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000649, train/loss_step=0.185, global_step=3741.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3017/5971 [31:55<31:14,  1.58it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.00028, train/loss_step=0.0851, global_step=3742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3018/5971 [31:56<31:14,  1.58it/s, loss=0.131, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00244, train/loss_step=0.424, global_step=3742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  51%|█████     | 3019/5971 [31:57<31:13,  1.58it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0435, train/loss_vlb_step=0.000151, train/loss_step=0.0435, global_step=3742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3020/5971 [31:59<31:14,  1.57it/s, loss=0.11, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000176, train/loss_step=0.052, global_step=3742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  51%|█████     | 3021/5971 [32:00<31:14,  1.57it/s, loss=0.11, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000176, train/loss_step=0.052, global_step=3742.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3021/5971 [32:00<31:14,  1.57it/s, loss=0.108, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=9.91e-5, train/loss_step=0.025, global_step=3743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3022/5971 [32:01<31:14,  1.57it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.5e-5, train/loss_step=0.00266, global_step=3743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3023/5971 [32:02<31:13,  1.57it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000116, train/loss_step=0.0315, global_step=3743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3024/5971 [32:04<31:14,  1.57it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000309, train/loss_step=0.0899, global_step=3743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  51%|█████     | 3025/5971 [32:05<31:14,  1.57it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000309, train/loss_step=0.0899, global_step=3743.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3025/5971 [32:05<31:14,  1.57it/s, loss=0.118, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000577, train/loss_step=0.172, global_step=3744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  51%|█████     | 3026/5971 [32:05<31:13,  1.57it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000305, train/loss_step=0.0916, global_step=3744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3027/5971 [32:06<31:13,  1.57it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=7.93e-5, train/loss_step=0.0198, global_step=3744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3028/5971 [32:09<31:14,  1.57it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00153, train/loss_step=0.360, global_step=3744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  51%|█████     | 3029/5971 [32:10<31:13,  1.57it/s, loss=0.0966, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00153, train/loss_step=0.360, global_step=3744.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3029/5971 [32:10<31:13,  1.57it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000117, train/loss_step=0.0325, global_step=3745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3030/5971 [32:10<31:13,  1.57it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000145, train/loss_step=0.039, global_step=3745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  51%|█████     | 3031/5971 [32:11<31:13,  1.57it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000463, train/loss_step=0.140, global_step=3745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3032/5971 [32:14<31:14,  1.57it/s, loss=0.104, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=3745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  51%|█████     | 3033/5971 [32:15<31:14,  1.57it/s, loss=0.104, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000401, train/loss_step=0.122, global_step=3745.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3033/5971 [32:15<31:14,  1.57it/s, loss=0.103, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.000151, train/loss_step=0.044, global_step=3746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3034/5971 [32:16<31:13,  1.57it/s, loss=0.099, v_num=0, train/loss_simple_step=0.00279, train/loss_vlb_step=1.52e-5, train/loss_step=0.00279, global_step=3746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3035/5971 [32:17<31:13,  1.57it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.15e-5, train/loss_step=0.0158, global_step=3746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  51%|█████     | 3036/5971 [32:19<31:14,  1.57it/s, loss=0.101, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000868, train/loss_step=0.229, global_step=3746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  51%|█████     | 3037/5971 [32:20<31:13,  1.57it/s, loss=0.101, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000868, train/loss_step=0.229, global_step=3746.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3037/5971 [32:20<31:13,  1.57it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.66e-6, train/loss_step=0.00161, global_step=3747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3038/5971 [32:21<31:13,  1.57it/s, loss=0.0762, v_num=0, train/loss_simple_step=0.00935, train/loss_vlb_step=4.09e-5, train/loss_step=0.00935, global_step=3747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3039/5971 [32:21<31:12,  1.57it/s, loss=0.0757, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000128, train/loss_step=0.0338, global_step=3747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  51%|█████     | 3040/5971 [32:24<31:13,  1.56it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000981, train/loss_step=0.243, global_step=3747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  51%|█████     | 3041/5971 [32:25<31:13,  1.56it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000981, train/loss_step=0.243, global_step=3747.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3041/5971 [32:25<31:13,  1.56it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000231, train/loss_step=0.0645, global_step=3748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3042/5971 [32:25<31:13,  1.56it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.0272, train/loss_vlb_step=9.99e-5, train/loss_step=0.0272, global_step=3748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  51%|█████     | 3043/5971 [32:26<31:12,  1.56it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000114, train/loss_step=0.0284, global_step=3748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3044/5971 [32:29<31:13,  1.56it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000147, train/loss_step=0.0395, global_step=3748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3045/5971 [32:30<31:13,  1.56it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000147, train/loss_step=0.0395, global_step=3748.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3045/5971 [32:30<31:13,  1.56it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.0916, train/loss_vlb_step=0.000303, train/loss_step=0.0916, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3046/5971 [32:30<31:12,  1.56it/s, loss=0.079, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.000128, train/loss_step=0.0363, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  51%|█████     | 3047/5971 [32:31<31:12,  1.56it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000523, train/loss_step=0.151, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  51%|█████     | 3048/5971 [32:33<31:13,  1.56it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  51%|█████     | 3049/5971 [32:33<31:11,  1.56it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:06,  2.50it/s][A

Validating:   1%|          | 2/167 [00:00<00:42,  3.89it/s][A
Epoch 6:  51%|█████     | 3053/5971 [32:34<31:07,  1.56it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:16, 10.11it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.00it/s][A
Epoch 6:  51%|█████     | 3057/5971 [32:34<31:02,  1.56it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.33it/s][A
Epoch 6:  51%|█████▏    | 3061/5971 [32:34<30:57,  1.57it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.79it/s][A
Epoch 6:  51%|█████▏    | 3065/5971 [32:35<30:53,  1.57it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.28it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.83it/s][A
Epoch 6:  51%|█████▏    | 3069/5971 [32:35<30:48,  1.57it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.48it/s][A
Epoch 6:  51%|█████▏    | 3073/5971 [32:35<30:43,  1.57it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 26.32it/s][A
Epoch 6:  52%|█████▏    | 3077/5971 [32:35<30:38,  1.57it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 26.60it/s][A
Epoch 6:  52%|█████▏    | 3081/5971 [32:35<30:33,  1.58it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.26it/s][A

Validating:  22%|██▏       | 36/167 [00:01<00:05, 24.81it/s][A
Epoch 6:  52%|█████▏    | 3085/5971 [32:35<30:29,  1.58it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 25.70it/s][A
Epoch 6:  52%|█████▏    | 3089/5971 [32:36<30:24,  1.58it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.40it/s][A
Epoch 6:  52%|█████▏    | 3093/5971 [32:36<30:19,  1.58it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.97it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.05it/s][A
Epoch 6:  52%|█████▏    | 3097/5971 [32:36<30:14,  1.58it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.01it/s][A
Epoch 6:  52%|█████▏    | 3101/5971 [32:36<30:10,  1.59it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.97it/s][A
Epoch 6:  52%|█████▏    | 3105/5971 [32:36<30:05,  1.59it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.58it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:04, 24.30it/s][A
Epoch 6:  52%|█████▏    | 3109/5971 [32:36<30:00,  1.59it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.21it/s][A
Epoch 6:  52%|█████▏    | 3113/5971 [32:36<29:56,  1.59it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.10it/s][A
Epoch 6:  52%|█████▏    | 3117/5971 [32:37<29:51,  1.59it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.03it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.84it/s][A
Epoch 6:  52%|█████▏    | 3121/5971 [32:37<29:46,  1.60it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.22it/s][A
Epoch 6:  52%|█████▏    | 3125/5971 [32:37<29:42,  1.60it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.95it/s][A
Epoch 6:  52%|█████▏    | 3129/5971 [32:37<29:37,  1.60it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.85it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.65it/s][A
Epoch 6:  52%|█████▏    | 3133/5971 [32:37<29:32,  1.60it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 27.17it/s][A
Epoch 6:  53%|█████▎    | 3137/5971 [32:37<29:28,  1.60it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 28.91it/s][A
Epoch 6:  53%|█████▎    | 3141/5971 [32:37<29:23,  1.60it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 28.80it/s][A
Epoch 6:  53%|█████▎    | 3145/5971 [32:38<29:18,  1.61it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 28.18it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 27.35it/s][A
Epoch 6:  53%|█████▎    | 3149/5971 [32:38<29:14,  1.61it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.39it/s][A
Epoch 6:  53%|█████▎    | 3153/5971 [32:38<29:09,  1.61it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.13it/s][A
Epoch 6:  53%|█████▎    | 3157/5971 [32:38<29:05,  1.61it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.51it/s][A
Epoch 6:  53%|█████▎    | 3161/5971 [32:38<29:00,  1.61it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.50it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 27.28it/s][A
Epoch 6:  53%|█████▎    | 3165/5971 [32:38<28:56,  1.62it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 26.67it/s][A
Epoch 6:  53%|█████▎    | 3169/5971 [32:39<28:51,  1.62it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.87it/s][A
Epoch 6:  53%|█████▎    | 3173/5971 [32:39<28:47,  1.62it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.98it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.16it/s][A
Epoch 6:  53%|█████▎    | 3177/5971 [32:39<28:42,  1.62it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.69it/s][A
Epoch 6:  53%|█████▎    | 3181/5971 [32:39<28:38,  1.62it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.66it/s][A
Epoch 6:  53%|█████▎    | 3185/5971 [32:39<28:33,  1.63it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:05<00:00, 28.67it/s][A
Epoch 6:  53%|█████▎    | 3189/5971 [32:39<28:29,  1.63it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 28.24it/s][A
Epoch 6:  53%|█████▎    | 3193/5971 [32:39<28:24,  1.63it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 26.49it/s][A

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.12it/s][A
Epoch 6:  54%|█████▎    | 3197/5971 [32:40<28:20,  1.63it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.40it/s][A
Epoch 6:  54%|█████▎    | 3201/5971 [32:40<28:15,  1.63it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 28.28it/s][A
Epoch 6:  54%|█████▎    | 3205/5971 [32:40<28:11,  1.64it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.75it/s][A
Epoch 6:  54%|█████▎    | 3209/5971 [32:40<28:06,  1.64it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.21it/s][A

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.83it/s][A
Epoch 6:  54%|█████▍    | 3213/5971 [32:40<28:02,  1.64it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:06<00:00, 25.80it/s][A
Epoch 6:  54%|█████▍    | 3216/5971 [32:40<27:59,  1.64it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  54%|█████▍    | 3217/5971 [32:41<27:59,  1.64it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000962, train/loss_step=0.255, global_step=3749.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3217/5971 [32:41<27:59,  1.64it/s, loss=0.0795, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.93e-5, train/loss_step=0.016, global_step=3750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  54%|█████▍    | 3218/5971 [32:42<27:58,  1.64it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.000952, train/loss_step=0.256, global_step=3750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3219/5971 [32:43<27:58,  1.64it/s, loss=0.0963, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.0012, train/loss_step=0.259, global_step=3750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  54%|█████▍    | 3220/5971 [32:46<27:59,  1.64it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000151, train/loss_step=0.0422, global_step=3750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3221/5971 [32:46<27:58,  1.64it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000151, train/loss_step=0.0422, global_step=3750.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3221/5971 [32:46<27:58,  1.64it/s, loss=0.0917, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000115, train/loss_step=0.0323, global_step=3751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3222/5971 [32:47<27:58,  1.64it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=6.21e-5, train/loss_step=0.0133, global_step=3751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  54%|█████▍    | 3223/5971 [32:48<27:58,  1.64it/s, loss=0.109, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00154, train/loss_step=0.342, global_step=3751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  54%|█████▍    | 3224/5971 [32:50<27:58,  1.64it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000152, train/loss_step=0.0413, global_step=3751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3225/5971 [32:51<27:58,  1.64it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000152, train/loss_step=0.0413, global_step=3751.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3225/5971 [32:51<27:58,  1.64it/s, loss=0.0993, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.76e-5, train/loss_step=0.00335, global_step=3752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3226/5971 [32:52<27:58,  1.64it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.0082, train/loss_vlb_step=3.72e-5, train/loss_step=0.0082, global_step=3752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  54%|█████▍    | 3227/5971 [32:53<27:57,  1.64it/s, loss=0.114, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00148, train/loss_step=0.329, global_step=3752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  54%|█████▍    | 3228/5971 [32:56<27:58,  1.63it/s, loss=0.111, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000706, train/loss_step=0.192, global_step=3752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3229/5971 [32:56<27:58,  1.63it/s, loss=0.111, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000706, train/loss_step=0.192, global_step=3752.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3229/5971 [32:56<27:58,  1.63it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3230/5971 [32:57<27:57,  1.63it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00499, train/loss_vlb_step=2.64e-5, train/loss_step=0.00499, global_step=3753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3231/5971 [32:58<27:57,  1.63it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=8.28e-5, train/loss_step=0.0193, global_step=3753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  54%|█████▍    | 3232/5971 [33:00<27:58,  1.63it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0429, train/loss_vlb_step=0.000152, train/loss_step=0.0429, global_step=3753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3233/5971 [33:01<27:57,  1.63it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0429, train/loss_vlb_step=0.000152, train/loss_step=0.0429, global_step=3753.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3233/5971 [33:01<27:57,  1.63it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0978, train/loss_vlb_step=0.000321, train/loss_step=0.0978, global_step=3754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3234/5971 [33:02<27:57,  1.63it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.39e-5, train/loss_step=0.00254, global_step=3754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3235/5971 [33:03<27:57,  1.63it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0878, train/loss_vlb_step=0.000297, train/loss_step=0.0878, global_step=3754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  54%|█████▍    | 3236/5971 [33:05<27:57,  1.63it/s, loss=0.091, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.00011, train/loss_step=0.0276, global_step=3754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  54%|█████▍    | 3237/5971 [33:06<27:57,  1.63it/s, loss=0.091, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=0.00011, train/loss_step=0.0276, global_step=3754.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3237/5971 [33:06<27:57,  1.63it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.000293, train/loss_step=0.0881, global_step=3755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3238/5971 [33:07<27:57,  1.63it/s, loss=0.112, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.00763, train/loss_step=0.599, global_step=3755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  54%|█████▍    | 3239/5971 [33:08<27:56,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.487, train/loss_vlb_step=0.0038, train/loss_step=0.487, global_step=3755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  54%|█████▍    | 3240/5971 [33:10<27:57,  1.63it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.32e-5, train/loss_step=0.00481, global_step=3755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3241/5971 [33:11<27:57,  1.63it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.32e-5, train/loss_step=0.00481, global_step=3755.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3241/5971 [33:11<27:57,  1.63it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.86e-5, train/loss_step=0.0101, global_step=3756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  54%|█████▍    | 3242/5971 [33:12<27:56,  1.63it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.51e-5, train/loss_step=0.00273, global_step=3756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3243/5971 [33:13<27:56,  1.63it/s, loss=0.115, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000996, train/loss_step=0.240, global_step=3756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  54%|█████▍    | 3244/5971 [33:15<27:57,  1.63it/s, loss=0.119, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=3756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3245/5971 [33:16<27:56,  1.63it/s, loss=0.119, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=3756.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3245/5971 [33:16<27:56,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.548, train/loss_vlb_step=0.00793, train/loss_step=0.548, global_step=3757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  54%|█████▍    | 3246/5971 [33:17<27:56,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.98e-5, train/loss_step=0.0165, global_step=3757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3247/5971 [33:18<27:55,  1.63it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.17e-5, train/loss_step=0.0115, global_step=3757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3248/5971 [33:20<27:56,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.84e-5, train/loss_step=0.00337, global_step=3757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3249/5971 [33:21<27:56,  1.62it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.84e-5, train/loss_step=0.00337, global_step=3757.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3249/5971 [33:21<27:56,  1.62it/s, loss=0.144, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00351, train/loss_step=0.445, global_step=3758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  54%|█████▍    | 3250/5971 [33:22<27:55,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00273, train/loss_step=0.412, global_step=3758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3251/5971 [33:23<27:55,  1.62it/s, loss=0.17, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000505, train/loss_step=0.151, global_step=3758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3252/5971 [33:25<27:56,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0989, train/loss_vlb_step=0.000325, train/loss_step=0.0989, global_step=3758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3253/5971 [33:26<27:55,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0989, train/loss_vlb_step=0.000325, train/loss_step=0.0989, global_step=3758.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  54%|█████▍    | 3253/5971 [33:26<27:55,  1.62it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.85e-5, train/loss_step=0.0165, global_step=3759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  54%|█████▍    | 3254/5971 [33:27<27:55,  1.62it/s, loss=0.204, v_num=0, train/loss_simple_step=0.707, train/loss_vlb_step=0.0129, train/loss_step=0.707, global_step=3759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  55%|█████▍    | 3255/5971 [33:28<27:55,  1.62it/s, loss=0.207, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000446, train/loss_step=0.131, global_step=3759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3256/5971 [33:30<27:55,  1.62it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000216, train/loss_step=0.0614, global_step=3759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3257/5971 [33:31<27:55,  1.62it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000216, train/loss_step=0.0614, global_step=3759.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3257/5971 [33:31<27:55,  1.62it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=8.83e-5, train/loss_step=0.0229, global_step=3760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  55%|█████▍    | 3258/5971 [33:32<27:55,  1.62it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.00011, train/loss_step=0.0302, global_step=3760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3259/5971 [33:33<27:54,  1.62it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00236, train/loss_vlb_step=1.35e-5, train/loss_step=0.00236, global_step=3760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3260/5971 [33:35<27:55,  1.62it/s, loss=0.161, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000744, train/loss_step=0.185, global_step=3760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  55%|█████▍    | 3261/5971 [33:36<27:55,  1.62it/s, loss=0.161, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000744, train/loss_step=0.185, global_step=3760.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3261/5971 [33:36<27:55,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000195, train/loss_step=0.0554, global_step=3761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3262/5971 [33:37<27:54,  1.62it/s, loss=0.181, v_num=0, train/loss_simple_step=0.348, train/loss_vlb_step=0.00164, train/loss_step=0.348, global_step=3761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  55%|█████▍    | 3263/5971 [33:38<27:54,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0723, train/loss_vlb_step=0.000243, train/loss_step=0.0723, global_step=3761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3264/5971 [33:40<27:54,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.3e-5, train/loss_step=0.00229, global_step=3761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3265/5971 [33:41<27:54,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.3e-5, train/loss_step=0.00229, global_step=3761.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3265/5971 [33:41<27:54,  1.62it/s, loss=0.149, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000724, train/loss_step=0.214, global_step=3762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  55%|█████▍    | 3266/5971 [33:41<27:54,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  55%|█████▍    | 3267/5971 [33:42<27:53,  1.62it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0994, train/loss_vlb_step=0.000326, train/loss_step=0.0994, global_step=3762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3268/5971 [33:45<27:54,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.24e-5, train/loss_step=0.00434, global_step=3762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3269/5971 [33:46<27:54,  1.61it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.24e-5, train/loss_step=0.00434, global_step=3762.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3269/5971 [33:46<27:54,  1.61it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.91e-5, train/loss_step=0.00348, global_step=3763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3270/5971 [33:46<27:53,  1.61it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.92e-5, train/loss_step=0.0218, global_step=3763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  55%|█████▍    | 3271/5971 [33:47<27:53,  1.61it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00423, train/loss_vlb_step=2.24e-5, train/loss_step=0.00423, global_step=3763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3272/5971 [33:50<27:54,  1.61it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00311, train/loss_vlb_step=1.78e-5, train/loss_step=0.00311, global_step=3763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3273/5971 [33:51<27:53,  1.61it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00311, train/loss_vlb_step=1.78e-5, train/loss_step=0.00311, global_step=3763.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3273/5971 [33:51<27:53,  1.61it/s, loss=0.112, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.00049, train/loss_step=0.147, global_step=3764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  55%|█████▍    | 3274/5971 [33:51<27:53,  1.61it/s, loss=0.0798, v_num=0, train/loss_simple_step=0.0692, train/loss_vlb_step=0.000236, train/loss_step=0.0692, global_step=3764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3275/5971 [33:52<27:52,  1.61it/s, loss=0.113, v_num=0, train/loss_simple_step=0.787, train/loss_vlb_step=0.022, train/loss_step=0.787, global_step=3764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  55%|█████▍    | 3276/5971 [33:55<27:53,  1.61it/s, loss=0.129, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00172, train/loss_step=0.385, global_step=3764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3277/5971 [33:56<27:53,  1.61it/s, loss=0.129, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00172, train/loss_step=0.385, global_step=3764.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3277/5971 [33:56<27:53,  1.61it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000192, train/loss_step=0.0573, global_step=3765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3278/5971 [33:57<27:53,  1.61it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00455, train/loss_vlb_step=2.21e-5, train/loss_step=0.00455, global_step=3765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3279/5971 [33:57<27:52,  1.61it/s, loss=0.143, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00113, train/loss_step=0.287, global_step=3765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  55%|█████▍    | 3280/5971 [34:00<27:53,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00196, train/loss_step=0.347, global_step=3765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3281/5971 [34:01<27:52,  1.61it/s, loss=0.152, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00196, train/loss_step=0.347, global_step=3765.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3281/5971 [34:01<27:52,  1.61it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=9.68e-6, train/loss_step=0.00167, global_step=3766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3282/5971 [34:01<27:52,  1.61it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0751, train/loss_vlb_step=0.000248, train/loss_step=0.0751, global_step=3766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  55%|█████▍    | 3283/5971 [34:02<27:52,  1.61it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00558, train/loss_vlb_step=2.92e-5, train/loss_step=0.00558, global_step=3766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▍    | 3284/5971 [34:05<27:52,  1.61it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.23e-5, train/loss_step=0.00214, global_step=3766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3285/5971 [34:06<27:52,  1.61it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.23e-5, train/loss_step=0.00214, global_step=3766.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3285/5971 [34:06<27:52,  1.61it/s, loss=0.141, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00192, train/loss_step=0.400, global_step=3767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  55%|█████▌    | 3286/5971 [34:07<27:52,  1.61it/s, loss=0.142, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000425, train/loss_step=0.126, global_step=3767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3287/5971 [34:07<27:51,  1.61it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.7e-5, train/loss_step=0.0101, global_step=3767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3288/5971 [34:10<27:52,  1.60it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0907, train/loss_vlb_step=0.000298, train/loss_step=0.0907, global_step=3767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3289/5971 [34:11<27:51,  1.60it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0907, train/loss_vlb_step=0.000298, train/loss_step=0.0907, global_step=3767.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3289/5971 [34:11<27:51,  1.60it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.34e-5, train/loss_step=0.00232, global_step=3768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3290/5971 [34:11<27:51,  1.60it/s, loss=0.15, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000642, train/loss_step=0.194, global_step=3768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  55%|█████▌    | 3291/5971 [34:12<27:51,  1.60it/s, loss=0.156, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000396, train/loss_step=0.120, global_step=3768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3292/5971 [34:14<27:51,  1.60it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00957, train/loss_vlb_step=4.5e-5, train/loss_step=0.00957, global_step=3768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3293/5971 [34:15<27:51,  1.60it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00957, train/loss_vlb_step=4.5e-5, train/loss_step=0.00957, global_step=3768.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3293/5971 [34:15<27:51,  1.60it/s, loss=0.182, v_num=0, train/loss_simple_step=0.670, train/loss_vlb_step=0.0131, train/loss_step=0.670, global_step=3769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  55%|█████▌    | 3294/5971 [34:16<27:50,  1.60it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.83e-5, train/loss_step=0.00329, global_step=3769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3295/5971 [34:17<27:50,  1.60it/s, loss=0.141, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=7.78e-5, train/loss_step=0.020, global_step=3769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  55%|█████▌    | 3296/5971 [34:19<27:51,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00105, train/loss_step=0.258, global_step=3769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3297/5971 [34:20<27:50,  1.60it/s, loss=0.134, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00105, train/loss_step=0.258, global_step=3769.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3297/5971 [34:20<27:50,  1.60it/s, loss=0.144, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00111, train/loss_step=0.263, global_step=3770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3298/5971 [34:21<27:50,  1.60it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000285, train/loss_step=0.0859, global_step=3770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3299/5971 [34:22<27:49,  1.60it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.99e-5, train/loss_step=0.0105, global_step=3770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  55%|█████▌    | 3300/5971 [34:24<27:50,  1.60it/s, loss=0.142, v_num=0, train/loss_simple_step=0.494, train/loss_vlb_step=0.00584, train/loss_step=0.494, global_step=3770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  55%|█████▌    | 3301/5971 [34:25<27:50,  1.60it/s, loss=0.142, v_num=0, train/loss_simple_step=0.494, train/loss_vlb_step=0.00584, train/loss_step=0.494, global_step=3770.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3301/5971 [34:25<27:50,  1.60it/s, loss=0.155, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00123, train/loss_step=0.268, global_step=3771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3302/5971 [34:26<27:49,  1.60it/s, loss=0.163, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000846, train/loss_step=0.224, global_step=3771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3303/5971 [34:27<27:49,  1.60it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00312, train/loss_vlb_step=1.74e-5, train/loss_step=0.00312, global_step=3771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3304/5971 [34:29<27:50,  1.60it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00747, train/loss_vlb_step=3.83e-5, train/loss_step=0.00747, global_step=3771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3305/5971 [34:30<27:49,  1.60it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00747, train/loss_vlb_step=3.83e-5, train/loss_step=0.00747, global_step=3771.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3305/5971 [34:30<27:49,  1.60it/s, loss=0.158, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00205, train/loss_step=0.303, global_step=3772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  55%|█████▌    | 3306/5971 [34:31<27:49,  1.60it/s, loss=0.176, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00606, train/loss_step=0.476, global_step=3772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3307/5971 [34:32<27:48,  1.60it/s, loss=0.181, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=3772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3308/5971 [34:34<27:49,  1.60it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000248, train/loss_step=0.0725, global_step=3772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3309/5971 [34:35<27:49,  1.59it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000248, train/loss_step=0.0725, global_step=3772.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3309/5971 [34:35<27:49,  1.59it/s, loss=0.19, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.00079, train/loss_step=0.216, global_step=3773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  55%|█████▌    | 3310/5971 [34:36<27:48,  1.59it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000117, train/loss_step=0.0296, global_step=3773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3311/5971 [34:37<27:48,  1.59it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.00023, train/loss_step=0.0699, global_step=3773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  55%|█████▌    | 3312/5971 [34:39<27:48,  1.59it/s, loss=0.187, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.00056, train/loss_step=0.162, global_step=3773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  55%|█████▌    | 3313/5971 [34:40<27:48,  1.59it/s, loss=0.187, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.00056, train/loss_step=0.162, global_step=3773.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  55%|█████▌    | 3313/5971 [34:40<27:48,  1.59it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.32e-6, train/loss_step=0.00156, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  56%|█████▌    | 3314/5971 [34:40<27:47,  1.59it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000183, train/loss_step=0.0514, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  56%|█████▌    | 3315/5971 [34:41<27:47,  1.59it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0863, train/loss_vlb_step=0.000287, train/loss_step=0.0863, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  56%|█████▌    | 3316/5971 [34:44<27:48,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  56%|█████▌    | 3317/5971 [34:44<27:47,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:01,  2.68it/s][A

Validating:   1%|          | 2/167 [00:00<00:52,  3.16it/s][A
Epoch 6:  56%|█████▌    | 3321/5971 [34:44<27:43,  1.59it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.32it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:12, 13.04it/s][A
Epoch 6:  56%|█████▌    | 3325/5971 [34:45<27:38,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.41it/s][A
Epoch 6:  56%|█████▌    | 3329/5971 [34:45<27:34,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.53it/s][A
Epoch 6:  56%|█████▌    | 3333/5971 [34:45<27:30,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.62it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.43it/s][A
Epoch 6:  56%|█████▌    | 3337/5971 [34:45<27:25,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 26.37it/s][A
Epoch 6:  56%|█████▌    | 3341/5971 [34:45<27:21,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.16it/s][A
Epoch 6:  56%|█████▌    | 3345/5971 [34:45<27:17,  1.60it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 26.38it/s][A
Epoch 6:  56%|█████▌    | 3349/5971 [34:45<27:12,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.64it/s][A

Validating:  22%|██▏       | 36/167 [00:01<00:04, 27.06it/s][A
Epoch 6:  56%|█████▌    | 3353/5971 [34:46<27:08,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 27.15it/s][A
Epoch 6:  56%|█████▌    | 3357/5971 [34:46<27:04,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.05it/s][A
Epoch 6:  56%|█████▋    | 3361/5971 [34:46<26:59,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.82it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.24it/s][A
Epoch 6:  56%|█████▋    | 3365/5971 [34:46<26:55,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.75it/s][A
Epoch 6:  56%|█████▋    | 3369/5971 [34:46<26:51,  1.61it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.45it/s][A
Epoch 6:  56%|█████▋    | 3373/5971 [34:46<26:46,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.46it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:04, 24.59it/s][A
Epoch 6:  57%|█████▋    | 3377/5971 [34:47<26:42,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.24it/s][A
Epoch 6:  57%|█████▋    | 3381/5971 [34:47<26:38,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.14it/s][A
Epoch 6:  57%|█████▋    | 3385/5971 [34:47<26:34,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.13it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.87it/s][A
Epoch 6:  57%|█████▋    | 3389/5971 [34:47<26:29,  1.62it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.99it/s][A
Epoch 6:  57%|█████▋    | 3393/5971 [34:47<26:25,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.12it/s][A
Epoch 6:  57%|█████▋    | 3397/5971 [34:47<26:21,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.90it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 25.47it/s][A
Epoch 6:  57%|█████▋    | 3401/5971 [34:48<26:17,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.37it/s][A
Epoch 6:  57%|█████▋    | 3405/5971 [34:48<26:13,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 25.69it/s][A
Epoch 6:  57%|█████▋    | 3409/5971 [34:48<26:08,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.20it/s][A

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.95it/s][A
Epoch 6:  57%|█████▋    | 3413/5971 [34:48<26:04,  1.63it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.56it/s][A
Epoch 6:  57%|█████▋    | 3417/5971 [34:48<26:00,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.18it/s][A
Epoch 6:  57%|█████▋    | 3421/5971 [34:48<25:56,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.02it/s][A

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 25.05it/s][A
Epoch 6:  57%|█████▋    | 3425/5971 [34:48<25:52,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.43it/s][A
Epoch 6:  57%|█████▋    | 3429/5971 [34:49<25:48,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 26.20it/s][A
Epoch 6:  57%|█████▋    | 3433/5971 [34:49<25:44,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:05<00:01, 25.18it/s][A
Epoch 6:  58%|█████▊    | 3437/5971 [34:49<25:40,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.36it/s][A

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.33it/s][A
Epoch 6:  58%|█████▊    | 3441/5971 [34:49<25:35,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.71it/s][A
Epoch 6:  58%|█████▊    | 3445/5971 [34:49<25:31,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.85it/s][A
Epoch 6:  58%|█████▊    | 3449/5971 [34:49<25:27,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 28.20it/s][A
Epoch 6:  58%|█████▊    | 3453/5971 [34:49<25:23,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 28.08it/s][A
Epoch 6:  58%|█████▊    | 3457/5971 [34:50<25:19,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 28.53it/s][A

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 24.82it/s][A
Epoch 6:  58%|█████▊    | 3461/5971 [34:50<25:15,  1.66it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.04it/s][A
Epoch 6:  58%|█████▊    | 3465/5971 [34:50<25:11,  1.66it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.16it/s][A
Epoch 6:  58%|█████▊    | 3469/5971 [34:50<25:07,  1.66it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.69it/s][A

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.57it/s][A
Epoch 6:  58%|█████▊    | 3473/5971 [34:50<25:03,  1.66it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.49it/s][A
Epoch 6:  58%|█████▊    | 3477/5971 [34:50<24:59,  1.66it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.03it/s][A
Epoch 6:  58%|█████▊    | 3481/5971 [34:51<24:55,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.52it/s][A
Epoch 6:  58%|█████▊    | 3484/5971 [34:51<24:52,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  58%|█████▊    | 3485/5971 [34:52<24:52,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.45e-5, train/loss_step=0.00481, global_step=3774.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  58%|█████▊    | 3485/5971 [34:52<24:52,  1.67it/s, loss=0.145, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000846, train/loss_step=0.214, global_step=3775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  58%|█████▊    | 3486/5971 [34:53<24:51,  1.67it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00629, train/loss_vlb_step=3.11e-5, train/loss_step=0.00629, global_step=3775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  58%|█████▊    | 3487/5971 [34:54<24:51,  1.67it/s, loss=0.149, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000584, train/loss_step=0.169, global_step=3775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  58%|█████▊    | 3488/5971 [34:56<24:52,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=0.000101, train/loss_step=0.0269, global_step=3775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  58%|█████▊    | 3489/5971 [34:57<24:51,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0269, train/loss_vlb_step=0.000101, train/loss_step=0.0269, global_step=3775.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  58%|█████▊    | 3489/5971 [34:57<24:51,  1.66it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0646, train/loss_vlb_step=0.000214, train/loss_step=0.0646, global_step=3776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  58%|█████▊    | 3490/5971 [34:58<24:51,  1.66it/s, loss=0.11, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=3776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  58%|█████▊    | 3491/5971 [34:59<24:50,  1.66it/s, loss=0.143, v_num=0, train/loss_simple_step=0.661, train/loss_vlb_step=0.0176, train/loss_step=0.661, global_step=3776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  58%|█████▊    | 3492/5971 [35:01<24:51,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0686, train/loss_vlb_step=0.000226, train/loss_step=0.0686, global_step=3776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  58%|█████▊    | 3493/5971 [35:02<24:51,  1.66it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0686, train/loss_vlb_step=0.000226, train/loss_step=0.0686, global_step=3776.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  58%|█████▊    | 3493/5971 [35:02<24:51,  1.66it/s, loss=0.15, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00197, train/loss_step=0.378, global_step=3777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  59%|█████▊    | 3494/5971 [35:03<24:50,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000644, train/loss_step=0.190, global_step=3777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3495/5971 [35:04<24:50,  1.66it/s, loss=0.144, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00104, train/loss_step=0.284, global_step=3777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  59%|█████▊    | 3496/5971 [35:06<24:50,  1.66it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.34e-5, train/loss_step=0.00238, global_step=3777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3497/5971 [35:07<24:50,  1.66it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.34e-5, train/loss_step=0.00238, global_step=3777.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3497/5971 [35:07<24:50,  1.66it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.36e-5, train/loss_step=0.0102, global_step=3778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▊    | 3498/5971 [35:08<24:50,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.73e-5, train/loss_step=0.00544, global_step=3778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3499/5971 [35:09<24:49,  1.66it/s, loss=0.142, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.0014, train/loss_step=0.324, global_step=3778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  59%|█████▊    | 3500/5971 [35:11<24:50,  1.66it/s, loss=0.144, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000705, train/loss_step=0.207, global_step=3778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3501/5971 [35:12<24:49,  1.66it/s, loss=0.144, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000705, train/loss_step=0.207, global_step=3778.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3501/5971 [35:12<24:49,  1.66it/s, loss=0.153, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000602, train/loss_step=0.181, global_step=3779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3502/5971 [35:13<24:49,  1.66it/s, loss=0.159, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000657, train/loss_step=0.184, global_step=3779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3503/5971 [35:13<24:48,  1.66it/s, loss=0.167, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000922, train/loss_step=0.243, global_step=3779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3504/5971 [35:16<24:49,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.33e-5, train/loss_step=0.0244, global_step=3779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3505/5971 [35:16<24:48,  1.66it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.33e-5, train/loss_step=0.0244, global_step=3779.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3505/5971 [35:16<24:48,  1.66it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0422, train/loss_vlb_step=0.000158, train/loss_step=0.0422, global_step=3780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3506/5971 [35:17<24:48,  1.66it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0736, train/loss_vlb_step=0.000244, train/loss_step=0.0736, global_step=3780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▊    | 3507/5971 [35:18<24:48,  1.66it/s, loss=0.161, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000457, train/loss_step=0.139, global_step=3780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3508/5971 [35:20<24:48,  1.65it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=3.92e-5, train/loss_step=0.00845, global_step=3780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3509/5971 [35:21<24:48,  1.65it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=3.92e-5, train/loss_step=0.00845, global_step=3780.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3509/5971 [35:21<24:48,  1.65it/s, loss=0.172, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00117, train/loss_step=0.287, global_step=3781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  59%|█████▉    | 3510/5971 [35:22<24:47,  1.65it/s, loss=0.185, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00236, train/loss_step=0.382, global_step=3781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3511/5971 [35:23<24:47,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.00017, train/loss_step=0.0448, global_step=3781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3512/5971 [35:25<24:47,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000227, train/loss_step=0.0643, global_step=3781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3513/5971 [35:26<24:47,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000227, train/loss_step=0.0643, global_step=3781.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3513/5971 [35:26<24:47,  1.65it/s, loss=0.146, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000855, train/loss_step=0.234, global_step=3782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3514/5971 [35:27<24:47,  1.65it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00138, train/loss_vlb_step=8.23e-6, train/loss_step=0.00138, global_step=3782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3515/5971 [35:28<24:46,  1.65it/s, loss=0.132, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000811, train/loss_step=0.188, global_step=3782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  59%|█████▉    | 3516/5971 [35:30<24:47,  1.65it/s, loss=0.145, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00104, train/loss_step=0.255, global_step=3782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  59%|█████▉    | 3517/5971 [35:31<24:46,  1.65it/s, loss=0.145, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00104, train/loss_step=0.255, global_step=3782.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3517/5971 [35:31<24:46,  1.65it/s, loss=0.156, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000916, train/loss_step=0.230, global_step=3783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3518/5971 [35:32<24:46,  1.65it/s, loss=0.17, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00163, train/loss_step=0.298, global_step=3783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3519/5971 [35:33<24:45,  1.65it/s, loss=0.165, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000828, train/loss_step=0.222, global_step=3783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3520/5971 [35:35<24:46,  1.65it/s, loss=0.161, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000431, train/loss_step=0.129, global_step=3783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3521/5971 [35:36<24:46,  1.65it/s, loss=0.161, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000431, train/loss_step=0.129, global_step=3783.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3521/5971 [35:36<24:46,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000123, train/loss_step=0.0323, global_step=3784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3522/5971 [35:37<24:45,  1.65it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.02e-5, train/loss_step=0.00173, global_step=3784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3523/5971 [35:38<24:45,  1.65it/s, loss=0.15, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.0019, train/loss_step=0.335, global_step=3784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  59%|█████▉    | 3524/5971 [35:40<24:45,  1.65it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.28e-5, train/loss_step=0.00232, global_step=3784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3525/5971 [35:41<24:45,  1.65it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.28e-5, train/loss_step=0.00232, global_step=3784.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3525/5971 [35:41<24:45,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.74e-5, train/loss_step=0.0053, global_step=3785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3526/5971 [35:42<24:45,  1.65it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00909, train/loss_vlb_step=4.35e-5, train/loss_step=0.00909, global_step=3785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3527/5971 [35:43<24:44,  1.65it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0227, train/loss_vlb_step=9.13e-5, train/loss_step=0.0227, global_step=3785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3528/5971 [35:45<24:45,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000151, train/loss_step=0.0413, global_step=3785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3529/5971 [35:46<24:44,  1.64it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000151, train/loss_step=0.0413, global_step=3785.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3529/5971 [35:46<24:44,  1.64it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=7.95e-5, train/loss_step=0.0196, global_step=3786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  59%|█████▉    | 3530/5971 [35:47<24:44,  1.64it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000157, train/loss_step=0.0451, global_step=3786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3531/5971 [35:48<24:43,  1.64it/s, loss=0.123, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00138, train/loss_step=0.321, global_step=3786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  59%|█████▉    | 3532/5971 [35:50<24:44,  1.64it/s, loss=0.128, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000571, train/loss_step=0.168, global_step=3786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3533/5971 [35:51<24:44,  1.64it/s, loss=0.128, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000571, train/loss_step=0.168, global_step=3786.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3533/5971 [35:51<24:44,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.608, train/loss_vlb_step=0.0123, train/loss_step=0.608, global_step=3787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3534/5971 [35:52<24:43,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0872, train/loss_vlb_step=0.000287, train/loss_step=0.0872, global_step=3787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3535/5971 [35:52<24:43,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000703, train/loss_step=0.196, global_step=3787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3536/5971 [35:55<24:43,  1.64it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000171, train/loss_step=0.0471, global_step=3787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3537/5971 [35:56<24:43,  1.64it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0471, train/loss_vlb_step=0.000171, train/loss_step=0.0471, global_step=3787.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3537/5971 [35:56<24:43,  1.64it/s, loss=0.135, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=3788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3538/5971 [35:56<24:42,  1.64it/s, loss=0.165, v_num=0, train/loss_simple_step=0.900, train/loss_vlb_step=0.152, train/loss_step=0.900, global_step=3788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  59%|█████▉    | 3539/5971 [35:57<24:42,  1.64it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.00013, train/loss_step=0.0348, global_step=3788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3540/5971 [36:00<24:43,  1.64it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.45e-5, train/loss_step=0.0119, global_step=3788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  59%|█████▉    | 3541/5971 [36:01<24:42,  1.64it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.45e-5, train/loss_step=0.0119, global_step=3788.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3541/5971 [36:01<24:42,  1.64it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000111, train/loss_step=0.0285, global_step=3789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3542/5971 [36:02<24:42,  1.64it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0199, train/loss_vlb_step=7.89e-5, train/loss_step=0.0199, global_step=3789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  59%|█████▉    | 3543/5971 [36:02<24:41,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0035, train/loss_vlb_step=1.95e-5, train/loss_step=0.0035, global_step=3789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3544/5971 [36:05<24:42,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.67e-5, train/loss_step=0.00544, global_step=3789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3545/5971 [36:05<24:41,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.67e-5, train/loss_step=0.00544, global_step=3789.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3545/5971 [36:05<24:41,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00502, train/loss_vlb_step=2.53e-5, train/loss_step=0.00502, global_step=3790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3546/5971 [36:06<24:41,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00509, train/loss_vlb_step=2.56e-5, train/loss_step=0.00509, global_step=3790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3547/5971 [36:07<24:40,  1.64it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0662, train/loss_vlb_step=0.000222, train/loss_step=0.0662, global_step=3790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  59%|█████▉    | 3548/5971 [36:10<24:41,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00115, train/loss_step=0.266, global_step=3790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  59%|█████▉    | 3549/5971 [36:10<24:41,  1.64it/s, loss=0.147, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00115, train/loss_step=0.266, global_step=3790.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3549/5971 [36:10<24:41,  1.64it/s, loss=0.155, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=3791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3550/5971 [36:11<24:40,  1.64it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0692, train/loss_vlb_step=0.000237, train/loss_step=0.0692, global_step=3791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  59%|█████▉    | 3551/5971 [36:12<24:40,  1.63it/s, loss=0.15, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000655, train/loss_step=0.197, global_step=3791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  59%|█████▉    | 3552/5971 [36:14<24:40,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00289, train/loss_step=0.389, global_step=3791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3553/5971 [36:15<24:40,  1.63it/s, loss=0.161, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00289, train/loss_step=0.389, global_step=3791.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3553/5971 [36:15<24:40,  1.63it/s, loss=0.167, v_num=0, train/loss_simple_step=0.738, train/loss_vlb_step=0.032, train/loss_step=0.738, global_step=3792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  60%|█████▉    | 3554/5971 [36:16<24:39,  1.63it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.06e-5, train/loss_step=0.00181, global_step=3792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3555/5971 [36:17<24:39,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.17e-6, train/loss_step=0.00151, global_step=3792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3556/5971 [36:19<24:39,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000146, train/loss_step=0.0395, global_step=3792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  60%|█████▉    | 3557/5971 [36:20<24:39,  1.63it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000146, train/loss_step=0.0395, global_step=3792.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3557/5971 [36:20<24:39,  1.63it/s, loss=0.157, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000667, train/loss_step=0.190, global_step=3793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  60%|█████▉    | 3558/5971 [36:21<24:39,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000784, train/loss_step=0.222, global_step=3793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3559/5971 [36:22<24:38,  1.63it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.19e-5, train/loss_step=0.00199, global_step=3793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3560/5971 [36:24<24:39,  1.63it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.59e-5, train/loss_step=0.00299, global_step=3793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3561/5971 [36:25<24:38,  1.63it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.59e-5, train/loss_step=0.00299, global_step=3793.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3561/5971 [36:25<24:38,  1.63it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0504, train/loss_vlb_step=0.000177, train/loss_step=0.0504, global_step=3794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  60%|█████▉    | 3562/5971 [36:26<24:38,  1.63it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.45e-5, train/loss_step=0.0213, global_step=3794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  60%|█████▉    | 3563/5971 [36:27<24:37,  1.63it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0629, train/loss_vlb_step=0.000218, train/loss_step=0.0629, global_step=3794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3564/5971 [36:29<24:38,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00192, train/loss_step=0.352, global_step=3794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  60%|█████▉    | 3565/5971 [36:30<24:37,  1.63it/s, loss=0.143, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00192, train/loss_step=0.352, global_step=3794.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3565/5971 [36:30<24:37,  1.63it/s, loss=0.172, v_num=0, train/loss_simple_step=0.582, train/loss_vlb_step=0.0131, train/loss_step=0.582, global_step=3795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  60%|█████▉    | 3566/5971 [36:31<24:37,  1.63it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00666, train/loss_vlb_step=3.18e-5, train/loss_step=0.00666, global_step=3795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3567/5971 [36:32<24:37,  1.63it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00302, train/loss_vlb_step=1.64e-5, train/loss_step=0.00302, global_step=3795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3568/5971 [36:34<24:37,  1.63it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000135, train/loss_step=0.0362, global_step=3795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  60%|█████▉    | 3569/5971 [36:35<24:37,  1.63it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000135, train/loss_step=0.0362, global_step=3795.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3569/5971 [36:35<24:37,  1.63it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.59e-5, train/loss_step=0.0107, global_step=3796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  60%|█████▉    | 3570/5971 [36:36<24:36,  1.63it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.33e-5, train/loss_step=0.0128, global_step=3796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3571/5971 [36:37<24:36,  1.63it/s, loss=0.164, v_num=0, train/loss_simple_step=0.546, train/loss_vlb_step=0.00426, train/loss_step=0.546, global_step=3796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  60%|█████▉    | 3572/5971 [36:39<24:36,  1.62it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0824, train/loss_vlb_step=0.000272, train/loss_step=0.0824, global_step=3796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3573/5971 [36:40<24:36,  1.62it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0824, train/loss_vlb_step=0.000272, train/loss_step=0.0824, global_step=3796.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3573/5971 [36:40<24:36,  1.62it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.28e-5, train/loss_step=0.0022, global_step=3797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  60%|█████▉    | 3574/5971 [36:41<24:35,  1.62it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00962, train/loss_vlb_step=3.85e-5, train/loss_step=0.00962, global_step=3797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3575/5971 [36:42<24:35,  1.62it/s, loss=0.118, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=3797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  60%|█████▉    | 3576/5971 [36:44<24:35,  1.62it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000196, train/loss_step=0.0569, global_step=3797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3577/5971 [36:45<24:35,  1.62it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000196, train/loss_step=0.0569, global_step=3797.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3577/5971 [36:45<24:35,  1.62it/s, loss=0.111, v_num=0, train/loss_simple_step=0.023, train/loss_vlb_step=9e-5, train/loss_step=0.023, global_step=3798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  60%|█████▉    | 3578/5971 [36:45<24:34,  1.62it/s, loss=0.106, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000418, train/loss_step=0.127, global_step=3798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3579/5971 [36:46<24:34,  1.62it/s, loss=0.112, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000417, train/loss_step=0.125, global_step=3798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3580/5971 [36:48<24:34,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.27e-5, train/loss_step=0.0205, global_step=3798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3581/5971 [36:49<24:34,  1.62it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.27e-5, train/loss_step=0.0205, global_step=3798.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|█████▉    | 3581/5971 [36:49<24:34,  1.62it/s, loss=0.125, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00118, train/loss_step=0.285, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  60%|█████▉    | 3582/5971 [36:50<24:34,  1.62it/s, loss=0.132, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000523, train/loss_step=0.156, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|██████    | 3583/5971 [36:51<24:33,  1.62it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.21e-5, train/loss_step=0.00207, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  60%|██████    | 3584/5971 [36:53<24:33,  1.62it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  60%|██████    | 3585/5971 [36:53<24:32,  1.62it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:05,  2.53it/s][A

Validating:   1%|          | 2/167 [00:00<00:47,  3.49it/s][A

Validating:   2%|▏         | 4/167 [00:00<00:22,  7.24it/s][A
Epoch 6:  60%|██████    | 3589/5971 [36:54<24:29,  1.62it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   4%|▍         | 7/167 [00:00<00:13, 11.48it/s][A
Epoch 6:  60%|██████    | 3593/5971 [36:54<24:25,  1.62it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   6%|▌         | 10/167 [00:00<00:09, 15.75it/s][A
Epoch 6:  60%|██████    | 3597/5971 [36:54<24:21,  1.62it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 13/167 [00:01<00:08, 17.46it/s][A

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.37it/s][A
Epoch 6:  60%|██████    | 3601/5971 [36:55<24:17,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█▏        | 19/167 [00:01<00:07, 19.24it/s][A
Epoch 6:  60%|██████    | 3605/5971 [36:55<24:13,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 22/167 [00:01<00:07, 18.72it/s][A
Epoch 6:  60%|██████    | 3609/5971 [36:55<24:09,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 20.62it/s][A

Validating:  17%|█▋        | 28/167 [00:01<00:06, 22.32it/s][A
Epoch 6:  61%|██████    | 3613/5971 [36:55<24:05,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 23.85it/s][A
Epoch 6:  61%|██████    | 3617/5971 [36:55<24:01,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|██        | 34/167 [00:01<00:05, 25.13it/s][A
Epoch 6:  61%|██████    | 3621/5971 [36:55<23:57,  1.63it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 37/167 [00:02<00:04, 26.37it/s][A

Validating:  24%|██▍       | 40/167 [00:02<00:05, 25.12it/s][A
Epoch 6:  61%|██████    | 3625/5971 [36:56<23:53,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.17it/s][A
Epoch 6:  61%|██████    | 3629/5971 [36:56<23:49,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.43it/s][A
Epoch 6:  61%|██████    | 3633/5971 [36:56<23:45,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.62it/s][A

Validating:  31%|███       | 52/167 [00:02<00:04, 26.65it/s][A
Epoch 6:  61%|██████    | 3637/5971 [36:56<23:41,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 25.77it/s][A
Epoch 6:  61%|██████    | 3641/5971 [36:56<23:38,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.74it/s][A
Epoch 6:  61%|██████    | 3645/5971 [36:56<23:34,  1.64it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:03<00:04, 25.50it/s][A
Epoch 6:  61%|██████    | 3649/5971 [36:56<23:30,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.94it/s][A

Validating:  41%|████      | 68/167 [00:03<00:03, 26.35it/s][A
Epoch 6:  61%|██████    | 3653/5971 [36:57<23:26,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.00it/s][A
Epoch 6:  61%|██████    | 3657/5971 [36:57<23:22,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.46it/s][A
Epoch 6:  61%|██████▏   | 3661/5971 [36:57<23:18,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.70it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.25it/s][A
Epoch 6:  61%|██████▏   | 3665/5971 [36:57<23:14,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.57it/s][A
Epoch 6:  61%|██████▏   | 3669/5971 [36:57<23:11,  1.65it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.71it/s][A
Epoch 6:  62%|██████▏   | 3673/5971 [36:57<23:07,  1.66it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:04<00:03, 25.39it/s][A

Validating:  55%|█████▌    | 92/167 [00:04<00:03, 23.39it/s][A
Epoch 6:  62%|██████▏   | 3677/5971 [36:58<23:03,  1.66it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:03, 23.91it/s][A
Epoch 6:  62%|██████▏   | 3681/5971 [36:58<22:59,  1.66it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 24.60it/s][A
Epoch 6:  62%|██████▏   | 3685/5971 [36:58<22:55,  1.66it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 24.63it/s][A

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.50it/s][A
Epoch 6:  62%|██████▏   | 3689/5971 [36:58<22:51,  1.66it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.15it/s][A
Epoch 6:  62%|██████▏   | 3693/5971 [36:58<22:48,  1.66it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 24.58it/s][A
Epoch 6:  62%|██████▏   | 3697/5971 [36:58<22:44,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 24.70it/s][A

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.27it/s][A
Epoch 6:  62%|██████▏   | 3701/5971 [36:58<22:40,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.99it/s][A
Epoch 6:  62%|██████▏   | 3705/5971 [36:59<22:36,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.60it/s][A
Epoch 6:  62%|██████▏   | 3709/5971 [36:59<22:33,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.79it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.10it/s][A
Epoch 6:  62%|██████▏   | 3713/5971 [36:59<22:29,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.38it/s][A
Epoch 6:  62%|██████▏   | 3717/5971 [36:59<22:25,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:05<00:01, 27.09it/s][A
Epoch 6:  62%|██████▏   | 3721/5971 [36:59<22:21,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.94it/s][A

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 26.99it/s][A
Epoch 6:  62%|██████▏   | 3725/5971 [36:59<22:18,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.21it/s][A
Epoch 6:  62%|██████▏   | 3729/5971 [37:00<22:14,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.78it/s][A
Epoch 6:  63%|██████▎   | 3733/5971 [37:00<22:10,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.85it/s][A

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.50it/s][A
Epoch 6:  63%|██████▎   | 3737/5971 [37:00<22:06,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.56it/s][A
Epoch 6:  63%|██████▎   | 3741/5971 [37:00<22:03,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 23.85it/s][A
Epoch 6:  63%|██████▎   | 3745/5971 [37:00<21:59,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.32it/s][A

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 26.47it/s][A
Epoch 6:  63%|██████▎   | 3749/5971 [37:00<21:55,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:07<00:00, 27.06it/s][A
Epoch 6:  63%|██████▎   | 3752/5971 [37:01<21:53,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.71it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.97it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.14it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.60it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.56it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.64it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.70it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.68it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.17it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.20it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.20it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.19it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.17it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.15it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.19it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.18it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.27it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.39it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.44it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.44it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.14it/s]

Epoch 6:  63%|██████▎   | 3753/5971 [37:13<21:59,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=3799.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3753/5971 [37:13<21:59,  1.68it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.97e-5, train/loss_step=0.0135, global_step=3800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.33it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.10it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.72it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.21it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.51it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.78it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.38it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.39it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.47it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.50it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.52it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.50it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.53it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.35it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.32it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.38it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.29it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.36it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.43it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.49it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.40it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.39it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.41it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.11it/s]

Epoch 6:  63%|██████▎   | 3754/5971 [37:25<22:05,  1.67it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.97e-5, train/loss_step=0.0135, global_step=3800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3754/5971 [37:25<22:05,  1.67it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000437, train/loss_step=0.130, global_step=3800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.33it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.10it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.65it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  4.06it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.29it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.50it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.77it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.97it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.06it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.35it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.42it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.43it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.42it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.48it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.24it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.30it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.29it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.36it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.38it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.56it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.60it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.61it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.07it/s]

Epoch 6:  63%|██████▎   | 3755/5971 [37:37<22:11,  1.66it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000437, train/loss_step=0.130, global_step=3800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3755/5971 [37:37<22:11,  1.66it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.1e-5, train/loss_step=0.0112, global_step=3800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.23it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.84it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.30it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.86it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.93it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.00it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.04it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.08it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.04it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.16it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.26it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.42it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.55it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.58it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.59it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.38it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.30it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.28it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.31it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.37it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.44it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.28it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.17it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.15it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.17it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.11it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.13it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.22it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.28it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.32it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.36it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.39it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.06it/s]

Epoch 6:  63%|██████▎   | 3756/5971 [37:51<22:18,  1.65it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.1e-5, train/loss_step=0.0112, global_step=3800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3756/5971 [37:51<22:18,  1.65it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.17e-5, train/loss_step=0.0209, global_step=3800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3757/5971 [37:51<22:18,  1.65it/s, loss=0.0944, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.17e-5, train/loss_step=0.0209, global_step=3800.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3757/5971 [37:51<22:18,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000811, train/loss_step=0.218, global_step=3801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  63%|██████▎   | 3758/5971 [37:52<22:18,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000811, train/loss_step=0.218, global_step=3801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3758/5971 [37:52<22:18,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.790, train/loss_vlb_step=0.0343, train/loss_step=0.790, global_step=3801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  63%|██████▎   | 3759/5971 [37:53<22:17,  1.65it/s, loss=0.144, v_num=0, train/loss_simple_step=0.790, train/loss_vlb_step=0.0343, train/loss_step=0.790, global_step=3801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3759/5971 [37:53<22:17,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000214, train/loss_step=0.0621, global_step=3801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3760/5971 [37:55<22:17,  1.65it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000214, train/loss_step=0.0621, global_step=3801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3760/5971 [37:55<22:17,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0921, train/loss_vlb_step=0.000313, train/loss_step=0.0921, global_step=3801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  63%|██████▎   | 3761/5971 [37:56<22:17,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0921, train/loss_vlb_step=0.000313, train/loss_step=0.0921, global_step=3801.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3761/5971 [37:56<22:17,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.49e-5, train/loss_step=0.018, global_step=3802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  63%|██████▎   | 3762/5971 [37:57<22:17,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.49e-5, train/loss_step=0.018, global_step=3802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3762/5971 [37:57<22:17,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00681, train/loss_vlb_step=3.38e-5, train/loss_step=0.00681, global_step=3802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3763/5971 [37:58<22:16,  1.65it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00681, train/loss_vlb_step=3.38e-5, train/loss_step=0.00681, global_step=3802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3763/5971 [37:58<22:16,  1.65it/s, loss=0.145, v_num=0, train/loss_simple_step=0.627, train/loss_vlb_step=0.00906, train/loss_step=0.627, global_step=3802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  63%|██████▎   | 3764/5971 [38:00<22:16,  1.65it/s, loss=0.145, v_num=0, train/loss_simple_step=0.627, train/loss_vlb_step=0.00906, train/loss_step=0.627, global_step=3802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3764/5971 [38:00<22:16,  1.65it/s, loss=0.169, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.0126, train/loss_step=0.539, global_step=3802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  63%|██████▎   | 3765/5971 [38:01<22:16,  1.65it/s, loss=0.169, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.0126, train/loss_step=0.539, global_step=3802.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3765/5971 [38:01<22:16,  1.65it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00864, train/loss_vlb_step=3.94e-5, train/loss_step=0.00864, global_step=3803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3766/5971 [38:02<22:15,  1.65it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00864, train/loss_vlb_step=3.94e-5, train/loss_step=0.00864, global_step=3803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3766/5971 [38:02<22:15,  1.65it/s, loss=0.173, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000832, train/loss_step=0.217, global_step=3803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  63%|██████▎   | 3767/5971 [38:03<22:15,  1.65it/s, loss=0.173, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000832, train/loss_step=0.217, global_step=3803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3767/5971 [38:03<22:15,  1.65it/s, loss=0.18, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00108, train/loss_step=0.266, global_step=3803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  63%|██████▎   | 3768/5971 [38:05<22:16,  1.65it/s, loss=0.18, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00108, train/loss_step=0.266, global_step=3803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3768/5971 [38:05<22:16,  1.65it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000276, train/loss_step=0.0807, global_step=3803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3769/5971 [38:06<22:15,  1.65it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000276, train/loss_step=0.0807, global_step=3803.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3769/5971 [38:06<22:15,  1.65it/s, loss=0.176, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000498, train/loss_step=0.147, global_step=3804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  63%|██████▎   | 3770/5971 [38:07<22:15,  1.65it/s, loss=0.176, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000498, train/loss_step=0.147, global_step=3804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3770/5971 [38:07<22:15,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.050, train/loss_vlb_step=0.000176, train/loss_step=0.050, global_step=3804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3771/5971 [38:08<22:14,  1.65it/s, loss=0.171, v_num=0, train/loss_simple_step=0.050, train/loss_vlb_step=0.000176, train/loss_step=0.050, global_step=3804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3771/5971 [38:08<22:14,  1.65it/s, loss=0.187, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00138, train/loss_step=0.316, global_step=3804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  63%|██████▎   | 3772/5971 [38:10<22:15,  1.65it/s, loss=0.187, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00138, train/loss_step=0.316, global_step=3804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3772/5971 [38:10<22:15,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000826, train/loss_step=0.229, global_step=3804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3773/5971 [38:11<22:14,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000826, train/loss_step=0.229, global_step=3804.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3773/5971 [38:11<22:14,  1.65it/s, loss=0.208, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00205, train/loss_step=0.338, global_step=3805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  63%|██████▎   | 3774/5971 [38:12<22:14,  1.65it/s, loss=0.208, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00205, train/loss_step=0.338, global_step=3805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3774/5971 [38:12<22:14,  1.65it/s, loss=0.243, v_num=0, train/loss_simple_step=0.820, train/loss_vlb_step=0.0601, train/loss_step=0.820, global_step=3805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  63%|██████▎   | 3775/5971 [38:13<22:13,  1.65it/s, loss=0.243, v_num=0, train/loss_simple_step=0.820, train/loss_vlb_step=0.0601, train/loss_step=0.820, global_step=3805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3775/5971 [38:13<22:13,  1.65it/s, loss=0.25, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000595, train/loss_step=0.154, global_step=3805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3776/5971 [38:15<22:13,  1.65it/s, loss=0.25, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000595, train/loss_step=0.154, global_step=3805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3776/5971 [38:15<22:13,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00131, train/loss_step=0.308, global_step=3805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3777/5971 [38:16<22:13,  1.65it/s, loss=0.264, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00131, train/loss_step=0.308, global_step=3805.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3777/5971 [38:16<22:13,  1.65it/s, loss=0.261, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000493, train/loss_step=0.149, global_step=3806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3778/5971 [38:17<22:13,  1.65it/s, loss=0.261, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000493, train/loss_step=0.149, global_step=3806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3778/5971 [38:17<22:13,  1.65it/s, loss=0.244, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00312, train/loss_step=0.456, global_step=3806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  63%|██████▎   | 3779/5971 [38:18<22:12,  1.64it/s, loss=0.244, v_num=0, train/loss_simple_step=0.456, train/loss_vlb_step=0.00312, train/loss_step=0.456, global_step=3806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3779/5971 [38:18<22:12,  1.64it/s, loss=0.248, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000482, train/loss_step=0.143, global_step=3806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3780/5971 [38:20<22:12,  1.64it/s, loss=0.248, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000482, train/loss_step=0.143, global_step=3806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3780/5971 [38:20<22:12,  1.64it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000143, train/loss_step=0.0405, global_step=3806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3781/5971 [38:21<22:12,  1.64it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000143, train/loss_step=0.0405, global_step=3806.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3781/5971 [38:21<22:12,  1.64it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000111, train/loss_step=0.0314, global_step=3807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3782/5971 [38:21<22:12,  1.64it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000111, train/loss_step=0.0314, global_step=3807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3782/5971 [38:21<22:12,  1.64it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.78e-5, train/loss_step=0.0134, global_step=3807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  63%|██████▎   | 3783/5971 [38:22<22:11,  1.64it/s, loss=0.247, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.78e-5, train/loss_step=0.0134, global_step=3807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3783/5971 [38:22<22:11,  1.64it/s, loss=0.222, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000468, train/loss_step=0.135, global_step=3807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  63%|██████▎   | 3784/5971 [38:24<22:11,  1.64it/s, loss=0.222, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000468, train/loss_step=0.135, global_step=3807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3784/5971 [38:24<22:11,  1.64it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.25e-5, train/loss_step=0.00221, global_step=3807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3785/5971 [38:25<22:11,  1.64it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.25e-5, train/loss_step=0.00221, global_step=3807.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3785/5971 [38:25<22:11,  1.64it/s, loss=0.208, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.000987, train/loss_step=0.274, global_step=3808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  63%|██████▎   | 3786/5971 [38:26<22:10,  1.64it/s, loss=0.208, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.000987, train/loss_step=0.274, global_step=3808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3786/5971 [38:26<22:10,  1.64it/s, loss=0.206, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000618, train/loss_step=0.178, global_step=3808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3787/5971 [38:27<22:10,  1.64it/s, loss=0.206, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000618, train/loss_step=0.178, global_step=3808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3787/5971 [38:27<22:10,  1.64it/s, loss=0.195, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000138, train/loss_step=0.037, global_step=3808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3788/5971 [38:29<22:10,  1.64it/s, loss=0.195, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000138, train/loss_step=0.037, global_step=3808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3788/5971 [38:29<22:10,  1.64it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.64e-5, train/loss_step=0.00298, global_step=3808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3789/5971 [38:30<22:10,  1.64it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.64e-5, train/loss_step=0.00298, global_step=3808.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3789/5971 [38:30<22:10,  1.64it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.53e-5, train/loss_step=0.0154, global_step=3809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  63%|██████▎   | 3790/5971 [38:31<22:09,  1.64it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.53e-5, train/loss_step=0.0154, global_step=3809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3790/5971 [38:31<22:09,  1.64it/s, loss=0.221, v_num=0, train/loss_simple_step=0.782, train/loss_vlb_step=0.0503, train/loss_step=0.782, global_step=3809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  63%|██████▎   | 3791/5971 [38:32<22:09,  1.64it/s, loss=0.221, v_num=0, train/loss_simple_step=0.782, train/loss_vlb_step=0.0503, train/loss_step=0.782, global_step=3809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  63%|██████▎   | 3791/5971 [38:32<22:09,  1.64it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=3809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3792/5971 [38:34<22:09,  1.64it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0559, train/loss_vlb_step=0.000194, train/loss_step=0.0559, global_step=3809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3792/5971 [38:34<22:09,  1.64it/s, loss=0.202, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=3809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  64%|██████▎   | 3793/5971 [38:35<22:09,  1.64it/s, loss=0.202, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=3809.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3793/5971 [38:35<22:09,  1.64it/s, loss=0.207, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00327, train/loss_step=0.439, global_step=3810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  64%|██████▎   | 3794/5971 [38:36<22:08,  1.64it/s, loss=0.207, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00327, train/loss_step=0.439, global_step=3810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3794/5971 [38:36<22:08,  1.64it/s, loss=0.179, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00135, train/loss_step=0.264, global_step=3810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3795/5971 [38:37<22:08,  1.64it/s, loss=0.179, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00135, train/loss_step=0.264, global_step=3810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3795/5971 [38:37<22:08,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=3810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3796/5971 [38:39<22:08,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=3810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3796/5971 [38:39<22:08,  1.64it/s, loss=0.199, v_num=0, train/loss_simple_step=0.779, train/loss_vlb_step=0.0273, train/loss_step=0.779, global_step=3810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  64%|██████▎   | 3797/5971 [38:40<22:08,  1.64it/s, loss=0.199, v_num=0, train/loss_simple_step=0.779, train/loss_vlb_step=0.0273, train/loss_step=0.779, global_step=3810.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3797/5971 [38:40<22:08,  1.64it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.3e-5, train/loss_step=0.0117, global_step=3811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3798/5971 [38:41<22:07,  1.64it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.3e-5, train/loss_step=0.0117, global_step=3811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3798/5971 [38:41<22:07,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.562, train/loss_vlb_step=0.00456, train/loss_step=0.562, global_step=3811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  64%|██████▎   | 3799/5971 [38:42<22:07,  1.64it/s, loss=0.198, v_num=0, train/loss_simple_step=0.562, train/loss_vlb_step=0.00456, train/loss_step=0.562, global_step=3811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3799/5971 [38:42<22:07,  1.64it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000188, train/loss_step=0.0539, global_step=3811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3800/5971 [38:44<22:07,  1.64it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0539, train/loss_vlb_step=0.000188, train/loss_step=0.0539, global_step=3811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3800/5971 [38:44<22:07,  1.64it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0431, train/loss_vlb_step=0.000151, train/loss_step=0.0431, global_step=3811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3801/5971 [38:45<22:07,  1.64it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0431, train/loss_vlb_step=0.000151, train/loss_step=0.0431, global_step=3811.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3801/5971 [38:45<22:07,  1.64it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.19e-5, train/loss_step=0.00206, global_step=3812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3802/5971 [38:46<22:06,  1.63it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.19e-5, train/loss_step=0.00206, global_step=3812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3802/5971 [38:46<22:06,  1.63it/s, loss=0.192, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.68e-5, train/loss_step=0.021, global_step=3812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  64%|██████▎   | 3803/5971 [38:47<22:06,  1.63it/s, loss=0.192, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.68e-5, train/loss_step=0.021, global_step=3812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3803/5971 [38:47<22:06,  1.63it/s, loss=0.199, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00108, train/loss_step=0.265, global_step=3812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3804/5971 [38:49<22:06,  1.63it/s, loss=0.199, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00108, train/loss_step=0.265, global_step=3812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3804/5971 [38:49<22:06,  1.63it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.00013, train/loss_step=0.0372, global_step=3812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3805/5971 [38:50<22:06,  1.63it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.00013, train/loss_step=0.0372, global_step=3812.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3805/5971 [38:50<22:06,  1.63it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0789, train/loss_vlb_step=0.000273, train/loss_step=0.0789, global_step=3813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3806/5971 [38:50<22:05,  1.63it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0789, train/loss_vlb_step=0.000273, train/loss_step=0.0789, global_step=3813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▎   | 3806/5971 [38:50<22:05,  1.63it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000157, train/loss_step=0.0438, global_step=3813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3807/5971 [38:51<22:05,  1.63it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000157, train/loss_step=0.0438, global_step=3813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3807/5971 [38:51<22:05,  1.63it/s, loss=0.187, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000342, train/loss_step=0.103, global_step=3813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  64%|██████▍   | 3808/5971 [38:54<22:05,  1.63it/s, loss=0.187, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000342, train/loss_step=0.103, global_step=3813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3808/5971 [38:54<22:05,  1.63it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000168, train/loss_step=0.0456, global_step=3813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3809/5971 [38:55<22:05,  1.63it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000168, train/loss_step=0.0456, global_step=3813.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3809/5971 [38:55<22:05,  1.63it/s, loss=0.202, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00102, train/loss_step=0.265, global_step=3814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  64%|██████▍   | 3810/5971 [38:56<22:04,  1.63it/s, loss=0.202, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00102, train/loss_step=0.265, global_step=3814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3810/5971 [38:56<22:04,  1.63it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=2.98e-5, train/loss_step=0.00661, global_step=3814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3811/5971 [38:56<22:04,  1.63it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00661, train/loss_vlb_step=2.98e-5, train/loss_step=0.00661, global_step=3814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3811/5971 [38:56<22:04,  1.63it/s, loss=0.176, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00147, train/loss_step=0.313, global_step=3814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  64%|██████▍   | 3812/5971 [38:59<22:04,  1.63it/s, loss=0.176, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00147, train/loss_step=0.313, global_step=3814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3812/5971 [38:59<22:04,  1.63it/s, loss=0.178, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000444, train/loss_step=0.134, global_step=3814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3813/5971 [38:59<22:03,  1.63it/s, loss=0.178, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000444, train/loss_step=0.134, global_step=3814.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3813/5971 [38:59<22:03,  1.63it/s, loss=0.172, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00169, train/loss_step=0.315, global_step=3815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  64%|██████▍   | 3814/5971 [39:00<22:03,  1.63it/s, loss=0.172, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00169, train/loss_step=0.315, global_step=3815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3814/5971 [39:00<22:03,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.56e-5, train/loss_step=0.00292, global_step=3815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3815/5971 [39:01<22:03,  1.63it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.56e-5, train/loss_step=0.00292, global_step=3815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3815/5971 [39:01<22:03,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.13e-5, train/loss_step=0.0163, global_step=3815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  64%|██████▍   | 3816/5971 [39:04<22:03,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.13e-5, train/loss_step=0.0163, global_step=3815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3816/5971 [39:04<22:03,  1.63it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.52e-5, train/loss_step=0.00274, global_step=3815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3817/5971 [39:05<22:02,  1.63it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.52e-5, train/loss_step=0.00274, global_step=3815.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3817/5971 [39:05<22:02,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00207, train/loss_step=0.393, global_step=3816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  64%|██████▍   | 3818/5971 [39:05<22:02,  1.63it/s, loss=0.135, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00207, train/loss_step=0.393, global_step=3816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3818/5971 [39:05<22:02,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00124, train/loss_step=0.321, global_step=3816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3819/5971 [39:06<22:02,  1.63it/s, loss=0.123, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00124, train/loss_step=0.321, global_step=3816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3819/5971 [39:06<22:02,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.470, train/loss_vlb_step=0.00374, train/loss_step=0.470, global_step=3816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3820/5971 [39:08<22:02,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.470, train/loss_vlb_step=0.00374, train/loss_step=0.470, global_step=3816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3820/5971 [39:08<22:02,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.866, train/loss_vlb_step=0.0557, train/loss_step=0.866, global_step=3816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  64%|██████▍   | 3821/5971 [39:09<22:01,  1.63it/s, loss=0.185, v_num=0, train/loss_simple_step=0.866, train/loss_vlb_step=0.0557, train/loss_step=0.866, global_step=3816.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3821/5971 [39:09<22:01,  1.63it/s, loss=0.211, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00315, train/loss_step=0.524, global_step=3817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3822/5971 [39:10<22:01,  1.63it/s, loss=0.211, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00315, train/loss_step=0.524, global_step=3817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3822/5971 [39:10<22:01,  1.63it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0758, train/loss_vlb_step=0.000249, train/loss_step=0.0758, global_step=3817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3823/5971 [39:11<22:00,  1.63it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0758, train/loss_vlb_step=0.000249, train/loss_step=0.0758, global_step=3817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3823/5971 [39:11<22:00,  1.63it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000233, train/loss_step=0.0671, global_step=3817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3824/5971 [39:13<22:01,  1.63it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000233, train/loss_step=0.0671, global_step=3817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3824/5971 [39:13<22:01,  1.63it/s, loss=0.21, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000508, train/loss_step=0.150, global_step=3817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  64%|██████▍   | 3825/5971 [39:14<22:00,  1.62it/s, loss=0.21, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000508, train/loss_step=0.150, global_step=3817.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3825/5971 [39:14<22:00,  1.62it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000197, train/loss_step=0.0564, global_step=3818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3826/5971 [39:15<22:00,  1.62it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000197, train/loss_step=0.0564, global_step=3818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3826/5971 [39:15<22:00,  1.62it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000179, train/loss_step=0.0503, global_step=3818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3827/5971 [39:16<21:59,  1.62it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000179, train/loss_step=0.0503, global_step=3818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3827/5971 [39:16<21:59,  1.62it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000187, train/loss_step=0.0538, global_step=3818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3828/5971 [39:18<21:59,  1.62it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000187, train/loss_step=0.0538, global_step=3818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3828/5971 [39:18<21:59,  1.62it/s, loss=0.22, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00179, train/loss_step=0.326, global_step=3818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  64%|██████▍   | 3829/5971 [39:19<21:59,  1.62it/s, loss=0.22, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00179, train/loss_step=0.326, global_step=3818.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3829/5971 [39:19<21:59,  1.62it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000305, train/loss_step=0.0928, global_step=3819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3830/5971 [39:20<21:58,  1.62it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000305, train/loss_step=0.0928, global_step=3819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3830/5971 [39:20<21:58,  1.62it/s, loss=0.218, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=3819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  64%|██████▍   | 3831/5971 [39:20<21:58,  1.62it/s, loss=0.218, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=3819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3831/5971 [39:20<21:58,  1.62it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.78e-5, train/loss_step=0.00358, global_step=3819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3832/5971 [39:23<21:58,  1.62it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.78e-5, train/loss_step=0.00358, global_step=3819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3832/5971 [39:23<21:58,  1.62it/s, loss=0.227, v_num=0, train/loss_simple_step=0.614, train/loss_vlb_step=0.00888, train/loss_step=0.614, global_step=3819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  64%|██████▍   | 3833/5971 [39:23<21:58,  1.62it/s, loss=0.227, v_num=0, train/loss_simple_step=0.614, train/loss_vlb_step=0.00888, train/loss_step=0.614, global_step=3819.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3833/5971 [39:23<21:58,  1.62it/s, loss=0.221, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000743, train/loss_step=0.208, global_step=3820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3834/5971 [39:24<21:57,  1.62it/s, loss=0.221, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000743, train/loss_step=0.208, global_step=3820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3834/5971 [39:24<21:57,  1.62it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.18e-5, train/loss_step=0.0195, global_step=3820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3835/5971 [39:25<21:57,  1.62it/s, loss=0.222, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.18e-5, train/loss_step=0.0195, global_step=3820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3835/5971 [39:25<21:57,  1.62it/s, loss=0.234, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00109, train/loss_step=0.264, global_step=3820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  64%|██████▍   | 3836/5971 [39:28<21:57,  1.62it/s, loss=0.234, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00109, train/loss_step=0.264, global_step=3820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3836/5971 [39:28<21:57,  1.62it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0857, train/loss_vlb_step=0.000285, train/loss_step=0.0857, global_step=3820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3837/5971 [39:29<21:57,  1.62it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0857, train/loss_vlb_step=0.000285, train/loss_step=0.0857, global_step=3820.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3837/5971 [39:29<21:57,  1.62it/s, loss=0.228, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000632, train/loss_step=0.173, global_step=3821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  64%|██████▍   | 3838/5971 [39:29<21:56,  1.62it/s, loss=0.228, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000632, train/loss_step=0.173, global_step=3821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3838/5971 [39:29<21:56,  1.62it/s, loss=0.219, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00049, train/loss_step=0.140, global_step=3821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  64%|██████▍   | 3839/5971 [39:30<21:56,  1.62it/s, loss=0.219, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00049, train/loss_step=0.140, global_step=3821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3839/5971 [39:30<21:56,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0623, train/loss_vlb_step=0.000215, train/loss_step=0.0623, global_step=3821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3840/5971 [39:32<21:56,  1.62it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0623, train/loss_vlb_step=0.000215, train/loss_step=0.0623, global_step=3821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3840/5971 [39:32<21:56,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.36e-5, train/loss_step=0.00237, global_step=3821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3841/5971 [39:33<21:56,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.36e-5, train/loss_step=0.00237, global_step=3821.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3841/5971 [39:33<21:56,  1.62it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.5e-5, train/loss_step=0.00278, global_step=3822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  64%|██████▍   | 3842/5971 [39:34<21:55,  1.62it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.5e-5, train/loss_step=0.00278, global_step=3822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3842/5971 [39:34<21:55,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.103, train/loss_step=0.812, global_step=3822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  64%|██████▍   | 3843/5971 [39:35<21:55,  1.62it/s, loss=0.166, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.103, train/loss_step=0.812, global_step=3822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3843/5971 [39:35<21:55,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00804, train/loss_vlb_step=3.75e-5, train/loss_step=0.00804, global_step=3822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3844/5971 [39:37<21:55,  1.62it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00804, train/loss_vlb_step=3.75e-5, train/loss_step=0.00804, global_step=3822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3844/5971 [39:37<21:55,  1.62it/s, loss=0.175, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00197, train/loss_step=0.387, global_step=3822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  64%|██████▍   | 3845/5971 [39:38<21:54,  1.62it/s, loss=0.175, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00197, train/loss_step=0.387, global_step=3822.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3845/5971 [39:38<21:54,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.62e-5, train/loss_step=0.0257, global_step=3823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3846/5971 [39:39<21:54,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.62e-5, train/loss_step=0.0257, global_step=3823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3846/5971 [39:39<21:54,  1.62it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0832, train/loss_vlb_step=0.000274, train/loss_step=0.0832, global_step=3823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3847/5971 [39:40<21:53,  1.62it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0832, train/loss_vlb_step=0.000274, train/loss_step=0.0832, global_step=3823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3847/5971 [39:40<21:53,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00963, train/loss_vlb_step=4.01e-5, train/loss_step=0.00963, global_step=3823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3848/5971 [39:42<21:54,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00963, train/loss_vlb_step=4.01e-5, train/loss_step=0.00963, global_step=3823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3848/5971 [39:42<21:54,  1.62it/s, loss=0.186, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.0224, train/loss_step=0.599, global_step=3823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  64%|██████▍   | 3849/5971 [39:43<21:53,  1.62it/s, loss=0.186, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.0224, train/loss_step=0.599, global_step=3823.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3849/5971 [39:43<21:53,  1.62it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=4.92e-5, train/loss_step=0.0117, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3850/5971 [39:44<21:53,  1.62it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=4.92e-5, train/loss_step=0.0117, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3850/5971 [39:44<21:53,  1.62it/s, loss=0.184, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000575, train/loss_step=0.174, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  64%|██████▍   | 3851/5971 [39:45<21:52,  1.61it/s, loss=0.184, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000575, train/loss_step=0.174, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  64%|██████▍   | 3851/5971 [39:45<21:52,  1.61it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00584, train/loss_vlb_step=2.9e-5, train/loss_step=0.00584, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  65%|██████▍   | 3852/5971 [39:47<21:52,  1.61it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00584, train/loss_vlb_step=2.9e-5, train/loss_step=0.00584, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  65%|██████▍   | 3852/5971 [39:47<21:52,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<00:55,  3.00it/s][A
Epoch 6:  65%|██████▍   | 3854/5971 [39:47<21:51,  1.61it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:41,  3.94it/s][A
Epoch 6:  65%|██████▍   | 3856/5971 [39:47<21:49,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.61it/s][A
Epoch 6:  65%|██████▍   | 3859/5971 [39:47<21:46,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:10, 14.53it/s][A
Epoch 6:  65%|██████▍   | 3862/5971 [39:48<21:43,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.08it/s][A
Epoch 6:  65%|██████▍   | 3865/5971 [39:48<21:40,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:00<00:07, 20.38it/s][A
Epoch 6:  65%|██████▍   | 3868/5971 [39:48<21:38,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.26it/s][A
Epoch 6:  65%|██████▍   | 3871/5971 [39:48<21:35,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.27it/s][A
Epoch 6:  65%|██████▍   | 3874/5971 [39:48<21:32,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.51it/s][A
Epoch 6:  65%|██████▍   | 3877/5971 [39:48<21:29,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.78it/s][A
Epoch 6:  65%|██████▍   | 3880/5971 [39:48<21:27,  1.62it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 26.69it/s][A
Epoch 6:  65%|██████▌   | 3883/5971 [39:48<21:24,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:04, 27.05it/s][A
Epoch 6:  65%|██████▌   | 3886/5971 [39:48<21:21,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 27.68it/s][A
Epoch 6:  65%|██████▌   | 3889/5971 [39:49<21:18,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 27.43it/s][A
Epoch 6:  65%|██████▌   | 3892/5971 [39:49<21:15,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.75it/s][A
Epoch 6:  65%|██████▌   | 3895/5971 [39:49<21:13,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.21it/s][A
Epoch 6:  65%|██████▌   | 3898/5971 [39:49<21:10,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.84it/s][A
Epoch 6:  65%|██████▌   | 3901/5971 [39:49<21:07,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.86it/s][A
Epoch 6:  65%|██████▌   | 3904/5971 [39:49<21:04,  1.63it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.24it/s][A
Epoch 6:  65%|██████▌   | 3907/5971 [39:49<21:02,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.57it/s][A
Epoch 6:  65%|██████▌   | 3910/5971 [39:49<20:59,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 27.98it/s][A
Epoch 6:  66%|██████▌   | 3914/5971 [39:49<20:55,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.22it/s][A
Epoch 6:  66%|██████▌   | 3918/5971 [39:50<20:52,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:02<00:03, 27.48it/s][A
Epoch 6:  66%|██████▌   | 3922/5971 [39:50<20:48,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.76it/s][A
Epoch 6:  66%|██████▌   | 3926/5971 [39:50<20:44,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 28.76it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.93it/s][A
Epoch 6:  66%|██████▌   | 3930/5971 [39:50<20:41,  1.64it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 28.37it/s][A
Epoch 6:  66%|██████▌   | 3934/5971 [39:50<20:37,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:02, 28.30it/s][A
Epoch 6:  66%|██████▌   | 3938/5971 [39:50<20:33,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 28.54it/s][A
Epoch 6:  66%|██████▌   | 3942/5971 [39:50<20:30,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 28.49it/s][A
Epoch 6:  66%|██████▌   | 3946/5971 [39:51<20:26,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:03<00:02, 29.03it/s][A
Epoch 6:  66%|██████▌   | 3950/5971 [39:51<20:23,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 28.85it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.89it/s][A
Epoch 6:  66%|██████▌   | 3954/5971 [39:51<20:19,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 28.24it/s][A
Epoch 6:  66%|██████▋   | 3958/5971 [39:51<20:16,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.35it/s][A
Epoch 6:  66%|██████▋   | 3962/5971 [39:51<20:12,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.06it/s][A
Epoch 6:  66%|██████▋   | 3966/5971 [39:51<20:08,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 25.92it/s][A

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.01it/s][A
Epoch 6:  66%|██████▋   | 3970/5971 [39:52<20:05,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 26.96it/s][A
Epoch 6:  67%|██████▋   | 3974/5971 [39:52<20:01,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:04<00:01, 27.75it/s][A
Epoch 6:  67%|██████▋   | 3978/5971 [39:52<19:58,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.62it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.91it/s][A
Epoch 6:  67%|██████▋   | 3982/5971 [39:52<19:54,  1.66it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.99it/s][A
Epoch 6:  67%|██████▋   | 3986/5971 [39:52<19:51,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.68it/s][A
Epoch 6:  67%|██████▋   | 3990/5971 [39:52<19:47,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.90it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.44it/s][A
Epoch 6:  67%|██████▋   | 3994/5971 [39:52<19:44,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 27.18it/s][A
Epoch 6:  67%|██████▋   | 3998/5971 [39:53<19:40,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:05<00:00, 27.59it/s][A
Epoch 6:  67%|██████▋   | 4002/5971 [39:53<19:37,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:05<00:00, 26.52it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.47it/s][A
Epoch 6:  67%|██████▋   | 4006/5971 [39:53<19:33,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 25.84it/s][A
Epoch 6:  67%|██████▋   | 4010/5971 [39:53<19:30,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.14it/s][A
Epoch 6:  67%|██████▋   | 4014/5971 [39:53<19:26,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.17it/s][A
Epoch 6:  67%|██████▋   | 4018/5971 [39:53<19:23,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.20it/s][A
Epoch 6:  67%|██████▋   | 4020/5971 [39:54<19:21,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=3824.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  67%|██████▋   | 4021/5971 [39:55<19:21,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=3825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  67%|██████▋   | 4022/5971 [39:56<19:20,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00225, train/loss_vlb_step=1.27e-5, train/loss_step=0.00225, global_step=3825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  67%|██████▋   | 4022/5971 [39:56<19:20,  1.68it/s, loss=0.161, v_num=0, train/loss_simple_step=0.362, train/loss_vlb_step=0.00157, train/loss_step=0.362, global_step=3825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  67%|██████▋   | 4023/5971 [39:56<19:20,  1.68it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.79e-5, train/loss_step=0.0109, global_step=3825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  67%|██████▋   | 4024/5971 [39:59<19:20,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00147, train/loss_vlb_step=8.81e-6, train/loss_step=0.00147, global_step=3825.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  67%|██████▋   | 4025/5971 [40:00<19:20,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000139, train/loss_step=0.0402, global_step=3826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  67%|██████▋   | 4026/5971 [40:00<19:19,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000139, train/loss_step=0.0402, global_step=3826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  67%|██████▋   | 4026/5971 [40:00<19:19,  1.68it/s, loss=0.146, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00122, train/loss_step=0.308, global_step=3826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  67%|██████▋   | 4027/5971 [40:01<19:19,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000855, train/loss_step=0.223, global_step=3826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  67%|██████▋   | 4028/5971 [40:04<19:19,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00219, train/loss_step=0.448, global_step=3826.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  67%|██████▋   | 4029/5971 [40:04<19:18,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.29e-5, train/loss_step=0.0116, global_step=3827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  67%|██████▋   | 4030/5971 [40:05<19:18,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.29e-5, train/loss_step=0.0116, global_step=3827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  67%|██████▋   | 4030/5971 [40:05<19:18,  1.68it/s, loss=0.143, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000459, train/loss_step=0.139, global_step=3827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  68%|██████▊   | 4031/5971 [40:06<19:17,  1.68it/s, loss=0.154, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000895, train/loss_step=0.232, global_step=3827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4032/5971 [40:09<19:18,  1.67it/s, loss=0.145, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000703, train/loss_step=0.207, global_step=3827.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4033/5971 [40:09<19:17,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000213, train/loss_step=0.063, global_step=3828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4034/5971 [40:10<19:17,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.063, train/loss_vlb_step=0.000213, train/loss_step=0.063, global_step=3828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4034/5971 [40:10<19:17,  1.67it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.12e-5, train/loss_step=0.00191, global_step=3828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4035/5971 [40:11<19:16,  1.67it/s, loss=0.181, v_num=0, train/loss_simple_step=0.776, train/loss_vlb_step=0.131, train/loss_step=0.776, global_step=3828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  68%|██████▊   | 4036/5971 [40:13<19:17,  1.67it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0043, train/loss_vlb_step=2.25e-5, train/loss_step=0.0043, global_step=3828.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4037/5971 [40:14<19:16,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.00024, train/loss_step=0.0716, global_step=3829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4038/5971 [40:15<19:16,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.00024, train/loss_step=0.0716, global_step=3829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4038/5971 [40:15<19:16,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000128, train/loss_step=0.0328, global_step=3829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4039/5971 [40:16<19:15,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.720, train/loss_vlb_step=0.0156, train/loss_step=0.720, global_step=3829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  68%|██████▊   | 4040/5971 [40:18<19:15,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.49e-5, train/loss_step=0.0145, global_step=3829.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4041/5971 [40:19<19:15,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00372, train/loss_step=0.415, global_step=3830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  68%|██████▊   | 4042/5971 [40:20<19:14,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00372, train/loss_step=0.415, global_step=3830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4042/5971 [40:20<19:14,  1.67it/s, loss=0.195, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000609, train/loss_step=0.176, global_step=3830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4043/5971 [40:21<19:14,  1.67it/s, loss=0.202, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000555, train/loss_step=0.162, global_step=3830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4044/5971 [40:23<19:14,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000582, train/loss_step=0.168, global_step=3830.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4045/5971 [40:24<19:14,  1.67it/s, loss=0.22, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000804, train/loss_step=0.219, global_step=3831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  68%|██████▊   | 4046/5971 [40:25<19:13,  1.67it/s, loss=0.22, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000804, train/loss_step=0.219, global_step=3831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4046/5971 [40:25<19:13,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00221, train/loss_vlb_step=1.29e-5, train/loss_step=0.00221, global_step=3831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4047/5971 [40:26<19:13,  1.67it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0893, train/loss_vlb_step=0.000295, train/loss_step=0.0893, global_step=3831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  68%|██████▊   | 4048/5971 [40:28<19:13,  1.67it/s, loss=0.179, v_num=0, train/loss_simple_step=0.066, train/loss_vlb_step=0.000221, train/loss_step=0.066, global_step=3831.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  68%|██████▊   | 4049/5971 [40:29<19:12,  1.67it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0506, train/loss_vlb_step=0.00018, train/loss_step=0.0506, global_step=3832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4050/5971 [40:30<19:12,  1.67it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0506, train/loss_vlb_step=0.00018, train/loss_step=0.0506, global_step=3832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4050/5971 [40:30<19:12,  1.67it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.52e-5, train/loss_step=0.00268, global_step=3832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4051/5971 [40:31<19:12,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00186, train/loss_step=0.413, global_step=3832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  68%|██████▊   | 4052/5971 [40:33<19:12,  1.67it/s, loss=0.196, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.00623, train/loss_step=0.479, global_step=3832.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4053/5971 [40:34<19:11,  1.67it/s, loss=0.203, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.00071, train/loss_step=0.204, global_step=3833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4054/5971 [40:35<19:11,  1.67it/s, loss=0.203, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.00071, train/loss_step=0.204, global_step=3833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4054/5971 [40:35<19:11,  1.67it/s, loss=0.227, v_num=0, train/loss_simple_step=0.470, train/loss_vlb_step=0.00314, train/loss_step=0.470, global_step=3833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4055/5971 [40:36<19:10,  1.66it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.53e-5, train/loss_step=0.00484, global_step=3833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4056/5971 [40:38<19:11,  1.66it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00336, train/loss_vlb_step=1.85e-5, train/loss_step=0.00336, global_step=3833.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4057/5971 [40:39<19:10,  1.66it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00546, train/loss_vlb_step=2.71e-5, train/loss_step=0.00546, global_step=3834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4058/5971 [40:40<19:10,  1.66it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00546, train/loss_vlb_step=2.71e-5, train/loss_step=0.00546, global_step=3834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4058/5971 [40:40<19:10,  1.66it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00507, train/loss_vlb_step=2.57e-5, train/loss_step=0.00507, global_step=3834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4059/5971 [40:41<19:09,  1.66it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.05e-5, train/loss_step=0.00183, global_step=3834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4060/5971 [40:43<19:09,  1.66it/s, loss=0.156, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000644, train/loss_step=0.186, global_step=3834.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  68%|██████▊   | 4061/5971 [40:44<19:09,  1.66it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.71e-5, train/loss_step=0.00568, global_step=3835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4062/5971 [40:45<19:09,  1.66it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.71e-5, train/loss_step=0.00568, global_step=3835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4062/5971 [40:45<19:09,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.21e-5, train/loss_step=0.00215, global_step=3835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4063/5971 [40:46<19:08,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000639, train/loss_step=0.164, global_step=3835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  68%|██████▊   | 4064/5971 [40:48<19:08,  1.66it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000123, train/loss_step=0.0311, global_step=3835.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4065/5971 [40:49<19:08,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00127, train/loss_step=0.303, global_step=3836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  68%|██████▊   | 4066/5971 [40:50<19:07,  1.66it/s, loss=0.125, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00127, train/loss_step=0.303, global_step=3836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4066/5971 [40:50<19:07,  1.66it/s, loss=0.135, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000778, train/loss_step=0.215, global_step=3836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4067/5971 [40:51<19:07,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0089, train/loss_vlb_step=3.95e-5, train/loss_step=0.0089, global_step=3836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4068/5971 [40:53<19:07,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.77e-5, train/loss_step=0.0161, global_step=3836.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4069/5971 [40:54<19:06,  1.66it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000289, train/loss_step=0.0868, global_step=3837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4070/5971 [40:55<19:06,  1.66it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000289, train/loss_step=0.0868, global_step=3837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4070/5971 [40:55<19:06,  1.66it/s, loss=0.141, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000813, train/loss_step=0.220, global_step=3837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  68%|██████▊   | 4071/5971 [40:56<19:05,  1.66it/s, loss=0.129, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000594, train/loss_step=0.177, global_step=3837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4072/5971 [40:58<19:06,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.15e-5, train/loss_step=0.0143, global_step=3837.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4073/5971 [40:59<19:05,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000671, train/loss_step=0.194, global_step=3838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  68%|██████▊   | 4074/5971 [40:59<19:05,  1.66it/s, loss=0.106, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000671, train/loss_step=0.194, global_step=3838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4074/5971 [40:59<19:05,  1.66it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.02e-5, train/loss_step=0.0117, global_step=3838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4075/5971 [41:00<19:04,  1.66it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=5.98e-5, train/loss_step=0.0143, global_step=3838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4076/5971 [41:02<19:04,  1.66it/s, loss=0.0882, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000335, train/loss_step=0.102, global_step=3838.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  68%|██████▊   | 4077/5971 [41:03<19:04,  1.66it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.00836, train/loss_vlb_step=3.86e-5, train/loss_step=0.00836, global_step=3839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4078/5971 [41:04<19:03,  1.65it/s, loss=0.0883, v_num=0, train/loss_simple_step=0.00836, train/loss_vlb_step=3.86e-5, train/loss_step=0.00836, global_step=3839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4078/5971 [41:04<19:03,  1.65it/s, loss=0.105, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00183, train/loss_step=0.343, global_step=3839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  68%|██████▊   | 4079/5971 [41:05<19:03,  1.65it/s, loss=0.115, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000659, train/loss_step=0.192, global_step=3839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4080/5971 [41:07<19:03,  1.65it/s, loss=0.12, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00149, train/loss_step=0.294, global_step=3839.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  68%|██████▊   | 4081/5971 [41:08<19:02,  1.65it/s, loss=0.127, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000494, train/loss_step=0.148, global_step=3840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4082/5971 [41:09<19:02,  1.65it/s, loss=0.127, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000494, train/loss_step=0.148, global_step=3840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4082/5971 [41:09<19:02,  1.65it/s, loss=0.16, v_num=0, train/loss_simple_step=0.648, train/loss_vlb_step=0.0281, train/loss_step=0.648, global_step=3840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  68%|██████▊   | 4083/5971 [41:10<19:02,  1.65it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000215, train/loss_step=0.0622, global_step=3840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4084/5971 [41:12<19:02,  1.65it/s, loss=0.181, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.00718, train/loss_step=0.563, global_step=3840.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  68%|██████▊   | 4085/5971 [41:13<19:01,  1.65it/s, loss=0.173, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4086/5971 [41:14<19:01,  1.65it/s, loss=0.173, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000496, train/loss_step=0.148, global_step=3841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4086/5971 [41:14<19:01,  1.65it/s, loss=0.174, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000822, train/loss_step=0.237, global_step=3841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4087/5971 [41:15<19:00,  1.65it/s, loss=0.194, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00299, train/loss_step=0.395, global_step=3841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  68%|██████▊   | 4088/5971 [41:17<19:00,  1.65it/s, loss=0.193, v_num=0, train/loss_simple_step=0.000924, train/loss_vlb_step=5.64e-6, train/loss_step=0.000924, global_step=3841.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4089/5971 [41:18<19:00,  1.65it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00147, train/loss_vlb_step=8.81e-6, train/loss_step=0.00147, global_step=3842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  68%|██████▊   | 4090/5971 [41:19<18:59,  1.65it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00147, train/loss_vlb_step=8.81e-6, train/loss_step=0.00147, global_step=3842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  68%|██████▊   | 4090/5971 [41:19<18:59,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.34e-5, train/loss_step=0.0197, global_step=3842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  69%|██████▊   | 4091/5971 [41:20<18:59,  1.65it/s, loss=0.208, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0251, train/loss_step=0.761, global_step=3842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  69%|██████▊   | 4092/5971 [41:22<18:59,  1.65it/s, loss=0.208, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=9e-5, train/loss_step=0.022, global_step=3842.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  69%|██████▊   | 4093/5971 [41:23<18:59,  1.65it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.7e-5, train/loss_step=0.0254, global_step=3843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4094/5971 [41:24<18:58,  1.65it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.7e-5, train/loss_step=0.0254, global_step=3843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4094/5971 [41:24<18:58,  1.65it/s, loss=0.207, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000511, train/loss_step=0.154, global_step=3843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4095/5971 [41:24<18:58,  1.65it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.57e-5, train/loss_step=0.0098, global_step=3843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4096/5971 [41:27<18:58,  1.65it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00845, train/loss_vlb_step=3.85e-5, train/loss_step=0.00845, global_step=3843.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4097/5971 [41:28<18:57,  1.65it/s, loss=0.216, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00143, train/loss_step=0.284, global_step=3844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  69%|██████▊   | 4098/5971 [41:28<18:57,  1.65it/s, loss=0.216, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00143, train/loss_step=0.284, global_step=3844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4098/5971 [41:28<18:57,  1.65it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=8.93e-5, train/loss_step=0.0236, global_step=3844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4099/5971 [41:29<18:56,  1.65it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0265, train/loss_vlb_step=0.000101, train/loss_step=0.0265, global_step=3844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4100/5971 [41:31<18:56,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.000153, train/loss_step=0.0427, global_step=3844.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4101/5971 [41:32<18:56,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.00049, train/loss_step=0.149, global_step=3845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  69%|██████▊   | 4102/5971 [41:33<18:55,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.00049, train/loss_step=0.149, global_step=3845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4102/5971 [41:33<18:55,  1.65it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00513, train/loss_vlb_step=2.75e-5, train/loss_step=0.00513, global_step=3845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▊   | 4103/5971 [41:34<18:55,  1.65it/s, loss=0.155, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000777, train/loss_step=0.217, global_step=3845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  69%|██████▊   | 4104/5971 [41:36<18:55,  1.64it/s, loss=0.155, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00473, train/loss_step=0.564, global_step=3845.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  69%|██████▊   | 4105/5971 [41:37<18:55,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00231, train/loss_step=0.407, global_step=3846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4106/5971 [41:38<18:54,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00231, train/loss_step=0.407, global_step=3846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4106/5971 [41:38<18:54,  1.64it/s, loss=0.163, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000459, train/loss_step=0.139, global_step=3846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4107/5971 [41:39<18:54,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00157, train/loss_step=0.310, global_step=3846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  69%|██████▉   | 4108/5971 [41:41<18:54,  1.64it/s, loss=0.164, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000481, train/loss_step=0.121, global_step=3846.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4109/5971 [41:42<18:53,  1.64it/s, loss=0.171, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000405, train/loss_step=0.122, global_step=3847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4110/5971 [41:43<18:53,  1.64it/s, loss=0.171, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000405, train/loss_step=0.122, global_step=3847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4110/5971 [41:43<18:53,  1.64it/s, loss=0.18, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000743, train/loss_step=0.205, global_step=3847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  69%|██████▉   | 4111/5971 [41:44<18:52,  1.64it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000146, train/loss_step=0.0409, global_step=3847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4112/5971 [41:46<18:52,  1.64it/s, loss=0.159, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00162, train/loss_step=0.322, global_step=3847.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  69%|██████▉   | 4113/5971 [41:47<18:52,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.51e-5, train/loss_step=0.0126, global_step=3848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4114/5971 [41:48<18:51,  1.64it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.51e-5, train/loss_step=0.0126, global_step=3848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4114/5971 [41:48<18:51,  1.64it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00747, train/loss_vlb_step=3.47e-5, train/loss_step=0.00747, global_step=3848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4115/5971 [41:49<18:51,  1.64it/s, loss=0.161, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000941, train/loss_step=0.220, global_step=3848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  69%|██████▉   | 4116/5971 [41:51<18:51,  1.64it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.81e-5, train/loss_step=0.0163, global_step=3848.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4117/5971 [41:52<18:51,  1.64it/s, loss=0.191, v_num=0, train/loss_simple_step=0.879, train/loss_vlb_step=0.0455, train/loss_step=0.879, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  69%|██████▉   | 4118/5971 [41:53<18:50,  1.64it/s, loss=0.191, v_num=0, train/loss_simple_step=0.879, train/loss_vlb_step=0.0455, train/loss_step=0.879, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4118/5971 [41:53<18:50,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00276, train/loss_step=0.407, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4119/5971 [41:54<18:50,  1.64it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000202, train/loss_step=0.0579, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  69%|██████▉   | 4120/5971 [41:56<18:50,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.36it/s][A
Epoch 6:  69%|██████▉   | 4122/5971 [41:56<18:48,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:50,  3.28it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.62it/s][A
Epoch 6:  69%|██████▉   | 4126/5971 [41:57<18:45,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.43it/s][A
Epoch 6:  69%|██████▉   | 4130/5971 [41:57<18:41,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.18it/s][A
Epoch 6:  69%|██████▉   | 4134/5971 [41:57<18:38,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.40it/s][A

Validating:  10%|█         | 17/167 [00:01<00:07, 21.32it/s][A
Epoch 6:  69%|██████▉   | 4138/5971 [41:57<18:35,  1.64it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.62it/s][A
Epoch 6:  69%|██████▉   | 4142/5971 [41:57<18:31,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.71it/s][A
Epoch 6:  69%|██████▉   | 4146/5971 [41:58<18:28,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.07it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.96it/s][A
Epoch 6:  70%|██████▉   | 4150/5971 [41:58<18:24,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.86it/s][A
Epoch 6:  70%|██████▉   | 4154/5971 [41:58<18:21,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.62it/s][A
Epoch 6:  70%|██████▉   | 4158/5971 [41:58<18:17,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.22it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.52it/s][A
Epoch 6:  70%|██████▉   | 4162/5971 [41:58<18:14,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.04it/s][A
Epoch 6:  70%|██████▉   | 4166/5971 [41:58<18:11,  1.65it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.37it/s][A
Epoch 6:  70%|██████▉   | 4170/5971 [41:58<18:07,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.60it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 25.27it/s][A
Epoch 6:  70%|██████▉   | 4174/5971 [41:59<18:04,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.49it/s][A
Epoch 6:  70%|██████▉   | 4178/5971 [41:59<18:00,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 25.40it/s][A
Epoch 6:  70%|███████   | 4182/5971 [41:59<17:57,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 24.47it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:03, 25.56it/s][A
Epoch 6:  70%|███████   | 4186/5971 [41:59<17:54,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.34it/s][A
Epoch 6:  70%|███████   | 4190/5971 [41:59<17:50,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.19it/s][A
Epoch 6:  70%|███████   | 4194/5971 [41:59<17:47,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.36it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.01it/s][A
Epoch 6:  70%|███████   | 4198/5971 [42:00<17:44,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.53it/s][A
Epoch 6:  70%|███████   | 4202/5971 [42:00<17:40,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.16it/s][A
Epoch 6:  70%|███████   | 4206/5971 [42:00<17:37,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.44it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 28.06it/s][A
Epoch 6:  71%|███████   | 4210/5971 [42:00<17:34,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.43it/s][A
Epoch 6:  71%|███████   | 4214/5971 [42:00<17:30,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.59it/s][A
Epoch 6:  71%|███████   | 4218/5971 [42:00<17:27,  1.67it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.62it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.47it/s][A
Epoch 6:  71%|███████   | 4222/5971 [42:00<17:24,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.58it/s][A
Epoch 6:  71%|███████   | 4226/5971 [42:01<17:20,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.20it/s][A
Epoch 6:  71%|███████   | 4230/5971 [42:01<17:17,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.38it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 24.82it/s][A
Epoch 6:  71%|███████   | 4234/5971 [42:01<17:14,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 25.58it/s][A
Epoch 6:  71%|███████   | 4238/5971 [42:01<17:10,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.51it/s][A
Epoch 6:  71%|███████   | 4242/5971 [42:01<17:07,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.78it/s][A
Epoch 6:  71%|███████   | 4246/5971 [42:01<17:04,  1.68it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.28it/s][A
Epoch 6:  71%|███████   | 4250/5971 [42:02<17:01,  1.69it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.59it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.02it/s][A
Epoch 6:  71%|███████   | 4254/5971 [42:02<16:57,  1.69it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.40it/s][A
Epoch 6:  71%|███████▏  | 4258/5971 [42:02<16:54,  1.69it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.92it/s][A
Epoch 6:  71%|███████▏  | 4262/5971 [42:02<16:51,  1.69it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 28.97it/s][A
Epoch 6:  71%|███████▏  | 4266/5971 [42:02<16:47,  1.69it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 28.05it/s][A
Epoch 6:  72%|███████▏  | 4270/5971 [42:02<16:44,  1.69it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.48it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.40it/s][A
Epoch 6:  72%|███████▏  | 4274/5971 [42:02<16:41,  1.69it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.58it/s][A
Epoch 6:  72%|███████▏  | 4278/5971 [42:03<16:38,  1.70it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 27.23it/s][A
Epoch 6:  72%|███████▏  | 4282/5971 [42:03<16:35,  1.70it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.38it/s][A
Epoch 6:  72%|███████▏  | 4286/5971 [42:03<16:31,  1.70it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:06<00:00, 28.66it/s][A
Epoch 6:  72%|███████▏  | 4288/5971 [42:03<16:30,  1.70it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.1e-5, train/loss_step=0.0115, global_step=3849.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  72%|███████▏  | 4289/5971 [42:04<16:29,  1.70it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00745, train/loss_vlb_step=3.45e-5, train/loss_step=0.00745, global_step=3850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4290/5971 [42:05<16:29,  1.70it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00745, train/loss_vlb_step=3.45e-5, train/loss_step=0.00745, global_step=3850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4290/5971 [42:05<16:29,  1.70it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0044, train/loss_vlb_step=2.29e-5, train/loss_step=0.0044, global_step=3850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4291/5971 [42:06<16:28,  1.70it/s, loss=0.211, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00169, train/loss_step=0.368, global_step=3850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4292/5971 [42:08<16:29,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00836, train/loss_vlb_step=4.02e-5, train/loss_step=0.00836, global_step=3850.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4293/5971 [42:09<16:28,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.00052, train/loss_step=0.157, global_step=3851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  72%|███████▏  | 4294/5971 [42:10<16:28,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.00052, train/loss_step=0.157, global_step=3851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4294/5971 [42:10<16:28,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0846, train/loss_vlb_step=0.000287, train/loss_step=0.0846, global_step=3851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4295/5971 [42:11<16:27,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=3851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4296/5971 [42:13<16:27,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000649, train/loss_step=0.186, global_step=3851.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4297/5971 [42:14<16:27,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.00023, train/loss_step=0.0685, global_step=3852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4298/5971 [42:15<16:26,  1.70it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0685, train/loss_vlb_step=0.00023, train/loss_step=0.0685, global_step=3852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4298/5971 [42:15<16:26,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.361, train/loss_vlb_step=0.00226, train/loss_step=0.361, global_step=3852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4299/5971 [42:16<16:26,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0993, train/loss_vlb_step=0.000326, train/loss_step=0.0993, global_step=3852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4300/5971 [42:19<16:26,  1.69it/s, loss=0.162, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000618, train/loss_step=0.176, global_step=3852.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4301/5971 [42:20<16:26,  1.69it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00922, train/loss_vlb_step=4.14e-5, train/loss_step=0.00922, global_step=3853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4302/5971 [42:20<16:25,  1.69it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00922, train/loss_vlb_step=4.14e-5, train/loss_step=0.00922, global_step=3853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4302/5971 [42:20<16:25,  1.69it/s, loss=0.183, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00257, train/loss_step=0.430, global_step=3853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  72%|███████▏  | 4303/5971 [42:21<16:25,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.000258, train/loss_step=0.0778, global_step=3853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4304/5971 [42:24<16:25,  1.69it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0558, train/loss_vlb_step=0.000188, train/loss_step=0.0558, global_step=3853.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4305/5971 [42:25<16:24,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000109, train/loss_step=0.0278, global_step=3854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4306/5971 [42:26<16:24,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000109, train/loss_step=0.0278, global_step=3854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4306/5971 [42:26<16:24,  1.69it/s, loss=0.124, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000613, train/loss_step=0.182, global_step=3854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4307/5971 [42:26<16:23,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=0.0001, train/loss_step=0.0238, global_step=3854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4308/5971 [42:29<16:23,  1.69it/s, loss=0.133, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000915, train/loss_step=0.234, global_step=3854.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4309/5971 [42:30<16:23,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.63e-5, train/loss_step=0.016, global_step=3855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  72%|███████▏  | 4310/5971 [42:31<16:22,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.63e-5, train/loss_step=0.016, global_step=3855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4310/5971 [42:31<16:22,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000668, train/loss_step=0.203, global_step=3855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4311/5971 [42:31<16:22,  1.69it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000169, train/loss_step=0.0497, global_step=3855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4312/5971 [42:34<16:22,  1.69it/s, loss=0.128, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.37e-5, train/loss_step=0.015, global_step=3855.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  72%|███████▏  | 4313/5971 [42:35<16:22,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000135, train/loss_step=0.0393, global_step=3856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4314/5971 [42:36<16:21,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000135, train/loss_step=0.0393, global_step=3856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4314/5971 [42:36<16:21,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00147, train/loss_step=0.343, global_step=3856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  72%|███████▏  | 4315/5971 [42:36<16:21,  1.69it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.72e-5, train/loss_step=0.0105, global_step=3856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4316/5971 [42:39<16:21,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.733, train/loss_vlb_step=0.0318, train/loss_step=0.733, global_step=3856.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4317/5971 [42:40<16:20,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.66e-5, train/loss_step=0.00303, global_step=3857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4318/5971 [42:40<16:20,  1.69it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00303, train/loss_vlb_step=1.66e-5, train/loss_step=0.00303, global_step=3857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4318/5971 [42:40<16:20,  1.69it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.25e-5, train/loss_step=0.0022, global_step=3857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4319/5971 [42:41<16:19,  1.69it/s, loss=0.14, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000551, train/loss_step=0.165, global_step=3857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4320/5971 [42:43<16:19,  1.69it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000143, train/loss_step=0.0391, global_step=3857.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4321/5971 [42:44<16:19,  1.69it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00595, train/loss_vlb_step=2.81e-5, train/loss_step=0.00595, global_step=3858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4322/5971 [42:45<16:18,  1.68it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00595, train/loss_vlb_step=2.81e-5, train/loss_step=0.00595, global_step=3858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4322/5971 [42:45<16:18,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000374, train/loss_step=0.113, global_step=3858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  72%|███████▏  | 4323/5971 [42:46<16:18,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.506, train/loss_vlb_step=0.00518, train/loss_step=0.506, global_step=3858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  72%|███████▏  | 4324/5971 [42:48<16:18,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000228, train/loss_step=0.0675, global_step=3858.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4325/5971 [42:49<16:17,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00183, train/loss_step=0.351, global_step=3859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  72%|███████▏  | 4326/5971 [42:50<16:17,  1.68it/s, loss=0.155, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00183, train/loss_step=0.351, global_step=3859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4326/5971 [42:50<16:17,  1.68it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000139, train/loss_step=0.0355, global_step=3859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  72%|███████▏  | 4327/5971 [42:51<16:16,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000461, train/loss_step=0.139, global_step=3859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  72%|███████▏  | 4328/5971 [42:53<16:16,  1.68it/s, loss=0.167, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.00356, train/loss_step=0.504, global_step=3859.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  73%|███████▎  | 4329/5971 [42:54<16:16,  1.68it/s, loss=0.174, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000538, train/loss_step=0.155, global_step=3860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4330/5971 [42:55<16:15,  1.68it/s, loss=0.174, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000538, train/loss_step=0.155, global_step=3860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4330/5971 [42:55<16:15,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000118, train/loss_step=0.0319, global_step=3860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4331/5971 [42:56<16:15,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.00014, train/loss_step=0.0393, global_step=3860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  73%|███████▎  | 4332/5971 [42:58<16:15,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000127, train/loss_step=0.0346, global_step=3860.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4333/5971 [42:59<16:14,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000141, train/loss_step=0.0372, global_step=3861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4334/5971 [43:00<16:14,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000141, train/loss_step=0.0372, global_step=3861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4334/5971 [43:00<16:14,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00128, train/loss_step=0.330, global_step=3861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  73%|███████▎  | 4335/5971 [43:01<16:13,  1.68it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000132, train/loss_step=0.0374, global_step=3861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4336/5971 [43:03<16:13,  1.68it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.61e-5, train/loss_step=0.00286, global_step=3861.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4337/5971 [43:04<16:13,  1.68it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.53e-5, train/loss_step=0.00277, global_step=3862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4338/5971 [43:05<16:13,  1.68it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.53e-5, train/loss_step=0.00277, global_step=3862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4338/5971 [43:05<16:13,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000587, train/loss_step=0.168, global_step=3862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  73%|███████▎  | 4339/5971 [43:06<16:12,  1.68it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0509, train/loss_vlb_step=0.000184, train/loss_step=0.0509, global_step=3862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4340/5971 [43:08<16:12,  1.68it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.5e-5, train/loss_step=0.0254, global_step=3862.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  73%|███████▎  | 4341/5971 [43:09<16:12,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000355, train/loss_step=0.107, global_step=3863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4342/5971 [43:10<16:11,  1.68it/s, loss=0.137, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000355, train/loss_step=0.107, global_step=3863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4342/5971 [43:10<16:11,  1.68it/s, loss=0.139, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000513, train/loss_step=0.156, global_step=3863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4343/5971 [43:11<16:11,  1.68it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00414, train/loss_vlb_step=2.22e-5, train/loss_step=0.00414, global_step=3863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4344/5971 [43:13<16:11,  1.68it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00131, train/loss_vlb_step=7.75e-6, train/loss_step=0.00131, global_step=3863.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4345/5971 [43:14<16:10,  1.68it/s, loss=0.0945, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=9.61e-5, train/loss_step=0.0264, global_step=3864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  73%|███████▎  | 4346/5971 [43:14<16:10,  1.68it/s, loss=0.0945, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=9.61e-5, train/loss_step=0.0264, global_step=3864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4346/5971 [43:14<16:10,  1.68it/s, loss=0.093, v_num=0, train/loss_simple_step=0.00591, train/loss_vlb_step=3.13e-5, train/loss_step=0.00591, global_step=3864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4347/5971 [43:15<16:09,  1.67it/s, loss=0.0896, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000235, train/loss_step=0.0716, global_step=3864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4348/5971 [43:18<16:09,  1.67it/s, loss=0.0692, v_num=0, train/loss_simple_step=0.0967, train/loss_vlb_step=0.000322, train/loss_step=0.0967, global_step=3864.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4349/5971 [43:18<16:09,  1.67it/s, loss=0.0642, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000184, train/loss_step=0.0533, global_step=3865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4350/5971 [43:19<16:08,  1.67it/s, loss=0.0642, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000184, train/loss_step=0.0533, global_step=3865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4350/5971 [43:19<16:08,  1.67it/s, loss=0.0641, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000108, train/loss_step=0.0307, global_step=3865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4351/5971 [43:20<16:08,  1.67it/s, loss=0.0715, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000656, train/loss_step=0.187, global_step=3865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  73%|███████▎  | 4352/5971 [43:23<16:08,  1.67it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=5.15e-5, train/loss_step=0.0111, global_step=3865.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4353/5971 [43:24<16:07,  1.67it/s, loss=0.0721, v_num=0, train/loss_simple_step=0.0742, train/loss_vlb_step=0.000248, train/loss_step=0.0742, global_step=3866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4354/5971 [43:24<16:07,  1.67it/s, loss=0.0721, v_num=0, train/loss_simple_step=0.0742, train/loss_vlb_step=0.000248, train/loss_step=0.0742, global_step=3866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4354/5971 [43:24<16:07,  1.67it/s, loss=0.0698, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00107, train/loss_step=0.283, global_step=3866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  73%|███████▎  | 4355/5971 [43:25<16:06,  1.67it/s, loss=0.0805, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00104, train/loss_step=0.253, global_step=3866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4356/5971 [43:27<16:06,  1.67it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.0222, train/loss_vlb_step=8.65e-5, train/loss_step=0.0222, global_step=3866.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4357/5971 [43:28<16:06,  1.67it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.51e-6, train/loss_step=0.00141, global_step=3867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4358/5971 [43:29<16:05,  1.67it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.51e-6, train/loss_step=0.00141, global_step=3867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4358/5971 [43:29<16:05,  1.67it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.531, train/loss_vlb_step=0.00617, train/loss_step=0.531, global_step=3867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  73%|███████▎  | 4359/5971 [43:30<16:05,  1.67it/s, loss=0.108, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000753, train/loss_step=0.213, global_step=3867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4360/5971 [43:32<16:05,  1.67it/s, loss=0.107, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.67e-5, train/loss_step=0.011, global_step=3867.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  73%|███████▎  | 4361/5971 [43:33<16:04,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0777, train/loss_vlb_step=0.000263, train/loss_step=0.0777, global_step=3868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4362/5971 [43:34<16:04,  1.67it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0777, train/loss_vlb_step=0.000263, train/loss_step=0.0777, global_step=3868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4362/5971 [43:34<16:04,  1.67it/s, loss=0.0981, v_num=0, train/loss_simple_step=0.00909, train/loss_vlb_step=4.33e-5, train/loss_step=0.00909, global_step=3868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4363/5971 [43:35<16:03,  1.67it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000142, train/loss_step=0.0381, global_step=3868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  73%|███████▎  | 4364/5971 [43:37<16:03,  1.67it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.000224, train/loss_step=0.0682, global_step=3868.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  73%|███████▎  | 4365/5971 [43:38<16:03,  1.67it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.89e-5, train/loss_step=0.00346, global_step=3869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4366/5971 [43:39<16:02,  1.67it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.89e-5, train/loss_step=0.00346, global_step=3869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4366/5971 [43:39<16:02,  1.67it/s, loss=0.117, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00151, train/loss_step=0.298, global_step=3869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  73%|███████▎  | 4367/5971 [43:40<16:02,  1.67it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.59e-5, train/loss_step=0.0136, global_step=3869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4368/5971 [43:42<16:02,  1.67it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0973, train/loss_vlb_step=0.000321, train/loss_step=0.0973, global_step=3869.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4369/5971 [43:43<16:01,  1.67it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.41e-5, train/loss_step=0.00261, global_step=3870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4370/5971 [43:44<16:01,  1.67it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.41e-5, train/loss_step=0.00261, global_step=3870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4370/5971 [43:44<16:01,  1.67it/s, loss=0.119, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000616, train/loss_step=0.178, global_step=3870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  73%|███████▎  | 4371/5971 [43:44<16:00,  1.67it/s, loss=0.116, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000544, train/loss_step=0.131, global_step=3870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4372/5971 [43:47<16:00,  1.66it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0892, train/loss_vlb_step=0.000293, train/loss_step=0.0892, global_step=3870.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4373/5971 [43:48<16:00,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00196, train/loss_step=0.358, global_step=3871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  73%|███████▎  | 4374/5971 [43:49<15:59,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00196, train/loss_step=0.358, global_step=3871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4374/5971 [43:49<15:59,  1.66it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000106, train/loss_step=0.0286, global_step=3871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4375/5971 [43:49<15:59,  1.66it/s, loss=0.127, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00175, train/loss_step=0.363, global_step=3871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  73%|███████▎  | 4376/5971 [43:52<15:59,  1.66it/s, loss=0.131, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000388, train/loss_step=0.117, global_step=3871.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4377/5971 [43:52<15:58,  1.66it/s, loss=0.145, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00147, train/loss_step=0.282, global_step=3872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  73%|███████▎  | 4378/5971 [43:53<15:58,  1.66it/s, loss=0.145, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00147, train/loss_step=0.282, global_step=3872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4378/5971 [43:53<15:58,  1.66it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0048, train/loss_vlb_step=2.49e-5, train/loss_step=0.0048, global_step=3872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4379/5971 [43:54<15:57,  1.66it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=7.97e-6, train/loss_step=0.00132, global_step=3872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4380/5971 [43:57<15:57,  1.66it/s, loss=0.115, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000456, train/loss_step=0.138, global_step=3872.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  73%|███████▎  | 4381/5971 [43:57<15:57,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=3873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4382/5971 [43:58<15:56,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000338, train/loss_step=0.103, global_step=3873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4382/5971 [43:58<15:56,  1.66it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00978, train/loss_vlb_step=4.27e-5, train/loss_step=0.00978, global_step=3873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4383/5971 [43:59<15:56,  1.66it/s, loss=0.132, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00183, train/loss_step=0.351, global_step=3873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  73%|███████▎  | 4384/5971 [44:01<15:56,  1.66it/s, loss=0.153, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00333, train/loss_step=0.498, global_step=3873.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4385/5971 [44:02<15:55,  1.66it/s, loss=0.163, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000815, train/loss_step=0.202, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4386/5971 [44:03<15:55,  1.66it/s, loss=0.163, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000815, train/loss_step=0.202, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4386/5971 [44:03<15:55,  1.66it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.27e-5, train/loss_step=0.0125, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4387/5971 [44:04<15:54,  1.66it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.00018, train/loss_step=0.0525, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  73%|███████▎  | 4388/5971 [44:06<15:54,  1.66it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.38it/s][A
Epoch 6:  74%|███████▎  | 4390/5971 [44:07<15:53,  1.66it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:48,  3.43it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.89it/s][A
Epoch 6:  74%|███████▎  | 4394/5971 [44:07<15:49,  1.66it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:12, 13.13it/s][A
Epoch 6:  74%|███████▎  | 4398/5971 [44:07<15:46,  1.66it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.75it/s][A
Epoch 6:  74%|███████▎  | 4402/5971 [44:07<15:43,  1.66it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.96it/s][A

Validating:  10%|█         | 17/167 [00:01<00:07, 20.80it/s][A
Epoch 6:  74%|███████▍  | 4406/5971 [44:07<15:40,  1.66it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.01it/s][A
Epoch 6:  74%|███████▍  | 4410/5971 [44:08<15:37,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.76it/s][A
Epoch 6:  74%|███████▍  | 4414/5971 [44:08<15:33,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.13it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.37it/s][A
Epoch 6:  74%|███████▍  | 4418/5971 [44:08<15:30,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.38it/s][A
Epoch 6:  74%|███████▍  | 4422/5971 [44:08<15:27,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.00it/s][A
Epoch 6:  74%|███████▍  | 4426/5971 [44:08<15:24,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.56it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.97it/s][A
Epoch 6:  74%|███████▍  | 4430/5971 [44:08<15:21,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.16it/s][A
Epoch 6:  74%|███████▍  | 4434/5971 [44:08<15:18,  1.67it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.97it/s][A
Epoch 6:  74%|███████▍  | 4438/5971 [44:09<15:14,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.73it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 27.03it/s][A
Epoch 6:  74%|███████▍  | 4442/5971 [44:09<15:11,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.72it/s][A
Epoch 6:  74%|███████▍  | 4446/5971 [44:09<15:08,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.15it/s][A
Epoch 6:  75%|███████▍  | 4450/5971 [44:09<15:05,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.30it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:03, 27.87it/s][A
Epoch 6:  75%|███████▍  | 4454/5971 [44:09<15:02,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 28.93it/s][A
Epoch 6:  75%|███████▍  | 4458/5971 [44:09<14:59,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 29.45it/s][A
Epoch 6:  75%|███████▍  | 4462/5971 [44:09<14:55,  1.68it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 28.70it/s][A
Epoch 6:  75%|███████▍  | 4466/5971 [44:10<14:52,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 28.54it/s][A
Epoch 6:  75%|███████▍  | 4470/5971 [44:10<14:49,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|████▉     | 83/167 [00:03<00:02, 29.94it/s][A
Epoch 6:  75%|███████▍  | 4474/5971 [44:10<14:46,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 28.95it/s][A
Epoch 6:  75%|███████▍  | 4478/5971 [44:10<14:43,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 29.30it/s][A

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 28.17it/s][A
Epoch 6:  75%|███████▌  | 4482/5971 [44:10<14:40,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 28.98it/s][A
Epoch 6:  75%|███████▌  | 4486/5971 [44:10<14:37,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 28.16it/s][A
Epoch 6:  75%|███████▌  | 4490/5971 [44:10<14:34,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 28.34it/s][A
Epoch 6:  75%|███████▌  | 4494/5971 [44:11<14:31,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 27.49it/s][A

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.51it/s][A
Epoch 6:  75%|███████▌  | 4498/5971 [44:11<14:28,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.82it/s][A
Epoch 6:  75%|███████▌  | 4502/5971 [44:11<14:24,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 26.68it/s][A
Epoch 6:  75%|███████▌  | 4506/5971 [44:11<14:21,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  71%|███████   | 118/167 [00:04<00:01, 27.17it/s][A

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.61it/s][A
Epoch 6:  76%|███████▌  | 4510/5971 [44:11<14:18,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.01it/s][A
Epoch 6:  76%|███████▌  | 4514/5971 [44:11<14:15,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.45it/s][A
Epoch 6:  76%|███████▌  | 4518/5971 [44:11<14:12,  1.70it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.06it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.14it/s][A
Epoch 6:  76%|███████▌  | 4522/5971 [44:12<14:09,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.18it/s][A
Epoch 6:  76%|███████▌  | 4526/5971 [44:12<14:06,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.81it/s][A
Epoch 6:  76%|███████▌  | 4530/5971 [44:12<14:03,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.53it/s][A

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 25.56it/s][A
Epoch 6:  76%|███████▌  | 4534/5971 [44:12<14:00,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.39it/s][A
Epoch 6:  76%|███████▌  | 4538/5971 [44:12<13:57,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.95it/s][A
Epoch 6:  76%|███████▌  | 4542/5971 [44:12<13:54,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.25it/s][A
Epoch 6:  76%|███████▌  | 4546/5971 [44:13<13:51,  1.71it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.38it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.46it/s][A
Epoch 6:  76%|███████▌  | 4550/5971 [44:13<13:48,  1.72it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.76it/s][A
Epoch 6:  76%|███████▋  | 4554/5971 [44:13<13:45,  1.72it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.00it/s][A
Epoch 6:  76%|███████▋  | 4556/5971 [44:13<13:44,  1.72it/s, loss=0.169, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00325, train/loss_step=0.452, global_step=3874.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  76%|███████▋  | 4557/5971 [44:14<13:43,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000822, train/loss_step=0.217, global_step=3875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4558/5971 [44:15<13:43,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000822, train/loss_step=0.217, global_step=3875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4558/5971 [44:15<13:43,  1.72it/s, loss=0.19, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00267, train/loss_step=0.381, global_step=3875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  76%|███████▋  | 4559/5971 [44:16<13:42,  1.72it/s, loss=0.216, v_num=0, train/loss_simple_step=0.651, train/loss_vlb_step=0.0192, train/loss_step=0.651, global_step=3875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4560/5971 [44:18<13:42,  1.72it/s, loss=0.223, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000879, train/loss_step=0.243, global_step=3875.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4561/5971 [44:19<13:42,  1.72it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.05e-5, train/loss_step=0.0018, global_step=3876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4562/5971 [44:20<13:41,  1.72it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.05e-5, train/loss_step=0.0018, global_step=3876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4562/5971 [44:20<13:41,  1.72it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000238, train/loss_step=0.0699, global_step=3876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4563/5971 [44:21<13:41,  1.71it/s, loss=0.2, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.00074, train/loss_step=0.210, global_step=3876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  76%|███████▋  | 4564/5971 [44:23<13:40,  1.71it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0437, train/loss_vlb_step=0.00015, train/loss_step=0.0437, global_step=3876.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4565/5971 [44:24<13:40,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00133, train/loss_vlb_step=8.11e-6, train/loss_step=0.00133, global_step=3877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4566/5971 [44:25<13:39,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00133, train/loss_vlb_step=8.11e-6, train/loss_step=0.00133, global_step=3877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  76%|███████▋  | 4566/5971 [44:25<13:39,  1.71it/s, loss=0.188, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=3877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  76%|███████▋  | 4567/5971 [44:26<13:39,  1.71it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0158, train/loss_vlb_step=6.53e-5, train/loss_step=0.0158, global_step=3877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4568/5971 [44:28<13:39,  1.71it/s, loss=0.191, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000693, train/loss_step=0.190, global_step=3877.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4569/5971 [44:29<13:38,  1.71it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=8.54e-5, train/loss_step=0.0237, global_step=3878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4570/5971 [44:30<13:38,  1.71it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=8.54e-5, train/loss_step=0.0237, global_step=3878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4570/5971 [44:30<13:38,  1.71it/s, loss=0.194, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000473, train/loss_step=0.138, global_step=3878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4571/5971 [44:30<13:37,  1.71it/s, loss=0.179, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.000205, train/loss_step=0.057, global_step=3878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4572/5971 [44:33<13:37,  1.71it/s, loss=0.16, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000422, train/loss_step=0.128, global_step=3878.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4573/5971 [44:34<13:37,  1.71it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000217, train/loss_step=0.0631, global_step=3879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4574/5971 [44:34<13:36,  1.71it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000217, train/loss_step=0.0631, global_step=3879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4574/5971 [44:34<13:36,  1.71it/s, loss=0.175, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00315, train/loss_step=0.443, global_step=3879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  77%|███████▋  | 4575/5971 [44:35<13:36,  1.71it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00915, train/loss_vlb_step=4.08e-5, train/loss_step=0.00915, global_step=3879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4576/5971 [44:38<13:36,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00318, train/loss_vlb_step=1.7e-5, train/loss_step=0.00318, global_step=3879.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  77%|███████▋  | 4577/5971 [44:39<13:35,  1.71it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00737, train/loss_vlb_step=3.55e-5, train/loss_step=0.00737, global_step=3880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4578/5971 [44:40<13:35,  1.71it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00737, train/loss_vlb_step=3.55e-5, train/loss_step=0.00737, global_step=3880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4578/5971 [44:40<13:35,  1.71it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0699, train/loss_vlb_step=0.000232, train/loss_step=0.0699, global_step=3880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4579/5971 [44:41<13:34,  1.71it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.04e-5, train/loss_step=0.0231, global_step=3880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4580/5971 [44:43<13:34,  1.71it/s, loss=0.0876, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000495, train/loss_step=0.139, global_step=3880.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4581/5971 [44:44<13:34,  1.71it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.0993, train/loss_vlb_step=0.000328, train/loss_step=0.0993, global_step=3881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4582/5971 [44:44<13:33,  1.71it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.0993, train/loss_vlb_step=0.000328, train/loss_step=0.0993, global_step=3881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4582/5971 [44:44<13:33,  1.71it/s, loss=0.0891, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.65e-5, train/loss_step=0.00293, global_step=3881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4583/5971 [44:45<13:33,  1.71it/s, loss=0.0802, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000124, train/loss_step=0.0307, global_step=3881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4584/5971 [44:48<13:33,  1.71it/s, loss=0.0795, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000112, train/loss_step=0.0303, global_step=3881.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4585/5971 [44:48<13:32,  1.71it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00118, train/loss_step=0.267, global_step=3882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  77%|███████▋  | 4586/5971 [44:49<13:32,  1.71it/s, loss=0.0928, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00118, train/loss_step=0.267, global_step=3882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4586/5971 [44:49<13:32,  1.71it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.0054, train/loss_vlb_step=2.86e-5, train/loss_step=0.0054, global_step=3882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4587/5971 [44:50<13:31,  1.71it/s, loss=0.0879, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000104, train/loss_step=0.0284, global_step=3882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4588/5971 [44:52<13:31,  1.70it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.00956, train/loss_vlb_step=4.15e-5, train/loss_step=0.00956, global_step=3882.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4589/5971 [44:53<13:31,  1.70it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000604, train/loss_step=0.172, global_step=3883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  77%|███████▋  | 4590/5971 [44:54<13:30,  1.70it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000604, train/loss_step=0.172, global_step=3883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4590/5971 [44:54<13:30,  1.70it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.0574, train/loss_vlb_step=0.000206, train/loss_step=0.0574, global_step=3883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4591/5971 [44:55<13:30,  1.70it/s, loss=0.0797, v_num=0, train/loss_simple_step=0.00559, train/loss_vlb_step=2.67e-5, train/loss_step=0.00559, global_step=3883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4592/5971 [44:57<13:29,  1.70it/s, loss=0.0734, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.75e-6, train/loss_step=0.00162, global_step=3883.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4593/5971 [44:58<13:29,  1.70it/s, loss=0.102, v_num=0, train/loss_simple_step=0.636, train/loss_vlb_step=0.0149, train/loss_step=0.636, global_step=3884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  77%|███████▋  | 4594/5971 [44:59<13:28,  1.70it/s, loss=0.102, v_num=0, train/loss_simple_step=0.636, train/loss_vlb_step=0.0149, train/loss_step=0.636, global_step=3884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4594/5971 [44:59<13:28,  1.70it/s, loss=0.081, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=9.3e-5, train/loss_step=0.0225, global_step=3884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4595/5971 [45:00<13:28,  1.70it/s, loss=0.113, v_num=0, train/loss_simple_step=0.646, train/loss_vlb_step=0.0118, train/loss_step=0.646, global_step=3884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  77%|███████▋  | 4596/5971 [45:02<13:28,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00628, train/loss_step=0.586, global_step=3884.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4597/5971 [45:03<13:27,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.01e-5, train/loss_step=0.00181, global_step=3885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4598/5971 [45:04<13:27,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.01e-5, train/loss_step=0.00181, global_step=3885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4598/5971 [45:04<13:27,  1.70it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.08e-5, train/loss_step=0.00182, global_step=3885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4599/5971 [45:05<13:26,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.00107, train/loss_step=0.251, global_step=3885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  77%|███████▋  | 4600/5971 [45:07<13:26,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.58e-5, train/loss_step=0.0214, global_step=3885.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4601/5971 [45:08<13:26,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.76e-5, train/loss_step=0.0166, global_step=3886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4602/5971 [45:09<13:25,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.76e-5, train/loss_step=0.0166, global_step=3886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4602/5971 [45:09<13:25,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0589, train/loss_vlb_step=0.000206, train/loss_step=0.0589, global_step=3886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4603/5971 [45:10<13:25,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000473, train/loss_step=0.141, global_step=3886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  77%|███████▋  | 4604/5971 [45:12<13:25,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.16e-5, train/loss_step=0.0201, global_step=3886.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4605/5971 [45:13<13:24,  1.70it/s, loss=0.152, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00188, train/loss_step=0.353, global_step=3887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  77%|███████▋  | 4606/5971 [45:14<13:24,  1.70it/s, loss=0.152, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00188, train/loss_step=0.353, global_step=3887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4606/5971 [45:14<13:24,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000676, train/loss_step=0.194, global_step=3887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4607/5971 [45:15<13:23,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000468, train/loss_step=0.141, global_step=3887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4608/5971 [45:17<13:23,  1.70it/s, loss=0.179, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000961, train/loss_step=0.246, global_step=3887.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4609/5971 [45:18<13:23,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0583, train/loss_vlb_step=0.000198, train/loss_step=0.0583, global_step=3888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4610/5971 [45:19<13:22,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0583, train/loss_vlb_step=0.000198, train/loss_step=0.0583, global_step=3888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4610/5971 [45:19<13:22,  1.70it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.24e-5, train/loss_step=0.00237, global_step=3888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4611/5971 [45:19<13:22,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0546, train/loss_vlb_step=0.000188, train/loss_step=0.0546, global_step=3888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4612/5971 [45:22<13:21,  1.69it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.39e-5, train/loss_step=0.0184, global_step=3888.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4613/5971 [45:22<13:21,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00129, train/loss_step=0.319, global_step=3889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  77%|███████▋  | 4614/5971 [45:23<13:20,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00129, train/loss_step=0.319, global_step=3889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4614/5971 [45:23<13:20,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.0005, train/loss_step=0.152, global_step=3889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4615/5971 [45:24<13:20,  1.69it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00617, train/loss_vlb_step=3.01e-5, train/loss_step=0.00617, global_step=3889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4616/5971 [45:26<13:20,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00259, train/loss_step=0.405, global_step=3889.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  77%|███████▋  | 4617/5971 [45:27<13:19,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.703, train/loss_vlb_step=0.0305, train/loss_step=0.703, global_step=3890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4618/5971 [45:28<13:19,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.703, train/loss_vlb_step=0.0305, train/loss_step=0.703, global_step=3890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4618/5971 [45:28<13:19,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00146, train/loss_vlb_step=8.47e-6, train/loss_step=0.00146, global_step=3890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4619/5971 [45:29<13:18,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00279, train/loss_vlb_step=1.61e-5, train/loss_step=0.00279, global_step=3890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4620/5971 [45:31<13:18,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000206, train/loss_step=0.0614, global_step=3890.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  77%|███████▋  | 4621/5971 [45:32<13:18,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=3891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  77%|███████▋  | 4622/5971 [45:33<13:17,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000312, train/loss_step=0.095, global_step=3891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4622/5971 [45:33<13:17,  1.69it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.38e-5, train/loss_step=0.00239, global_step=3891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4623/5971 [45:34<13:17,  1.69it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00945, train/loss_vlb_step=4.56e-5, train/loss_step=0.00945, global_step=3891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4624/5971 [45:36<13:17,  1.69it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00271, train/loss_vlb_step=1.51e-5, train/loss_step=0.00271, global_step=3891.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4625/5971 [45:37<13:16,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.0007, train/loss_step=0.197, global_step=3892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  77%|███████▋  | 4626/5971 [45:38<13:16,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.0007, train/loss_step=0.197, global_step=3892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4626/5971 [45:38<13:16,  1.69it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=9.32e-5, train/loss_step=0.0226, global_step=3892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  77%|███████▋  | 4627/5971 [45:39<13:15,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.00016, train/loss_step=0.0428, global_step=3892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  78%|███████▊  | 4628/5971 [45:41<13:15,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000757, train/loss_step=0.208, global_step=3892.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4629/5971 [45:42<13:14,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000466, train/loss_step=0.141, global_step=3893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4630/5971 [45:43<13:14,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000466, train/loss_step=0.141, global_step=3893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4630/5971 [45:43<13:14,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00785, train/loss_vlb_step=3.7e-5, train/loss_step=0.00785, global_step=3893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4631/5971 [45:44<13:13,  1.69it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.54e-5, train/loss_step=0.0221, global_step=3893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  78%|███████▊  | 4632/5971 [45:46<13:13,  1.69it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0906, train/loss_vlb_step=0.0003, train/loss_step=0.0906, global_step=3893.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  78%|███████▊  | 4633/5971 [45:47<13:13,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.0016, train/loss_step=0.357, global_step=3894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  78%|███████▊  | 4634/5971 [45:48<13:12,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.0016, train/loss_step=0.357, global_step=3894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4634/5971 [45:48<13:12,  1.69it/s, loss=0.131, v_num=0, train/loss_simple_step=0.248, train/loss_vlb_step=0.000941, train/loss_step=0.248, global_step=3894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4635/5971 [45:48<13:12,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00197, train/loss_step=0.394, global_step=3894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  78%|███████▊  | 4636/5971 [45:51<13:12,  1.69it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.66e-5, train/loss_step=0.0184, global_step=3894.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4637/5971 [45:52<13:11,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000594, train/loss_step=0.162, global_step=3895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  78%|███████▊  | 4638/5971 [45:52<13:11,  1.69it/s, loss=0.104, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000594, train/loss_step=0.162, global_step=3895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4638/5971 [45:52<13:11,  1.69it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7e-5, train/loss_step=0.0172, global_step=3895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  78%|███████▊  | 4639/5971 [45:53<13:10,  1.68it/s, loss=0.117, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000869, train/loss_step=0.241, global_step=3895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4640/5971 [45:55<13:10,  1.68it/s, loss=0.126, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00107, train/loss_step=0.247, global_step=3895.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  78%|███████▊  | 4641/5971 [45:56<13:09,  1.68it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=4.96e-5, train/loss_step=0.0123, global_step=3896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4642/5971 [45:57<13:09,  1.68it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=4.96e-5, train/loss_step=0.0123, global_step=3896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4642/5971 [45:57<13:09,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.440, train/loss_vlb_step=0.00377, train/loss_step=0.440, global_step=3896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  78%|███████▊  | 4643/5971 [45:58<13:08,  1.68it/s, loss=0.151, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000462, train/loss_step=0.140, global_step=3896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4644/5971 [46:01<13:08,  1.68it/s, loss=0.159, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000636, train/loss_step=0.179, global_step=3896.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4645/5971 [46:02<13:08,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000219, train/loss_step=0.0643, global_step=3897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4646/5971 [46:03<13:07,  1.68it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000219, train/loss_step=0.0643, global_step=3897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4646/5971 [46:03<13:07,  1.68it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00451, train/loss_vlb_step=2.37e-5, train/loss_step=0.00451, global_step=3897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4647/5971 [46:03<13:07,  1.68it/s, loss=0.198, v_num=0, train/loss_simple_step=0.958, train/loss_vlb_step=0.482, train/loss_step=0.958, global_step=3897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  78%|███████▊  | 4648/5971 [46:06<13:07,  1.68it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00305, train/loss_vlb_step=1.65e-5, train/loss_step=0.00305, global_step=3897.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4649/5971 [46:06<13:06,  1.68it/s, loss=0.194, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00109, train/loss_step=0.283, global_step=3898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  78%|███████▊  | 4650/5971 [46:07<13:06,  1.68it/s, loss=0.194, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00109, train/loss_step=0.283, global_step=3898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4650/5971 [46:07<13:06,  1.68it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0492, train/loss_vlb_step=0.000177, train/loss_step=0.0492, global_step=3898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4651/5971 [46:08<13:05,  1.68it/s, loss=0.212, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00185, train/loss_step=0.323, global_step=3898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  78%|███████▊  | 4652/5971 [46:11<13:05,  1.68it/s, loss=0.218, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000854, train/loss_step=0.224, global_step=3898.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4653/5971 [46:11<13:05,  1.68it/s, loss=0.21, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000664, train/loss_step=0.191, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  78%|███████▊  | 4654/5971 [46:12<13:04,  1.68it/s, loss=0.21, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000664, train/loss_step=0.191, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4654/5971 [46:12<13:04,  1.68it/s, loss=0.208, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000773, train/loss_step=0.215, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4655/5971 [46:13<13:03,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.73e-5, train/loss_step=0.0127, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  78%|███████▊  | 4656/5971 [46:15<13:03,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.38it/s][A
Epoch 6:  78%|███████▊  | 4658/5971 [46:16<13:02,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:50,  3.24it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.75it/s][A
Epoch 6:  78%|███████▊  | 4662/5971 [46:16<12:59,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.38it/s][A
Epoch 6:  78%|███████▊  | 4666/5971 [46:16<12:56,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.38it/s][A
Epoch 6:  78%|███████▊  | 4670/5971 [46:16<12:53,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   9%|▉         | 15/167 [00:01<00:07, 19.40it/s][A
Epoch 6:  78%|███████▊  | 4674/5971 [46:17<12:50,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:01<00:06, 21.48it/s][A

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.86it/s][A
Epoch 6:  78%|███████▊  | 4678/5971 [46:17<12:47,  1.68it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 22.81it/s][A
Epoch 6:  78%|███████▊  | 4682/5971 [46:17<12:44,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:06, 23.29it/s][A
Epoch 6:  78%|███████▊  | 4686/5971 [46:17<12:41,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.10it/s][A

Validating:  20%|█▉        | 33/167 [00:01<00:05, 23.53it/s][A
Epoch 6:  79%|███████▊  | 4690/5971 [46:17<12:38,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 22.98it/s][A
Epoch 6:  79%|███████▊  | 4694/5971 [46:17<12:35,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 23.96it/s][A
Epoch 6:  79%|███████▊  | 4698/5971 [46:18<12:32,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 23.64it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:05, 23.99it/s][A
Epoch 6:  79%|███████▊  | 4702/5971 [46:18<12:29,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:05, 23.11it/s][A
Epoch 6:  79%|███████▉  | 4706/5971 [46:18<12:26,  1.69it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 23.75it/s][A
Epoch 6:  79%|███████▉  | 4710/5971 [46:18<12:23,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.63it/s][A

Validating:  34%|███▍      | 57/167 [00:03<00:05, 18.84it/s][A
Epoch 6:  79%|███████▉  | 4714/5971 [46:18<12:20,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:03<00:05, 20.53it/s][A
Epoch 6:  79%|███████▉  | 4718/5971 [46:19<12:17,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 63/167 [00:03<00:04, 20.97it/s][A
Epoch 6:  79%|███████▉  | 4722/5971 [46:19<12:14,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 21.98it/s][A

Validating:  41%|████▏     | 69/167 [00:03<00:04, 23.45it/s][A
Epoch 6:  79%|███████▉  | 4726/5971 [46:19<12:12,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 23.76it/s][A
Epoch 6:  79%|███████▉  | 4730/5971 [46:19<12:09,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 24.45it/s][A
Epoch 6:  79%|███████▉  | 4734/5971 [46:19<12:06,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.24it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 24.21it/s][A
Epoch 6:  79%|███████▉  | 4738/5971 [46:19<12:03,  1.70it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  50%|█████     | 84/167 [00:04<00:03, 25.04it/s][A
Epoch 6:  79%|███████▉  | 4742/5971 [46:20<12:00,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 25.98it/s][A
Epoch 6:  79%|███████▉  | 4746/5971 [46:20<11:57,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.21it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.08it/s][A
Epoch 6:  80%|███████▉  | 4750/5971 [46:20<11:54,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.76it/s][A
Epoch 6:  80%|███████▉  | 4754/5971 [46:20<11:51,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.81it/s][A
Epoch 6:  80%|███████▉  | 4758/5971 [46:20<11:48,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.97it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.38it/s][A
Epoch 6:  80%|███████▉  | 4762/5971 [46:20<11:45,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 25.97it/s][A
Epoch 6:  80%|███████▉  | 4766/5971 [46:20<11:42,  1.71it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 26.44it/s][A
Epoch 6:  80%|███████▉  | 4770/5971 [46:21<11:40,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:05<00:01, 27.36it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:01, 27.99it/s][A
Epoch 6:  80%|███████▉  | 4774/5971 [46:21<11:37,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 24.42it/s][A
Epoch 6:  80%|████████  | 4778/5971 [46:21<11:34,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.42it/s][A
Epoch 6:  80%|████████  | 4782/5971 [46:21<11:31,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.26it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.03it/s][A
Epoch 6:  80%|████████  | 4786/5971 [46:21<11:28,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.08it/s][A
Epoch 6:  80%|████████  | 4790/5971 [46:21<11:25,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████  | 135/167 [00:06<00:01, 25.15it/s][A
Epoch 6:  80%|████████  | 4794/5971 [46:22<11:22,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 26.19it/s][A

Validating:  84%|████████▍ | 141/167 [00:06<00:01, 25.31it/s][A
Epoch 6:  80%|████████  | 4798/5971 [46:22<11:20,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 25.82it/s][A
Epoch 6:  80%|████████  | 4802/5971 [46:22<11:17,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 25.45it/s][A
Epoch 6:  80%|████████  | 4806/5971 [46:22<11:14,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.40it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.51it/s][A
Epoch 6:  81%|████████  | 4810/5971 [46:22<11:11,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 25.60it/s][A
Epoch 6:  81%|████████  | 4814/5971 [46:22<11:08,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.85it/s][A
Epoch 6:  81%|████████  | 4818/5971 [46:22<11:05,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:07<00:00, 24.70it/s][A

Validating:  99%|█████████▉| 165/167 [00:07<00:00, 24.92it/s][A
Epoch 6:  81%|████████  | 4822/5971 [46:23<11:03,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4824/5971 [46:23<11:01,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.05e-5, train/loss_step=0.0141, global_step=3899.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:34,  1.41it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.51it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:13,  3.36it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  4.01it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.47it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.82it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.22it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.54it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.62it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.66it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.70it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.70it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.67it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.71it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.72it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:08<00:00,  5.69it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.69it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.70it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.35it/s]

Epoch 6:  81%|████████  | 4825/5971 [46:35<11:03,  1.73it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.000149, train/loss_step=0.0385, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A
Epoch 6:  81%|████████  | 4825/5971 [46:37<11:04,  1.73it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.000149, train/loss_step=0.0385, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.94it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.42it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.79it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.05it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.23it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.51it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.57it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.60it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.50it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.55it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.63it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.63it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.49it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.49it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.54it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.59it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.24it/s]

Epoch 6:  81%|████████  | 4826/5971 [46:47<11:05,  1.72it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0385, train/loss_vlb_step=0.000149, train/loss_step=0.0385, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4826/5971 [46:47<11:05,  1.72it/s, loss=0.188, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.40it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:17,  2.67it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:13,  3.36it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.59it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.82it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.99it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.07it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.18it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.32it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.34it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.65it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.63it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.52it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.58it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.58it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.56it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.57it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.64it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.58it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.56it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.54it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.13it/s]

Epoch 6:  81%|████████  | 4827/5971 [46:59<11:08,  1.71it/s, loss=0.188, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4827/5971 [46:59<11:08,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00696, train/loss_vlb_step=3.32e-5, train/loss_step=0.00696, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.24it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.87it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.73it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.01it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.19it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.52it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.57it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.61it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.53it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.60it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.58it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.50it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.51it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.48it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.48it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.47it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.54it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.64it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.68it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.24it/s]

Epoch 6:  81%|████████  | 4828/5971 [47:12<11:10,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00696, train/loss_vlb_step=3.32e-5, train/loss_step=0.00696, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4828/5971 [47:12<11:10,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00327, train/loss_step=0.448, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  81%|████████  | 4829/5971 [47:13<11:09,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00327, train/loss_step=0.448, global_step=3900.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4829/5971 [47:13<11:09,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.13e-5, train/loss_step=0.00397, global_step=3901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4830/5971 [47:14<11:09,  1.70it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00397, train/loss_vlb_step=2.13e-5, train/loss_step=0.00397, global_step=3901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4830/5971 [47:14<11:09,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000619, train/loss_step=0.182, global_step=3901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  81%|████████  | 4831/5971 [47:15<11:08,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000619, train/loss_step=0.182, global_step=3901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4831/5971 [47:15<11:08,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.96e-5, train/loss_step=0.0112, global_step=3901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4832/5971 [47:17<11:08,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.96e-5, train/loss_step=0.0112, global_step=3901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4832/5971 [47:17<11:08,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000384, train/loss_step=0.117, global_step=3901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  81%|████████  | 4833/5971 [47:18<11:08,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000384, train/loss_step=0.117, global_step=3901.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4833/5971 [47:18<11:08,  1.70it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00722, train/loss_vlb_step=3.28e-5, train/loss_step=0.00722, global_step=3902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4834/5971 [47:19<11:07,  1.70it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00722, train/loss_vlb_step=3.28e-5, train/loss_step=0.00722, global_step=3902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4834/5971 [47:19<11:07,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00765, train/loss_vlb_step=3.71e-5, train/loss_step=0.00765, global_step=3902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4835/5971 [47:20<11:07,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00765, train/loss_vlb_step=3.71e-5, train/loss_step=0.00765, global_step=3902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4835/5971 [47:20<11:07,  1.70it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00603, train/loss_vlb_step=2.97e-5, train/loss_step=0.00603, global_step=3902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4836/5971 [47:22<11:06,  1.70it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00603, train/loss_vlb_step=2.97e-5, train/loss_step=0.00603, global_step=3902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4836/5971 [47:22<11:06,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000585, train/loss_step=0.173, global_step=3902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  81%|████████  | 4837/5971 [47:23<11:06,  1.70it/s, loss=0.122, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000585, train/loss_step=0.173, global_step=3902.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4837/5971 [47:23<11:06,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.09e-5, train/loss_step=0.018, global_step=3903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  81%|████████  | 4838/5971 [47:24<11:05,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.09e-5, train/loss_step=0.018, global_step=3903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4838/5971 [47:24<11:05,  1.70it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=7.17e-5, train/loss_step=0.0162, global_step=3903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4839/5971 [47:25<11:05,  1.70it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=7.17e-5, train/loss_step=0.0162, global_step=3903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4839/5971 [47:25<11:05,  1.70it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.85e-5, train/loss_step=0.0254, global_step=3903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4840/5971 [47:27<11:05,  1.70it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.85e-5, train/loss_step=0.0254, global_step=3903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4840/5971 [47:27<11:05,  1.70it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00133, train/loss_step=0.322, global_step=3903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  81%|████████  | 4841/5971 [47:28<11:04,  1.70it/s, loss=0.0967, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00133, train/loss_step=0.322, global_step=3903.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4841/5971 [47:28<11:04,  1.70it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.19e-5, train/loss_step=0.00203, global_step=3904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4842/5971 [47:28<11:04,  1.70it/s, loss=0.0873, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.19e-5, train/loss_step=0.00203, global_step=3904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4842/5971 [47:28<11:04,  1.70it/s, loss=0.0769, v_num=0, train/loss_simple_step=0.00692, train/loss_vlb_step=3.29e-5, train/loss_step=0.00692, global_step=3904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4843/5971 [47:29<11:03,  1.70it/s, loss=0.0769, v_num=0, train/loss_simple_step=0.00692, train/loss_vlb_step=3.29e-5, train/loss_step=0.00692, global_step=3904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4843/5971 [47:29<11:03,  1.70it/s, loss=0.102, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00565, train/loss_step=0.519, global_step=3904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  81%|████████  | 4844/5971 [47:32<11:03,  1.70it/s, loss=0.102, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00565, train/loss_step=0.519, global_step=3904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4844/5971 [47:32<11:03,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000417, train/loss_step=0.127, global_step=3904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4845/5971 [47:32<11:02,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000417, train/loss_step=0.127, global_step=3904.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4845/5971 [47:32<11:02,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.08e-5, train/loss_step=0.00181, global_step=3905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4846/5971 [47:33<11:02,  1.70it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.08e-5, train/loss_step=0.00181, global_step=3905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4846/5971 [47:33<11:02,  1.70it/s, loss=0.121, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00346, train/loss_step=0.426, global_step=3905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  81%|████████  | 4847/5971 [47:34<11:01,  1.70it/s, loss=0.121, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00346, train/loss_step=0.426, global_step=3905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4847/5971 [47:34<11:01,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000694, train/loss_step=0.194, global_step=3905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4848/5971 [47:36<11:01,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000694, train/loss_step=0.194, global_step=3905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4848/5971 [47:36<11:01,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000103, train/loss_step=0.0262, global_step=3905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4849/5971 [47:37<11:01,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000103, train/loss_step=0.0262, global_step=3905.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4849/5971 [47:37<11:01,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.72e-5, train/loss_step=0.0166, global_step=3906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  81%|████████  | 4850/5971 [47:38<11:00,  1.70it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.72e-5, train/loss_step=0.0166, global_step=3906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4850/5971 [47:38<11:00,  1.70it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0862, train/loss_vlb_step=0.000292, train/loss_step=0.0862, global_step=3906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4851/5971 [47:39<11:00,  1.70it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0862, train/loss_vlb_step=0.000292, train/loss_step=0.0862, global_step=3906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████  | 4851/5971 [47:39<11:00,  1.70it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00491, train/loss_vlb_step=2.45e-5, train/loss_step=0.00491, global_step=3906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4852/5971 [47:42<10:59,  1.70it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00491, train/loss_vlb_step=2.45e-5, train/loss_step=0.00491, global_step=3906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4852/5971 [47:42<10:59,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000649, train/loss_step=0.178, global_step=3906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  81%|████████▏ | 4853/5971 [47:43<10:59,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000649, train/loss_step=0.178, global_step=3906.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4853/5971 [47:43<10:59,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.01e-5, train/loss_step=0.00175, global_step=3907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4854/5971 [47:43<10:58,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.01e-5, train/loss_step=0.00175, global_step=3907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4854/5971 [47:43<10:58,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.54e-5, train/loss_step=0.0156, global_step=3907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  81%|████████▏ | 4855/5971 [47:44<10:58,  1.70it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.54e-5, train/loss_step=0.0156, global_step=3907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4855/5971 [47:44<10:58,  1.70it/s, loss=0.114, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000426, train/loss_step=0.128, global_step=3907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  81%|████████▏ | 4856/5971 [47:46<10:58,  1.69it/s, loss=0.114, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000426, train/loss_step=0.128, global_step=3907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4856/5971 [47:46<10:58,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000862, train/loss_step=0.247, global_step=3907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4857/5971 [47:47<10:57,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000862, train/loss_step=0.247, global_step=3907.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4857/5971 [47:47<10:57,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.11e-5, train/loss_step=0.00405, global_step=3908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4858/5971 [47:48<10:57,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.11e-5, train/loss_step=0.00405, global_step=3908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4858/5971 [47:48<10:57,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000585, train/loss_step=0.177, global_step=3908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  81%|████████▏ | 4859/5971 [47:49<10:56,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000585, train/loss_step=0.177, global_step=3908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4859/5971 [47:49<10:56,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000137, train/loss_step=0.0386, global_step=3908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4860/5971 [47:51<10:56,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000137, train/loss_step=0.0386, global_step=3908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4860/5971 [47:51<10:56,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.1e-5, train/loss_step=0.00394, global_step=3908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  81%|████████▏ | 4861/5971 [47:52<10:55,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.1e-5, train/loss_step=0.00394, global_step=3908.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4861/5971 [47:52<10:55,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00524, train/loss_vlb_step=2.64e-5, train/loss_step=0.00524, global_step=3909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4862/5971 [47:53<10:55,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00524, train/loss_vlb_step=2.64e-5, train/loss_step=0.00524, global_step=3909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4862/5971 [47:53<10:55,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000604, train/loss_step=0.163, global_step=3909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  81%|████████▏ | 4863/5971 [47:54<10:54,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000604, train/loss_step=0.163, global_step=3909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4863/5971 [47:54<10:54,  1.69it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.00012, train/loss_step=0.0315, global_step=3909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4864/5971 [47:56<10:54,  1.69it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.00012, train/loss_step=0.0315, global_step=3909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4864/5971 [47:56<10:54,  1.69it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8e-5, train/loss_step=0.0195, global_step=3909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  81%|████████▏ | 4865/5971 [47:57<10:54,  1.69it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8e-5, train/loss_step=0.0195, global_step=3909.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4865/5971 [47:57<10:54,  1.69it/s, loss=0.0899, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000109, train/loss_step=0.0289, global_step=3910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4866/5971 [47:58<10:53,  1.69it/s, loss=0.0899, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000109, train/loss_step=0.0289, global_step=3910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  81%|████████▏ | 4866/5971 [47:58<10:53,  1.69it/s, loss=0.0707, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000157, train/loss_step=0.042, global_step=3910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4867/5971 [47:59<10:52,  1.69it/s, loss=0.0707, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000157, train/loss_step=0.042, global_step=3910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4867/5971 [47:59<10:52,  1.69it/s, loss=0.0695, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000595, train/loss_step=0.171, global_step=3910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4868/5971 [48:01<10:52,  1.69it/s, loss=0.0695, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000595, train/loss_step=0.171, global_step=3910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4868/5971 [48:01<10:52,  1.69it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00348, train/loss_step=0.481, global_step=3910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4869/5971 [48:02<10:52,  1.69it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00348, train/loss_step=0.481, global_step=3910.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4869/5971 [48:02<10:52,  1.69it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000544, train/loss_step=0.161, global_step=3911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4870/5971 [48:03<10:51,  1.69it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000544, train/loss_step=0.161, global_step=3911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4870/5971 [48:03<10:51,  1.69it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.55e-5, train/loss_step=0.00479, global_step=3911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4871/5971 [48:04<10:51,  1.69it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.55e-5, train/loss_step=0.00479, global_step=3911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4871/5971 [48:04<10:51,  1.69it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000139, train/loss_step=0.0359, global_step=3911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4872/5971 [48:06<10:51,  1.69it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000139, train/loss_step=0.0359, global_step=3911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4872/5971 [48:06<10:51,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.006, train/loss_step=0.594, global_step=3911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  82%|████████▏ | 4873/5971 [48:07<10:50,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.006, train/loss_step=0.594, global_step=3911.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4873/5971 [48:07<10:50,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4874/5971 [48:08<10:49,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000319, train/loss_step=0.0971, global_step=3912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4874/5971 [48:08<10:49,  1.69it/s, loss=0.132, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000735, train/loss_step=0.204, global_step=3912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4875/5971 [48:09<10:49,  1.69it/s, loss=0.132, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000735, train/loss_step=0.204, global_step=3912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4875/5971 [48:09<10:49,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.6e-5, train/loss_step=0.00278, global_step=3912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4876/5971 [48:11<10:49,  1.69it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00278, train/loss_vlb_step=1.6e-5, train/loss_step=0.00278, global_step=3912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4876/5971 [48:11<10:49,  1.69it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.21e-5, train/loss_step=0.00206, global_step=3912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4877/5971 [48:12<10:48,  1.69it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00206, train/loss_vlb_step=1.21e-5, train/loss_step=0.00206, global_step=3912.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4877/5971 [48:12<10:48,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.000239, train/loss_step=0.0726, global_step=3913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4878/5971 [48:13<10:48,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.000239, train/loss_step=0.0726, global_step=3913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4878/5971 [48:13<10:48,  1.69it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0582, train/loss_vlb_step=0.00021, train/loss_step=0.0582, global_step=3913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4879/5971 [48:14<10:47,  1.69it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0582, train/loss_vlb_step=0.00021, train/loss_step=0.0582, global_step=3913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4879/5971 [48:14<10:47,  1.69it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000181, train/loss_step=0.0496, global_step=3913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4880/5971 [48:16<10:47,  1.69it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000181, train/loss_step=0.0496, global_step=3913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4880/5971 [48:16<10:47,  1.69it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.12e-5, train/loss_step=0.00194, global_step=3913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4881/5971 [48:17<10:46,  1.69it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.12e-5, train/loss_step=0.00194, global_step=3913.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4881/5971 [48:17<10:46,  1.69it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00637, train/loss_vlb_step=3.15e-5, train/loss_step=0.00637, global_step=3914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4882/5971 [48:18<10:46,  1.68it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00637, train/loss_vlb_step=3.15e-5, train/loss_step=0.00637, global_step=3914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4882/5971 [48:18<10:46,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000191, train/loss_step=0.0536, global_step=3914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4883/5971 [48:18<10:45,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000191, train/loss_step=0.0536, global_step=3914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4883/5971 [48:18<10:45,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000111, train/loss_step=0.0314, global_step=3914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4884/5971 [48:21<10:45,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000111, train/loss_step=0.0314, global_step=3914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4884/5971 [48:21<10:45,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.05e-5, train/loss_step=0.014, global_step=3914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  82%|████████▏ | 4885/5971 [48:21<10:45,  1.68it/s, loss=0.106, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.05e-5, train/loss_step=0.014, global_step=3914.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4885/5971 [48:21<10:45,  1.68it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.13e-5, train/loss_step=0.00194, global_step=3915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4886/5971 [48:22<10:44,  1.68it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.13e-5, train/loss_step=0.00194, global_step=3915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4886/5971 [48:22<10:44,  1.68it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.68e-5, train/loss_step=0.0135, global_step=3915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4887/5971 [48:23<10:43,  1.68it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.68e-5, train/loss_step=0.0135, global_step=3915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4887/5971 [48:23<10:43,  1.68it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.29e-5, train/loss_step=0.00924, global_step=3915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4888/5971 [48:25<10:43,  1.68it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.29e-5, train/loss_step=0.00924, global_step=3915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4888/5971 [48:25<10:43,  1.68it/s, loss=0.0714, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.45e-5, train/loss_step=0.0144, global_step=3915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4889/5971 [48:26<10:43,  1.68it/s, loss=0.0714, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.45e-5, train/loss_step=0.0144, global_step=3915.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4889/5971 [48:26<10:43,  1.68it/s, loss=0.0638, v_num=0, train/loss_simple_step=0.00788, train/loss_vlb_step=3.63e-5, train/loss_step=0.00788, global_step=3916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4890/5971 [48:27<10:42,  1.68it/s, loss=0.0638, v_num=0, train/loss_simple_step=0.00788, train/loss_vlb_step=3.63e-5, train/loss_step=0.00788, global_step=3916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4890/5971 [48:27<10:42,  1.68it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.65e-6, train/loss_step=0.00161, global_step=3916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4891/5971 [48:28<10:42,  1.68it/s, loss=0.0636, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.65e-6, train/loss_step=0.00161, global_step=3916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4891/5971 [48:28<10:42,  1.68it/s, loss=0.0807, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.003, train/loss_step=0.377, global_step=3916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  82%|████████▏ | 4892/5971 [48:30<10:41,  1.68it/s, loss=0.0807, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.003, train/loss_step=0.377, global_step=3916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4892/5971 [48:30<10:41,  1.68it/s, loss=0.0647, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00107, train/loss_step=0.274, global_step=3916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4893/5971 [48:31<10:41,  1.68it/s, loss=0.0647, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00107, train/loss_step=0.274, global_step=3916.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4893/5971 [48:31<10:41,  1.68it/s, loss=0.0603, v_num=0, train/loss_simple_step=0.00882, train/loss_vlb_step=3.92e-5, train/loss_step=0.00882, global_step=3917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4894/5971 [48:32<10:40,  1.68it/s, loss=0.0603, v_num=0, train/loss_simple_step=0.00882, train/loss_vlb_step=3.92e-5, train/loss_step=0.00882, global_step=3917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4894/5971 [48:32<10:40,  1.68it/s, loss=0.0505, v_num=0, train/loss_simple_step=0.00828, train/loss_vlb_step=3.8e-5, train/loss_step=0.00828, global_step=3917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4895/5971 [48:33<10:40,  1.68it/s, loss=0.0505, v_num=0, train/loss_simple_step=0.00828, train/loss_vlb_step=3.8e-5, train/loss_step=0.00828, global_step=3917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4895/5971 [48:33<10:40,  1.68it/s, loss=0.0579, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000519, train/loss_step=0.151, global_step=3917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4896/5971 [48:35<10:40,  1.68it/s, loss=0.0579, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000519, train/loss_step=0.151, global_step=3917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4896/5971 [48:35<10:40,  1.68it/s, loss=0.0635, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=3917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4897/5971 [48:36<10:39,  1.68it/s, loss=0.0635, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=3917.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4897/5971 [48:36<10:39,  1.68it/s, loss=0.0707, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000766, train/loss_step=0.217, global_step=3918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4898/5971 [48:37<10:39,  1.68it/s, loss=0.0707, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000766, train/loss_step=0.217, global_step=3918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4898/5971 [48:37<10:39,  1.68it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000955, train/loss_step=0.228, global_step=3918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4899/5971 [48:38<10:38,  1.68it/s, loss=0.0792, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000955, train/loss_step=0.228, global_step=3918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4899/5971 [48:38<10:38,  1.68it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00316, train/loss_step=0.425, global_step=3918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4900/5971 [48:40<10:38,  1.68it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00316, train/loss_step=0.425, global_step=3918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4900/5971 [48:40<10:38,  1.68it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.05e-5, train/loss_step=0.0112, global_step=3918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4901/5971 [48:41<10:37,  1.68it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.05e-5, train/loss_step=0.0112, global_step=3918.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4901/5971 [48:41<10:37,  1.68it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000163, train/loss_step=0.0464, global_step=3919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4902/5971 [48:42<10:37,  1.68it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000163, train/loss_step=0.0464, global_step=3919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4902/5971 [48:42<10:37,  1.68it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000133, train/loss_step=0.0357, global_step=3919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4903/5971 [48:43<10:36,  1.68it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000133, train/loss_step=0.0357, global_step=3919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4903/5971 [48:43<10:36,  1.68it/s, loss=0.111, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00104, train/loss_step=0.265, global_step=3919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  82%|████████▏ | 4904/5971 [48:45<10:36,  1.68it/s, loss=0.111, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00104, train/loss_step=0.265, global_step=3919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4904/5971 [48:45<10:36,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.484, train/loss_vlb_step=0.00504, train/loss_step=0.484, global_step=3919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4905/5971 [48:46<10:35,  1.68it/s, loss=0.135, v_num=0, train/loss_simple_step=0.484, train/loss_vlb_step=0.00504, train/loss_step=0.484, global_step=3919.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4905/5971 [48:46<10:35,  1.68it/s, loss=0.142, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000506, train/loss_step=0.152, global_step=3920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4906/5971 [48:47<10:35,  1.68it/s, loss=0.142, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000506, train/loss_step=0.152, global_step=3920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4906/5971 [48:47<10:35,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00421, train/loss_step=0.463, global_step=3920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4907/5971 [48:48<10:34,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.463, train/loss_vlb_step=0.00421, train/loss_step=0.463, global_step=3920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4907/5971 [48:48<10:34,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.33e-5, train/loss_step=0.00484, global_step=3920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4908/5971 [48:50<10:34,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.33e-5, train/loss_step=0.00484, global_step=3920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4908/5971 [48:50<10:34,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.01e-5, train/loss_step=0.0196, global_step=3920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4909/5971 [48:51<10:34,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0196, train/loss_vlb_step=8.01e-5, train/loss_step=0.0196, global_step=3920.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4909/5971 [48:51<10:34,  1.68it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000155, train/loss_step=0.0432, global_step=3921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4910/5971 [48:52<10:33,  1.67it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000155, train/loss_step=0.0432, global_step=3921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4910/5971 [48:52<10:33,  1.67it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000141, train/loss_step=0.0383, global_step=3921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4911/5971 [48:52<10:32,  1.67it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000141, train/loss_step=0.0383, global_step=3921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4911/5971 [48:52<10:32,  1.67it/s, loss=0.152, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000148, train/loss_step=0.042, global_step=3921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4912/5971 [48:55<10:32,  1.67it/s, loss=0.152, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000148, train/loss_step=0.042, global_step=3921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4912/5971 [48:55<10:32,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.96e-5, train/loss_step=0.019, global_step=3921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4913/5971 [48:56<10:32,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.96e-5, train/loss_step=0.019, global_step=3921.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4913/5971 [48:56<10:32,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00947, train/loss_vlb_step=4.5e-5, train/loss_step=0.00947, global_step=3922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4914/5971 [48:56<10:31,  1.67it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00947, train/loss_vlb_step=4.5e-5, train/loss_step=0.00947, global_step=3922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4914/5971 [48:56<10:31,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000611, train/loss_step=0.173, global_step=3922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  82%|████████▏ | 4915/5971 [48:57<10:31,  1.67it/s, loss=0.147, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000611, train/loss_step=0.173, global_step=3922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4915/5971 [48:57<10:31,  1.67it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.41e-5, train/loss_step=0.00481, global_step=3922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4916/5971 [48:59<10:30,  1.67it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.41e-5, train/loss_step=0.00481, global_step=3922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4916/5971 [48:59<10:30,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.79e-6, train/loss_step=0.00165, global_step=3922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4917/5971 [49:00<10:30,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.79e-6, train/loss_step=0.00165, global_step=3922.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4917/5971 [49:00<10:30,  1.67it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.4e-5, train/loss_step=0.00242, global_step=3923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  82%|████████▏ | 4918/5971 [49:01<10:29,  1.67it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.4e-5, train/loss_step=0.00242, global_step=3923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4918/5971 [49:01<10:29,  1.67it/s, loss=0.153, v_num=0, train/loss_simple_step=0.809, train/loss_vlb_step=0.0592, train/loss_step=0.809, global_step=3923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  82%|████████▏ | 4919/5971 [49:02<10:29,  1.67it/s, loss=0.153, v_num=0, train/loss_simple_step=0.809, train/loss_vlb_step=0.0592, train/loss_step=0.809, global_step=3923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4919/5971 [49:02<10:29,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00238, train/loss_step=0.452, global_step=3923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4920/5971 [49:05<10:29,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00238, train/loss_step=0.452, global_step=3923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4920/5971 [49:05<10:29,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.99e-5, train/loss_step=0.00352, global_step=3923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4921/5971 [49:06<10:28,  1.67it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.99e-5, train/loss_step=0.00352, global_step=3923.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4921/5971 [49:06<10:28,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.642, train/loss_vlb_step=0.0279, train/loss_step=0.642, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  82%|████████▏ | 4922/5971 [49:06<10:27,  1.67it/s, loss=0.183, v_num=0, train/loss_simple_step=0.642, train/loss_vlb_step=0.0279, train/loss_step=0.642, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4922/5971 [49:06<10:27,  1.67it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4923/5971 [49:07<10:27,  1.67it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4923/5971 [49:07<10:27,  1.67it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00667, train/loss_vlb_step=3.24e-5, train/loss_step=0.00667, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4924/5971 [49:09<10:27,  1.67it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00667, train/loss_vlb_step=3.24e-5, train/loss_step=0.00667, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  82%|████████▏ | 4924/5971 [49:09<10:27,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<00:55,  2.97it/s][A
Epoch 6:  82%|████████▏ | 4926/5971 [49:10<10:25,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:53,  3.08it/s][A
Epoch 6:  83%|████████▎ | 4928/5971 [49:10<10:24,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.61it/s][A
Epoch 6:  83%|████████▎ | 4931/5971 [49:10<10:22,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.70it/s][A
Epoch 6:  83%|████████▎ | 4934/5971 [49:10<10:20,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.63it/s][A
Epoch 6:  83%|████████▎ | 4937/5971 [49:11<10:17,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.34it/s][A
Epoch 6:  83%|████████▎ | 4940/5971 [49:11<10:15,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.93it/s][A
Epoch 6:  83%|████████▎ | 4944/5971 [49:11<10:12,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.79it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.79it/s][A
Epoch 6:  83%|████████▎ | 4948/5971 [49:11<10:10,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.35it/s][A
Epoch 6:  83%|████████▎ | 4952/5971 [49:11<10:07,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.26it/s][A
Epoch 6:  83%|████████▎ | 4956/5971 [49:11<10:04,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.51it/s][A

Validating:  21%|██        | 35/167 [00:01<00:04, 26.67it/s][A
Epoch 6:  83%|████████▎ | 4960/5971 [49:11<10:01,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.42it/s][A
Epoch 6:  83%|████████▎ | 4964/5971 [49:11<09:58,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 27.81it/s][A
Epoch 6:  83%|████████▎ | 4968/5971 [49:12<09:55,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 28.06it/s][A
Epoch 6:  83%|████████▎ | 4972/5971 [49:12<09:53,  1.68it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 28.48it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 27.70it/s][A
Epoch 6:  83%|████████▎ | 4976/5971 [49:12<09:50,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.63it/s][A
Epoch 6:  83%|████████▎ | 4980/5971 [49:12<09:47,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 27.66it/s][A
Epoch 6:  83%|████████▎ | 4984/5971 [49:12<09:44,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 28.56it/s][A
Epoch 6:  84%|████████▎ | 4988/5971 [49:12<09:41,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.31it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 27.24it/s][A
Epoch 6:  84%|████████▎ | 4992/5971 [49:13<09:39,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.32it/s][A
Epoch 6:  84%|████████▎ | 4996/5971 [49:13<09:36,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.92it/s][A
Epoch 6:  84%|████████▎ | 5000/5971 [49:13<09:33,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.23it/s][A
Epoch 6:  84%|████████▍ | 5004/5971 [49:13<09:30,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.81it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.94it/s][A
Epoch 6:  84%|████████▍ | 5008/5971 [49:13<09:27,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.13it/s][A
Epoch 6:  84%|████████▍ | 5012/5971 [49:13<09:25,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 25.27it/s][A
Epoch 6:  84%|████████▍ | 5016/5971 [49:13<09:22,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 26.94it/s][A
Epoch 6:  84%|████████▍ | 5020/5971 [49:14<09:19,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.47it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.97it/s][A
Epoch 6:  84%|████████▍ | 5024/5971 [49:14<09:16,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.52it/s][A
Epoch 6:  84%|████████▍ | 5028/5971 [49:14<09:13,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.43it/s][A
Epoch 6:  84%|████████▍ | 5032/5971 [49:14<09:11,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 27.19it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.00it/s][A
Epoch 6:  84%|████████▍ | 5036/5971 [49:14<09:08,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.42it/s][A
Epoch 6:  84%|████████▍ | 5040/5971 [49:14<09:05,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.88it/s][A
Epoch 6:  84%|████████▍ | 5044/5971 [49:14<09:02,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 27.07it/s][A
Epoch 6:  85%|████████▍ | 5048/5971 [49:15<09:00,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 28.58it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.93it/s][A
Epoch 6:  85%|████████▍ | 5052/5971 [49:15<08:57,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.42it/s][A
Epoch 6:  85%|████████▍ | 5056/5971 [49:15<08:54,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.19it/s][A
Epoch 6:  85%|████████▍ | 5060/5971 [49:15<08:52,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.52it/s][A
Epoch 6:  85%|████████▍ | 5064/5971 [49:15<08:49,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.33it/s][A
Epoch 6:  85%|████████▍ | 5068/5971 [49:15<08:46,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 27.53it/s][A
Epoch 6:  85%|████████▍ | 5072/5971 [49:15<08:43,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 28.29it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.69it/s][A
Epoch 6:  85%|████████▌ | 5076/5971 [49:16<08:41,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.91it/s][A
Epoch 6:  85%|████████▌ | 5080/5971 [49:16<08:38,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.37it/s][A
Epoch 6:  85%|████████▌ | 5084/5971 [49:16<08:35,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 27.46it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.99it/s][A
Epoch 6:  85%|████████▌ | 5088/5971 [49:16<08:32,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 28.42it/s][A
Epoch 6:  85%|████████▌ | 5092/5971 [49:16<08:30,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5092/5971 [49:16<08:30,  1.72it/s, loss=0.164, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00163, train/loss_step=0.319, global_step=3924.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  85%|████████▌ | 5093/5971 [49:17<08:29,  1.72it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000108, train/loss_step=0.0275, global_step=3925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5094/5971 [49:18<08:29,  1.72it/s, loss=0.147, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00102, train/loss_step=0.253, global_step=3925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  85%|████████▌ | 5095/5971 [49:19<08:28,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000103, train/loss_step=0.0271, global_step=3925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5096/5971 [49:22<08:28,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000103, train/loss_step=0.0271, global_step=3925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5096/5971 [49:22<08:28,  1.72it/s, loss=0.156, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000623, train/loss_step=0.176, global_step=3925.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  85%|████████▌ | 5097/5971 [49:23<08:28,  1.72it/s, loss=0.173, v_num=0, train/loss_simple_step=0.372, train/loss_vlb_step=0.00261, train/loss_step=0.372, global_step=3926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  85%|████████▌ | 5098/5971 [49:24<08:27,  1.72it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000146, train/loss_step=0.0393, global_step=3926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5099/5971 [49:24<08:26,  1.72it/s, loss=0.19, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00179, train/loss_step=0.377, global_step=3926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  85%|████████▌ | 5100/5971 [49:27<08:26,  1.72it/s, loss=0.19, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00179, train/loss_step=0.377, global_step=3926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5100/5971 [49:27<08:26,  1.72it/s, loss=0.199, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.0018, train/loss_step=0.214, global_step=3926.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5101/5971 [49:27<08:26,  1.72it/s, loss=0.212, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.000886, train/loss_step=0.256, global_step=3927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5102/5971 [49:28<08:25,  1.72it/s, loss=0.214, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000856, train/loss_step=0.224, global_step=3927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5103/5971 [49:29<08:25,  1.72it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.000257, train/loss_step=0.0748, global_step=3927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5104/5971 [49:31<08:24,  1.72it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.000257, train/loss_step=0.0748, global_step=3927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  85%|████████▌ | 5104/5971 [49:31<08:24,  1.72it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0458, train/loss_vlb_step=0.000163, train/loss_step=0.0458, global_step=3927.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  85%|████████▌ | 5105/5971 [49:32<08:24,  1.72it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00519, train/loss_vlb_step=2.74e-5, train/loss_step=0.00519, global_step=3928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5106/5971 [49:33<08:23,  1.72it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.42e-5, train/loss_step=0.0185, global_step=3928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5107/5971 [49:34<08:23,  1.72it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000109, train/loss_step=0.0285, global_step=3928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5108/5971 [49:36<08:22,  1.72it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000109, train/loss_step=0.0285, global_step=3928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5108/5971 [49:36<08:22,  1.72it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00431, train/loss_vlb_step=2.31e-5, train/loss_step=0.00431, global_step=3928.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5109/5971 [49:37<08:22,  1.72it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.44e-5, train/loss_step=0.0123, global_step=3929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  86%|████████▌ | 5110/5971 [49:38<08:21,  1.72it/s, loss=0.126, v_num=0, train/loss_simple_step=0.040, train/loss_vlb_step=0.000136, train/loss_step=0.040, global_step=3929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5111/5971 [49:39<08:21,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.69e-5, train/loss_step=0.0238, global_step=3929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5112/5971 [49:41<08:20,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.69e-5, train/loss_step=0.0238, global_step=3929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5112/5971 [49:41<08:20,  1.71it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00371, train/loss_vlb_step=1.87e-5, train/loss_step=0.00371, global_step=3929.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5113/5971 [49:42<08:20,  1.71it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.00028, train/loss_step=0.0845, global_step=3930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  86%|████████▌ | 5114/5971 [49:43<08:19,  1.71it/s, loss=0.124, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00367, train/loss_step=0.457, global_step=3930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  86%|████████▌ | 5115/5971 [49:44<08:19,  1.71it/s, loss=0.131, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000509, train/loss_step=0.154, global_step=3930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5116/5971 [49:46<08:19,  1.71it/s, loss=0.131, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000509, train/loss_step=0.154, global_step=3930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5116/5971 [49:46<08:19,  1.71it/s, loss=0.132, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000745, train/loss_step=0.209, global_step=3930.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5117/5971 [49:47<08:18,  1.71it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0577, train/loss_vlb_step=0.0002, train/loss_step=0.0577, global_step=3931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5118/5971 [49:48<08:17,  1.71it/s, loss=0.125, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000922, train/loss_step=0.218, global_step=3931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5119/5971 [49:49<08:17,  1.71it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.19e-5, train/loss_step=0.0144, global_step=3931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5120/5971 [49:51<08:17,  1.71it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.19e-5, train/loss_step=0.0144, global_step=3931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5120/5971 [49:51<08:17,  1.71it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000243, train/loss_step=0.0722, global_step=3931.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5121/5971 [49:52<08:16,  1.71it/s, loss=0.0986, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000824, train/loss_step=0.225, global_step=3932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5122/5971 [49:53<08:16,  1.71it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.0703, train/loss_vlb_step=0.000247, train/loss_step=0.0703, global_step=3932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5123/5971 [49:53<08:15,  1.71it/s, loss=0.0887, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000118, train/loss_step=0.031, global_step=3932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  86%|████████▌ | 5124/5971 [49:56<08:15,  1.71it/s, loss=0.0887, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000118, train/loss_step=0.031, global_step=3932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5124/5971 [49:56<08:15,  1.71it/s, loss=0.0871, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.68e-5, train/loss_step=0.0129, global_step=3932.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5125/5971 [49:57<08:14,  1.71it/s, loss=0.0885, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000132, train/loss_step=0.0335, global_step=3933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5126/5971 [49:57<08:14,  1.71it/s, loss=0.0877, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.92e-5, train/loss_step=0.0036, global_step=3933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5127/5971 [49:58<08:13,  1.71it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00103, train/loss_step=0.249, global_step=3933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  86%|████████▌ | 5128/5971 [50:00<08:13,  1.71it/s, loss=0.0988, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00103, train/loss_step=0.249, global_step=3933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5128/5971 [50:00<08:13,  1.71it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.00015, train/loss_step=0.0391, global_step=3933.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5129/5971 [50:01<08:12,  1.71it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00997, train/loss_vlb_step=4.57e-5, train/loss_step=0.00997, global_step=3934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5130/5971 [50:02<08:12,  1.71it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000157, train/loss_step=0.0449, global_step=3934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5131/5971 [50:03<08:11,  1.71it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.48e-5, train/loss_step=0.0213, global_step=3934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5132/5971 [50:05<08:11,  1.71it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.48e-5, train/loss_step=0.0213, global_step=3934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5132/5971 [50:05<08:11,  1.71it/s, loss=0.105, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=3934.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5133/5971 [50:06<08:10,  1.71it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000278, train/loss_step=0.0841, global_step=3935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5134/5971 [50:07<08:10,  1.71it/s, loss=0.109, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.00473, train/loss_step=0.539, global_step=3935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  86%|████████▌ | 5135/5971 [50:08<08:09,  1.71it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.69e-5, train/loss_step=0.00298, global_step=3935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5136/5971 [50:11<08:09,  1.71it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.69e-5, train/loss_step=0.00298, global_step=3935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5136/5971 [50:11<08:09,  1.71it/s, loss=0.0925, v_num=0, train/loss_simple_step=0.0211, train/loss_vlb_step=8.32e-5, train/loss_step=0.0211, global_step=3935.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5137/5971 [50:12<08:08,  1.71it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.08e-5, train/loss_step=0.0121, global_step=3936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5138/5971 [50:13<08:08,  1.71it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.0775, train/loss_vlb_step=0.000258, train/loss_step=0.0775, global_step=3936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5139/5971 [50:14<08:07,  1.71it/s, loss=0.0835, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.52e-5, train/loss_step=0.0185, global_step=3936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5140/5971 [50:16<08:07,  1.70it/s, loss=0.0835, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.52e-5, train/loss_step=0.0185, global_step=3936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5140/5971 [50:16<08:07,  1.70it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=3936.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5141/5971 [50:17<08:07,  1.70it/s, loss=0.0849, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000852, train/loss_step=0.213, global_step=3937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5142/5971 [50:18<08:06,  1.70it/s, loss=0.0836, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000155, train/loss_step=0.0449, global_step=3937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5143/5971 [50:18<08:05,  1.70it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.49e-5, train/loss_step=0.0229, global_step=3937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5144/5971 [50:21<08:05,  1.70it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.49e-5, train/loss_step=0.0229, global_step=3937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5144/5971 [50:21<08:05,  1.70it/s, loss=0.0929, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000843, train/loss_step=0.207, global_step=3937.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5145/5971 [50:22<08:05,  1.70it/s, loss=0.117, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00625, train/loss_step=0.509, global_step=3938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  86%|████████▌ | 5146/5971 [50:23<08:04,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.0126, train/loss_step=0.537, global_step=3938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▌ | 5147/5971 [50:23<08:04,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.98e-5, train/loss_step=0.0109, global_step=3938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5148/5971 [50:26<08:03,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.98e-5, train/loss_step=0.0109, global_step=3938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▌ | 5148/5971 [50:26<08:03,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.738, train/loss_vlb_step=0.0207, train/loss_step=0.738, global_step=3938.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  86%|████████▌ | 5149/5971 [50:27<08:03,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000123, train/loss_step=0.0323, global_step=3939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5150/5971 [50:27<08:02,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.000184, train/loss_step=0.0529, global_step=3939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5151/5971 [50:28<08:02,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00976, train/loss_vlb_step=4.47e-5, train/loss_step=0.00976, global_step=3939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5152/5971 [50:30<08:01,  1.70it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00976, train/loss_vlb_step=4.47e-5, train/loss_step=0.00976, global_step=3939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5152/5971 [50:30<08:01,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000735, train/loss_step=0.205, global_step=3939.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  86%|████████▋ | 5153/5971 [50:31<08:01,  1.70it/s, loss=0.182, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00111, train/loss_step=0.264, global_step=3940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▋ | 5154/5971 [50:32<08:00,  1.70it/s, loss=0.17, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00142, train/loss_step=0.309, global_step=3940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▋ | 5155/5971 [50:33<08:00,  1.70it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.00013, train/loss_step=0.0364, global_step=3940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5156/5971 [50:36<07:59,  1.70it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.00013, train/loss_step=0.0364, global_step=3940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5156/5971 [50:36<07:59,  1.70it/s, loss=0.172, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000131, train/loss_step=0.034, global_step=3940.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▋ | 5157/5971 [50:37<07:59,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0824, train/loss_vlb_step=0.000272, train/loss_step=0.0824, global_step=3941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5158/5971 [50:38<07:58,  1.70it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.33e-5, train/loss_step=0.0181, global_step=3941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▋ | 5159/5971 [50:38<07:58,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000881, train/loss_step=0.229, global_step=3941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  86%|████████▋ | 5160/5971 [50:41<07:57,  1.70it/s, loss=0.183, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000881, train/loss_step=0.229, global_step=3941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5160/5971 [50:41<07:57,  1.70it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.55e-5, train/loss_step=0.0128, global_step=3941.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5161/5971 [50:42<07:57,  1.70it/s, loss=0.208, v_num=0, train/loss_simple_step=0.808, train/loss_vlb_step=0.0824, train/loss_step=0.808, global_step=3942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  86%|████████▋ | 5162/5971 [50:43<07:56,  1.70it/s, loss=0.217, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000878, train/loss_step=0.215, global_step=3942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5163/5971 [50:43<07:56,  1.70it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.04e-5, train/loss_step=0.0138, global_step=3942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5164/5971 [50:46<07:56,  1.70it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.04e-5, train/loss_step=0.0138, global_step=3942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  86%|████████▋ | 5164/5971 [50:46<07:56,  1.70it/s, loss=0.213, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000436, train/loss_step=0.132, global_step=3942.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  87%|████████▋ | 5165/5971 [50:47<07:55,  1.70it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00522, train/loss_vlb_step=2.67e-5, train/loss_step=0.00522, global_step=3943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5166/5971 [50:48<07:54,  1.70it/s, loss=0.166, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000377, train/loss_step=0.112, global_step=3943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  87%|████████▋ | 5167/5971 [50:49<07:54,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000616, train/loss_step=0.182, global_step=3943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5168/5971 [50:51<07:54,  1.69it/s, loss=0.175, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000616, train/loss_step=0.182, global_step=3943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5168/5971 [50:51<07:54,  1.69it/s, loss=0.164, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.0038, train/loss_step=0.525, global_step=3943.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  87%|████████▋ | 5169/5971 [50:52<07:53,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.44e-5, train/loss_step=0.0125, global_step=3944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5170/5971 [50:53<07:52,  1.69it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.53e-5, train/loss_step=0.00721, global_step=3944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5171/5971 [50:53<07:52,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000575, train/loss_step=0.168, global_step=3944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  87%|████████▋ | 5172/5971 [50:56<07:52,  1.69it/s, loss=0.169, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000575, train/loss_step=0.168, global_step=3944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5172/5971 [50:56<07:52,  1.69it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0944, train/loss_vlb_step=0.000311, train/loss_step=0.0944, global_step=3944.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5173/5971 [50:56<07:51,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.86e-5, train/loss_step=0.0252, global_step=3945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  87%|████████▋ | 5174/5971 [50:57<07:50,  1.69it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.43e-5, train/loss_step=0.00262, global_step=3945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5175/5971 [50:58<07:50,  1.69it/s, loss=0.14, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=3945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  87%|████████▋ | 5176/5971 [51:01<07:50,  1.69it/s, loss=0.14, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=3945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5176/5971 [51:01<07:50,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00537, train/loss_vlb_step=2.57e-5, train/loss_step=0.00537, global_step=3945.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5177/5971 [51:01<07:49,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.59e-5, train/loss_step=0.00292, global_step=3946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5178/5971 [51:02<07:48,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.29e-5, train/loss_step=0.0174, global_step=3946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  87%|████████▋ | 5179/5971 [51:03<07:48,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.0125, train/loss_step=0.599, global_step=3946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  87%|████████▋ | 5180/5971 [51:06<07:48,  1.69it/s, loss=0.153, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.0125, train/loss_step=0.599, global_step=3946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5180/5971 [51:06<07:48,  1.69it/s, loss=0.158, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=3946.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5181/5971 [51:06<07:47,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0597, train/loss_vlb_step=0.000201, train/loss_step=0.0597, global_step=3947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5182/5971 [51:07<07:47,  1.69it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0576, train/loss_vlb_step=0.0002, train/loss_step=0.0576, global_step=3947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  87%|████████▋ | 5183/5971 [51:08<07:46,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000745, train/loss_step=0.208, global_step=3947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5184/5971 [51:10<07:46,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000745, train/loss_step=0.208, global_step=3947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5184/5971 [51:10<07:46,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000269, train/loss_step=0.0803, global_step=3947.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5185/5971 [51:11<07:45,  1.69it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00477, train/loss_vlb_step=2.48e-5, train/loss_step=0.00477, global_step=3948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5186/5971 [51:12<07:45,  1.69it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0683, train/loss_vlb_step=0.000227, train/loss_step=0.0683, global_step=3948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5187/5971 [51:13<07:44,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000126, train/loss_step=0.035, global_step=3948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  87%|████████▋ | 5188/5971 [51:15<07:44,  1.69it/s, loss=0.11, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000126, train/loss_step=0.035, global_step=3948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5188/5971 [51:15<07:44,  1.69it/s, loss=0.106, v_num=0, train/loss_simple_step=0.431, train/loss_vlb_step=0.00225, train/loss_step=0.431, global_step=3948.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5189/5971 [51:16<07:43,  1.69it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.82e-5, train/loss_step=0.0127, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5190/5971 [51:17<07:43,  1.69it/s, loss=0.113, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000601, train/loss_step=0.165, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  87%|████████▋ | 5191/5971 [51:18<07:42,  1.69it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.000206, train/loss_step=0.0609, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5192/5971 [51:20<07:42,  1.69it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.000206, train/loss_step=0.0609, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  87%|████████▋ | 5192/5971 [51:20<07:42,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<00:55,  2.99it/s][A

Validating:   1%|          | 2/167 [00:00<00:45,  3.62it/s][A
Epoch 6:  87%|████████▋ | 5196/5971 [51:21<07:39,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.30it/s][A
Epoch 6:  87%|████████▋ | 5200/5971 [51:21<07:36,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.18it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.73it/s][A
Epoch 6:  87%|████████▋ | 5204/5971 [51:21<07:34,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.97it/s][A
Epoch 6:  87%|████████▋ | 5208/5971 [51:21<07:31,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.79it/s][A
Epoch 6:  87%|████████▋ | 5212/5971 [51:21<07:28,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.70it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.12it/s][A
Epoch 6:  87%|████████▋ | 5216/5971 [51:22<07:26,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.21it/s][A
Epoch 6:  87%|████████▋ | 5220/5971 [51:22<07:23,  1.69it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.96it/s][A
Epoch 6:  87%|████████▋ | 5224/5971 [51:22<07:20,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.49it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 25.54it/s][A
Epoch 6:  88%|████████▊ | 5228/5971 [51:22<07:18,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 25.88it/s][A
Epoch 6:  88%|████████▊ | 5232/5971 [51:22<07:15,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.93it/s][A
Epoch 6:  88%|████████▊ | 5236/5971 [51:22<07:12,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.33it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.91it/s][A
Epoch 6:  88%|████████▊ | 5240/5971 [51:22<07:10,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.73it/s][A
Epoch 6:  88%|████████▊ | 5244/5971 [51:23<07:07,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 24.83it/s][A
Epoch 6:  88%|████████▊ | 5248/5971 [51:23<07:04,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 24.34it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:04, 24.62it/s][A
Epoch 6:  88%|████████▊ | 5252/5971 [51:23<07:02,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 24.88it/s][A
Epoch 6:  88%|████████▊ | 5256/5971 [51:23<06:59,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 24.48it/s][A
Epoch 6:  88%|████████▊ | 5260/5971 [51:23<06:56,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:03<00:03, 25.81it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.88it/s][A
Epoch 6:  88%|████████▊ | 5264/5971 [51:23<06:54,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.48it/s][A
Epoch 6:  88%|████████▊ | 5268/5971 [51:24<06:51,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.07it/s][A
Epoch 6:  88%|████████▊ | 5272/5971 [51:24<06:48,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.11it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.81it/s][A
Epoch 6:  88%|████████▊ | 5276/5971 [51:24<06:46,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.77it/s][A
Epoch 6:  88%|████████▊ | 5280/5971 [51:24<06:43,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.27it/s][A
Epoch 6:  88%|████████▊ | 5284/5971 [51:24<06:40,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.73it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.44it/s][A
Epoch 6:  89%|████████▊ | 5288/5971 [51:24<06:38,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.85it/s][A
Epoch 6:  89%|████████▊ | 5292/5971 [51:24<06:35,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.50it/s][A
Epoch 6:  89%|████████▊ | 5296/5971 [51:25<06:33,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.47it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 24.95it/s][A
Epoch 6:  89%|████████▉ | 5300/5971 [51:25<06:30,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.16it/s][A
Epoch 6:  89%|████████▉ | 5304/5971 [51:25<06:27,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.29it/s][A
Epoch 6:  89%|████████▉ | 5308/5971 [51:25<06:25,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 25.78it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.50it/s][A
Epoch 6:  89%|████████▉ | 5312/5971 [51:25<06:22,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.93it/s][A
Epoch 6:  89%|████████▉ | 5316/5971 [51:25<06:20,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.19it/s][A
Epoch 6:  89%|████████▉ | 5320/5971 [51:26<06:17,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.32it/s][A
Epoch 6:  89%|████████▉ | 5324/5971 [51:26<06:14,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.91it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 25.76it/s][A
Epoch 6:  89%|████████▉ | 5328/5971 [51:26<06:12,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.40it/s][A
Epoch 6:  89%|████████▉ | 5332/5971 [51:26<06:09,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.92it/s][A
Epoch 6:  89%|████████▉ | 5336/5971 [51:26<06:07,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.49it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.39it/s][A
Epoch 6:  89%|████████▉ | 5340/5971 [51:26<06:04,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.67it/s][A
Epoch 6:  89%|████████▉ | 5344/5971 [51:26<06:02,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.85it/s][A
Epoch 6:  90%|████████▉ | 5348/5971 [51:27<05:59,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.60it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.05it/s][A
Epoch 6:  90%|████████▉ | 5352/5971 [51:27<05:57,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.88it/s][A
Epoch 6:  90%|████████▉ | 5356/5971 [51:27<05:54,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.75it/s][A
Epoch 6:  90%|████████▉ | 5360/5971 [51:27<05:51,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5360/5971 [51:27<05:51,  1.74it/s, loss=0.127, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00316, train/loss_step=0.477, global_step=3949.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  90%|████████▉ | 5361/5971 [51:28<05:51,  1.74it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000233, train/loss_step=0.0694, global_step=3950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5362/5971 [51:29<05:50,  1.74it/s, loss=0.15, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00301, train/loss_step=0.423, global_step=3950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  90%|████████▉ | 5363/5971 [51:30<05:50,  1.74it/s, loss=0.159, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00114, train/loss_step=0.284, global_step=3950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5364/5971 [51:33<05:49,  1.73it/s, loss=0.159, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00114, train/loss_step=0.284, global_step=3950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5364/5971 [51:33<05:49,  1.73it/s, loss=0.166, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000465, train/loss_step=0.140, global_step=3950.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5365/5971 [51:34<05:49,  1.73it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=9.2e-5, train/loss_step=0.0226, global_step=3951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5366/5971 [51:34<05:48,  1.73it/s, loss=0.171, v_num=0, train/loss_simple_step=0.098, train/loss_vlb_step=0.000322, train/loss_step=0.098, global_step=3951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5367/5971 [51:35<05:48,  1.73it/s, loss=0.166, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.0032, train/loss_step=0.498, global_step=3951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  90%|████████▉ | 5368/5971 [51:37<05:47,  1.73it/s, loss=0.166, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.0032, train/loss_step=0.498, global_step=3951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5368/5971 [51:37<05:47,  1.73it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=0.000101, train/loss_step=0.0255, global_step=3951.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5369/5971 [51:38<05:47,  1.73it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00419, train/loss_vlb_step=2.3e-5, train/loss_step=0.00419, global_step=3952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5370/5971 [51:39<05:46,  1.73it/s, loss=0.161, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=3952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  90%|████████▉ | 5371/5971 [51:40<05:46,  1.73it/s, loss=0.169, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00298, train/loss_step=0.363, global_step=3952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  90%|████████▉ | 5372/5971 [51:42<05:45,  1.73it/s, loss=0.169, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00298, train/loss_step=0.363, global_step=3952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5372/5971 [51:42<05:45,  1.73it/s, loss=0.186, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00335, train/loss_step=0.438, global_step=3952.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|████████▉ | 5373/5971 [51:43<05:45,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000193, train/loss_step=0.0562, global_step=3953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5374/5971 [51:44<05:44,  1.73it/s, loss=0.217, v_num=0, train/loss_simple_step=0.626, train/loss_vlb_step=0.00972, train/loss_step=0.626, global_step=3953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  90%|█████████ | 5375/5971 [51:45<05:44,  1.73it/s, loss=0.218, v_num=0, train/loss_simple_step=0.061, train/loss_vlb_step=0.000211, train/loss_step=0.061, global_step=3953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5376/5971 [51:47<05:43,  1.73it/s, loss=0.218, v_num=0, train/loss_simple_step=0.061, train/loss_vlb_step=0.000211, train/loss_step=0.061, global_step=3953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5376/5971 [51:47<05:43,  1.73it/s, loss=0.214, v_num=0, train/loss_simple_step=0.340, train/loss_vlb_step=0.0017, train/loss_step=0.340, global_step=3953.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  90%|█████████ | 5377/5971 [51:48<05:43,  1.73it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00245, train/loss_vlb_step=1.41e-5, train/loss_step=0.00245, global_step=3954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5378/5971 [51:49<05:42,  1.73it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.13e-5, train/loss_step=0.00196, global_step=3954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5379/5971 [51:50<05:42,  1.73it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000115, train/loss_step=0.0302, global_step=3954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  90%|█████████ | 5380/5971 [51:52<05:41,  1.73it/s, loss=0.203, v_num=0, train/loss_simple_step=0.0302, train/loss_vlb_step=0.000115, train/loss_step=0.0302, global_step=3954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5380/5971 [51:52<05:41,  1.73it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0795, train/loss_vlb_step=0.000263, train/loss_step=0.0795, global_step=3954.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5381/5971 [51:53<05:41,  1.73it/s, loss=0.187, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000444, train/loss_step=0.132, global_step=3955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  90%|█████████ | 5382/5971 [51:54<05:40,  1.73it/s, loss=0.19, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00341, train/loss_step=0.490, global_step=3955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  90%|█████████ | 5383/5971 [51:55<05:40,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=0.000101, train/loss_step=0.0255, global_step=3955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5384/5971 [51:57<05:39,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=0.000101, train/loss_step=0.0255, global_step=3955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5384/5971 [51:57<05:39,  1.73it/s, loss=0.201, v_num=0, train/loss_simple_step=0.623, train/loss_vlb_step=0.009, train/loss_step=0.623, global_step=3955.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  90%|█████████ | 5385/5971 [51:58<05:39,  1.73it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000271, train/loss_step=0.0803, global_step=3956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5386/5971 [51:59<05:38,  1.73it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=6.93e-5, train/loss_step=0.0171, global_step=3956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  90%|█████████ | 5387/5971 [52:00<05:38,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000136, train/loss_step=0.035, global_step=3956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5388/5971 [52:02<05:37,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000136, train/loss_step=0.035, global_step=3956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5388/5971 [52:02<05:37,  1.73it/s, loss=0.191, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00136, train/loss_step=0.304, global_step=3956.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  90%|█████████ | 5389/5971 [52:03<05:37,  1.73it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000184, train/loss_step=0.0533, global_step=3957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5390/5971 [52:03<05:36,  1.73it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0305, train/loss_vlb_step=0.00012, train/loss_step=0.0305, global_step=3957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  90%|█████████ | 5391/5971 [52:04<05:36,  1.73it/s, loss=0.197, v_num=0, train/loss_simple_step=0.511, train/loss_vlb_step=0.00465, train/loss_step=0.511, global_step=3957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  90%|█████████ | 5392/5971 [52:07<05:35,  1.72it/s, loss=0.197, v_num=0, train/loss_simple_step=0.511, train/loss_vlb_step=0.00465, train/loss_step=0.511, global_step=3957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5392/5971 [52:07<05:35,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.21e-5, train/loss_step=0.0114, global_step=3957.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5393/5971 [52:08<05:35,  1.72it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00134, train/loss_vlb_step=8e-6, train/loss_step=0.00134, global_step=3958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  90%|█████████ | 5394/5971 [52:08<05:34,  1.72it/s, loss=0.143, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000102, train/loss_step=0.026, global_step=3958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5395/5971 [52:09<05:34,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000652, train/loss_step=0.188, global_step=3958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5396/5971 [52:11<05:33,  1.72it/s, loss=0.149, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000652, train/loss_step=0.188, global_step=3958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5396/5971 [52:11<05:33,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0741, train/loss_vlb_step=0.000249, train/loss_step=0.0741, global_step=3958.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5397/5971 [52:12<05:33,  1.72it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000109, train/loss_step=0.0273, global_step=3959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5398/5971 [52:13<05:32,  1.72it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.49e-5, train/loss_step=0.0143, global_step=3959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  90%|█████████ | 5399/5971 [52:14<05:32,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000904, train/loss_step=0.203, global_step=3959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  90%|█████████ | 5400/5971 [52:17<05:31,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000904, train/loss_step=0.203, global_step=3959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5400/5971 [52:17<05:31,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00182, train/loss_step=0.351, global_step=3959.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  90%|█████████ | 5401/5971 [52:17<05:31,  1.72it/s, loss=0.161, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000472, train/loss_step=0.143, global_step=3960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5402/5971 [52:18<05:30,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00525, train/loss_vlb_step=2.61e-5, train/loss_step=0.00525, global_step=3960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  90%|█████████ | 5403/5971 [52:19<05:30,  1.72it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0756, train/loss_vlb_step=0.000251, train/loss_step=0.0756, global_step=3960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  91%|█████████ | 5404/5971 [52:21<05:29,  1.72it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0756, train/loss_vlb_step=0.000251, train/loss_step=0.0756, global_step=3960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5404/5971 [52:21<05:29,  1.72it/s, loss=0.143, v_num=0, train/loss_simple_step=0.702, train/loss_vlb_step=0.0197, train/loss_step=0.702, global_step=3960.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  91%|█████████ | 5405/5971 [52:22<05:29,  1.72it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.00021, train/loss_step=0.0619, global_step=3961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5406/5971 [52:23<05:28,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0934, train/loss_vlb_step=0.000311, train/loss_step=0.0934, global_step=3961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5407/5971 [52:24<05:27,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.002, train/loss_step=0.391, global_step=3961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  91%|█████████ | 5408/5971 [52:26<05:27,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.002, train/loss_step=0.391, global_step=3961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5408/5971 [52:26<05:27,  1.72it/s, loss=0.159, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000802, train/loss_step=0.209, global_step=3961.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5409/5971 [52:27<05:26,  1.72it/s, loss=0.181, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00546, train/loss_step=0.501, global_step=3962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  91%|█████████ | 5410/5971 [52:28<05:26,  1.72it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0058, train/loss_vlb_step=3.04e-5, train/loss_step=0.0058, global_step=3962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5411/5971 [52:29<05:25,  1.72it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.34e-5, train/loss_step=0.0126, global_step=3962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5412/5971 [52:31<05:25,  1.72it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.34e-5, train/loss_step=0.0126, global_step=3962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5412/5971 [52:31<05:25,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00154, train/loss_step=0.276, global_step=3962.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  91%|█████████ | 5413/5971 [52:32<05:24,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00365, train/loss_vlb_step=2.02e-5, train/loss_step=0.00365, global_step=3963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5414/5971 [52:33<05:24,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000894, train/loss_step=0.249, global_step=3963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  91%|█████████ | 5415/5971 [52:34<05:23,  1.72it/s, loss=0.184, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00116, train/loss_step=0.278, global_step=3963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  91%|█████████ | 5416/5971 [52:36<05:23,  1.72it/s, loss=0.184, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00116, train/loss_step=0.278, global_step=3963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5416/5971 [52:36<05:23,  1.72it/s, loss=0.184, v_num=0, train/loss_simple_step=0.067, train/loss_vlb_step=0.000235, train/loss_step=0.067, global_step=3963.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5417/5971 [52:37<05:22,  1.72it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00293, train/loss_vlb_step=1.59e-5, train/loss_step=0.00293, global_step=3964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5418/5971 [52:38<05:22,  1.72it/s, loss=0.188, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000448, train/loss_step=0.133, global_step=3964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  91%|█████████ | 5419/5971 [52:39<05:21,  1.72it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.000226, train/loss_step=0.0653, global_step=3964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5420/5971 [52:41<05:21,  1.71it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0653, train/loss_vlb_step=0.000226, train/loss_step=0.0653, global_step=3964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5420/5971 [52:41<05:21,  1.71it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000117, train/loss_step=0.0323, global_step=3964.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5421/5971 [52:42<05:20,  1.71it/s, loss=0.17, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000828, train/loss_step=0.237, global_step=3965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  91%|█████████ | 5422/5971 [52:43<05:20,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=9.67e-5, train/loss_step=0.0264, global_step=3965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5423/5971 [52:44<05:19,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000249, train/loss_step=0.0734, global_step=3965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5424/5971 [52:46<05:19,  1.71it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0734, train/loss_vlb_step=0.000249, train/loss_step=0.0734, global_step=3965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5424/5971 [52:46<05:19,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000112, train/loss_step=0.0301, global_step=3965.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5425/5971 [52:47<05:18,  1.71it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0627, train/loss_vlb_step=0.000221, train/loss_step=0.0627, global_step=3966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5426/5971 [52:48<05:18,  1.71it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00271, train/loss_vlb_step=1.53e-5, train/loss_step=0.00271, global_step=3966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5427/5971 [52:48<05:17,  1.71it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00965, train/loss_vlb_step=4.24e-5, train/loss_step=0.00965, global_step=3966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5428/5971 [52:51<05:17,  1.71it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00965, train/loss_vlb_step=4.24e-5, train/loss_step=0.00965, global_step=3966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5428/5971 [52:51<05:17,  1.71it/s, loss=0.113, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000652, train/loss_step=0.195, global_step=3966.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  91%|█████████ | 5429/5971 [52:52<05:16,  1.71it/s, loss=0.097, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.00064, train/loss_step=0.179, global_step=3967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  91%|█████████ | 5430/5971 [52:53<05:16,  1.71it/s, loss=0.106, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000615, train/loss_step=0.176, global_step=3967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5431/5971 [52:53<05:15,  1.71it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.21e-6, train/loss_step=0.00135, global_step=3967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5432/5971 [52:56<05:15,  1.71it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.21e-6, train/loss_step=0.00135, global_step=3967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5432/5971 [52:56<05:15,  1.71it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.00177, train/loss_vlb_step=1.07e-5, train/loss_step=0.00177, global_step=3967.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5433/5971 [52:56<05:14,  1.71it/s, loss=0.0923, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.34e-5, train/loss_step=0.0239, global_step=3968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  91%|█████████ | 5434/5971 [52:57<05:13,  1.71it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.0308, train/loss_vlb_step=0.000114, train/loss_step=0.0308, global_step=3968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5435/5971 [52:58<05:13,  1.71it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00143, train/loss_step=0.308, global_step=3968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  91%|█████████ | 5436/5971 [53:00<05:12,  1.71it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00143, train/loss_step=0.308, global_step=3968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5436/5971 [53:00<05:12,  1.71it/s, loss=0.0956, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00142, train/loss_step=0.321, global_step=3968.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5437/5971 [53:01<05:12,  1.71it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000114, train/loss_step=0.0316, global_step=3969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5438/5971 [53:02<05:11,  1.71it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.55e-5, train/loss_step=0.0108, global_step=3969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5439/5971 [53:03<05:11,  1.71it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=3969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  91%|█████████ | 5440/5971 [53:05<05:10,  1.71it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000393, train/loss_step=0.118, global_step=3969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5440/5971 [53:05<05:10,  1.71it/s, loss=0.107, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00117, train/loss_step=0.293, global_step=3969.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  91%|█████████ | 5441/5971 [53:06<05:10,  1.71it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=7.88e-5, train/loss_step=0.020, global_step=3970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5442/5971 [53:07<05:09,  1.71it/s, loss=0.0947, v_num=0, train/loss_simple_step=0.00524, train/loss_vlb_step=2.6e-5, train/loss_step=0.00524, global_step=3970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5443/5971 [53:08<05:09,  1.71it/s, loss=0.118, v_num=0, train/loss_simple_step=0.548, train/loss_vlb_step=0.00511, train/loss_step=0.548, global_step=3970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  91%|█████████ | 5444/5971 [53:10<05:08,  1.71it/s, loss=0.118, v_num=0, train/loss_simple_step=0.548, train/loss_vlb_step=0.00511, train/loss_step=0.548, global_step=3970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5444/5971 [53:10<05:08,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000663, train/loss_step=0.193, global_step=3970.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5445/5971 [53:11<05:08,  1.71it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0084, train/loss_vlb_step=3.91e-5, train/loss_step=0.0084, global_step=3971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5446/5971 [53:12<05:07,  1.71it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.000281, train/loss_step=0.0847, global_step=3971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5447/5971 [53:13<05:07,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000552, train/loss_step=0.164, global_step=3971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  91%|█████████ | 5448/5971 [53:15<05:06,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000552, train/loss_step=0.164, global_step=3971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████ | 5448/5971 [53:15<05:06,  1.71it/s, loss=0.135, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.00062, train/loss_step=0.180, global_step=3971.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  91%|█████████▏| 5449/5971 [53:16<05:06,  1.70it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00713, train/loss_vlb_step=3.5e-5, train/loss_step=0.00713, global_step=3972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5450/5971 [53:17<05:05,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.00034, train/loss_step=0.101, global_step=3972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  91%|█████████▏| 5451/5971 [53:18<05:05,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000266, train/loss_step=0.0807, global_step=3972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5452/5971 [53:20<05:04,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000266, train/loss_step=0.0807, global_step=3972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5452/5971 [53:20<05:04,  1.70it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.25e-5, train/loss_step=0.00213, global_step=3972.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5453/5971 [53:21<05:04,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.633, train/loss_vlb_step=0.0124, train/loss_step=0.633, global_step=3973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  91%|█████████▏| 5454/5971 [53:22<05:03,  1.70it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.66e-5, train/loss_step=0.00295, global_step=3973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5455/5971 [53:22<05:02,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000141, train/loss_step=0.0387, global_step=3973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  91%|█████████▏| 5456/5971 [53:25<05:02,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0387, train/loss_vlb_step=0.000141, train/loss_step=0.0387, global_step=3973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5456/5971 [53:25<05:02,  1.70it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000161, train/loss_step=0.0462, global_step=3973.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5457/5971 [53:26<05:01,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.00014, train/loss_step=0.0406, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  91%|█████████▏| 5458/5971 [53:27<05:01,  1.70it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00794, train/loss_vlb_step=3.73e-5, train/loss_step=0.00794, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5459/5971 [53:27<05:00,  1.70it/s, loss=0.126, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000245, train/loss_step=0.073, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  91%|█████████▏| 5460/5971 [53:29<05:00,  1.70it/s, loss=0.126, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000245, train/loss_step=0.073, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  91%|█████████▏| 5460/5971 [53:29<05:00,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:35,  1.74it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:31,  5.19it/s][A
Epoch 6:  92%|█████████▏| 5464/5971 [53:30<04:57,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   4%|▎         | 6/167 [00:00<00:16,  9.81it/s][A
Epoch 6:  92%|█████████▏| 5468/5971 [53:30<04:55,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▌         | 9/167 [00:00<00:11, 13.91it/s][A
Epoch 6:  92%|█████████▏| 5472/5971 [53:31<04:52,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 12/167 [00:01<00:08, 17.57it/s][A

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.07it/s][A
Epoch 6:  92%|█████████▏| 5476/5971 [53:31<04:50,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:01<00:06, 21.70it/s][A
Epoch 6:  92%|█████████▏| 5480/5971 [53:31<04:47,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.77it/s][A
Epoch 6:  92%|█████████▏| 5484/5971 [53:31<04:45,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.22it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:06, 23.26it/s][A
Epoch 6:  92%|█████████▏| 5488/5971 [53:31<04:42,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.28it/s][A
Epoch 6:  92%|█████████▏| 5492/5971 [53:31<04:40,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.54it/s][A
Epoch 6:  92%|█████████▏| 5496/5971 [53:31<04:37,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.27it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.24it/s][A
Epoch 6:  92%|█████████▏| 5500/5971 [53:32<04:35,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.29it/s][A
Epoch 6:  92%|█████████▏| 5504/5971 [53:32<04:32,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 24.96it/s][A
Epoch 6:  92%|█████████▏| 5508/5971 [53:32<04:29,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.11it/s][A
Epoch 6:  92%|█████████▏| 5512/5971 [53:32<04:27,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 52/167 [00:02<00:04, 25.34it/s][A

Validating:  33%|███▎      | 55/167 [00:02<00:04, 25.30it/s][A
Epoch 6:  92%|█████████▏| 5516/5971 [53:32<04:24,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 25.62it/s][A
Epoch 6:  92%|█████████▏| 5520/5971 [53:32<04:22,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 25.63it/s][A
Epoch 6:  93%|█████████▎| 5524/5971 [53:33<04:19,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.02it/s][A
Epoch 6:  93%|█████████▎| 5528/5971 [53:33<04:17,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.55it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.33it/s][A
Epoch 6:  93%|█████████▎| 5532/5971 [53:33<04:14,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.69it/s][A
Epoch 6:  93%|█████████▎| 5536/5971 [53:33<04:12,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.72it/s][A
Epoch 6:  93%|█████████▎| 5540/5971 [53:33<04:09,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.74it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.51it/s][A
Epoch 6:  93%|█████████▎| 5544/5971 [53:33<04:07,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.21it/s][A
Epoch 6:  93%|█████████▎| 5548/5971 [53:33<04:04,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.46it/s][A
Epoch 6:  93%|█████████▎| 5552/5971 [53:34<04:02,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.41it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.56it/s][A
Epoch 6:  93%|█████████▎| 5556/5971 [53:34<04:00,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.17it/s][A
Epoch 6:  93%|█████████▎| 5560/5971 [53:34<03:57,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.97it/s][A
Epoch 6:  93%|█████████▎| 5564/5971 [53:34<03:55,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.78it/s][A

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 23.23it/s][A
Epoch 6:  93%|█████████▎| 5568/5971 [53:34<03:52,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 23.99it/s][A
Epoch 6:  93%|█████████▎| 5572/5971 [53:34<03:50,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 24.94it/s][A
Epoch 6:  93%|█████████▎| 5576/5971 [53:35<03:47,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.12it/s][A

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.13it/s][A
Epoch 6:  93%|█████████▎| 5580/5971 [53:35<03:45,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.78it/s][A
Epoch 6:  94%|█████████▎| 5584/5971 [53:35<03:42,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 28.18it/s][A
Epoch 6:  94%|█████████▎| 5588/5971 [53:35<03:40,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.63it/s][A
Epoch 6:  94%|█████████▎| 5592/5971 [53:35<03:37,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.66it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.42it/s][A
Epoch 6:  94%|█████████▎| 5596/5971 [53:35<03:35,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 27.23it/s][A
Epoch 6:  94%|█████████▍| 5600/5971 [53:35<03:33,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.42it/s][A
Epoch 6:  94%|█████████▍| 5604/5971 [53:36<03:30,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.54it/s][A

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.53it/s][A
Epoch 6:  94%|█████████▍| 5608/5971 [53:36<03:28,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.37it/s][A
Epoch 6:  94%|█████████▍| 5612/5971 [53:36<03:25,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.78it/s][A
Epoch 6:  94%|█████████▍| 5616/5971 [53:36<03:23,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 28.02it/s][A
Epoch 6:  94%|█████████▍| 5620/5971 [53:36<03:20,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 29.29it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 28.55it/s][A
Epoch 6:  94%|█████████▍| 5624/5971 [53:36<03:18,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 28.40it/s][A
Epoch 6:  94%|█████████▍| 5628/5971 [53:36<03:16,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5628/5971 [53:37<03:16,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00107, train/loss_step=0.234, global_step=3974.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [A
Epoch 6:  94%|█████████▍| 5629/5971 [53:38<03:15,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.17e-5, train/loss_step=0.0117, global_step=3975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5630/5971 [53:39<03:14,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=6.94e-5, train/loss_step=0.0171, global_step=3975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5631/5971 [53:39<03:14,  1.75it/s, loss=0.121, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00484, train/loss_step=0.499, global_step=3975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  94%|█████████▍| 5632/5971 [53:42<03:13,  1.75it/s, loss=0.121, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00484, train/loss_step=0.499, global_step=3975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5632/5971 [53:42<03:13,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00114, train/loss_step=0.255, global_step=3975.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5633/5971 [53:43<03:13,  1.75it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000122, train/loss_step=0.0342, global_step=3976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5634/5971 [53:43<03:12,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0058, train/loss_vlb_step=2.88e-5, train/loss_step=0.0058, global_step=3976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  94%|█████████▍| 5635/5971 [53:44<03:12,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000143, train/loss_step=0.0408, global_step=3976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5636/5971 [53:46<03:11,  1.75it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0408, train/loss_vlb_step=0.000143, train/loss_step=0.0408, global_step=3976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5636/5971 [53:46<03:11,  1.75it/s, loss=0.132, v_num=0, train/loss_simple_step=0.515, train/loss_vlb_step=0.00462, train/loss_step=0.515, global_step=3976.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  94%|█████████▍| 5637/5971 [53:47<03:11,  1.75it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00458, train/loss_vlb_step=2.33e-5, train/loss_step=0.00458, global_step=3977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5638/5971 [53:48<03:10,  1.75it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0646, train/loss_vlb_step=0.000219, train/loss_step=0.0646, global_step=3977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  94%|█████████▍| 5639/5971 [53:49<03:10,  1.75it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.3e-5, train/loss_step=0.0116, global_step=3977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  94%|█████████▍| 5640/5971 [53:51<03:09,  1.75it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.3e-5, train/loss_step=0.0116, global_step=3977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5640/5971 [53:51<03:09,  1.75it/s, loss=0.133, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000401, train/loss_step=0.121, global_step=3977.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5641/5971 [53:52<03:09,  1.75it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000193, train/loss_step=0.0547, global_step=3978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  94%|█████████▍| 5642/5971 [53:53<03:08,  1.75it/s, loss=0.113, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000637, train/loss_step=0.191, global_step=3978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  95%|█████████▍| 5643/5971 [53:54<03:07,  1.75it/s, loss=0.113, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=0.000103, train/loss_step=0.025, global_step=3978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5644/5971 [53:56<03:07,  1.74it/s, loss=0.113, v_num=0, train/loss_simple_step=0.025, train/loss_vlb_step=0.000103, train/loss_step=0.025, global_step=3978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5644/5971 [53:56<03:07,  1.74it/s, loss=0.122, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.00084, train/loss_step=0.228, global_step=3978.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▍| 5645/5971 [53:57<03:06,  1.74it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.13e-5, train/loss_step=0.00394, global_step=3979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5646/5971 [53:58<03:06,  1.74it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.000278, train/loss_step=0.0815, global_step=3979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5647/5971 [53:59<03:05,  1.74it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000134, train/loss_step=0.0373, global_step=3979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5648/5971 [54:01<03:05,  1.74it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0373, train/loss_vlb_step=0.000134, train/loss_step=0.0373, global_step=3979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5648/5971 [54:01<03:05,  1.74it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0804, train/loss_vlb_step=0.000265, train/loss_step=0.0804, global_step=3979.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5649/5971 [54:02<03:04,  1.74it/s, loss=0.12, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000444, train/loss_step=0.134, global_step=3980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  95%|█████████▍| 5650/5971 [54:03<03:04,  1.74it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.91e-5, train/loss_step=0.0164, global_step=3980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5651/5971 [54:04<03:03,  1.74it/s, loss=0.113, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00204, train/loss_step=0.354, global_step=3980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▍| 5652/5971 [54:06<03:03,  1.74it/s, loss=0.113, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00204, train/loss_step=0.354, global_step=3980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5652/5971 [54:06<03:03,  1.74it/s, loss=0.115, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00115, train/loss_step=0.287, global_step=3980.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5653/5971 [54:07<03:02,  1.74it/s, loss=0.121, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000563, train/loss_step=0.161, global_step=3981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5654/5971 [54:08<03:02,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000152, train/loss_step=0.039, global_step=3981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5655/5971 [54:08<03:01,  1.74it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0923, train/loss_vlb_step=0.000304, train/loss_step=0.0923, global_step=3981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5656/5971 [54:11<03:01,  1.74it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0923, train/loss_vlb_step=0.000304, train/loss_step=0.0923, global_step=3981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5656/5971 [54:11<03:01,  1.74it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.48e-5, train/loss_step=0.00251, global_step=3981.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5657/5971 [54:11<03:00,  1.74it/s, loss=0.0996, v_num=0, train/loss_simple_step=0.00669, train/loss_vlb_step=3.09e-5, train/loss_step=0.00669, global_step=3982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5658/5971 [54:12<02:59,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00368, train/loss_step=0.458, global_step=3982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  95%|█████████▍| 5659/5971 [54:13<02:59,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000474, train/loss_step=0.143, global_step=3982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5660/5971 [54:15<02:58,  1.74it/s, loss=0.126, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000474, train/loss_step=0.143, global_step=3982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5660/5971 [54:15<02:58,  1.74it/s, loss=0.129, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000596, train/loss_step=0.174, global_step=3982.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5661/5971 [54:16<02:58,  1.74it/s, loss=0.133, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000468, train/loss_step=0.138, global_step=3983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5662/5971 [54:17<02:57,  1.74it/s, loss=0.147, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00354, train/loss_step=0.472, global_step=3983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▍| 5663/5971 [54:18<02:57,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000982, train/loss_step=0.217, global_step=3983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5664/5971 [54:20<02:56,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000982, train/loss_step=0.217, global_step=3983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5664/5971 [54:20<02:56,  1.74it/s, loss=0.15, v_num=0, train/loss_simple_step=0.092, train/loss_vlb_step=0.000304, train/loss_step=0.092, global_step=3983.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▍| 5665/5971 [54:21<02:56,  1.74it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.24e-5, train/loss_step=0.0122, global_step=3984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5666/5971 [54:22<02:55,  1.74it/s, loss=0.161, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00126, train/loss_step=0.309, global_step=3984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▍| 5667/5971 [54:23<02:55,  1.74it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000257, train/loss_step=0.0755, global_step=3984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5668/5971 [54:25<02:54,  1.74it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000257, train/loss_step=0.0755, global_step=3984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5668/5971 [54:25<02:54,  1.74it/s, loss=0.16, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=6.44e-5, train/loss_step=0.017, global_step=3984.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  95%|█████████▍| 5669/5971 [54:26<02:53,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=1.97e-5, train/loss_step=0.00377, global_step=3985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5670/5971 [54:27<02:53,  1.74it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000129, train/loss_step=0.0336, global_step=3985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▍| 5671/5971 [54:28<02:52,  1.74it/s, loss=0.147, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.00069, train/loss_step=0.199, global_step=3985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  95%|█████████▍| 5672/5971 [54:30<02:52,  1.73it/s, loss=0.147, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.00069, train/loss_step=0.199, global_step=3985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▍| 5672/5971 [54:30<02:52,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0363, train/loss_vlb_step=0.00013, train/loss_step=0.0363, global_step=3985.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5673/5971 [54:31<02:51,  1.73it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00563, train/loss_vlb_step=2.79e-5, train/loss_step=0.00563, global_step=3986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5674/5971 [54:32<02:51,  1.73it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0052, train/loss_vlb_step=2.68e-5, train/loss_step=0.0052, global_step=3986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  95%|█████████▌| 5675/5971 [54:32<02:50,  1.73it/s, loss=0.13, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.00075, train/loss_step=0.197, global_step=3986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  95%|█████████▌| 5676/5971 [54:35<02:50,  1.73it/s, loss=0.13, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.00075, train/loss_step=0.197, global_step=3986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5676/5971 [54:35<02:50,  1.73it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0562, train/loss_vlb_step=0.000201, train/loss_step=0.0562, global_step=3986.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5677/5971 [54:36<02:49,  1.73it/s, loss=0.141, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000579, train/loss_step=0.171, global_step=3987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  95%|█████████▌| 5678/5971 [54:37<02:49,  1.73it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00547, train/loss_vlb_step=2.78e-5, train/loss_step=0.00547, global_step=3987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5679/5971 [54:37<02:48,  1.73it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.64e-5, train/loss_step=0.0126, global_step=3987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  95%|█████████▌| 5680/5971 [54:40<02:48,  1.73it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.64e-5, train/loss_step=0.0126, global_step=3987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5680/5971 [54:40<02:48,  1.73it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000124, train/loss_step=0.0325, global_step=3987.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5681/5971 [54:41<02:47,  1.73it/s, loss=0.107, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000603, train/loss_step=0.177, global_step=3988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  95%|█████████▌| 5682/5971 [54:42<02:46,  1.73it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.00644, train/loss_vlb_step=3.08e-5, train/loss_step=0.00644, global_step=3988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5683/5971 [54:42<02:46,  1.73it/s, loss=0.0726, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=2.07e-5, train/loss_step=0.00374, global_step=3988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5684/5971 [54:44<02:45,  1.73it/s, loss=0.0726, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=2.07e-5, train/loss_step=0.00374, global_step=3988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5684/5971 [54:45<02:45,  1.73it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.0582, train/loss_vlb_step=0.000204, train/loss_step=0.0582, global_step=3988.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▌| 5685/5971 [54:45<02:45,  1.73it/s, loss=0.0707, v_num=0, train/loss_simple_step=0.00706, train/loss_vlb_step=3.37e-5, train/loss_step=0.00706, global_step=3989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5686/5971 [54:46<02:44,  1.73it/s, loss=0.0748, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00185, train/loss_step=0.391, global_step=3989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  95%|█████████▌| 5687/5971 [54:47<02:44,  1.73it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000895, train/loss_step=0.245, global_step=3989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5688/5971 [54:49<02:43,  1.73it/s, loss=0.0833, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000895, train/loss_step=0.245, global_step=3989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5688/5971 [54:49<02:43,  1.73it/s, loss=0.0829, v_num=0, train/loss_simple_step=0.00971, train/loss_vlb_step=4.14e-5, train/loss_step=0.00971, global_step=3989.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5689/5971 [54:50<02:43,  1.73it/s, loss=0.0831, v_num=0, train/loss_simple_step=0.00673, train/loss_vlb_step=3.3e-5, train/loss_step=0.00673, global_step=3990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▌| 5690/5971 [54:51<02:42,  1.73it/s, loss=0.0816, v_num=0, train/loss_simple_step=0.00514, train/loss_vlb_step=2.57e-5, train/loss_step=0.00514, global_step=3990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5691/5971 [54:52<02:41,  1.73it/s, loss=0.083, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000799, train/loss_step=0.226, global_step=3990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  95%|█████████▌| 5692/5971 [54:54<02:41,  1.73it/s, loss=0.083, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000799, train/loss_step=0.226, global_step=3990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5692/5971 [54:54<02:41,  1.73it/s, loss=0.0892, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000544, train/loss_step=0.160, global_step=3990.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5693/5971 [54:55<02:40,  1.73it/s, loss=0.0893, v_num=0, train/loss_simple_step=0.0092, train/loss_vlb_step=3.94e-5, train/loss_step=0.0092, global_step=3991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5694/5971 [54:56<02:40,  1.73it/s, loss=0.0985, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000632, train/loss_step=0.188, global_step=3991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▌| 5695/5971 [54:57<02:39,  1.73it/s, loss=0.131, v_num=0, train/loss_simple_step=0.841, train/loss_vlb_step=0.0717, train/loss_step=0.841, global_step=3991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  95%|█████████▌| 5696/5971 [54:59<02:39,  1.73it/s, loss=0.131, v_num=0, train/loss_simple_step=0.841, train/loss_vlb_step=0.0717, train/loss_step=0.841, global_step=3991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5696/5971 [54:59<02:39,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.41e-6, train/loss_step=0.00158, global_step=3991.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5697/5971 [55:00<02:38,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0753, train/loss_vlb_step=0.000248, train/loss_step=0.0753, global_step=3992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▌| 5698/5971 [55:01<02:38,  1.73it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.000112, train/loss_step=0.0282, global_step=3992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5699/5971 [55:01<02:37,  1.73it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.73e-5, train/loss_step=0.0194, global_step=3992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  95%|█████████▌| 5700/5971 [55:04<02:37,  1.73it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.73e-5, train/loss_step=0.0194, global_step=3992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5700/5971 [55:04<02:37,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.82e-5, train/loss_step=0.00348, global_step=3992.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  95%|█████████▌| 5701/5971 [55:05<02:36,  1.73it/s, loss=0.139, v_num=0, train/loss_simple_step=0.503, train/loss_vlb_step=0.00463, train/loss_step=0.503, global_step=3993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  95%|█████████▌| 5702/5971 [55:06<02:35,  1.72it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0713, train/loss_vlb_step=0.000238, train/loss_step=0.0713, global_step=3993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5703/5971 [55:07<02:35,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.921, train/loss_vlb_step=0.155, train/loss_step=0.921, global_step=3993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  96%|█████████▌| 5704/5971 [55:09<02:34,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.921, train/loss_vlb_step=0.155, train/loss_step=0.921, global_step=3993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5704/5971 [55:09<02:34,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0633, train/loss_vlb_step=0.000216, train/loss_step=0.0633, global_step=3993.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5705/5971 [55:10<02:34,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00241, train/loss_vlb_step=1.45e-5, train/loss_step=0.00241, global_step=3994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5706/5971 [55:10<02:33,  1.72it/s, loss=0.19, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00187, train/loss_step=0.417, global_step=3994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6:  96%|█████████▌| 5707/5971 [55:11<02:33,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000123, train/loss_step=0.0354, global_step=3994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5708/5971 [55:13<02:32,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000123, train/loss_step=0.0354, global_step=3994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5708/5971 [55:13<02:32,  1.72it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00657, train/loss_vlb_step=3.23e-5, train/loss_step=0.00657, global_step=3994.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5709/5971 [55:14<02:32,  1.72it/s, loss=0.206, v_num=0, train/loss_simple_step=0.539, train/loss_vlb_step=0.00288, train/loss_step=0.539, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]      
Epoch 6:  96%|█████████▌| 5710/5971 [55:15<02:31,  1.72it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000122, train/loss_step=0.0323, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5711/5971 [55:16<02:30,  1.72it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.94e-5, train/loss_step=0.00352, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5712/5971 [55:18<02:30,  1.72it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.94e-5, train/loss_step=0.00352, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5712/5971 [55:18<02:30,  1.72it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.65e-5, train/loss_step=0.00533, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5713/5971 [55:19<02:29,  1.72it/s, loss=0.197, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000663, train/loss_step=0.185, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  96%|█████████▌| 5714/5971 [55:20<02:29,  1.72it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0347, train/loss_vlb_step=0.000137, train/loss_step=0.0347, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5715/5971 [55:21<02:28,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00911, train/loss_vlb_step=4.45e-5, train/loss_step=0.00911, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5716/5971 [55:23<02:28,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00911, train/loss_vlb_step=4.45e-5, train/loss_step=0.00911, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5716/5971 [55:23<02:28,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00234, train/loss_step=0.403, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  96%|█████████▌| 5717/5971 [55:24<02:27,  1.72it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0536, train/loss_vlb_step=0.000182, train/loss_step=0.0536, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5718/5971 [55:25<02:27,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000883, train/loss_step=0.217, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  96%|█████████▌| 5719/5971 [55:26<02:26,  1.72it/s, loss=0.194, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00254, train/loss_step=0.377, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  96%|█████████▌| 5720/5971 [55:28<02:26,  1.72it/s, loss=0.194, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00254, train/loss_step=0.377, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5720/5971 [55:28<02:26,  1.72it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.000174, train/loss_step=0.0511, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5721/5971 [55:29<02:25,  1.72it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00313, train/loss_vlb_step=1.7e-5, train/loss_step=0.00313, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5722/5971 [55:30<02:24,  1.72it/s, loss=0.169, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000106, train/loss_step=0.026, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  96%|█████████▌| 5723/5971 [55:31<02:24,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00107, train/loss_step=0.253, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  96%|█████████▌| 5724/5971 [55:33<02:23,  1.72it/s, loss=0.136, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00107, train/loss_step=0.253, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5724/5971 [55:33<02:23,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.605, train/loss_vlb_step=0.00895, train/loss_step=0.605, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5725/5971 [55:34<02:23,  1.72it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00328, train/loss_vlb_step=1.78e-5, train/loss_step=0.00328, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5725/5971 [55:47<02:23,  1.71it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00328, train/loss_vlb_step=1.78e-5, train/loss_step=0.00328, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5726/5971 [56:06<02:24,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00328, train/loss_vlb_step=1.78e-5, train/loss_step=0.00328, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5726/5971 [56:06<02:24,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=2e-5, train/loss_step=0.00376, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  96%|█████████▌| 5727/5971 [56:07<02:23,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=2e-5, train/loss_step=0.00376, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5727/5971 [56:07<02:23,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000187, train/loss_step=0.0551, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5728/5971 [56:09<02:22,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000187, train/loss_step=0.0551, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  96%|█████████▌| 5728/5971 [56:09<02:22,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:03,  2.62it/s][A
Epoch 6:  96%|█████████▌| 5730/5971 [56:10<02:21,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   1%|          | 2/167 [00:00<00:42,  3.86it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:31,  5.18it/s][A
Epoch 6:  96%|█████████▌| 5732/5971 [56:10<02:20,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   4%|▎         | 6/167 [00:00<00:14, 10.81it/s][A
Epoch 6:  96%|█████████▌| 5735/5971 [56:10<02:18,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   5%|▌         | 9/167 [00:00<00:10, 15.24it/s][A
Epoch 6:  96%|█████████▌| 5738/5971 [56:10<02:16,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   7%|▋         | 12/167 [00:01<00:08, 18.75it/s][A
Epoch 6:  96%|█████████▌| 5741/5971 [56:10<02:15,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:   9%|▉         | 15/167 [00:01<00:07, 21.66it/s][A
Epoch 6:  96%|█████████▌| 5744/5971 [56:11<02:13,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  11%|█         | 18/167 [00:01<00:06, 23.59it/s][A
Epoch 6:  96%|█████████▌| 5747/5971 [56:11<02:11,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  13%|█▎        | 21/167 [00:01<00:05, 25.28it/s][A
Epoch 6:  96%|█████████▋| 5750/5971 [56:11<02:09,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.47it/s][A
Epoch 6:  96%|█████████▋| 5753/5971 [56:11<02:07,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 26.01it/s][A
Epoch 6:  96%|█████████▋| 5756/5971 [56:11<02:05,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 23.26it/s][A
Epoch 6:  96%|█████████▋| 5759/5971 [56:11<02:04,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.53it/s][A
Epoch 6:  96%|█████████▋| 5762/5971 [56:11<02:02,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.62it/s][A
Epoch 6:  97%|█████████▋| 5765/5971 [56:11<02:00,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 26.02it/s][A
Epoch 6:  97%|█████████▋| 5768/5971 [56:11<01:58,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.71it/s][A
Epoch 6:  97%|█████████▋| 5771/5971 [56:12<01:56,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.84it/s][A
Epoch 6:  97%|█████████▋| 5774/5971 [56:12<01:55,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.46it/s][A
Epoch 6:  97%|█████████▋| 5777/5971 [56:12<01:53,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.65it/s][A
Epoch 6:  97%|█████████▋| 5780/5971 [56:12<01:51,  1.71it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.53it/s][A
Epoch 6:  97%|█████████▋| 5783/5971 [56:12<01:49,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 27.60it/s][A
Epoch 6:  97%|█████████▋| 5786/5971 [56:12<01:47,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.51it/s][A
Epoch 6:  97%|█████████▋| 5789/5971 [56:12<01:46,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.81it/s][A
Epoch 6:  97%|█████████▋| 5793/5971 [56:12<01:43,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.22it/s][A
Epoch 6:  97%|█████████▋| 5797/5971 [56:13<01:41,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.78it/s][A
Epoch 6:  97%|█████████▋| 5801/5971 [56:13<01:38,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.13it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.80it/s][A
Epoch 6:  97%|█████████▋| 5805/5971 [56:13<01:36,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.34it/s][A
Epoch 6:  97%|█████████▋| 5809/5971 [56:13<01:34,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.13it/s][A
Epoch 6:  97%|█████████▋| 5813/5971 [56:13<01:31,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  51%|█████     | 85/167 [00:03<00:03, 24.99it/s][A

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 26.13it/s][A
Epoch 6:  97%|█████████▋| 5817/5971 [56:13<01:29,  1.72it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 27.22it/s][A
Epoch 6:  97%|█████████▋| 5821/5971 [56:13<01:26,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 28.63it/s][A
Epoch 6:  98%|█████████▊| 5825/5971 [56:14<01:24,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 28.02it/s][A
Epoch 6:  98%|█████████▊| 5829/5971 [56:14<01:22,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  61%|██████    | 102/167 [00:04<00:02, 28.23it/s][A
Epoch 6:  98%|█████████▊| 5833/5971 [56:14<01:19,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 28.38it/s][A
Epoch 6:  98%|█████████▊| 5837/5971 [56:14<01:17,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  66%|██████▌   | 110/167 [00:04<00:01, 28.97it/s][A
Epoch 6:  98%|█████████▊| 5841/5971 [56:14<01:15,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 28.02it/s][A

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.05it/s][A
Epoch 6:  98%|█████████▊| 5845/5971 [56:14<01:12,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 25.74it/s][A
Epoch 6:  98%|█████████▊| 5849/5971 [56:14<01:10,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.47it/s][A
Epoch 6:  98%|█████████▊| 5853/5971 [56:15<01:08,  1.73it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.57it/s][A

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.58it/s][A
Epoch 6:  98%|█████████▊| 5857/5971 [56:15<01:05,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.96it/s][A
Epoch 6:  98%|█████████▊| 5861/5971 [56:15<01:03,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.91it/s][A
Epoch 6:  98%|█████████▊| 5865/5971 [56:15<01:00,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.73it/s][A

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.12it/s][A
Epoch 6:  98%|█████████▊| 5869/5971 [56:15<00:58,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 26.20it/s][A
Epoch 6:  98%|█████████▊| 5873/5971 [56:15<00:56,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.02it/s][A
Epoch 6:  98%|█████████▊| 5877/5971 [56:16<00:53,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.61it/s][A

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.12it/s][A
Epoch 6:  98%|█████████▊| 5881/5971 [56:16<00:51,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.74it/s][A
Epoch 6:  99%|█████████▊| 5885/5971 [56:16<00:49,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 24.91it/s][A
Epoch 6:  99%|█████████▊| 5889/5971 [56:16<00:47,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.93it/s][A

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.33it/s][A
Epoch 6:  99%|█████████▊| 5893/5971 [56:16<00:44,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Validating: 100%|██████████| 167/167 [00:06<00:00, 25.56it/s][A
Epoch 6:  99%|█████████▊| 5896/5971 [56:17<00:42,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.79it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.66it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.95it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.16it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.42it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.55it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.61it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.63it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.70it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.71it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.71it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.66it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.70it/s][A
Epoch 6:  99%|█████████▊| 5896/5971 [56:27<00:43,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.68it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.32it/s]

Epoch 6:  99%|█████████▉| 5897/5971 [56:28<00:42,  1.74it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.44e-5, train/loss_step=0.00966, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5897/5971 [56:28<00:42,  1.74it/s, loss=0.117, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.47e-5, train/loss_step=0.015, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:30,  1.61it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:17,  2.77it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:13,  3.58it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  4.16it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.57it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.88it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.11it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.27it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.51it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.59it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.60it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.62it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.63it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:05<00:03,  5.62it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.62it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.64it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.64it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.64it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.66it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.35it/s]

Epoch 6:  99%|█████████▉| 5898/5971 [56:40<00:42,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.47e-5, train/loss_step=0.015, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5898/5971 [56:40<00:42,  1.73it/s, loss=0.136, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.0029, train/loss_step=0.414, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.87it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.99it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.16it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.47it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.52it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.59it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.64it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.64it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.65it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.60it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.56it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.62it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.42it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.46it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.55it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.54it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.55it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.60it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.60it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.25it/s]

Epoch 6:  99%|█████████▉| 5899/5971 [56:52<00:41,  1.73it/s, loss=0.136, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.0029, train/loss_step=0.414, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5899/5971 [56:52<00:41,  1.73it/s, loss=0.157, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00277, train/loss_step=0.406, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.40it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.76it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.01it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.19it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.50it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.55it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.59it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.62it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.55it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.34it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.29it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.25it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.24it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.24it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.23it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.26it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.50it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.59it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.66it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.66it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.64it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.65it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.20it/s]

Epoch 6:  99%|█████████▉| 5900/5971 [57:05<00:41,  1.72it/s, loss=0.157, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00277, train/loss_step=0.406, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5900/5971 [57:05<00:41,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000572, train/loss_step=0.168, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5901/5971 [57:06<00:40,  1.72it/s, loss=0.165, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000572, train/loss_step=0.168, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5901/5971 [57:06<00:40,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.000287, train/loss_step=0.0871, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5902/5971 [57:07<00:40,  1.72it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.000287, train/loss_step=0.0871, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5902/5971 [57:07<00:40,  1.72it/s, loss=0.167, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000618, train/loss_step=0.178, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  99%|█████████▉| 5903/5971 [57:08<00:39,  1.72it/s, loss=0.167, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000618, train/loss_step=0.178, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5903/5971 [57:08<00:39,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000136, train/loss_step=0.0364, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5904/5971 [57:10<00:38,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000136, train/loss_step=0.0364, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5904/5971 [57:10<00:38,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.19e-5, train/loss_step=0.00441, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5905/5971 [57:11<00:38,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.19e-5, train/loss_step=0.00441, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5905/5971 [57:11<00:38,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.46e-5, train/loss_step=0.00254, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5906/5971 [57:12<00:37,  1.72it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00254, train/loss_vlb_step=1.46e-5, train/loss_step=0.00254, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5906/5971 [57:12<00:37,  1.72it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000122, train/loss_step=0.0333, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  99%|█████████▉| 5907/5971 [57:13<00:37,  1.72it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.000122, train/loss_step=0.0333, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5907/5971 [57:13<00:37,  1.72it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.59e-5, train/loss_step=0.00529, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5908/5971 [57:15<00:36,  1.72it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.59e-5, train/loss_step=0.00529, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5908/5971 [57:15<00:36,  1.72it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00323, train/loss_vlb_step=1.75e-5, train/loss_step=0.00323, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5909/5971 [57:16<00:36,  1.72it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00323, train/loss_vlb_step=1.75e-5, train/loss_step=0.00323, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5909/5971 [57:16<00:36,  1.72it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00124, train/loss_vlb_step=7.47e-6, train/loss_step=0.00124, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5910/5971 [57:17<00:35,  1.72it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00124, train/loss_vlb_step=7.47e-6, train/loss_step=0.00124, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5910/5971 [57:17<00:35,  1.72it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.66e-5, train/loss_step=0.0218, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  99%|█████████▉| 5911/5971 [57:17<00:34,  1.72it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.66e-5, train/loss_step=0.0218, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5911/5971 [57:17<00:34,  1.72it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000138, train/loss_step=0.0382, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5912/5971 [57:20<00:34,  1.72it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000138, train/loss_step=0.0382, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5912/5971 [57:20<00:34,  1.72it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000572, train/loss_step=0.170, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  99%|█████████▉| 5913/5971 [57:20<00:33,  1.72it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000572, train/loss_step=0.170, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5913/5971 [57:20<00:33,  1.72it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000317, train/loss_step=0.096, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5914/5971 [57:21<00:33,  1.72it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.096, train/loss_vlb_step=0.000317, train/loss_step=0.096, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5914/5971 [57:21<00:33,  1.72it/s, loss=0.102, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00115, train/loss_step=0.286, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  99%|█████████▉| 5915/5971 [57:22<00:32,  1.72it/s, loss=0.102, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00115, train/loss_step=0.286, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5915/5971 [57:22<00:32,  1.72it/s, loss=0.113, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00127, train/loss_step=0.274, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5916/5971 [57:24<00:32,  1.72it/s, loss=0.113, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00127, train/loss_step=0.274, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5916/5971 [57:24<00:32,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000493, train/loss_step=0.143, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5917/5971 [57:25<00:31,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000493, train/loss_step=0.143, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5917/5971 [57:25<00:31,  1.72it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000196, train/loss_step=0.0581, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5918/5971 [57:26<00:30,  1.72it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0581, train/loss_vlb_step=0.000196, train/loss_step=0.0581, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5918/5971 [57:26<00:30,  1.72it/s, loss=0.108, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000519, train/loss_step=0.157, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  99%|█████████▉| 5919/5971 [57:27<00:30,  1.72it/s, loss=0.108, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000519, train/loss_step=0.157, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5919/5971 [57:27<00:30,  1.72it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000828, train/loss_step=0.225, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5920/5971 [57:29<00:29,  1.72it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000828, train/loss_step=0.225, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5920/5971 [57:29<00:29,  1.72it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.82e-5, train/loss_step=0.0232, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5921/5971 [57:30<00:29,  1.72it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.82e-5, train/loss_step=0.0232, global_step=4e+3, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5921/5971 [57:30<00:29,  1.72it/s, loss=0.088, v_num=0, train/loss_simple_step=0.00365, train/loss_vlb_step=1.97e-5, train/loss_step=0.00365, global_step=4006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5922/5971 [57:31<00:28,  1.72it/s, loss=0.088, v_num=0, train/loss_simple_step=0.00365, train/loss_vlb_step=1.97e-5, train/loss_step=0.00365, global_step=4006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5922/5971 [57:31<00:28,  1.72it/s, loss=0.0811, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.00015, train/loss_step=0.0404, global_step=4006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  99%|█████████▉| 5923/5971 [57:32<00:27,  1.72it/s, loss=0.0811, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.00015, train/loss_step=0.0404, global_step=4006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5923/5971 [57:32<00:27,  1.72it/s, loss=0.0843, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=4006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  99%|█████████▉| 5924/5971 [57:34<00:27,  1.72it/s, loss=0.0843, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=4006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5924/5971 [57:34<00:27,  1.72it/s, loss=0.0859, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000129, train/loss_step=0.0348, global_step=4006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5925/5971 [57:35<00:26,  1.71it/s, loss=0.0859, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000129, train/loss_step=0.0348, global_step=4006.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5925/5971 [57:35<00:26,  1.71it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000127, train/loss_step=0.0364, global_step=4007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5926/5971 [57:36<00:26,  1.71it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000127, train/loss_step=0.0364, global_step=4007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5926/5971 [57:36<00:26,  1.71it/s, loss=0.0912, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=4007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:  99%|█████████▉| 5927/5971 [57:37<00:25,  1.71it/s, loss=0.0912, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.00035, train/loss_step=0.106, global_step=4007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5927/5971 [57:37<00:25,  1.71it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=4007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5928/5971 [57:39<00:25,  1.71it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=4007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5928/5971 [57:39<00:25,  1.71it/s, loss=0.103, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000434, train/loss_step=0.132, global_step=4007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  99%|█████████▉| 5929/5971 [57:40<00:24,  1.71it/s, loss=0.103, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000434, train/loss_step=0.132, global_step=4007.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5929/5971 [57:40<00:24,  1.71it/s, loss=0.11, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=4008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6:  99%|█████████▉| 5930/5971 [57:41<00:23,  1.71it/s, loss=0.11, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=4008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5930/5971 [57:41<00:23,  1.71it/s, loss=0.14, v_num=0, train/loss_simple_step=0.630, train/loss_vlb_step=0.0176, train/loss_step=0.630, global_step=4008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  99%|█████████▉| 5931/5971 [57:42<00:23,  1.71it/s, loss=0.14, v_num=0, train/loss_simple_step=0.630, train/loss_vlb_step=0.0176, train/loss_step=0.630, global_step=4008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5931/5971 [57:42<00:23,  1.71it/s, loss=0.148, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000683, train/loss_step=0.189, global_step=4008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5932/5971 [57:44<00:22,  1.71it/s, loss=0.148, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000683, train/loss_step=0.189, global_step=4008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5932/5971 [57:44<00:22,  1.71it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00759, train/loss_vlb_step=3.7e-5, train/loss_step=0.00759, global_step=4008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5933/5971 [57:45<00:22,  1.71it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00759, train/loss_vlb_step=3.7e-5, train/loss_step=0.00759, global_step=4008.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5933/5971 [57:45<00:22,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.27e-5, train/loss_step=0.0177, global_step=4009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5934/5971 [57:46<00:21,  1.71it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.27e-5, train/loss_step=0.0177, global_step=4009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5934/5971 [57:46<00:21,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=9.87e-6, train/loss_step=0.00171, global_step=4009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5935/5971 [57:46<00:21,  1.71it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=9.87e-6, train/loss_step=0.00171, global_step=4009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5935/5971 [57:46<00:21,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00186, train/loss_step=0.375, global_step=4009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]    
Epoch 6:  99%|█████████▉| 5936/5971 [57:49<00:20,  1.71it/s, loss=0.127, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00186, train/loss_step=0.375, global_step=4009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5936/5971 [57:49<00:20,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.012, train/loss_step=0.712, global_step=4009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  99%|█████████▉| 5937/5971 [57:49<00:19,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.012, train/loss_step=0.712, global_step=4009.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5937/5971 [57:49<00:19,  1.71it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.96e-5, train/loss_step=0.0142, global_step=4010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5938/5971 [57:50<00:19,  1.71it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.96e-5, train/loss_step=0.0142, global_step=4010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5938/5971 [57:50<00:19,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00101, train/loss_step=0.258, global_step=4010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6:  99%|█████████▉| 5939/5971 [57:51<00:18,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00101, train/loss_step=0.258, global_step=4010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5939/5971 [57:51<00:18,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00194, train/loss_step=0.350, global_step=4010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5940/5971 [57:53<00:18,  1.71it/s, loss=0.164, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00194, train/loss_step=0.350, global_step=4010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5940/5971 [57:53<00:18,  1.71it/s, loss=0.172, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000592, train/loss_step=0.175, global_step=4010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5941/5971 [57:54<00:17,  1.71it/s, loss=0.172, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000592, train/loss_step=0.175, global_step=4010.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6:  99%|█████████▉| 5941/5971 [57:54<00:17,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000842, train/loss_step=0.217, global_step=4011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5942/5971 [57:55<00:16,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000842, train/loss_step=0.217, global_step=4011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5942/5971 [57:55<00:16,  1.71it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000221, train/loss_step=0.0658, global_step=4011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5943/5971 [57:56<00:16,  1.71it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.000221, train/loss_step=0.0658, global_step=4011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5943/5971 [57:56<00:16,  1.71it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0543, train/loss_vlb_step=0.000187, train/loss_step=0.0543, global_step=4011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5944/5971 [57:58<00:15,  1.71it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0543, train/loss_vlb_step=0.000187, train/loss_step=0.0543, global_step=4011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5944/5971 [57:58<00:15,  1.71it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00976, train/loss_vlb_step=4.54e-5, train/loss_step=0.00976, global_step=4011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5945/5971 [57:59<00:15,  1.71it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00976, train/loss_vlb_step=4.54e-5, train/loss_step=0.00976, global_step=4011.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5945/5971 [57:59<00:15,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0741, train/loss_vlb_step=0.000245, train/loss_step=0.0741, global_step=4012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5946/5971 [58:00<00:14,  1.71it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0741, train/loss_vlb_step=0.000245, train/loss_step=0.0741, global_step=4012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5946/5971 [58:00<00:14,  1.71it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00545, train/loss_vlb_step=2.78e-5, train/loss_step=0.00545, global_step=4012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5947/5971 [58:01<00:14,  1.71it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00545, train/loss_vlb_step=2.78e-5, train/loss_step=0.00545, global_step=4012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5947/5971 [58:01<00:14,  1.71it/s, loss=0.2, v_num=0, train/loss_simple_step=0.579, train/loss_vlb_step=0.010, train/loss_step=0.579, global_step=4012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]        
Epoch 6: 100%|█████████▉| 5948/5971 [58:03<00:13,  1.71it/s, loss=0.2, v_num=0, train/loss_simple_step=0.579, train/loss_vlb_step=0.010, train/loss_step=0.579, global_step=4012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5948/5971 [58:03<00:13,  1.71it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.83e-5, train/loss_step=0.0138, global_step=4012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5949/5971 [58:04<00:12,  1.71it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.83e-5, train/loss_step=0.0138, global_step=4012.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5949/5971 [58:04<00:12,  1.71it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00604, train/loss_vlb_step=3.06e-5, train/loss_step=0.00604, global_step=4013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5950/5971 [58:05<00:12,  1.71it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00604, train/loss_vlb_step=3.06e-5, train/loss_step=0.00604, global_step=4013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5950/5971 [58:05<00:12,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=0.000101, train/loss_step=0.0245, global_step=4013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6: 100%|█████████▉| 5951/5971 [58:06<00:11,  1.71it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=0.000101, train/loss_step=0.0245, global_step=4013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5951/5971 [58:06<00:11,  1.71it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.76e-5, train/loss_step=0.00315, global_step=4013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5952/5971 [58:08<00:11,  1.71it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.76e-5, train/loss_step=0.00315, global_step=4013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5952/5971 [58:08<00:11,  1.71it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00648, train/loss_vlb_step=3.22e-5, train/loss_step=0.00648, global_step=4013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5953/5971 [58:09<00:10,  1.71it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00648, train/loss_vlb_step=3.22e-5, train/loss_step=0.00648, global_step=4013.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5953/5971 [58:09<00:10,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0518, train/loss_vlb_step=0.000185, train/loss_step=0.0518, global_step=4014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6: 100%|█████████▉| 5954/5971 [58:10<00:09,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0518, train/loss_vlb_step=0.000185, train/loss_step=0.0518, global_step=4014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5954/5971 [58:10<00:09,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00589, train/loss_vlb_step=2.83e-5, train/loss_step=0.00589, global_step=4014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5955/5971 [58:11<00:09,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00589, train/loss_vlb_step=2.83e-5, train/loss_step=0.00589, global_step=4014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5955/5971 [58:11<00:09,  1.71it/s, loss=0.17, v_num=0, train/loss_simple_step=0.772, train/loss_vlb_step=0.0137, train/loss_step=0.772, global_step=4014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]     
Epoch 6: 100%|█████████▉| 5956/5971 [58:13<00:08,  1.71it/s, loss=0.17, v_num=0, train/loss_simple_step=0.772, train/loss_vlb_step=0.0137, train/loss_step=0.772, global_step=4014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5956/5971 [58:13<00:08,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.23e-5, train/loss_step=0.00223, global_step=4014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5957/5971 [58:14<00:08,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.23e-5, train/loss_step=0.00223, global_step=4014.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5957/5971 [58:14<00:08,  1.71it/s, loss=0.141, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000474, train/loss_step=0.143, global_step=4015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6: 100%|█████████▉| 5958/5971 [58:15<00:07,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000474, train/loss_step=0.143, global_step=4015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5958/5971 [58:15<00:07,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00121, train/loss_step=0.287, global_step=4015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6: 100%|█████████▉| 5959/5971 [58:15<00:07,  1.70it/s, loss=0.142, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00121, train/loss_step=0.287, global_step=4015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5959/5971 [58:15<00:07,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00193, train/loss_step=0.354, global_step=4015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5960/5971 [58:18<00:06,  1.70it/s, loss=0.143, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00193, train/loss_step=0.354, global_step=4015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5960/5971 [58:18<00:06,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.00023, train/loss_step=0.0649, global_step=4015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5961/5971 [58:19<00:05,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.00023, train/loss_step=0.0649, global_step=4015.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5961/5971 [58:19<00:05,  1.70it/s, loss=0.152, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.00364, train/loss_step=0.520, global_step=4016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6: 100%|█████████▉| 5962/5971 [58:19<00:05,  1.70it/s, loss=0.152, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.00364, train/loss_step=0.520, global_step=4016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5962/5971 [58:19<00:05,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00533, train/loss_step=0.550, global_step=4016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5963/5971 [58:20<00:04,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00533, train/loss_step=0.550, global_step=4016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5963/5971 [58:20<00:04,  1.70it/s, loss=0.175, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.36e-5, train/loss_step=0.018, global_step=4016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5964/5971 [58:22<00:04,  1.70it/s, loss=0.175, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.36e-5, train/loss_step=0.018, global_step=4016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5964/5971 [58:22<00:04,  1.70it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.36e-5, train/loss_step=0.00442, global_step=4016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5965/5971 [58:23<00:03,  1.70it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.36e-5, train/loss_step=0.00442, global_step=4016.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5965/5971 [58:23<00:03,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00609, train/loss_vlb_step=3.01e-5, train/loss_step=0.00609, global_step=4017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5966/5971 [58:24<00:02,  1.70it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00609, train/loss_vlb_step=3.01e-5, train/loss_step=0.00609, global_step=4017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5966/5971 [58:24<00:02,  1.70it/s, loss=0.177, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=4017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6: 100%|█████████▉| 5967/5971 [58:25<00:02,  1.70it/s, loss=0.177, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=4017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5967/5971 [58:25<00:02,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000402, train/loss_step=0.120, global_step=4017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5968/5971 [58:27<00:01,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000402, train/loss_step=0.120, global_step=4017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5968/5971 [58:27<00:01,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00367, train/loss_vlb_step=2.02e-5, train/loss_step=0.00367, global_step=4017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5969/5971 [58:28<00:01,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00367, train/loss_vlb_step=2.02e-5, train/loss_step=0.00367, global_step=4017.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5969/5971 [58:28<00:01,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000901, train/loss_step=0.202, global_step=4018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6: 100%|█████████▉| 5970/5971 [58:29<00:00,  1.70it/s, loss=0.164, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000901, train/loss_step=0.202, global_step=4018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|█████████▉| 5970/5971 [58:29<00:00,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.16e-5, train/loss_step=0.00676, global_step=4018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:30<00:00,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.16e-5, train/loss_step=0.00676, global_step=4018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:30<00:00,  1.70it/s, loss=0.2, v_num=0, train/loss_simple_step=0.740, train/loss_vlb_step=0.0321, train/loss_step=0.740, global_step=4018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]       
Epoch 6: 100%|██████████| 5971/5971 [58:32<00:00,  1.70it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00343, train/loss_vlb_step=1.8e-5, train/loss_step=0.00343, global_step=4018.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:33<00:00,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.26e-5, train/loss_step=0.0022, global_step=4019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:34<00:00,  1.70it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00417, train/loss_vlb_step=2.16e-5, train/loss_step=0.00417, global_step=4019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:35<00:00,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000169, train/loss_step=0.0468, global_step=4019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6: 100%|██████████| 5971/5971 [58:37<00:00,  1.70it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00658, train/loss_vlb_step=2.93e-5, train/loss_step=0.00658, global_step=4019.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:38<00:00,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.04e-5, train/loss_step=0.0018, global_step=4020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]  
Epoch 6: 100%|██████████| 5971/5971 [58:39<00:00,  1.70it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00732, train/loss_vlb_step=3.58e-5, train/loss_step=0.00732, global_step=4020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:40<00:00,  1.70it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000163, train/loss_step=0.0462, global_step=4020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:42<00:00,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.000124, train/loss_step=0.0332, global_step=4020.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:43<00:00,  1.69it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.00827, train/loss_vlb_step=3.74e-5, train/loss_step=0.00827, global_step=4021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:44<00:00,  1.69it/s, loss=0.0726, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.000188, train/loss_step=0.0537, global_step=4021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 6: 100%|██████████| 5971/5971 [58:45<00:00,  1.69it/s, loss=0.0718, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.04e-5, train/loss_step=0.00176, global_step=4021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:47<00:00,  1.69it/s, loss=0.0821, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000808, train/loss_step=0.210, global_step=4021.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6: 100%|██████████| 5971/5971 [58:48<00:00,  1.69it/s, loss=0.0952, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.001, train/loss_step=0.269, global_step=4022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6: 100%|██████████| 5971/5971 [58:48<00:00,  1.69it/s, loss=0.0908, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000176, train/loss_step=0.0491, global_step=4022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:49<00:00,  1.69it/s, loss=0.0895, v_num=0, train/loss_simple_step=0.0932, train/loss_vlb_step=0.000309, train/loss_step=0.0932, global_step=4022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:52<00:00,  1.69it/s, loss=0.09, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.38e-5, train/loss_step=0.0154, global_step=4022.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6: 100%|██████████| 5971/5971 [58:52<00:00,  1.69it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000381, train/loss_step=0.113, global_step=4023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:53<00:00,  1.69it/s, loss=0.0878, v_num=0, train/loss_simple_step=0.0504, train/loss_vlb_step=0.000174, train/loss_step=0.0504, global_step=4023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:54<00:00,  1.69it/s, loss=0.0528, v_num=0, train/loss_simple_step=0.0407, train/loss_vlb_step=0.000146, train/loss_step=0.0407, global_step=4023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:56<00:00,  1.69it/s, loss=0.0527, v_num=0, train/loss_simple_step=0.00143, train/loss_vlb_step=8.69e-6, train/loss_step=0.00143, global_step=4023.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 6: 100%|██████████| 5971/5971 [58:59<00:00,  1.69it/s, loss=0.0642, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000811, train/loss_step=0.231, global_step=4024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]   
Epoch 6:   0%|          | 0/5971 [00:00<00:00, 11305.40it/s, loss=0.0642, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000811, train/loss_step=0.231, global_step=4024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 7:   0%|          | 0/5971 [00:00<00:02, 2894.62it/s, loss=0.0642, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000811, train/loss_step=0.231, global_step=4024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151] 
Epoch 7:   0%|          | 1/5971 [00:01<1:37:30,  1.02it/s, loss=0.0642, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000811, train/loss_step=0.231, global_step=4024.0, train/loss_simple_epoch=0.151, train/loss_vlb_epoch=0.0027, train/loss_epoch=0.151]
Epoch 7:   0%|          | 1/5971 [00:01<1:37:34,  1.02it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000882, train/loss_step=0.238, global_step=4025.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 2/5971 [00:02<1:34:14,  1.06it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000882, train/loss_step=0.238, global_step=4025.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 2/5971 [00:02<1:34:15,  1.06it/s, loss=0.11, v_num=0, train/loss_simple_step=0.737, train/loss_vlb_step=0.0206, train/loss_step=0.737, global_step=4025.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:   0%|          | 3/5971 [00:03<1:32:25,  1.08it/s, loss=0.11, v_num=0, train/loss_simple_step=0.737, train/loss_vlb_step=0.0206, train/loss_step=0.737, global_step=4025.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 3/5971 [00:03<1:32:27,  1.08it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000285, train/loss_step=0.0864, global_step=4025.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 4/5971 [00:05<1:55:29,  1.16s/it, loss=0.114, v_num=0, train/loss_simple_step=0.0864, train/loss_vlb_step=0.000285, train/loss_step=0.0864, global_step=4025.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 4/5971 [00:05<1:55:30,  1.16s/it, loss=0.133, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00191, train/loss_step=0.382, global_step=4025.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   0%|          | 5/5971 [00:06<1:51:02,  1.12s/it, loss=0.133, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00191, train/loss_step=0.382, global_step=4025.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 5/5971 [00:06<1:51:03,  1.12s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.97e-5, train/loss_step=0.0131, global_step=4026.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 6/5971 [00:07<1:47:27,  1.08s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.97e-5, train/loss_step=0.0131, global_step=4026.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 6/5971 [00:07<1:47:28,  1.08s/it, loss=0.137, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000363, train/loss_step=0.109, global_step=4026.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   0%|          | 7/5971 [00:08<1:44:48,  1.05s/it, loss=0.137, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000363, train/loss_step=0.109, global_step=4026.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 7/5971 [00:08<1:44:48,  1.05s/it, loss=0.146, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000747, train/loss_step=0.210, global_step=4026.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 8/5971 [00:10<1:56:44,  1.17s/it, loss=0.146, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000747, train/loss_step=0.210, global_step=4026.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 8/5971 [00:10<1:56:45,  1.17s/it, loss=0.151, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=4026.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 9/5971 [00:11<1:53:53,  1.15s/it, loss=0.151, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000371, train/loss_step=0.113, global_step=4026.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 9/5971 [00:11<1:53:54,  1.15s/it, loss=0.155, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=4027.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 10/5971 [00:12<1:51:31,  1.12s/it, loss=0.155, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000437, train/loss_step=0.133, global_step=4027.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 10/5971 [00:12<1:51:31,  1.12s/it, loss=0.163, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000579, train/loss_step=0.160, global_step=4027.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 11/5971 [00:13<1:49:23,  1.10s/it, loss=0.163, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000579, train/loss_step=0.160, global_step=4027.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 11/5971 [00:13<1:49:24,  1.10s/it, loss=0.154, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000129, train/loss_step=0.0368, global_step=4027.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 12/5971 [00:15<1:59:07,  1.20s/it, loss=0.154, v_num=0, train/loss_simple_step=0.0368, train/loss_vlb_step=0.000129, train/loss_step=0.0368, global_step=4027.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 12/5971 [00:15<1:59:08,  1.20s/it, loss=0.141, v_num=0, train/loss_simple_step=0.00446, train/loss_vlb_step=2.43e-5, train/loss_step=0.00446, global_step=4027.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 13/5971 [00:16<1:57:03,  1.18s/it, loss=0.141, v_num=0, train/loss_simple_step=0.00446, train/loss_vlb_step=2.43e-5, train/loss_step=0.00446, global_step=4027.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 13/5971 [00:16<1:57:03,  1.18s/it, loss=0.139, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.15e-5, train/loss_step=0.018, global_step=4028.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:   0%|          | 14/5971 [00:17<1:54:59,  1.16s/it, loss=0.139, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.15e-5, train/loss_step=0.018, global_step=4028.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 14/5971 [00:17<1:54:59,  1.16s/it, loss=0.166, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.0166, train/loss_step=0.624, global_step=4028.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   0%|          | 15/5971 [00:18<1:53:06,  1.14s/it, loss=0.166, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.0166, train/loss_step=0.624, global_step=4028.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 15/5971 [00:18<1:53:07,  1.14s/it, loss=0.167, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000117, train/loss_step=0.0301, global_step=4028.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 16/5971 [00:20<1:59:31,  1.20s/it, loss=0.167, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000117, train/loss_step=0.0301, global_step=4028.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 16/5971 [00:20<1:59:32,  1.20s/it, loss=0.186, v_num=0, train/loss_simple_step=0.505, train/loss_vlb_step=0.00337, train/loss_step=0.505, global_step=4028.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   0%|          | 17/5971 [00:21<1:57:50,  1.19s/it, loss=0.186, v_num=0, train/loss_simple_step=0.505, train/loss_vlb_step=0.00337, train/loss_step=0.505, global_step=4028.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 17/5971 [00:21<1:57:50,  1.19s/it, loss=0.198, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00146, train/loss_step=0.296, global_step=4029.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 18/5971 [00:22<1:56:11,  1.17s/it, loss=0.198, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00146, train/loss_step=0.296, global_step=4029.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 18/5971 [00:22<1:56:11,  1.17s/it, loss=0.202, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=4029.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 19/5971 [00:23<1:54:43,  1.16s/it, loss=0.202, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=4029.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 19/5971 [00:23<1:54:43,  1.16s/it, loss=0.206, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000294, train/loss_step=0.0894, global_step=4029.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 20/5971 [00:25<1:59:52,  1.21s/it, loss=0.206, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000294, train/loss_step=0.0894, global_step=4029.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 20/5971 [00:25<1:59:52,  1.21s/it, loss=0.195, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.8e-5, train/loss_step=0.0176, global_step=4029.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   0%|          | 21/5971 [00:26<1:58:24,  1.19s/it, loss=0.195, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.8e-5, train/loss_step=0.0176, global_step=4029.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 21/5971 [00:26<1:58:25,  1.19s/it, loss=0.201, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00224, train/loss_step=0.359, global_step=4030.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   0%|          | 22/5971 [00:27<1:56:57,  1.18s/it, loss=0.201, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00224, train/loss_step=0.359, global_step=4030.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 22/5971 [00:27<1:56:57,  1.18s/it, loss=0.17, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000347, train/loss_step=0.106, global_step=4030.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 23/5971 [00:28<1:55:42,  1.17s/it, loss=0.17, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000347, train/loss_step=0.106, global_step=4030.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 23/5971 [00:28<1:55:42,  1.17s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.000286, train/loss_step=0.0871, global_step=4030.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 24/5971 [00:30<1:59:26,  1.21s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0871, train/loss_vlb_step=0.000286, train/loss_step=0.0871, global_step=4030.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 24/5971 [00:30<1:59:26,  1.21s/it, loss=0.197, v_num=0, train/loss_simple_step=0.932, train/loss_vlb_step=0.235, train/loss_step=0.932, global_step=4030.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:   0%|          | 25/5971 [00:31<1:58:15,  1.19s/it, loss=0.197, v_num=0, train/loss_simple_step=0.932, train/loss_vlb_step=0.235, train/loss_step=0.932, global_step=4030.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 25/5971 [00:31<1:58:15,  1.19s/it, loss=0.197, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.69e-5, train/loss_step=0.00517, global_step=4031.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 26/5971 [00:31<1:57:02,  1.18s/it, loss=0.197, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.69e-5, train/loss_step=0.00517, global_step=4031.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 26/5971 [00:31<1:57:02,  1.18s/it, loss=0.194, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000198, train/loss_step=0.0561, global_step=4031.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   0%|          | 27/5971 [00:32<1:55:53,  1.17s/it, loss=0.194, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000198, train/loss_step=0.0561, global_step=4031.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 27/5971 [00:32<1:55:53,  1.17s/it, loss=0.184, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.27e-5, train/loss_step=0.00435, global_step=4031.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 28/5971 [00:34<1:59:01,  1.20s/it, loss=0.184, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.27e-5, train/loss_step=0.00435, global_step=4031.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 28/5971 [00:34<1:59:02,  1.20s/it, loss=0.18, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000113, train/loss_step=0.0289, global_step=4031.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   0%|          | 29/5971 [00:35<1:58:01,  1.19s/it, loss=0.18, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000113, train/loss_step=0.0289, global_step=4031.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 29/5971 [00:35<1:58:01,  1.19s/it, loss=0.181, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000546, train/loss_step=0.151, global_step=4032.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|          | 30/5971 [00:36<1:56:55,  1.18s/it, loss=0.181, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000546, train/loss_step=0.151, global_step=4032.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 30/5971 [00:36<1:56:55,  1.18s/it, loss=0.178, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=4032.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 31/5971 [00:37<1:55:56,  1.17s/it, loss=0.178, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=4032.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 31/5971 [00:37<1:55:56,  1.17s/it, loss=0.199, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00445, train/loss_step=0.453, global_step=4032.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|          | 32/5971 [00:39<1:59:54,  1.21s/it, loss=0.199, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00445, train/loss_step=0.453, global_step=4032.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 32/5971 [00:39<1:59:54,  1.21s/it, loss=0.2, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.02e-5, train/loss_step=0.0197, global_step=4032.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 33/5971 [00:40<1:58:59,  1.20s/it, loss=0.2, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.02e-5, train/loss_step=0.0197, global_step=4032.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 33/5971 [00:40<1:58:59,  1.20s/it, loss=0.223, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00471, train/loss_step=0.486, global_step=4033.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 34/5971 [00:41<1:58:05,  1.19s/it, loss=0.223, v_num=0, train/loss_simple_step=0.486, train/loss_vlb_step=0.00471, train/loss_step=0.486, global_step=4033.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 34/5971 [00:41<1:58:05,  1.19s/it, loss=0.192, v_num=0, train/loss_simple_step=0.00681, train/loss_vlb_step=3.39e-5, train/loss_step=0.00681, global_step=4033.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 35/5971 [00:42<1:57:10,  1.18s/it, loss=0.192, v_num=0, train/loss_simple_step=0.00681, train/loss_vlb_step=3.39e-5, train/loss_step=0.00681, global_step=4033.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 35/5971 [00:42<1:57:11,  1.18s/it, loss=0.191, v_num=0, train/loss_simple_step=0.0088, train/loss_vlb_step=4e-5, train/loss_step=0.0088, global_step=4033.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:   1%|          | 36/5971 [00:44<1:59:33,  1.21s/it, loss=0.191, v_num=0, train/loss_simple_step=0.0088, train/loss_vlb_step=4e-5, train/loss_step=0.0088, global_step=4033.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 36/5971 [00:44<1:59:33,  1.21s/it, loss=0.167, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000103, train/loss_step=0.0264, global_step=4033.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 37/5971 [00:45<1:58:46,  1.20s/it, loss=0.167, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.000103, train/loss_step=0.0264, global_step=4033.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 37/5971 [00:45<1:58:46,  1.20s/it, loss=0.168, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00139, train/loss_step=0.307, global_step=4034.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   1%|          | 38/5971 [00:46<1:57:55,  1.19s/it, loss=0.168, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00139, train/loss_step=0.307, global_step=4034.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 38/5971 [00:46<1:57:56,  1.19s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.31e-5, train/loss_step=0.0197, global_step=4034.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 39/5971 [00:47<1:57:08,  1.18s/it, loss=0.164, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.31e-5, train/loss_step=0.0197, global_step=4034.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 39/5971 [00:47<1:57:08,  1.18s/it, loss=0.169, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000739, train/loss_step=0.192, global_step=4034.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|          | 40/5971 [00:49<2:00:07,  1.22s/it, loss=0.169, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000739, train/loss_step=0.192, global_step=4034.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 40/5971 [00:49<2:00:07,  1.22s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000143, train/loss_step=0.0393, global_step=4034.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 41/5971 [00:50<1:59:20,  1.21s/it, loss=0.17, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000143, train/loss_step=0.0393, global_step=4034.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 41/5971 [00:50<1:59:20,  1.21s/it, loss=0.152, v_num=0, train/loss_simple_step=0.00119, train/loss_vlb_step=7.18e-6, train/loss_step=0.00119, global_step=4035.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 42/5971 [00:51<1:58:33,  1.20s/it, loss=0.152, v_num=0, train/loss_simple_step=0.00119, train/loss_vlb_step=7.18e-6, train/loss_step=0.00119, global_step=4035.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 42/5971 [00:51<1:58:33,  1.20s/it, loss=0.152, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000351, train/loss_step=0.107, global_step=4035.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   1%|          | 43/5971 [00:52<1:57:48,  1.19s/it, loss=0.152, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000351, train/loss_step=0.107, global_step=4035.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 43/5971 [00:52<1:57:48,  1.19s/it, loss=0.175, v_num=0, train/loss_simple_step=0.553, train/loss_vlb_step=0.0101, train/loss_step=0.553, global_step=4035.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   1%|          | 44/5971 [00:54<1:59:55,  1.21s/it, loss=0.175, v_num=0, train/loss_simple_step=0.553, train/loss_vlb_step=0.0101, train/loss_step=0.553, global_step=4035.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 44/5971 [00:54<1:59:55,  1.21s/it, loss=0.135, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000478, train/loss_step=0.139, global_step=4035.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 45/5971 [00:55<1:59:12,  1.21s/it, loss=0.135, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000478, train/loss_step=0.139, global_step=4035.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 45/5971 [00:55<1:59:13,  1.21s/it, loss=0.137, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000162, train/loss_step=0.0448, global_step=4036.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 46/5971 [00:56<1:58:29,  1.20s/it, loss=0.137, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000162, train/loss_step=0.0448, global_step=4036.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 46/5971 [00:56<1:58:29,  1.20s/it, loss=0.136, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.000106, train/loss_step=0.0281, global_step=4036.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 47/5971 [00:57<1:57:46,  1.19s/it, loss=0.136, v_num=0, train/loss_simple_step=0.0281, train/loss_vlb_step=0.000106, train/loss_step=0.0281, global_step=4036.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 47/5971 [00:57<1:57:46,  1.19s/it, loss=0.136, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.42e-6, train/loss_step=0.0014, global_step=4036.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|          | 48/5971 [00:59<1:59:44,  1.21s/it, loss=0.136, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.42e-6, train/loss_step=0.0014, global_step=4036.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 48/5971 [00:59<1:59:44,  1.21s/it, loss=0.152, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00158, train/loss_step=0.343, global_step=4036.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   1%|          | 49/5971 [01:00<1:59:07,  1.21s/it, loss=0.152, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00158, train/loss_step=0.343, global_step=4036.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 49/5971 [01:00<1:59:07,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.63e-5, train/loss_step=0.00531, global_step=4037.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 50/5971 [01:01<1:58:25,  1.20s/it, loss=0.144, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.63e-5, train/loss_step=0.00531, global_step=4037.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 50/5971 [01:01<1:58:25,  1.20s/it, loss=0.14, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.66e-5, train/loss_step=0.0128, global_step=4037.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   1%|          | 51/5971 [01:02<1:57:47,  1.19s/it, loss=0.14, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.66e-5, train/loss_step=0.0128, global_step=4037.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 51/5971 [01:02<1:57:47,  1.19s/it, loss=0.12, v_num=0, train/loss_simple_step=0.0633, train/loss_vlb_step=0.000213, train/loss_step=0.0633, global_step=4037.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 52/5971 [01:04<1:59:35,  1.21s/it, loss=0.12, v_num=0, train/loss_simple_step=0.0633, train/loss_vlb_step=0.000213, train/loss_step=0.0633, global_step=4037.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 52/5971 [01:04<1:59:35,  1.21s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.00018, train/loss_step=0.0531, global_step=4037.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 53/5971 [01:05<1:59:00,  1.21s/it, loss=0.122, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.00018, train/loss_step=0.0531, global_step=4037.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 53/5971 [01:05<1:59:00,  1.21s/it, loss=0.122, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.0032, train/loss_step=0.479, global_step=4038.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   1%|          | 54/5971 [01:06<1:58:23,  1.20s/it, loss=0.122, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.0032, train/loss_step=0.479, global_step=4038.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 54/5971 [01:06<1:58:23,  1.20s/it, loss=0.122, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.95e-5, train/loss_step=0.022, global_step=4038.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 55/5971 [01:06<1:57:48,  1.19s/it, loss=0.122, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.95e-5, train/loss_step=0.022, global_step=4038.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 55/5971 [01:06<1:57:48,  1.19s/it, loss=0.133, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000747, train/loss_step=0.222, global_step=4038.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 56/5971 [01:09<1:59:25,  1.21s/it, loss=0.133, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000747, train/loss_step=0.222, global_step=4038.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 56/5971 [01:09<1:59:25,  1.21s/it, loss=0.145, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00109, train/loss_step=0.268, global_step=4038.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|          | 57/5971 [01:09<1:58:52,  1.21s/it, loss=0.145, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00109, train/loss_step=0.268, global_step=4038.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 57/5971 [01:09<1:58:52,  1.21s/it, loss=0.13, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.31e-5, train/loss_step=0.00232, global_step=4039.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 58/5971 [01:10<1:58:19,  1.20s/it, loss=0.13, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.31e-5, train/loss_step=0.00232, global_step=4039.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 58/5971 [01:10<1:58:19,  1.20s/it, loss=0.151, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00289, train/loss_step=0.453, global_step=4039.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   1%|          | 59/5971 [01:11<1:57:45,  1.20s/it, loss=0.151, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00289, train/loss_step=0.453, global_step=4039.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 59/5971 [01:11<1:57:45,  1.20s/it, loss=0.142, v_num=0, train/loss_simple_step=0.00313, train/loss_vlb_step=1.51e-5, train/loss_step=0.00313, global_step=4039.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 60/5971 [01:14<1:59:52,  1.22s/it, loss=0.142, v_num=0, train/loss_simple_step=0.00313, train/loss_vlb_step=1.51e-5, train/loss_step=0.00313, global_step=4039.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 60/5971 [01:14<1:59:52,  1.22s/it, loss=0.159, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00195, train/loss_step=0.375, global_step=4039.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:   1%|          | 61/5971 [01:15<1:59:22,  1.21s/it, loss=0.159, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00195, train/loss_step=0.375, global_step=4039.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 61/5971 [01:15<1:59:22,  1.21s/it, loss=0.178, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00206, train/loss_step=0.394, global_step=4040.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 62/5971 [01:16<1:58:52,  1.21s/it, loss=0.178, v_num=0, train/loss_simple_step=0.394, train/loss_vlb_step=0.00206, train/loss_step=0.394, global_step=4040.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 62/5971 [01:16<1:58:52,  1.21s/it, loss=0.178, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000344, train/loss_step=0.102, global_step=4040.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 63/5971 [01:16<1:58:23,  1.20s/it, loss=0.178, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000344, train/loss_step=0.102, global_step=4040.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 63/5971 [01:16<1:58:23,  1.20s/it, loss=0.153, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000141, train/loss_step=0.0391, global_step=4040.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 64/5971 [01:19<1:59:41,  1.22s/it, loss=0.153, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000141, train/loss_step=0.0391, global_step=4040.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 64/5971 [01:19<1:59:41,  1.22s/it, loss=0.146, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.68e-5, train/loss_step=0.016, global_step=4040.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   1%|          | 65/5971 [01:19<1:59:11,  1.21s/it, loss=0.146, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.68e-5, train/loss_step=0.016, global_step=4040.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 65/5971 [01:19<1:59:11,  1.21s/it, loss=0.152, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000536, train/loss_step=0.151, global_step=4041.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 66/5971 [01:20<1:58:40,  1.21s/it, loss=0.152, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000536, train/loss_step=0.151, global_step=4041.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 66/5971 [01:20<1:58:41,  1.21s/it, loss=0.16, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000701, train/loss_step=0.198, global_step=4041.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|          | 67/5971 [01:21<1:58:13,  1.20s/it, loss=0.16, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000701, train/loss_step=0.198, global_step=4041.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 67/5971 [01:21<1:58:13,  1.20s/it, loss=0.166, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=4041.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 68/5971 [01:23<1:59:34,  1.22s/it, loss=0.166, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=4041.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 68/5971 [01:23<1:59:34,  1.22s/it, loss=0.15, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.67e-5, train/loss_step=0.0234, global_step=4041.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 69/5971 [01:24<1:59:07,  1.21s/it, loss=0.15, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.67e-5, train/loss_step=0.0234, global_step=4041.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 69/5971 [01:24<1:59:07,  1.21s/it, loss=0.162, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000872, train/loss_step=0.243, global_step=4042.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 70/5971 [01:25<1:58:38,  1.21s/it, loss=0.162, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000872, train/loss_step=0.243, global_step=4042.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 70/5971 [01:25<1:58:38,  1.21s/it, loss=0.173, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000897, train/loss_step=0.242, global_step=4042.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 71/5971 [01:26<1:58:09,  1.20s/it, loss=0.173, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000897, train/loss_step=0.242, global_step=4042.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 71/5971 [01:26<1:58:09,  1.20s/it, loss=0.178, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000527, train/loss_step=0.158, global_step=4042.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 72/5971 [01:28<1:59:20,  1.21s/it, loss=0.178, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000527, train/loss_step=0.158, global_step=4042.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 72/5971 [01:28<1:59:20,  1.21s/it, loss=0.176, v_num=0, train/loss_simple_step=0.00424, train/loss_vlb_step=2.22e-5, train/loss_step=0.00424, global_step=4042.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 73/5971 [01:29<1:58:52,  1.21s/it, loss=0.176, v_num=0, train/loss_simple_step=0.00424, train/loss_vlb_step=2.22e-5, train/loss_step=0.00424, global_step=4042.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 73/5971 [01:29<1:58:52,  1.21s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.2e-5, train/loss_step=0.0147, global_step=4043.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   1%|          | 74/5971 [01:30<1:58:24,  1.20s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.2e-5, train/loss_step=0.0147, global_step=4043.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|          | 74/5971 [01:30<1:58:24,  1.20s/it, loss=0.157, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000353, train/loss_step=0.104, global_step=4043.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 75/5971 [01:31<1:57:56,  1.20s/it, loss=0.157, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000353, train/loss_step=0.104, global_step=4043.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 75/5971 [01:31<1:57:57,  1.20s/it, loss=0.166, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00526, train/loss_step=0.412, global_step=4043.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|▏         | 76/5971 [01:33<1:59:03,  1.21s/it, loss=0.166, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00526, train/loss_step=0.412, global_step=4043.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 76/5971 [01:33<1:59:03,  1.21s/it, loss=0.156, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.00022, train/loss_step=0.0658, global_step=4043.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 77/5971 [01:34<1:58:37,  1.21s/it, loss=0.156, v_num=0, train/loss_simple_step=0.0658, train/loss_vlb_step=0.00022, train/loss_step=0.0658, global_step=4043.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 77/5971 [01:34<1:58:37,  1.21s/it, loss=0.156, v_num=0, train/loss_simple_step=0.00601, train/loss_vlb_step=2.92e-5, train/loss_step=0.00601, global_step=4044.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 78/5971 [01:35<1:58:10,  1.20s/it, loss=0.156, v_num=0, train/loss_simple_step=0.00601, train/loss_vlb_step=2.92e-5, train/loss_step=0.00601, global_step=4044.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 78/5971 [01:35<1:58:10,  1.20s/it, loss=0.134, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.77e-5, train/loss_step=0.00317, global_step=4044.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 79/5971 [01:35<1:57:43,  1.20s/it, loss=0.134, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.77e-5, train/loss_step=0.00317, global_step=4044.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 79/5971 [01:35<1:57:44,  1.20s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.65e-5, train/loss_step=0.0148, global_step=4044.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   1%|▏         | 80/5971 [01:38<1:59:19,  1.22s/it, loss=0.134, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.65e-5, train/loss_step=0.0148, global_step=4044.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 80/5971 [01:38<1:59:19,  1.22s/it, loss=0.128, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000968, train/loss_step=0.255, global_step=4044.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|▏         | 81/5971 [01:39<1:58:55,  1.21s/it, loss=0.128, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.000968, train/loss_step=0.255, global_step=4044.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 81/5971 [01:39<1:58:55,  1.21s/it, loss=0.121, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00116, train/loss_step=0.249, global_step=4045.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|▏         | 82/5971 [01:40<1:58:30,  1.21s/it, loss=0.121, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.00116, train/loss_step=0.249, global_step=4045.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 82/5971 [01:40<1:58:31,  1.21s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000159, train/loss_step=0.0456, global_step=4045.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 83/5971 [01:41<1:58:05,  1.20s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000159, train/loss_step=0.0456, global_step=4045.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 83/5971 [01:41<1:58:05,  1.20s/it, loss=0.123, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000474, train/loss_step=0.143, global_step=4045.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   1%|▏         | 84/5971 [01:43<1:59:06,  1.21s/it, loss=0.123, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000474, train/loss_step=0.143, global_step=4045.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 84/5971 [01:43<1:59:06,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00545, train/loss_step=0.428, global_step=4045.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|▏         | 85/5971 [01:44<1:58:47,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00545, train/loss_step=0.428, global_step=4045.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 85/5971 [01:44<1:58:48,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000511, train/loss_step=0.152, global_step=4046.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 86/5971 [01:45<1:58:23,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000511, train/loss_step=0.152, global_step=4046.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 86/5971 [01:45<1:58:24,  1.21s/it, loss=0.151, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00164, train/loss_step=0.339, global_step=4046.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|▏         | 87/5971 [01:45<1:58:00,  1.20s/it, loss=0.151, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00164, train/loss_step=0.339, global_step=4046.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 87/5971 [01:45<1:58:00,  1.20s/it, loss=0.146, v_num=0, train/loss_simple_step=0.00733, train/loss_vlb_step=3.47e-5, train/loss_step=0.00733, global_step=4046.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 88/5971 [01:48<1:59:14,  1.22s/it, loss=0.146, v_num=0, train/loss_simple_step=0.00733, train/loss_vlb_step=3.47e-5, train/loss_step=0.00733, global_step=4046.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 88/5971 [01:48<1:59:14,  1.22s/it, loss=0.147, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000184, train/loss_step=0.0528, global_step=4046.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   1%|▏         | 89/5971 [01:49<1:58:51,  1.21s/it, loss=0.147, v_num=0, train/loss_simple_step=0.0528, train/loss_vlb_step=0.000184, train/loss_step=0.0528, global_step=4046.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   1%|▏         | 89/5971 [01:49<1:58:51,  1.21s/it, loss=0.14, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=4047.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   2%|▏         | 90/5971 [01:49<1:58:28,  1.21s/it, loss=0.14, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=4047.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 90/5971 [01:50<1:58:28,  1.21s/it, loss=0.129, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=5.73e-5, train/loss_step=0.0143, global_step=4047.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 91/5971 [01:50<1:58:06,  1.21s/it, loss=0.129, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=5.73e-5, train/loss_step=0.0143, global_step=4047.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 91/5971 [01:50<1:58:06,  1.21s/it, loss=0.123, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000153, train/loss_step=0.0419, global_step=4047.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 92/5971 [01:53<1:59:14,  1.22s/it, loss=0.123, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000153, train/loss_step=0.0419, global_step=4047.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 92/5971 [01:53<1:59:14,  1.22s/it, loss=0.123, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.57e-5, train/loss_step=0.00286, global_step=4047.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 93/5971 [01:54<1:58:52,  1.21s/it, loss=0.123, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.57e-5, train/loss_step=0.00286, global_step=4047.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 93/5971 [01:54<1:58:52,  1.21s/it, loss=0.128, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000361, train/loss_step=0.108, global_step=4048.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   2%|▏         | 94/5971 [01:54<1:58:29,  1.21s/it, loss=0.128, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000361, train/loss_step=0.108, global_step=4048.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 94/5971 [01:54<1:58:29,  1.21s/it, loss=0.141, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00323, train/loss_step=0.365, global_step=4048.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   2%|▏         | 95/5971 [01:55<1:58:08,  1.21s/it, loss=0.141, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.00323, train/loss_step=0.365, global_step=4048.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 95/5971 [01:55<1:58:08,  1.21s/it, loss=0.144, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00625, train/loss_step=0.471, global_step=4048.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 96/5971 [01:58<1:59:07,  1.22s/it, loss=0.144, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00625, train/loss_step=0.471, global_step=4048.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 96/5971 [01:58<1:59:07,  1.22s/it, loss=0.143, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.00021, train/loss_step=0.0609, global_step=4048.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 97/5971 [01:58<1:58:46,  1.21s/it, loss=0.143, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.00021, train/loss_step=0.0609, global_step=4048.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 97/5971 [01:58<1:58:46,  1.21s/it, loss=0.18, v_num=0, train/loss_simple_step=0.743, train/loss_vlb_step=0.0351, train/loss_step=0.743, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:   2%|▏         | 98/5971 [01:59<1:58:24,  1.21s/it, loss=0.18, v_num=0, train/loss_simple_step=0.743, train/loss_vlb_step=0.0351, train/loss_step=0.743, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 98/5971 [01:59<1:58:24,  1.21s/it, loss=0.181, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.89e-5, train/loss_step=0.0252, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 99/5971 [02:00<1:58:03,  1.21s/it, loss=0.181, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.89e-5, train/loss_step=0.0252, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 99/5971 [02:00<1:58:03,  1.21s/it, loss=0.196, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00145, train/loss_step=0.309, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   2%|▏         | 100/5971 [02:02<1:58:55,  1.22s/it, loss=0.196, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00145, train/loss_step=0.309, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   2%|▏         | 100/5971 [02:02<1:58:55,  1.22s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:21,  2.04it/s][A
Epoch 7:   2%|▏         | 102/5971 [02:03<1:57:04,  1.20s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:50,  3.25it/s][A
Epoch 7:   2%|▏         | 104/5971 [02:03<1:55:00,  1.18s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.29it/s][A
Epoch 7:   2%|▏         | 107/5971 [02:03<1:51:53,  1.14s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.49it/s][A
Epoch 7:   2%|▏         | 110/5971 [02:03<1:48:54,  1.11s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:01<00:09, 16.40it/s][A
Epoch 7:   2%|▏         | 113/5971 [02:03<1:46:04,  1.09s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.57it/s][A
Epoch 7:   2%|▏         | 116/5971 [02:03<1:43:23,  1.06s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.31it/s][A
Epoch 7:   2%|▏         | 119/5971 [02:04<1:40:52,  1.03s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.70it/s][A
Epoch 7:   2%|▏         | 122/5971 [02:04<1:38:27,  1.01s/it, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.88it/s][A
Epoch 7:   2%|▏         | 125/5971 [02:04<1:36:08,  1.01it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.70it/s][A
Epoch 7:   2%|▏         | 128/5971 [02:04<1:33:56,  1.04it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.13it/s][A
Epoch 7:   2%|▏         | 131/5971 [02:04<1:31:51,  1.06it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.21it/s][A
Epoch 7:   2%|▏         | 134/5971 [02:04<1:29:50,  1.08it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:01<00:04, 26.42it/s][A
Epoch 7:   2%|▏         | 137/5971 [02:04<1:27:54,  1.11it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.97it/s][A
Epoch 7:   2%|▏         | 140/5971 [02:04<1:26:04,  1.13it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.70it/s][A
Epoch 7:   2%|▏         | 143/5971 [02:04<1:24:18,  1.15it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 27.56it/s][A
Epoch 7:   2%|▏         | 146/5971 [02:05<1:22:37,  1.18it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 28.10it/s][A
Epoch 7:   3%|▎         | 150/5971 [02:05<1:20:27,  1.21it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 51/167 [00:02<00:04, 28.68it/s][A
Epoch 7:   3%|▎         | 154/5971 [02:05<1:18:24,  1.24it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:03, 28.57it/s][A

Validating:  34%|███▍      | 57/167 [00:02<00:03, 28.91it/s][A
Epoch 7:   3%|▎         | 158/5971 [02:05<1:16:28,  1.27it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 28.20it/s][A
Epoch 7:   3%|▎         | 162/5971 [02:05<1:14:37,  1.30it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.68it/s][A
Epoch 7:   3%|▎         | 166/5971 [02:05<1:12:54,  1.33it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.26it/s][A

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.06it/s][A
Epoch 7:   3%|▎         | 170/5971 [02:05<1:11:13,  1.36it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.92it/s][A
Epoch 7:   3%|▎         | 174/5971 [02:06<1:09:38,  1.39it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.86it/s][A
Epoch 7:   3%|▎         | 178/5971 [02:06<1:08:06,  1.42it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.71it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.12it/s][A
Epoch 7:   3%|▎         | 182/5971 [02:06<1:06:39,  1.45it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|█████     | 84/167 [00:03<00:03, 27.37it/s][A
Epoch 7:   3%|▎         | 186/5971 [02:06<1:05:15,  1.48it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 27.58it/s][A
Epoch 7:   3%|▎         | 190/5971 [02:06<1:03:55,  1.51it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 27.40it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.96it/s][A
Epoch 7:   3%|▎         | 194/5971 [02:06<1:02:38,  1.54it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.47it/s][A
Epoch 7:   3%|▎         | 198/5971 [02:07<1:01:24,  1.57it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.74it/s][A
Epoch 7:   3%|▎         | 202/5971 [02:07<1:00:13,  1.60it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.63it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.89it/s][A
Epoch 7:   3%|▎         | 206/5971 [02:07<59:05,  1.63it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 27.83it/s][A
Epoch 7:   4%|▎         | 210/5971 [02:07<57:59,  1.66it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.15it/s][A
Epoch 7:   4%|▎         | 214/5971 [02:07<56:56,  1.69it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.48it/s][A

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.12it/s][A
Epoch 7:   4%|▎         | 218/5971 [02:07<55:55,  1.71it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.23it/s][A
Epoch 7:   4%|▎         | 222/5971 [02:07<54:57,  1.74it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.23it/s][A
Epoch 7:   4%|▍         | 226/5971 [02:08<54:00,  1.77it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.43it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 25.67it/s][A
Epoch 7:   4%|▍         | 230/5971 [02:08<53:06,  1.80it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.94it/s][A
Epoch 7:   4%|▍         | 234/5971 [02:08<52:13,  1.83it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.43it/s][A
Epoch 7:   4%|▍         | 238/5971 [02:08<51:22,  1.86it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 27.26it/s][A
Epoch 7:   4%|▍         | 242/5971 [02:08<50:32,  1.89it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.24it/s][A
Epoch 7:   4%|▍         | 246/5971 [02:08<49:45,  1.92it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.07it/s][A

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.52it/s][A
Epoch 7:   4%|▍         | 250/5971 [02:08<48:59,  1.95it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.36it/s][A
Epoch 7:   4%|▍         | 254/5971 [02:09<48:14,  1.98it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.31it/s][A
Epoch 7:   4%|▍         | 258/5971 [02:09<47:31,  2.00it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.16it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.88it/s][A
Epoch 7:   4%|▍         | 262/5971 [02:09<46:49,  2.03it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.83it/s][A
Epoch 7:   4%|▍         | 266/5971 [02:09<46:08,  2.06it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 100%|██████████| 167/167 [00:06<00:00, 25.64it/s][A
Epoch 7:   4%|▍         | 268/5971 [02:09<45:54,  2.07it/s, loss=0.221, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4049.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:   5%|▍         | 269/5971 [02:10<46:04,  2.06it/s, loss=0.255, v_num=0, train/loss_simple_step=0.920, train/loss_vlb_step=0.155, train/loss_step=0.920, global_step=4050.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▍         | 270/5971 [02:11<46:13,  2.06it/s, loss=0.255, v_num=0, train/loss_simple_step=0.920, train/loss_vlb_step=0.155, train/loss_step=0.920, global_step=4050.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 270/5971 [02:11<46:13,  2.06it/s, loss=0.253, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.7e-5, train/loss_step=0.00544, global_step=4050.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 271/5971 [02:12<46:21,  2.05it/s, loss=0.248, v_num=0, train/loss_simple_step=0.0501, train/loss_vlb_step=0.000184, train/loss_step=0.0501, global_step=4050.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 272/5971 [02:14<46:54,  2.02it/s, loss=0.24, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.000979, train/loss_step=0.259, global_step=4050.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   5%|▍         | 273/5971 [02:15<47:02,  2.02it/s, loss=0.232, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.18e-5, train/loss_step=0.00405, global_step=4051.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 274/5971 [02:16<47:09,  2.01it/s, loss=0.232, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.18e-5, train/loss_step=0.00405, global_step=4051.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 274/5971 [02:16<47:09,  2.01it/s, loss=0.222, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000452, train/loss_step=0.136, global_step=4051.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   5%|▍         | 275/5971 [02:17<47:16,  2.01it/s, loss=0.243, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00212, train/loss_step=0.426, global_step=4051.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▍         | 276/5971 [02:19<47:49,  1.98it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0276, train/loss_vlb_step=9.81e-5, train/loss_step=0.0276, global_step=4051.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 277/5971 [02:20<47:57,  1.98it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000191, train/loss_step=0.0554, global_step=4052.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 278/5971 [02:21<48:03,  1.97it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0554, train/loss_vlb_step=0.000191, train/loss_step=0.0554, global_step=4052.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 278/5971 [02:21<48:03,  1.97it/s, loss=0.243, v_num=0, train/loss_simple_step=0.0815, train/loss_vlb_step=0.00027, train/loss_step=0.0815, global_step=4052.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▍         | 279/5971 [02:22<48:11,  1.97it/s, loss=0.241, v_num=0, train/loss_simple_step=0.00508, train/loss_vlb_step=2.6e-5, train/loss_step=0.00508, global_step=4052.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 280/5971 [02:24<48:42,  1.95it/s, loss=0.243, v_num=0, train/loss_simple_step=0.0509, train/loss_vlb_step=0.000179, train/loss_step=0.0509, global_step=4052.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 281/5971 [02:25<48:49,  1.94it/s, loss=0.243, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=4053.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   5%|▍         | 282/5971 [02:26<48:56,  1.94it/s, loss=0.243, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=4053.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 282/5971 [02:26<48:56,  1.94it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0041, train/loss_vlb_step=2.18e-5, train/loss_step=0.0041, global_step=4053.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 283/5971 [02:26<49:03,  1.93it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00918, train/loss_vlb_step=4.3e-5, train/loss_step=0.00918, global_step=4053.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 284/5971 [02:29<49:42,  1.91it/s, loss=0.213, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00115, train/loss_step=0.268, global_step=4053.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   5%|▍         | 285/5971 [02:30<49:49,  1.90it/s, loss=0.208, v_num=0, train/loss_simple_step=0.654, train/loss_vlb_step=0.0147, train/loss_step=0.654, global_step=4054.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▍         | 286/5971 [02:31<49:57,  1.90it/s, loss=0.208, v_num=0, train/loss_simple_step=0.654, train/loss_vlb_step=0.0147, train/loss_step=0.654, global_step=4054.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 286/5971 [02:31<49:57,  1.90it/s, loss=0.254, v_num=0, train/loss_simple_step=0.933, train/loss_vlb_step=0.470, train/loss_step=0.933, global_step=4054.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▍         | 287/5971 [02:32<50:04,  1.89it/s, loss=0.244, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000354, train/loss_step=0.108, global_step=4054.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 288/5971 [02:34<50:34,  1.87it/s, loss=0.207, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000118, train/loss_step=0.030, global_step=4054.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 289/5971 [02:35<50:40,  1.87it/s, loss=0.169, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000508, train/loss_step=0.155, global_step=4055.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 290/5971 [02:36<50:47,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000508, train/loss_step=0.155, global_step=4055.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 290/5971 [02:36<50:47,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00712, train/loss_vlb_step=3.43e-5, train/loss_step=0.00712, global_step=4055.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 291/5971 [02:36<50:53,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00102, train/loss_vlb_step=6.2e-6, train/loss_step=0.00102, global_step=4055.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▍         | 292/5971 [02:39<51:23,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00176, train/loss_vlb_step=1.01e-5, train/loss_step=0.00176, global_step=4055.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 293/5971 [02:39<51:29,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.61e-5, train/loss_step=0.00544, global_step=4056.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 294/5971 [02:40<51:35,  1.83it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00544, train/loss_vlb_step=2.61e-5, train/loss_step=0.00544, global_step=4056.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 294/5971 [02:40<51:35,  1.83it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00299, train/loss_vlb_step=1.63e-5, train/loss_step=0.00299, global_step=4056.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 295/5971 [02:41<51:40,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000158, train/loss_step=0.0456, global_step=4056.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▍         | 296/5971 [02:43<52:10,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.34e-5, train/loss_step=0.00231, global_step=4056.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 297/5971 [02:44<52:16,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.069, train/loss_vlb_step=0.000231, train/loss_step=0.069, global_step=4057.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   5%|▍         | 298/5971 [02:45<52:22,  1.81it/s, loss=0.127, v_num=0, train/loss_simple_step=0.069, train/loss_vlb_step=0.000231, train/loss_step=0.069, global_step=4057.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▍         | 298/5971 [02:45<52:22,  1.81it/s, loss=0.147, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00777, train/loss_step=0.475, global_step=4057.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▌         | 299/5971 [02:46<52:27,  1.80it/s, loss=0.16, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00113, train/loss_step=0.267, global_step=4057.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▌         | 300/5971 [02:48<52:56,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.64e-5, train/loss_step=0.00754, global_step=4057.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 301/5971 [02:49<53:02,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000192, train/loss_step=0.0548, global_step=4058.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▌         | 302/5971 [02:50<53:08,  1.78it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000192, train/loss_step=0.0548, global_step=4058.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 302/5971 [02:50<53:08,  1.78it/s, loss=0.161, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000416, train/loss_step=0.126, global_step=4058.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   5%|▌         | 303/5971 [02:51<53:13,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000117, train/loss_step=0.0319, global_step=4058.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 304/5971 [02:53<53:49,  1.75it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0289, train/loss_vlb_step=0.000104, train/loss_step=0.0289, global_step=4058.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▌         | 305/5971 [02:54<53:55,  1.75it/s, loss=0.131, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00117, train/loss_step=0.265, global_step=4059.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   5%|▌         | 306/5971 [02:55<54:00,  1.75it/s, loss=0.131, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00117, train/loss_step=0.265, global_step=4059.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 306/5971 [02:55<54:00,  1.75it/s, loss=0.0845, v_num=0, train/loss_simple_step=0.00556, train/loss_vlb_step=2.87e-5, train/loss_step=0.00556, global_step=4059.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 307/5971 [02:56<54:05,  1.74it/s, loss=0.105, v_num=0, train/loss_simple_step=0.520, train/loss_vlb_step=0.0072, train/loss_step=0.520, global_step=4059.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      
Epoch 7:   5%|▌         | 308/5971 [02:58<54:33,  1.73it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00547, train/loss_vlb_step=2.58e-5, train/loss_step=0.00547, global_step=4059.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 309/5971 [02:59<54:38,  1.73it/s, loss=0.103, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000431, train/loss_step=0.130, global_step=4060.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   5%|▌         | 310/5971 [03:00<54:43,  1.72it/s, loss=0.103, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000431, train/loss_step=0.130, global_step=4060.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 310/5971 [03:00<54:43,  1.72it/s, loss=0.112, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000672, train/loss_step=0.196, global_step=4060.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 311/5971 [03:01<54:48,  1.72it/s, loss=0.115, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000193, train/loss_step=0.053, global_step=4060.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 312/5971 [03:03<55:19,  1.70it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000123, train/loss_step=0.0337, global_step=4060.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 313/5971 [03:04<55:25,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00115, train/loss_step=0.304, global_step=4061.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   5%|▌         | 314/5971 [03:05<55:29,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00115, train/loss_step=0.304, global_step=4061.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 314/5971 [03:05<55:29,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.625, train/loss_vlb_step=0.00864, train/loss_step=0.625, global_step=4061.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 315/5971 [03:06<55:34,  1.70it/s, loss=0.194, v_num=0, train/loss_simple_step=0.679, train/loss_vlb_step=0.0173, train/loss_step=0.679, global_step=4061.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▌         | 316/5971 [03:08<56:02,  1.68it/s, loss=0.218, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00628, train/loss_step=0.483, global_step=4061.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 317/5971 [03:09<56:07,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0924, train/loss_vlb_step=0.000306, train/loss_step=0.0924, global_step=4062.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 318/5971 [03:10<56:12,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0924, train/loss_vlb_step=0.000306, train/loss_step=0.0924, global_step=4062.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 318/5971 [03:10<56:12,  1.68it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.00023, train/loss_step=0.0654, global_step=4062.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▌         | 319/5971 [03:11<56:16,  1.67it/s, loss=0.207, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00456, train/loss_step=0.426, global_step=4062.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   5%|▌         | 320/5971 [03:13<56:42,  1.66it/s, loss=0.212, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=4062.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 321/5971 [03:14<56:46,  1.66it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000306, train/loss_step=0.0918, global_step=4063.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 322/5971 [03:15<56:50,  1.66it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000306, train/loss_step=0.0918, global_step=4063.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 322/5971 [03:15<56:50,  1.66it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0691, train/loss_vlb_step=0.000244, train/loss_step=0.0691, global_step=4063.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 323/5971 [03:15<56:54,  1.65it/s, loss=0.213, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000225, train/loss_step=0.064, global_step=4063.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   5%|▌         | 324/5971 [03:17<57:20,  1.64it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.85e-5, train/loss_step=0.0106, global_step=4063.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 325/5971 [03:18<57:24,  1.64it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0715, train/loss_vlb_step=0.000235, train/loss_step=0.0715, global_step=4064.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 326/5971 [03:19<57:28,  1.64it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0715, train/loss_vlb_step=0.000235, train/loss_step=0.0715, global_step=4064.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 326/5971 [03:19<57:28,  1.64it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00928, train/loss_vlb_step=4.29e-5, train/loss_step=0.00928, global_step=4064.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   5%|▌         | 327/5971 [03:20<57:32,  1.63it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.000217, train/loss_step=0.0649, global_step=4064.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   5%|▌         | 328/5971 [03:22<57:57,  1.62it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00678, train/loss_vlb_step=3.21e-5, train/loss_step=0.00678, global_step=4064.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 329/5971 [03:23<58:01,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.1e-5, train/loss_step=0.00186, global_step=4065.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   6%|▌         | 330/5971 [03:24<58:05,  1.62it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.1e-5, train/loss_step=0.00186, global_step=4065.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 330/5971 [03:24<58:05,  1.62it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00618, train/loss_vlb_step=2.95e-5, train/loss_step=0.00618, global_step=4065.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 331/5971 [03:25<58:09,  1.62it/s, loss=0.174, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000948, train/loss_step=0.257, global_step=4065.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   6%|▌         | 332/5971 [03:27<58:38,  1.60it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000152, train/loss_step=0.0426, global_step=4065.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 333/5971 [03:28<58:42,  1.60it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000131, train/loss_step=0.0355, global_step=4066.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 334/5971 [03:29<58:45,  1.60it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000131, train/loss_step=0.0355, global_step=4066.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 334/5971 [03:29<58:45,  1.60it/s, loss=0.137, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000469, train/loss_step=0.143, global_step=4066.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   6%|▌         | 335/5971 [03:30<58:49,  1.60it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.31e-5, train/loss_step=0.0186, global_step=4066.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 336/5971 [03:32<59:15,  1.58it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.155, train/loss_vlb_step=0.000513, train/loss_step=0.155, global_step=4066.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 337/5971 [03:33<59:19,  1.58it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000302, train/loss_step=0.0918, global_step=4067.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 338/5971 [03:34<59:23,  1.58it/s, loss=0.0872, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000302, train/loss_step=0.0918, global_step=4067.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 338/5971 [03:34<59:23,  1.58it/s, loss=0.0866, v_num=0, train/loss_simple_step=0.0524, train/loss_vlb_step=0.000181, train/loss_step=0.0524, global_step=4067.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 339/5971 [03:35<59:27,  1.58it/s, loss=0.0701, v_num=0, train/loss_simple_step=0.0959, train/loss_vlb_step=0.000315, train/loss_step=0.0959, global_step=4067.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 340/5971 [03:37<59:52,  1.57it/s, loss=0.0651, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.49e-5, train/loss_step=0.0136, global_step=4067.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   6%|▌         | 341/5971 [03:38<59:56,  1.57it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.00071, train/loss_step=0.198, global_step=4068.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   6%|▌         | 342/5971 [03:39<1:00:00,  1.56it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.00071, train/loss_step=0.198, global_step=4068.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 342/5971 [03:39<1:00:00,  1.56it/s, loss=0.0747, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000544, train/loss_step=0.157, global_step=4068.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 343/5971 [03:40<1:00:03,  1.56it/s, loss=0.0717, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.58e-5, train/loss_step=0.00301, global_step=4068.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 344/5971 [03:42<1:00:26,  1.55it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.409, train/loss_vlb_step=0.00232, train/loss_step=0.409, global_step=4068.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:   6%|▌         | 345/5971 [03:43<1:00:29,  1.55it/s, loss=0.089, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.9e-5, train/loss_step=0.0189, global_step=4069.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 346/5971 [03:44<1:00:33,  1.55it/s, loss=0.089, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.9e-5, train/loss_step=0.0189, global_step=4069.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 346/5971 [03:44<1:00:33,  1.55it/s, loss=0.0924, v_num=0, train/loss_simple_step=0.0785, train/loss_vlb_step=0.000261, train/loss_step=0.0785, global_step=4069.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 347/5971 [03:45<1:00:36,  1.55it/s, loss=0.116, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00555, train/loss_step=0.542, global_step=4069.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:   6%|▌         | 348/5971 [03:47<1:00:58,  1.54it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00831, train/loss_vlb_step=4.03e-5, train/loss_step=0.00831, global_step=4069.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 349/5971 [03:47<1:01:02,  1.54it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.12e-5, train/loss_step=0.00193, global_step=4070.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 350/5971 [03:48<1:01:05,  1.53it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.12e-5, train/loss_step=0.00193, global_step=4070.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 350/5971 [03:48<1:01:05,  1.53it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.00028, train/loss_step=0.0844, global_step=4070.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   6%|▌         | 351/5971 [03:49<1:01:08,  1.53it/s, loss=0.107, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.18e-5, train/loss_step=0.00208, global_step=4070.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 352/5971 [03:52<1:01:34,  1.52it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000131, train/loss_step=0.0336, global_step=4070.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   6%|▌         | 353/5971 [03:53<1:01:38,  1.52it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.0002, train/loss_step=0.0601, global_step=4071.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   6%|▌         | 354/5971 [03:53<1:01:40,  1.52it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.0002, train/loss_step=0.0601, global_step=4071.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 354/5971 [03:53<1:01:40,  1.52it/s, loss=0.126, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00434, train/loss_step=0.495, global_step=4071.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   6%|▌         | 355/5971 [03:54<1:01:43,  1.52it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0708, train/loss_vlb_step=0.000234, train/loss_step=0.0708, global_step=4071.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 356/5971 [03:56<1:02:05,  1.51it/s, loss=0.145, v_num=0, train/loss_simple_step=0.484, train/loss_vlb_step=0.00307, train/loss_step=0.484, global_step=4071.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   6%|▌         | 357/5971 [03:57<1:02:08,  1.51it/s, loss=0.173, v_num=0, train/loss_simple_step=0.647, train/loss_vlb_step=0.0106, train/loss_step=0.647, global_step=4072.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   6%|▌         | 358/5971 [03:58<1:02:10,  1.50it/s, loss=0.173, v_num=0, train/loss_simple_step=0.647, train/loss_vlb_step=0.0106, train/loss_step=0.647, global_step=4072.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 358/5971 [03:58<1:02:10,  1.50it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00187, train/loss_vlb_step=1.04e-5, train/loss_step=0.00187, global_step=4072.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 359/5971 [03:59<1:02:13,  1.50it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.79e-5, train/loss_step=0.0168, global_step=4072.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   6%|▌         | 360/5971 [04:01<1:02:39,  1.49it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.41e-5, train/loss_step=0.00252, global_step=4072.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 361/5971 [04:02<1:02:42,  1.49it/s, loss=0.162, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000411, train/loss_step=0.122, global_step=4073.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   6%|▌         | 362/5971 [04:03<1:02:45,  1.49it/s, loss=0.162, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000411, train/loss_step=0.122, global_step=4073.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 362/5971 [04:03<1:02:45,  1.49it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.51e-5, train/loss_step=0.00485, global_step=4073.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 363/5971 [04:04<1:02:47,  1.49it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.04e-5, train/loss_step=0.0244, global_step=4073.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   6%|▌         | 364/5971 [04:06<1:03:09,  1.48it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0695, train/loss_vlb_step=0.000231, train/loss_step=0.0695, global_step=4073.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 365/5971 [04:07<1:03:12,  1.48it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.000146, train/loss_step=0.0434, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   6%|▌         | 366/5971 [04:08<1:03:15,  1.48it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.000146, train/loss_step=0.0434, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 366/5971 [04:08<1:03:15,  1.48it/s, loss=0.142, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000414, train/loss_step=0.126, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   6%|▌         | 367/5971 [04:09<1:03:17,  1.48it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00737, train/loss_vlb_step=3.41e-5, train/loss_step=0.00737, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   6%|▌         | 368/5971 [04:11<1:03:38,  1.47it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:16,  2.16it/s][A
Epoch 7:   6%|▌         | 370/5971 [04:11<1:03:24,  1.47it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:42,  3.93it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:16, 10.02it/s][A
Epoch 7:   6%|▋         | 374/5971 [04:12<1:02:44,  1.49it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.22it/s][A
Epoch 7:   6%|▋         | 378/5971 [04:12<1:02:04,  1.50it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.64it/s][A
Epoch 7:   6%|▋         | 382/5971 [04:12<1:01:25,  1.52it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.27it/s][A

Validating:  10%|█         | 17/167 [00:01<00:06, 21.53it/s][A
Epoch 7:   6%|▋         | 386/5971 [04:12<1:00:46,  1.53it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.68it/s][A
Epoch 7:   7%|▋         | 390/5971 [04:12<1:00:09,  1.55it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.55it/s][A
Epoch 7:   7%|▋         | 394/5971 [04:13<59:32,  1.56it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.32it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.83it/s][A
Epoch 7:   7%|▋         | 398/5971 [04:13<58:56,  1.58it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.99it/s][A
Epoch 7:   7%|▋         | 402/5971 [04:13<58:20,  1.59it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.15it/s][A
Epoch 7:   7%|▋         | 406/5971 [04:13<57:46,  1.61it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.73it/s][A
Epoch 7:   7%|▋         | 410/5971 [04:13<57:11,  1.62it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.02it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.65it/s][A
Epoch 7:   7%|▋         | 414/5971 [04:13<56:38,  1.64it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.08it/s][A
Epoch 7:   7%|▋         | 418/5971 [04:13<56:05,  1.65it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.08it/s][A
Epoch 7:   7%|▋         | 422/5971 [04:14<55:33,  1.66it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.41it/s][A

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.47it/s][A
Epoch 7:   7%|▋         | 426/5971 [04:14<55:01,  1.68it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.43it/s][A
Epoch 7:   7%|▋         | 430/5971 [04:14<54:31,  1.69it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.17it/s][A
Epoch 7:   7%|▋         | 434/5971 [04:14<54:00,  1.71it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.85it/s][A

Validating:  41%|████▏     | 69/167 [00:03<00:04, 24.11it/s][A
Epoch 7:   7%|▋         | 438/5971 [04:14<53:30,  1.72it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.61it/s][A
Epoch 7:   7%|▋         | 442/5971 [04:14<53:01,  1.74it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 24.77it/s][A
Epoch 7:   7%|▋         | 446/5971 [04:15<52:32,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 24.85it/s][A
Epoch 7:   8%|▊         | 450/5971 [04:15<52:04,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.41it/s][A
Epoch 7:   8%|▊         | 454/5971 [04:15<51:36,  1.78it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.74it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.21it/s][A
Epoch 7:   8%|▊         | 458/5971 [04:15<51:08,  1.80it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 26.52it/s][A
Epoch 7:   8%|▊         | 462/5971 [04:15<50:41,  1.81it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.04it/s][A
Epoch 7:   8%|▊         | 466/5971 [04:15<50:15,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.89it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.27it/s][A
Epoch 7:   8%|▊         | 470/5971 [04:15<49:49,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.44it/s][A
Epoch 7:   8%|▊         | 474/5971 [04:16<49:24,  1.85it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 25.52it/s][A
Epoch 7:   8%|▊         | 478/5971 [04:16<48:59,  1.87it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 25.93it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 25.90it/s][A
Epoch 7:   8%|▊         | 482/5971 [04:16<48:34,  1.88it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 25.82it/s][A
Epoch 7:   8%|▊         | 486/5971 [04:16<48:10,  1.90it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 26.13it/s][A
Epoch 7:   8%|▊         | 490/5971 [04:16<47:46,  1.91it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 24.77it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 24.72it/s][A
Epoch 7:   8%|▊         | 494/5971 [04:16<47:22,  1.93it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.50it/s][A
Epoch 7:   8%|▊         | 498/5971 [04:17<46:59,  1.94it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.94it/s][A
Epoch 7:   8%|▊         | 502/5971 [04:17<46:36,  1.96it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████  | 135/167 [00:05<00:01, 25.17it/s][A
Epoch 7:   8%|▊         | 506/5971 [04:17<46:14,  1.97it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.15it/s][A

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 26.08it/s][A
Epoch 7:   9%|▊         | 510/5971 [04:17<45:52,  1.98it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.17it/s][A
Epoch 7:   9%|▊         | 514/5971 [04:17<45:30,  2.00it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.18it/s][A
Epoch 7:   9%|▊         | 518/5971 [04:17<45:08,  2.01it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.09it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.77it/s][A
Epoch 7:   9%|▊         | 522/5971 [04:17<44:47,  2.03it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.20it/s][A
Epoch 7:   9%|▉         | 526/5971 [04:18<44:27,  2.04it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.67it/s][A
Epoch 7:   9%|▉         | 530/5971 [04:18<44:06,  2.06it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.81it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.13it/s][A
Epoch 7:   9%|▉         | 534/5971 [04:18<43:46,  2.07it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 536/5971 [04:18<43:38,  2.08it/s, loss=0.122, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000464, train/loss_step=0.141, global_step=4074.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:   9%|▉         | 537/5971 [04:19<43:42,  2.07it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000256, train/loss_step=0.0757, global_step=4075.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 538/5971 [04:20<43:46,  2.07it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000256, train/loss_step=0.0757, global_step=4075.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 538/5971 [04:20<43:46,  2.07it/s, loss=0.142, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00198, train/loss_step=0.407, global_step=4075.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   9%|▉         | 539/5971 [04:21<43:50,  2.07it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000165, train/loss_step=0.0474, global_step=4075.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 540/5971 [04:24<44:15,  2.05it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.55e-5, train/loss_step=0.0121, global_step=4075.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   9%|▉         | 541/5971 [04:25<44:19,  2.04it/s, loss=0.16, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00197, train/loss_step=0.395, global_step=4076.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   9%|▉         | 542/5971 [04:26<44:22,  2.04it/s, loss=0.16, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00197, train/loss_step=0.395, global_step=4076.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 542/5971 [04:26<44:22,  2.04it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0074, train/loss_vlb_step=3.36e-5, train/loss_step=0.0074, global_step=4076.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 543/5971 [04:27<44:26,  2.04it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.74e-5, train/loss_step=0.0108, global_step=4076.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 544/5971 [04:29<44:43,  2.02it/s, loss=0.127, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00178, train/loss_step=0.374, global_step=4076.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   9%|▉         | 545/5971 [04:30<44:47,  2.02it/s, loss=0.101, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000451, train/loss_step=0.135, global_step=4077.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 546/5971 [04:31<44:50,  2.02it/s, loss=0.101, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000451, train/loss_step=0.135, global_step=4077.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 546/5971 [04:31<44:50,  2.02it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.71e-5, train/loss_step=0.00319, global_step=4077.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 547/5971 [04:32<44:53,  2.01it/s, loss=0.125, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00332, train/loss_step=0.490, global_step=4077.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:   9%|▉         | 548/5971 [04:34<45:15,  2.00it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=9.56e-5, train/loss_step=0.0254, global_step=4077.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 549/5971 [04:35<45:19,  1.99it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.5e-6, train/loss_step=0.00157, global_step=4078.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 550/5971 [04:36<45:22,  1.99it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.5e-6, train/loss_step=0.00157, global_step=4078.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 550/5971 [04:36<45:22,  1.99it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.87e-5, train/loss_step=0.00358, global_step=4078.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 551/5971 [04:37<45:25,  1.99it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.59e-5, train/loss_step=0.0157, global_step=4078.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   9%|▉         | 552/5971 [04:39<45:40,  1.98it/s, loss=0.133, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00161, train/loss_step=0.338, global_step=4078.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   9%|▉         | 553/5971 [04:40<45:43,  1.97it/s, loss=0.145, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00127, train/loss_step=0.294, global_step=4079.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 554/5971 [04:41<45:46,  1.97it/s, loss=0.145, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00127, train/loss_step=0.294, global_step=4079.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 554/5971 [04:41<45:46,  1.97it/s, loss=0.152, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000895, train/loss_step=0.246, global_step=4079.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 555/5971 [04:42<45:49,  1.97it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.1e-5, train/loss_step=0.00182, global_step=4079.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 556/5971 [04:44<46:04,  1.96it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0557, train/loss_vlb_step=0.000187, train/loss_step=0.0557, global_step=4079.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 557/5971 [04:45<46:07,  1.96it/s, loss=0.149, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00037, train/loss_step=0.109, global_step=4080.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   9%|▉         | 558/5971 [04:46<46:10,  1.95it/s, loss=0.149, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00037, train/loss_step=0.109, global_step=4080.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 558/5971 [04:46<46:10,  1.95it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0669, train/loss_vlb_step=0.000221, train/loss_step=0.0669, global_step=4080.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 559/5971 [04:47<46:13,  1.95it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.57e-5, train/loss_step=0.0128, global_step=4080.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   9%|▉         | 560/5971 [04:49<46:32,  1.94it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000127, train/loss_step=0.0319, global_step=4080.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 561/5971 [04:50<46:35,  1.94it/s, loss=0.128, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00174, train/loss_step=0.328, global_step=4081.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:   9%|▉         | 562/5971 [04:51<46:38,  1.93it/s, loss=0.128, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00174, train/loss_step=0.328, global_step=4081.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 562/5971 [04:51<46:38,  1.93it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0956, train/loss_vlb_step=0.000318, train/loss_step=0.0956, global_step=4081.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 563/5971 [04:52<46:41,  1.93it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0353, train/loss_vlb_step=0.000127, train/loss_step=0.0353, global_step=4081.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 564/5971 [04:54<46:56,  1.92it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=5.99e-5, train/loss_step=0.0143, global_step=4081.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:   9%|▉         | 565/5971 [04:55<46:59,  1.92it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.82e-5, train/loss_step=0.00341, global_step=4082.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 566/5971 [04:56<47:02,  1.92it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.82e-5, train/loss_step=0.00341, global_step=4082.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   9%|▉         | 566/5971 [04:56<47:02,  1.92it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.00024, train/loss_step=0.0704, global_step=4082.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:   9%|▉         | 567/5971 [04:56<47:05,  1.91it/s, loss=0.111, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00314, train/loss_step=0.473, global_step=4082.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|▉         | 568/5971 [04:59<47:22,  1.90it/s, loss=0.127, v_num=0, train/loss_simple_step=0.342, train/loss_vlb_step=0.00223, train/loss_step=0.342, global_step=4082.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 569/5971 [05:00<47:25,  1.90it/s, loss=0.129, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000155, train/loss_step=0.042, global_step=4083.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 570/5971 [05:01<47:28,  1.90it/s, loss=0.129, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000155, train/loss_step=0.042, global_step=4083.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 570/5971 [05:01<47:28,  1.90it/s, loss=0.155, v_num=0, train/loss_simple_step=0.515, train/loss_vlb_step=0.00657, train/loss_step=0.515, global_step=4083.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  10%|▉         | 571/5971 [05:02<47:31,  1.89it/s, loss=0.163, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000581, train/loss_step=0.175, global_step=4083.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 572/5971 [05:04<47:45,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.17e-5, train/loss_step=0.00429, global_step=4083.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 573/5971 [05:05<47:48,  1.88it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.55e-5, train/loss_step=0.00754, global_step=4084.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 574/5971 [05:05<47:51,  1.88it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.55e-5, train/loss_step=0.00754, global_step=4084.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 574/5971 [05:05<47:51,  1.88it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0226, train/loss_vlb_step=8.97e-5, train/loss_step=0.0226, global_step=4084.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  10%|▉         | 575/5971 [05:06<47:54,  1.88it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.00026, train/loss_step=0.0763, global_step=4084.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 576/5971 [05:08<48:08,  1.87it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0953, train/loss_vlb_step=0.000321, train/loss_step=0.0953, global_step=4084.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 577/5971 [05:09<48:11,  1.87it/s, loss=0.147, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00484, train/loss_step=0.526, global_step=4085.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  10%|▉         | 578/5971 [05:10<48:13,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00484, train/loss_step=0.526, global_step=4085.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 578/5971 [05:10<48:13,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00225, train/loss_step=0.420, global_step=4085.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 579/5971 [05:11<48:16,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0507, train/loss_vlb_step=0.000184, train/loss_step=0.0507, global_step=4085.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 580/5971 [05:14<48:33,  1.85it/s, loss=0.174, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000588, train/loss_step=0.175, global_step=4085.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|▉         | 581/5971 [05:14<48:36,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00381, train/loss_vlb_step=2.04e-5, train/loss_step=0.00381, global_step=4086.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 582/5971 [05:15<48:39,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00381, train/loss_vlb_step=2.04e-5, train/loss_step=0.00381, global_step=4086.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 582/5971 [05:15<48:39,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.084, train/loss_vlb_step=0.000276, train/loss_step=0.084, global_step=4086.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  10%|▉         | 583/5971 [05:16<48:41,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.451, train/loss_vlb_step=0.00288, train/loss_step=0.451, global_step=4086.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  10%|▉         | 584/5971 [05:19<48:59,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.95e-5, train/loss_step=0.0136, global_step=4086.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 585/5971 [05:20<49:01,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.26e-6, train/loss_step=0.00152, global_step=4087.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 586/5971 [05:20<49:04,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00152, train/loss_vlb_step=9.26e-6, train/loss_step=0.00152, global_step=4087.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 586/5971 [05:20<49:04,  1.83it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.46e-5, train/loss_step=0.0148, global_step=4087.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|▉         | 587/5971 [05:21<49:06,  1.83it/s, loss=0.169, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.0027, train/loss_step=0.366, global_step=4087.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  10%|▉         | 588/5971 [05:24<49:23,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0089, train/loss_vlb_step=4.03e-5, train/loss_step=0.0089, global_step=4087.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 589/5971 [05:25<49:25,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000718, train/loss_step=0.186, global_step=4088.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|▉         | 590/5971 [05:25<49:28,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000718, train/loss_step=0.186, global_step=4088.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 590/5971 [05:25<49:28,  1.81it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.6e-5, train/loss_step=0.0166, global_step=4088.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 591/5971 [05:26<49:30,  1.81it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0565, train/loss_vlb_step=0.000192, train/loss_step=0.0565, global_step=4088.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 592/5971 [05:28<49:43,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0666, train/loss_vlb_step=0.00022, train/loss_step=0.0666, global_step=4088.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  10%|▉         | 593/5971 [05:29<49:46,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.17e-5, train/loss_step=0.00638, global_step=4089.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 594/5971 [05:30<49:48,  1.80it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.17e-5, train/loss_step=0.00638, global_step=4089.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 594/5971 [05:30<49:48,  1.80it/s, loss=0.144, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00104, train/loss_step=0.253, global_step=4089.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  10%|▉         | 595/5971 [05:31<49:50,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0683, train/loss_vlb_step=0.000228, train/loss_step=0.0683, global_step=4089.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|▉         | 596/5971 [05:33<50:04,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00114, train/loss_step=0.275, global_step=4089.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  10%|▉         | 597/5971 [05:34<50:07,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000371, train/loss_step=0.111, global_step=4090.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 598/5971 [05:35<50:09,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000371, train/loss_step=0.111, global_step=4090.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 598/5971 [05:35<50:09,  1.79it/s, loss=0.121, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000705, train/loss_step=0.211, global_step=4090.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 599/5971 [05:36<50:11,  1.78it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0036, train/loss_vlb_step=1.95e-5, train/loss_step=0.0036, global_step=4090.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 600/5971 [05:38<50:26,  1.77it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.03e-5, train/loss_step=0.0168, global_step=4090.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 601/5971 [05:39<50:28,  1.77it/s, loss=0.118, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.00049, train/loss_step=0.147, global_step=4091.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|█         | 602/5971 [05:40<50:30,  1.77it/s, loss=0.118, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.00049, train/loss_step=0.147, global_step=4091.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 602/5971 [05:40<50:30,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00129, train/loss_step=0.310, global_step=4091.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 603/5971 [05:41<50:33,  1.77it/s, loss=0.114, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000502, train/loss_step=0.152, global_step=4091.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 604/5971 [05:43<50:46,  1.76it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00556, train/loss_vlb_step=2.77e-5, train/loss_step=0.00556, global_step=4091.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 605/5971 [05:44<50:48,  1.76it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.31e-5, train/loss_step=0.00676, global_step=4092.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 606/5971 [05:45<50:50,  1.76it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00676, train/loss_vlb_step=3.31e-5, train/loss_step=0.00676, global_step=4092.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 606/5971 [05:45<50:50,  1.76it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0662, train/loss_vlb_step=0.00022, train/loss_step=0.0662, global_step=4092.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|█         | 607/5971 [05:46<50:52,  1.76it/s, loss=0.0984, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.55e-6, train/loss_step=0.00161, global_step=4092.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 608/5971 [05:48<51:07,  1.75it/s, loss=0.118, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00178, train/loss_step=0.400, global_step=4092.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  10%|█         | 609/5971 [05:49<51:10,  1.75it/s, loss=0.112, v_num=0, train/loss_simple_step=0.061, train/loss_vlb_step=0.000206, train/loss_step=0.061, global_step=4093.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 610/5971 [05:50<51:12,  1.75it/s, loss=0.112, v_num=0, train/loss_simple_step=0.061, train/loss_vlb_step=0.000206, train/loss_step=0.061, global_step=4093.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 610/5971 [05:50<51:12,  1.75it/s, loss=0.121, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000694, train/loss_step=0.194, global_step=4093.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 611/5971 [05:50<51:14,  1.74it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=2.06e-5, train/loss_step=0.00384, global_step=4093.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 612/5971 [05:53<51:28,  1.74it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.72e-5, train/loss_step=0.0108, global_step=4093.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|█         | 613/5971 [05:54<51:30,  1.73it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00197, train/loss_vlb_step=1.16e-5, train/loss_step=0.00197, global_step=4094.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 614/5971 [05:55<51:32,  1.73it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00197, train/loss_vlb_step=1.16e-5, train/loss_step=0.00197, global_step=4094.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 614/5971 [05:55<51:32,  1.73it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.66e-5, train/loss_step=0.00291, global_step=4094.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 615/5971 [05:55<51:34,  1.73it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.00297, train/loss_vlb_step=1.56e-5, train/loss_step=0.00297, global_step=4094.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 616/5971 [05:58<51:50,  1.72it/s, loss=0.0857, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.62e-5, train/loss_step=0.00531, global_step=4094.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 617/5971 [05:59<51:52,  1.72it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.41e-5, train/loss_step=0.00243, global_step=4095.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 618/5971 [06:00<51:54,  1.72it/s, loss=0.0803, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.41e-5, train/loss_step=0.00243, global_step=4095.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 618/5971 [06:00<51:54,  1.72it/s, loss=0.0715, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000128, train/loss_step=0.0354, global_step=4095.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  10%|█         | 619/5971 [06:01<51:56,  1.72it/s, loss=0.113, v_num=0, train/loss_simple_step=0.840, train/loss_vlb_step=0.0481, train/loss_step=0.840, global_step=4095.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  10%|█         | 620/5971 [06:03<52:10,  1.71it/s, loss=0.125, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000961, train/loss_step=0.247, global_step=4095.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 621/5971 [06:04<52:12,  1.71it/s, loss=0.157, v_num=0, train/loss_simple_step=0.798, train/loss_vlb_step=0.0235, train/loss_step=0.798, global_step=4096.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|█         | 622/5971 [06:05<52:14,  1.71it/s, loss=0.157, v_num=0, train/loss_simple_step=0.798, train/loss_vlb_step=0.0235, train/loss_step=0.798, global_step=4096.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 622/5971 [06:05<52:14,  1.71it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000118, train/loss_step=0.0309, global_step=4096.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 623/5971 [06:05<52:15,  1.71it/s, loss=0.146, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000678, train/loss_step=0.198, global_step=4096.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  10%|█         | 624/5971 [06:08<52:28,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000191, train/loss_step=0.0568, global_step=4096.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 625/5971 [06:08<52:30,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000177, train/loss_step=0.0514, global_step=4097.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  10%|█         | 626/5971 [06:09<52:32,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0514, train/loss_vlb_step=0.000177, train/loss_step=0.0514, global_step=4097.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  10%|█         | 626/5971 [06:09<52:32,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000265, train/loss_step=0.0755, global_step=4097.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  11%|█         | 627/5971 [06:10<52:34,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0095, train/loss_vlb_step=4.51e-5, train/loss_step=0.0095, global_step=4097.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  11%|█         | 628/5971 [06:12<52:48,  1.69it/s, loss=0.143, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000812, train/loss_step=0.228, global_step=4097.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  11%|█         | 629/5971 [06:13<52:49,  1.69it/s, loss=0.146, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000418, train/loss_step=0.126, global_step=4098.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  11%|█         | 630/5971 [06:14<52:51,  1.68it/s, loss=0.146, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000418, train/loss_step=0.126, global_step=4098.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  11%|█         | 630/5971 [06:14<52:51,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000484, train/loss_step=0.147, global_step=4098.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  11%|█         | 631/5971 [06:15<52:53,  1.68it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.54e-5, train/loss_step=0.00266, global_step=4098.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  11%|█         | 632/5971 [06:17<53:05,  1.68it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00301, train/loss_vlb_step=1.64e-5, train/loss_step=0.00301, global_step=4098.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  11%|█         | 633/5971 [06:18<53:07,  1.67it/s, loss=0.162, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00156, train/loss_step=0.370, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  11%|█         | 634/5971 [06:19<53:09,  1.67it/s, loss=0.162, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00156, train/loss_step=0.370, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  11%|█         | 634/5971 [06:19<53:09,  1.67it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.000175, train/loss_step=0.0474, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  11%|█         | 635/5971 [06:20<53:11,  1.67it/s, loss=0.175, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000819, train/loss_step=0.220, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  11%|█         | 636/5971 [06:22<53:24,  1.66it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.24it/s][A
Epoch 7:  11%|█         | 638/5971 [06:23<53:17,  1.67it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:41,  3.98it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.90it/s][A
Epoch 7:  11%|█         | 642/5971 [06:23<52:57,  1.68it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.44it/s][A
Epoch 7:  11%|█         | 646/5971 [06:23<52:36,  1.69it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.12it/s][A
Epoch 7:  11%|█         | 650/5971 [06:23<52:16,  1.70it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.62it/s][A

Validating:  10%|█         | 17/167 [00:01<00:06, 21.65it/s][A
Epoch 7:  11%|█         | 654/5971 [06:23<51:56,  1.71it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.87it/s][A
Epoch 7:  11%|█         | 658/5971 [06:24<51:36,  1.72it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.61it/s][A
Epoch 7:  11%|█         | 662/5971 [06:24<51:16,  1.73it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 22.80it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.92it/s][A
Epoch 7:  11%|█         | 666/5971 [06:24<50:57,  1.74it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.01it/s][A
Epoch 7:  11%|█         | 670/5971 [06:24<50:37,  1.75it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.04it/s][A
Epoch 7:  11%|█▏        | 674/5971 [06:24<50:18,  1.75it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 25.74it/s][A
Epoch 7:  11%|█▏        | 678/5971 [06:24<49:59,  1.76it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.63it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.13it/s][A
Epoch 7:  11%|█▏        | 682/5971 [06:24<49:40,  1.77it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.91it/s][A
Epoch 7:  11%|█▏        | 686/5971 [06:25<49:22,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.97it/s][A
Epoch 7:  12%|█▏        | 690/5971 [06:25<49:04,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.44it/s][A

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.52it/s][A
Epoch 7:  12%|█▏        | 694/5971 [06:25<48:46,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.00it/s][A
Epoch 7:  12%|█▏        | 698/5971 [06:25<48:28,  1.81it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.82it/s][A
Epoch 7:  12%|█▏        | 702/5971 [06:25<48:10,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|████      | 67/167 [00:03<00:03, 28.42it/s][A
Epoch 7:  12%|█▏        | 706/5971 [06:25<47:53,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 28.74it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 29.04it/s][A
Epoch 7:  12%|█▏        | 710/5971 [06:25<47:35,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 28.27it/s][A
Epoch 7:  12%|█▏        | 714/5971 [06:26<47:18,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 28.37it/s][A
Epoch 7:  12%|█▏        | 718/5971 [06:26<47:01,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.37it/s][A
Epoch 7:  12%|█▏        | 722/5971 [06:26<46:45,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.18it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.01it/s][A
Epoch 7:  12%|█▏        | 726/5971 [06:26<46:28,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 26.45it/s][A
Epoch 7:  12%|█▏        | 730/5971 [06:26<46:12,  1.89it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.71it/s][A
Epoch 7:  12%|█▏        | 734/5971 [06:26<45:56,  1.90it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 27.14it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.22it/s][A
Epoch 7:  12%|█▏        | 738/5971 [06:27<45:40,  1.91it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.58it/s][A
Epoch 7:  12%|█▏        | 742/5971 [06:27<45:24,  1.92it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.06it/s][A
Epoch 7:  12%|█▏        | 746/5971 [06:27<45:09,  1.93it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.17it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.19it/s][A
Epoch 7:  13%|█▎        | 750/5971 [06:27<44:53,  1.94it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 27.52it/s][A
Epoch 7:  13%|█▎        | 754/5971 [06:27<44:38,  1.95it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 27.28it/s][A
Epoch 7:  13%|█▎        | 758/5971 [06:27<44:23,  1.96it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.34it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 25.31it/s][A
Epoch 7:  13%|█▎        | 762/5971 [06:27<44:08,  1.97it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.48it/s][A
Epoch 7:  13%|█▎        | 766/5971 [06:28<43:53,  1.98it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 24.99it/s][A
Epoch 7:  13%|█▎        | 770/5971 [06:28<43:38,  1.99it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.57it/s][A

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.42it/s][A
Epoch 7:  13%|█▎        | 774/5971 [06:28<43:24,  2.00it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.82it/s][A
Epoch 7:  13%|█▎        | 778/5971 [06:28<43:10,  2.00it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 25.89it/s][A
Epoch 7:  13%|█▎        | 782/5971 [06:28<42:55,  2.01it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.50it/s][A

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.81it/s][A
Epoch 7:  13%|█▎        | 786/5971 [06:28<42:41,  2.02it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.12it/s][A
Epoch 7:  13%|█▎        | 790/5971 [06:28<42:27,  2.03it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.60it/s][A
Epoch 7:  13%|█▎        | 794/5971 [06:29<42:13,  2.04it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.47it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.77it/s][A
Epoch 7:  13%|█▎        | 798/5971 [06:29<42:00,  2.05it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.86it/s][A
Epoch 7:  13%|█▎        | 802/5971 [06:29<41:46,  2.06it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 100%|██████████| 167/167 [00:06<00:00, 27.68it/s][A
Epoch 7:  13%|█▎        | 804/5971 [06:29<41:41,  2.07it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:28,  1.69it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:16,  2.86it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.66it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:10,  4.23it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.61it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.89it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.07it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.19it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.46it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.48it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.44it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.52it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.53it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.55it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.56it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.56it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.54it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.51it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.53it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.51it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.52it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.51it/s][A
Epoch 7:  13%|█▎        | 804/5971 [06:39<42:43,  2.02it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.46it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.48it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.28it/s]

Epoch 7:  13%|█▎        | 805/5971 [06:41<42:55,  2.01it/s, loss=0.184, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000667, train/loss_step=0.193, global_step=4099.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  13%|█▎        | 805/5971 [06:41<42:55,  2.01it/s, loss=0.198, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00121, train/loss_step=0.290, global_step=4100.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.40it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.76it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.02it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.21it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.52it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.60it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.66it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.65it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.69it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.69it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.40it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.34it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.30it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.29it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.17it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.13it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.23it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.31it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.22it/s]

Epoch 7:  13%|█▎        | 806/5971 [06:53<44:08,  1.95it/s, loss=0.198, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00121, train/loss_step=0.290, global_step=4100.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  13%|█▎        | 806/5971 [06:53<44:08,  1.95it/s, loss=0.204, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000566, train/loss_step=0.151, global_step=4100.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.20it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.81it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.24it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.56it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.80it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.98it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.14it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.26it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.52it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.55it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.53it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.54it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.55it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.55it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.53it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.51it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.53it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.19it/s]

Epoch 7:  14%|█▎        | 807/5971 [07:05<45:21,  1.90it/s, loss=0.204, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000566, train/loss_step=0.151, global_step=4100.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 807/5971 [07:05<45:21,  1.90it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.07e-5, train/loss_step=0.00175, global_step=4100.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.36it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.16it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.72it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.09it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.63it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.77it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.84it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.80it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  4.91it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  4.99it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  5.02it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  5.00it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.11it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.08it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.02it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  4.96it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  4.93it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:06,  4.97it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.12it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.25it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:05<00:05,  5.35it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.51it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.31it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.38it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.61it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.64it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.64it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.67it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.04it/s]

Epoch 7:  14%|█▎        | 808/5971 [07:19<46:42,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.07e-5, train/loss_step=0.00175, global_step=4100.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 808/5971 [07:19<46:42,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00113, train/loss_step=0.282, global_step=4100.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  14%|█▎        | 809/5971 [07:20<46:44,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00113, train/loss_step=0.282, global_step=4100.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 809/5971 [07:20<46:44,  1.84it/s, loss=0.131, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000449, train/loss_step=0.137, global_step=4101.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 810/5971 [07:20<46:46,  1.84it/s, loss=0.131, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000449, train/loss_step=0.137, global_step=4101.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 810/5971 [07:20<46:46,  1.84it/s, loss=0.147, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00202, train/loss_step=0.346, global_step=4101.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  14%|█▎        | 811/5971 [07:21<46:47,  1.84it/s, loss=0.147, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00202, train/loss_step=0.346, global_step=4101.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 811/5971 [07:21<46:47,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000262, train/loss_step=0.0761, global_step=4101.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 812/5971 [07:23<46:57,  1.83it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000262, train/loss_step=0.0761, global_step=4101.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 812/5971 [07:23<46:57,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.68e-5, train/loss_step=0.0102, global_step=4101.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  14%|█▎        | 813/5971 [07:24<46:59,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.68e-5, train/loss_step=0.0102, global_step=4101.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 813/5971 [07:24<46:59,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.842, train/loss_vlb_step=0.0617, train/loss_step=0.842, global_step=4102.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  14%|█▎        | 814/5971 [07:25<47:00,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.842, train/loss_vlb_step=0.0617, train/loss_step=0.842, global_step=4102.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 814/5971 [07:25<47:00,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4102.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 815/5971 [07:26<47:01,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4102.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 815/5971 [07:26<47:01,  1.83it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000212, train/loss_step=0.0626, global_step=4102.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 816/5971 [07:28<47:12,  1.82it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000212, train/loss_step=0.0626, global_step=4102.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 816/5971 [07:28<47:12,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000117, train/loss_step=0.0299, global_step=4102.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  14%|█▎        | 817/5971 [07:29<47:14,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000117, train/loss_step=0.0299, global_step=4102.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 817/5971 [07:29<47:14,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.61e-5, train/loss_step=0.0189, global_step=4103.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 818/5971 [07:30<47:15,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.61e-5, train/loss_step=0.0189, global_step=4103.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 818/5971 [07:30<47:15,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000192, train/loss_step=0.0547, global_step=4103.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 819/5971 [07:31<47:17,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000192, train/loss_step=0.0547, global_step=4103.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 819/5971 [07:31<47:17,  1.82it/s, loss=0.219, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.00834, train/loss_step=0.577, global_step=4103.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  14%|█▎        | 820/5971 [07:33<47:27,  1.81it/s, loss=0.219, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.00834, train/loss_step=0.577, global_step=4103.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 820/5971 [07:33<47:27,  1.81it/s, loss=0.239, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00357, train/loss_step=0.398, global_step=4103.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 821/5971 [07:34<47:28,  1.81it/s, loss=0.239, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00357, train/loss_step=0.398, global_step=4103.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▎        | 821/5971 [07:34<47:28,  1.81it/s, loss=0.245, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.004, train/loss_step=0.502, global_step=4104.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  14%|█▍        | 822/5971 [07:35<47:30,  1.81it/s, loss=0.245, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.004, train/loss_step=0.502, global_step=4104.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 822/5971 [07:35<47:30,  1.81it/s, loss=0.244, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.51e-5, train/loss_step=0.0181, global_step=4104.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 823/5971 [07:36<47:31,  1.81it/s, loss=0.244, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.51e-5, train/loss_step=0.0181, global_step=4104.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 823/5971 [07:36<47:31,  1.81it/s, loss=0.252, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00188, train/loss_step=0.381, global_step=4104.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  14%|█▍        | 824/5971 [07:38<47:41,  1.80it/s, loss=0.252, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00188, train/loss_step=0.381, global_step=4104.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 824/5971 [07:38<47:41,  1.80it/s, loss=0.259, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00135, train/loss_step=0.331, global_step=4104.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 825/5971 [07:39<47:42,  1.80it/s, loss=0.259, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00135, train/loss_step=0.331, global_step=4104.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 825/5971 [07:39<47:42,  1.80it/s, loss=0.258, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00103, train/loss_step=0.270, global_step=4105.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 826/5971 [07:40<47:44,  1.80it/s, loss=0.258, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00103, train/loss_step=0.270, global_step=4105.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 826/5971 [07:40<47:44,  1.80it/s, loss=0.259, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.00063, train/loss_step=0.176, global_step=4105.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 827/5971 [07:41<47:45,  1.79it/s, loss=0.259, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.00063, train/loss_step=0.176, global_step=4105.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 827/5971 [07:41<47:45,  1.79it/s, loss=0.302, v_num=0, train/loss_simple_step=0.869, train/loss_vlb_step=0.110, train/loss_step=0.869, global_step=4105.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  14%|█▍        | 828/5971 [07:43<47:54,  1.79it/s, loss=0.302, v_num=0, train/loss_simple_step=0.869, train/loss_vlb_step=0.110, train/loss_step=0.869, global_step=4105.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 828/5971 [07:43<47:54,  1.79it/s, loss=0.296, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000562, train/loss_step=0.156, global_step=4105.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 829/5971 [07:44<47:56,  1.79it/s, loss=0.296, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000562, train/loss_step=0.156, global_step=4105.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 829/5971 [07:44<47:56,  1.79it/s, loss=0.289, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.42e-5, train/loss_step=0.00243, global_step=4106.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 830/5971 [07:45<47:57,  1.79it/s, loss=0.289, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.42e-5, train/loss_step=0.00243, global_step=4106.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 830/5971 [07:45<47:57,  1.79it/s, loss=0.281, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000577, train/loss_step=0.175, global_step=4106.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  14%|█▍        | 831/5971 [07:46<47:59,  1.79it/s, loss=0.281, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000577, train/loss_step=0.175, global_step=4106.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 831/5971 [07:46<47:59,  1.79it/s, loss=0.284, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000436, train/loss_step=0.133, global_step=4106.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 832/5971 [07:48<48:08,  1.78it/s, loss=0.284, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000436, train/loss_step=0.133, global_step=4106.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 832/5971 [07:48<48:08,  1.78it/s, loss=0.283, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.73e-5, train/loss_step=0.00308, global_step=4106.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 833/5971 [07:49<48:09,  1.78it/s, loss=0.283, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.73e-5, train/loss_step=0.00308, global_step=4106.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 833/5971 [07:49<48:09,  1.78it/s, loss=0.266, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00641, train/loss_step=0.493, global_step=4107.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  14%|█▍        | 834/5971 [07:49<48:10,  1.78it/s, loss=0.266, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00641, train/loss_step=0.493, global_step=4107.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 834/5971 [07:49<48:10,  1.78it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000212, train/loss_step=0.0644, global_step=4107.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 835/5971 [07:50<48:12,  1.78it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000212, train/loss_step=0.0644, global_step=4107.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 835/5971 [07:50<48:12,  1.78it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0747, train/loss_vlb_step=0.000251, train/loss_step=0.0747, global_step=4107.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 836/5971 [07:53<48:22,  1.77it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0747, train/loss_vlb_step=0.000251, train/loss_step=0.0747, global_step=4107.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 836/5971 [07:53<48:22,  1.77it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000246, train/loss_step=0.0738, global_step=4107.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 837/5971 [07:54<48:24,  1.77it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0738, train/loss_vlb_step=0.000246, train/loss_step=0.0738, global_step=4107.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 837/5971 [07:54<48:24,  1.77it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.00012, train/loss_step=0.0313, global_step=4108.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  14%|█▍        | 838/5971 [07:54<48:25,  1.77it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.00012, train/loss_step=0.0313, global_step=4108.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 838/5971 [07:54<48:25,  1.77it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.49e-5, train/loss_step=0.0125, global_step=4108.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 839/5971 [07:55<48:26,  1.77it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.49e-5, train/loss_step=0.0125, global_step=4108.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 839/5971 [07:55<48:26,  1.77it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.00012, train/loss_step=0.0333, global_step=4108.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  14%|█▍        | 840/5971 [07:57<48:35,  1.76it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0333, train/loss_vlb_step=0.00012, train/loss_step=0.0333, global_step=4108.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 840/5971 [07:57<48:35,  1.76it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.55e-5, train/loss_step=0.0239, global_step=4108.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 841/5971 [07:58<48:36,  1.76it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.55e-5, train/loss_step=0.0239, global_step=4108.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 841/5971 [07:58<48:36,  1.76it/s, loss=0.211, v_num=0, train/loss_simple_step=0.903, train/loss_vlb_step=0.228, train/loss_step=0.903, global_step=4109.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  14%|█▍        | 842/5971 [07:59<48:38,  1.76it/s, loss=0.211, v_num=0, train/loss_simple_step=0.903, train/loss_vlb_step=0.228, train/loss_step=0.903, global_step=4109.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 842/5971 [07:59<48:38,  1.76it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=0.000104, train/loss_step=0.0258, global_step=4109.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 843/5971 [08:00<48:39,  1.76it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=0.000104, train/loss_step=0.0258, global_step=4109.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 843/5971 [08:00<48:39,  1.76it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0813, train/loss_vlb_step=0.000273, train/loss_step=0.0813, global_step=4109.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 844/5971 [08:02<48:49,  1.75it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0813, train/loss_vlb_step=0.000273, train/loss_step=0.0813, global_step=4109.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 844/5971 [08:02<48:49,  1.75it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.51e-5, train/loss_step=0.0186, global_step=4109.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  14%|█▍        | 845/5971 [08:03<48:50,  1.75it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0186, train/loss_vlb_step=7.51e-5, train/loss_step=0.0186, global_step=4109.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 845/5971 [08:03<48:50,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.08e-5, train/loss_step=0.00184, global_step=4110.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 846/5971 [08:04<48:52,  1.75it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.08e-5, train/loss_step=0.00184, global_step=4110.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 846/5971 [08:04<48:52,  1.75it/s, loss=0.172, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000924, train/loss_step=0.254, global_step=4110.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  14%|█▍        | 847/5971 [08:05<48:53,  1.75it/s, loss=0.172, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000924, train/loss_step=0.254, global_step=4110.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 847/5971 [08:05<48:53,  1.75it/s, loss=0.148, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00247, train/loss_step=0.397, global_step=4110.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  14%|█▍        | 848/5971 [08:07<49:01,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00247, train/loss_step=0.397, global_step=4110.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 848/5971 [08:07<49:01,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00119, train/loss_step=0.266, global_step=4110.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 849/5971 [08:08<49:03,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.00119, train/loss_step=0.266, global_step=4110.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 849/5971 [08:08<49:03,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0013, train/loss_vlb_step=7.89e-6, train/loss_step=0.0013, global_step=4111.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 850/5971 [08:09<49:04,  1.74it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0013, train/loss_vlb_step=7.89e-6, train/loss_step=0.0013, global_step=4111.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 850/5971 [08:09<49:04,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00272, train/loss_step=0.397, global_step=4111.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  14%|█▍        | 851/5971 [08:10<49:05,  1.74it/s, loss=0.164, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00272, train/loss_step=0.397, global_step=4111.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 851/5971 [08:10<49:05,  1.74it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.21e-5, train/loss_step=0.00213, global_step=4111.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 852/5971 [08:12<49:14,  1.73it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.21e-5, train/loss_step=0.00213, global_step=4111.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 852/5971 [08:12<49:14,  1.73it/s, loss=0.169, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000743, train/loss_step=0.218, global_step=4111.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  14%|█▍        | 853/5971 [08:13<49:15,  1.73it/s, loss=0.169, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000743, train/loss_step=0.218, global_step=4111.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 853/5971 [08:13<49:15,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000719, train/loss_step=0.204, global_step=4112.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 854/5971 [08:14<49:16,  1.73it/s, loss=0.154, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000719, train/loss_step=0.204, global_step=4112.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 854/5971 [08:14<49:16,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.44e-5, train/loss_step=0.00246, global_step=4112.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 855/5971 [08:14<49:17,  1.73it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00246, train/loss_vlb_step=1.44e-5, train/loss_step=0.00246, global_step=4112.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 855/5971 [08:14<49:17,  1.73it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.47e-5, train/loss_step=0.0165, global_step=4112.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  14%|█▍        | 856/5971 [08:17<49:26,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.47e-5, train/loss_step=0.0165, global_step=4112.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 856/5971 [08:17<49:26,  1.72it/s, loss=0.173, v_num=0, train/loss_simple_step=0.565, train/loss_vlb_step=0.0107, train/loss_step=0.565, global_step=4112.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  14%|█▍        | 857/5971 [08:17<49:28,  1.72it/s, loss=0.173, v_num=0, train/loss_simple_step=0.565, train/loss_vlb_step=0.0107, train/loss_step=0.565, global_step=4112.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 857/5971 [08:17<49:28,  1.72it/s, loss=0.177, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=4113.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 858/5971 [08:18<49:29,  1.72it/s, loss=0.177, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=4113.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 858/5971 [08:18<49:29,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00911, train/loss_vlb_step=4.14e-5, train/loss_step=0.00911, global_step=4113.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 859/5971 [08:19<49:30,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00911, train/loss_vlb_step=4.14e-5, train/loss_step=0.00911, global_step=4113.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 859/5971 [08:19<49:30,  1.72it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.000166, train/loss_step=0.0477, global_step=4113.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  14%|█▍        | 860/5971 [08:21<49:38,  1.72it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.000166, train/loss_step=0.0477, global_step=4113.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 860/5971 [08:21<49:38,  1.72it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.14e-5, train/loss_step=0.00415, global_step=4113.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 861/5971 [08:22<49:40,  1.71it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00415, train/loss_vlb_step=2.14e-5, train/loss_step=0.00415, global_step=4113.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 861/5971 [08:22<49:40,  1.71it/s, loss=0.157, v_num=0, train/loss_simple_step=0.510, train/loss_vlb_step=0.00352, train/loss_step=0.510, global_step=4114.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  14%|█▍        | 862/5971 [08:23<49:41,  1.71it/s, loss=0.157, v_num=0, train/loss_simple_step=0.510, train/loss_vlb_step=0.00352, train/loss_step=0.510, global_step=4114.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 862/5971 [08:23<49:41,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0093, train/loss_vlb_step=4.45e-5, train/loss_step=0.0093, global_step=4114.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 863/5971 [08:24<49:42,  1.71it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0093, train/loss_vlb_step=4.45e-5, train/loss_step=0.0093, global_step=4114.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 863/5971 [08:24<49:42,  1.71it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000135, train/loss_step=0.0372, global_step=4114.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 864/5971 [08:26<49:52,  1.71it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000135, train/loss_step=0.0372, global_step=4114.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 864/5971 [08:26<49:52,  1.71it/s, loss=0.161, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000607, train/loss_step=0.175, global_step=4114.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  14%|█▍        | 865/5971 [08:27<49:54,  1.71it/s, loss=0.161, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000607, train/loss_step=0.175, global_step=4114.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  14%|█▍        | 865/5971 [08:27<49:54,  1.71it/s, loss=0.163, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000126, train/loss_step=0.037, global_step=4115.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 866/5971 [08:28<49:55,  1.70it/s, loss=0.163, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000126, train/loss_step=0.037, global_step=4115.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 866/5971 [08:28<49:55,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00236, train/loss_vlb_step=1.37e-5, train/loss_step=0.00236, global_step=4115.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 867/5971 [08:29<49:56,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00236, train/loss_vlb_step=1.37e-5, train/loss_step=0.00236, global_step=4115.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 867/5971 [08:29<49:56,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.13e-5, train/loss_step=0.00405, global_step=4115.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 868/5971 [08:31<50:05,  1.70it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.13e-5, train/loss_step=0.00405, global_step=4115.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 868/5971 [08:31<50:05,  1.70it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.22e-5, train/loss_step=0.00441, global_step=4115.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 869/5971 [08:32<50:06,  1.70it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.22e-5, train/loss_step=0.00441, global_step=4115.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 869/5971 [08:32<50:06,  1.70it/s, loss=0.155, v_num=0, train/loss_simple_step=0.749, train/loss_vlb_step=0.0325, train/loss_step=0.749, global_step=4116.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  15%|█▍        | 870/5971 [08:33<50:07,  1.70it/s, loss=0.155, v_num=0, train/loss_simple_step=0.749, train/loss_vlb_step=0.0325, train/loss_step=0.749, global_step=4116.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 870/5971 [08:33<50:07,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000135, train/loss_step=0.0393, global_step=4116.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 871/5971 [08:34<50:08,  1.70it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000135, train/loss_step=0.0393, global_step=4116.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 871/5971 [08:34<50:08,  1.70it/s, loss=0.147, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000733, train/loss_step=0.205, global_step=4116.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  15%|█▍        | 872/5971 [08:36<50:16,  1.69it/s, loss=0.147, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000733, train/loss_step=0.205, global_step=4116.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 872/5971 [08:36<50:16,  1.69it/s, loss=0.15, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00116, train/loss_step=0.270, global_step=4116.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  15%|█▍        | 873/5971 [08:37<50:18,  1.69it/s, loss=0.15, v_num=0, train/loss_simple_step=0.270, train/loss_vlb_step=0.00116, train/loss_step=0.270, global_step=4116.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 873/5971 [08:37<50:18,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000597, train/loss_step=0.171, global_step=4117.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 874/5971 [08:38<50:19,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000597, train/loss_step=0.171, global_step=4117.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 874/5971 [08:38<50:19,  1.69it/s, loss=0.186, v_num=0, train/loss_simple_step=0.762, train/loss_vlb_step=0.0394, train/loss_step=0.762, global_step=4117.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  15%|█▍        | 875/5971 [08:39<50:20,  1.69it/s, loss=0.186, v_num=0, train/loss_simple_step=0.762, train/loss_vlb_step=0.0394, train/loss_step=0.762, global_step=4117.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 875/5971 [08:39<50:20,  1.69it/s, loss=0.21, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00324, train/loss_step=0.481, global_step=4117.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 876/5971 [08:41<50:28,  1.68it/s, loss=0.21, v_num=0, train/loss_simple_step=0.481, train/loss_vlb_step=0.00324, train/loss_step=0.481, global_step=4117.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 876/5971 [08:41<50:28,  1.68it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.59e-5, train/loss_step=0.00286, global_step=4117.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 877/5971 [08:42<50:29,  1.68it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.59e-5, train/loss_step=0.00286, global_step=4117.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 877/5971 [08:42<50:29,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00133, train/loss_vlb_step=8.14e-6, train/loss_step=0.00133, global_step=4118.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 878/5971 [08:43<50:30,  1.68it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00133, train/loss_vlb_step=8.14e-6, train/loss_step=0.00133, global_step=4118.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 878/5971 [08:43<50:30,  1.68it/s, loss=0.192, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00164, train/loss_step=0.325, global_step=4118.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  15%|█▍        | 879/5971 [08:43<50:32,  1.68it/s, loss=0.192, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00164, train/loss_step=0.325, global_step=4118.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 879/5971 [08:43<50:32,  1.68it/s, loss=0.206, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00141, train/loss_step=0.339, global_step=4118.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 880/5971 [08:46<50:40,  1.67it/s, loss=0.206, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00141, train/loss_step=0.339, global_step=4118.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 880/5971 [08:46<50:40,  1.67it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00927, train/loss_vlb_step=4.14e-5, train/loss_step=0.00927, global_step=4118.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 881/5971 [08:47<50:41,  1.67it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00927, train/loss_vlb_step=4.14e-5, train/loss_step=0.00927, global_step=4118.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 881/5971 [08:47<50:41,  1.67it/s, loss=0.192, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000811, train/loss_step=0.221, global_step=4119.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  15%|█▍        | 882/5971 [08:47<50:42,  1.67it/s, loss=0.192, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000811, train/loss_step=0.221, global_step=4119.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 882/5971 [08:47<50:42,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000908, train/loss_step=0.243, global_step=4119.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 883/5971 [08:48<50:43,  1.67it/s, loss=0.204, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.000908, train/loss_step=0.243, global_step=4119.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 883/5971 [08:48<50:43,  1.67it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.28e-5, train/loss_step=0.00441, global_step=4119.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 884/5971 [08:51<50:52,  1.67it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.28e-5, train/loss_step=0.00441, global_step=4119.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 884/5971 [08:51<50:52,  1.67it/s, loss=0.215, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00372, train/loss_step=0.425, global_step=4119.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  15%|█▍        | 885/5971 [08:52<50:53,  1.67it/s, loss=0.215, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00372, train/loss_step=0.425, global_step=4119.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 885/5971 [08:52<50:53,  1.67it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=7e-5, train/loss_step=0.0169, global_step=4120.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  15%|█▍        | 886/5971 [08:52<50:54,  1.66it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=7e-5, train/loss_step=0.0169, global_step=4120.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 886/5971 [08:52<50:54,  1.66it/s, loss=0.215, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000114, train/loss_step=0.029, global_step=4120.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 887/5971 [08:53<50:55,  1.66it/s, loss=0.215, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000114, train/loss_step=0.029, global_step=4120.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 887/5971 [08:53<50:55,  1.66it/s, loss=0.221, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=4120.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 888/5971 [08:55<51:03,  1.66it/s, loss=0.221, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=4120.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 888/5971 [08:55<51:03,  1.66it/s, loss=0.231, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000728, train/loss_step=0.208, global_step=4120.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 889/5971 [08:56<51:04,  1.66it/s, loss=0.231, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000728, train/loss_step=0.208, global_step=4120.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 889/5971 [08:56<51:04,  1.66it/s, loss=0.206, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00108, train/loss_step=0.258, global_step=4121.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  15%|█▍        | 890/5971 [08:57<51:05,  1.66it/s, loss=0.206, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00108, train/loss_step=0.258, global_step=4121.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 890/5971 [08:57<51:05,  1.66it/s, loss=0.21, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=4121.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 891/5971 [08:58<51:06,  1.66it/s, loss=0.21, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=4121.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 891/5971 [08:58<51:06,  1.66it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000311, train/loss_step=0.0922, global_step=4121.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 892/5971 [09:00<51:16,  1.65it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000311, train/loss_step=0.0922, global_step=4121.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 892/5971 [09:00<51:16,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.482, train/loss_vlb_step=0.00561, train/loss_step=0.482, global_step=4121.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  15%|█▍        | 893/5971 [09:01<51:17,  1.65it/s, loss=0.215, v_num=0, train/loss_simple_step=0.482, train/loss_vlb_step=0.00561, train/loss_step=0.482, global_step=4121.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 893/5971 [09:01<51:17,  1.65it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.000179, train/loss_step=0.0493, global_step=4122.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 894/5971 [09:02<51:18,  1.65it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.000179, train/loss_step=0.0493, global_step=4122.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 894/5971 [09:02<51:18,  1.65it/s, loss=0.184, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000967, train/loss_step=0.266, global_step=4122.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  15%|█▍        | 895/5971 [09:03<51:19,  1.65it/s, loss=0.184, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000967, train/loss_step=0.266, global_step=4122.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▍        | 895/5971 [09:03<51:19,  1.65it/s, loss=0.169, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000654, train/loss_step=0.183, global_step=4122.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 896/5971 [09:05<51:27,  1.64it/s, loss=0.169, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000654, train/loss_step=0.183, global_step=4122.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 896/5971 [09:05<51:27,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.01e-5, train/loss_step=0.0117, global_step=4122.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 897/5971 [09:06<51:28,  1.64it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.01e-5, train/loss_step=0.0117, global_step=4122.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 897/5971 [09:06<51:28,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=4123.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 898/5971 [09:07<51:28,  1.64it/s, loss=0.176, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=4123.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 898/5971 [09:07<51:28,  1.64it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.71e-5, train/loss_step=0.0218, global_step=4123.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 899/5971 [09:08<51:29,  1.64it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.71e-5, train/loss_step=0.0218, global_step=4123.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 899/5971 [09:08<51:29,  1.64it/s, loss=0.164, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00272, train/loss_step=0.410, global_step=4123.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  15%|█▌        | 900/5971 [09:10<51:37,  1.64it/s, loss=0.164, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00272, train/loss_step=0.410, global_step=4123.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 900/5971 [09:10<51:37,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4123.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 901/5971 [09:11<51:38,  1.64it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4123.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 901/5971 [09:11<51:38,  1.64it/s, loss=0.165, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000573, train/loss_step=0.159, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  15%|█▌        | 902/5971 [09:12<51:39,  1.64it/s, loss=0.165, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000573, train/loss_step=0.159, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 902/5971 [09:12<51:39,  1.64it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000144, train/loss_step=0.0394, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 903/5971 [09:13<51:40,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.000144, train/loss_step=0.0394, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 903/5971 [09:13<51:40,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00836, train/loss_vlb_step=3.98e-5, train/loss_step=0.00836, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 904/5971 [09:15<51:48,  1.63it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00836, train/loss_vlb_step=3.98e-5, train/loss_step=0.00836, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  15%|█▌        | 904/5971 [09:15<51:48,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:06,  2.51it/s][A
Epoch 7:  15%|█▌        | 906/5971 [09:15<51:42,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:41,  4.01it/s][A
Epoch 7:  15%|█▌        | 908/5971 [09:15<51:35,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:15, 10.17it/s][A
Epoch 7:  15%|█▌        | 911/5971 [09:15<51:24,  1.64it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.07it/s][A
Epoch 7:  15%|█▌        | 914/5971 [09:16<51:13,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.56it/s][A
Epoch 7:  15%|█▌        | 917/5971 [09:16<51:02,  1.65it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.56it/s][A
Epoch 7:  15%|█▌        | 920/5971 [09:16<50:50,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.40it/s][A
Epoch 7:  15%|█▌        | 923/5971 [09:16<50:39,  1.66it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.76it/s][A
Epoch 7:  16%|█▌        | 926/5971 [09:16<50:28,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 21.58it/s][A
Epoch 7:  16%|█▌        | 929/5971 [09:16<50:18,  1.67it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 20.41it/s][A
Epoch 7:  16%|█▌        | 932/5971 [09:16<50:07,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 21.09it/s][A
Epoch 7:  16%|█▌        | 935/5971 [09:16<49:56,  1.68it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:06, 22.09it/s][A
Epoch 7:  16%|█▌        | 938/5971 [09:17<49:46,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:01<00:06, 21.56it/s][A
Epoch 7:  16%|█▌        | 941/5971 [09:17<49:35,  1.69it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 22.69it/s][A
Epoch 7:  16%|█▌        | 944/5971 [09:17<49:24,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 23.48it/s][A
Epoch 7:  16%|█▌        | 947/5971 [09:17<49:14,  1.70it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 23.06it/s][A
Epoch 7:  16%|█▌        | 950/5971 [09:17<49:04,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 24.09it/s][A
Epoch 7:  16%|█▌        | 953/5971 [09:17<48:53,  1.71it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.14it/s][A
Epoch 7:  16%|█▌        | 957/5971 [09:17<48:39,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 27.44it/s][A
Epoch 7:  16%|█▌        | 961/5971 [09:17<48:25,  1.72it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▍      | 58/167 [00:02<00:03, 27.53it/s][A
Epoch 7:  16%|█▌        | 965/5971 [09:18<48:12,  1.73it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 26.62it/s][A

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.97it/s][A
Epoch 7:  16%|█▌        | 969/5971 [09:18<47:58,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|████      | 67/167 [00:03<00:04, 24.85it/s][A
Epoch 7:  16%|█▋        | 973/5971 [09:18<47:45,  1.74it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 25.47it/s][A
Epoch 7:  16%|█▋        | 977/5971 [09:18<47:32,  1.75it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 25.52it/s][A

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.46it/s][A
Epoch 7:  16%|█▋        | 981/5971 [09:18<47:19,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.34it/s][A
Epoch 7:  16%|█▋        | 985/5971 [09:18<47:06,  1.76it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 26.54it/s][A
Epoch 7:  17%|█▋        | 989/5971 [09:19<46:53,  1.77it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████     | 85/167 [00:03<00:03, 25.72it/s][A

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 23.47it/s][A
Epoch 7:  17%|█▋        | 993/5971 [09:19<46:40,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 23.37it/s][A
Epoch 7:  17%|█▋        | 997/5971 [09:19<46:28,  1.78it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 24.50it/s][A
Epoch 7:  17%|█▋        | 1001/5971 [09:19<46:15,  1.79it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 23.79it/s][A

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 23.87it/s][A
Epoch 7:  17%|█▋        | 1005/5971 [09:19<46:03,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 24.12it/s][A
Epoch 7:  17%|█▋        | 1009/5971 [09:19<45:50,  1.80it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 24.73it/s][A
Epoch 7:  17%|█▋        | 1013/5971 [09:20<45:38,  1.81it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.04it/s][A

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.66it/s][A
Epoch 7:  17%|█▋        | 1017/5971 [09:20<45:26,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 26.78it/s][A
Epoch 7:  17%|█▋        | 1021/5971 [09:20<45:14,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.05it/s][A
Epoch 7:  17%|█▋        | 1025/5971 [09:20<45:02,  1.83it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.57it/s][A

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.90it/s][A
Epoch 7:  17%|█▋        | 1029/5971 [09:20<44:50,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.57it/s][A
Epoch 7:  17%|█▋        | 1033/5971 [09:20<44:38,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.20it/s][A
Epoch 7:  17%|█▋        | 1037/5971 [09:20<44:26,  1.85it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 25.33it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.16it/s][A
Epoch 7:  17%|█▋        | 1041/5971 [09:21<44:14,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 25.80it/s][A
Epoch 7:  18%|█▊        | 1045/5971 [09:21<44:03,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 27.49it/s][A
Epoch 7:  18%|█▊        | 1049/5971 [09:21<43:51,  1.87it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.51it/s][A
Epoch 7:  18%|█▊        | 1053/5971 [09:21<43:40,  1.88it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.38it/s][A

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.91it/s][A
Epoch 7:  18%|█▊        | 1057/5971 [09:21<43:28,  1.88it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 26.88it/s][A
Epoch 7:  18%|█▊        | 1061/5971 [09:21<43:17,  1.89it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.96it/s][A
Epoch 7:  18%|█▊        | 1065/5971 [09:21<43:06,  1.90it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.62it/s][A

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 25.74it/s][A
Epoch 7:  18%|█▊        | 1069/5971 [09:22<42:55,  1.90it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 100%|██████████| 167/167 [00:07<00:00, 26.49it/s][A
Epoch 7:  18%|█▊        | 1072/5971 [09:22<42:47,  1.91it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  18%|█▊        | 1073/5971 [09:23<42:49,  1.91it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=9.11e-6, train/loss_step=0.00151, global_step=4124.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1073/5971 [09:23<42:49,  1.91it/s, loss=0.143, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000779, train/loss_step=0.204, global_step=4125.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  18%|█▊        | 1074/5971 [09:24<42:50,  1.91it/s, loss=0.148, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000377, train/loss_step=0.115, global_step=4125.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1075/5971 [09:25<42:51,  1.90it/s, loss=0.162, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.0021, train/loss_step=0.398, global_step=4125.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  18%|█▊        | 1076/5971 [09:27<42:58,  1.90it/s, loss=0.168, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00159, train/loss_step=0.331, global_step=4125.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1077/5971 [09:28<42:59,  1.90it/s, loss=0.168, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00159, train/loss_step=0.331, global_step=4125.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1077/5971 [09:28<42:59,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000817, train/loss_step=0.229, global_step=4126.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1078/5971 [09:29<43:00,  1.90it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.21e-5, train/loss_step=0.00216, global_step=4126.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1079/5971 [09:29<43:01,  1.89it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00615, train/loss_vlb_step=2.79e-5, train/loss_step=0.00615, global_step=4126.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1080/5971 [09:32<43:08,  1.89it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00602, train/loss_vlb_step=2.83e-5, train/loss_step=0.00602, global_step=4126.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1081/5971 [09:33<43:09,  1.89it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00602, train/loss_vlb_step=2.83e-5, train/loss_step=0.00602, global_step=4126.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1081/5971 [09:33<43:09,  1.89it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00437, train/loss_vlb_step=2.27e-5, train/loss_step=0.00437, global_step=4127.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1082/5971 [09:33<43:10,  1.89it/s, loss=0.138, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00268, train/loss_step=0.422, global_step=4127.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  18%|█▊        | 1083/5971 [09:34<43:12,  1.89it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.97e-5, train/loss_step=0.0106, global_step=4127.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1084/5971 [09:37<43:18,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000534, train/loss_step=0.161, global_step=4127.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1085/5971 [09:37<43:19,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000534, train/loss_step=0.161, global_step=4127.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1085/5971 [09:37<43:19,  1.88it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00359, train/loss_vlb_step=1.85e-5, train/loss_step=0.00359, global_step=4128.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1086/5971 [09:38<43:20,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.000578, train/loss_step=0.171, global_step=4128.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  18%|█▊        | 1087/5971 [09:39<43:21,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.000218, train/loss_step=0.0613, global_step=4128.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1088/5971 [09:42<43:30,  1.87it/s, loss=0.126, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000727, train/loss_step=0.193, global_step=4128.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  18%|█▊        | 1089/5971 [09:42<43:31,  1.87it/s, loss=0.126, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000727, train/loss_step=0.193, global_step=4128.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1089/5971 [09:42<43:31,  1.87it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00925, train/loss_vlb_step=3.9e-5, train/loss_step=0.00925, global_step=4129.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1090/5971 [09:43<43:31,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0749, train/loss_vlb_step=0.000247, train/loss_step=0.0749, global_step=4129.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1091/5971 [09:44<43:32,  1.87it/s, loss=0.136, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00136, train/loss_step=0.321, global_step=4129.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  18%|█▊        | 1092/5971 [09:46<43:39,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000889, train/loss_step=0.226, global_step=4129.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1093/5971 [09:47<43:40,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000889, train/loss_step=0.226, global_step=4129.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1093/5971 [09:47<43:40,  1.86it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000211, train/loss_step=0.0643, global_step=4130.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1094/5971 [09:48<43:41,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000134, train/loss_step=0.037, global_step=4130.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  18%|█▊        | 1095/5971 [09:49<43:42,  1.86it/s, loss=0.14, v_num=0, train/loss_simple_step=0.466, train/loss_vlb_step=0.00371, train/loss_step=0.466, global_step=4130.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  18%|█▊        | 1096/5971 [09:51<43:50,  1.85it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.73e-5, train/loss_step=0.0205, global_step=4130.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1097/5971 [09:52<43:51,  1.85it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.73e-5, train/loss_step=0.0205, global_step=4130.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1097/5971 [09:52<43:51,  1.85it/s, loss=0.132, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00242, train/loss_step=0.382, global_step=4131.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  18%|█▊        | 1098/5971 [09:53<43:52,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.00334, train/loss_step=0.468, global_step=4131.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1099/5971 [09:54<43:53,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.094, train/loss_vlb_step=0.000309, train/loss_step=0.094, global_step=4131.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1100/5971 [09:56<43:59,  1.85it/s, loss=0.167, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000504, train/loss_step=0.143, global_step=4131.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1101/5971 [09:57<44:00,  1.84it/s, loss=0.167, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000504, train/loss_step=0.143, global_step=4131.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1101/5971 [09:57<44:00,  1.84it/s, loss=0.17, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000222, train/loss_step=0.065, global_step=4132.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  18%|█▊        | 1102/5971 [09:58<44:01,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.94e-5, train/loss_step=0.00346, global_step=4132.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  18%|█▊        | 1103/5971 [09:59<44:02,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.675, train/loss_vlb_step=0.0158, train/loss_step=0.675, global_step=4132.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  18%|█▊        | 1104/5971 [10:01<44:08,  1.84it/s, loss=0.213, v_num=0, train/loss_simple_step=0.778, train/loss_vlb_step=0.0208, train/loss_step=0.778, global_step=4132.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1105/5971 [10:02<44:09,  1.84it/s, loss=0.213, v_num=0, train/loss_simple_step=0.778, train/loss_vlb_step=0.0208, train/loss_step=0.778, global_step=4132.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1105/5971 [10:02<44:09,  1.84it/s, loss=0.257, v_num=0, train/loss_simple_step=0.890, train/loss_vlb_step=0.150, train/loss_step=0.890, global_step=4133.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  19%|█▊        | 1106/5971 [10:03<44:10,  1.84it/s, loss=0.257, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000535, train/loss_step=0.158, global_step=4133.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1107/5971 [10:03<44:11,  1.83it/s, loss=0.258, v_num=0, train/loss_simple_step=0.099, train/loss_vlb_step=0.000329, train/loss_step=0.099, global_step=4133.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1108/5971 [10:06<44:17,  1.83it/s, loss=0.252, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000226, train/loss_step=0.0671, global_step=4133.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1109/5971 [10:07<44:18,  1.83it/s, loss=0.252, v_num=0, train/loss_simple_step=0.0671, train/loss_vlb_step=0.000226, train/loss_step=0.0671, global_step=4133.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1109/5971 [10:07<44:18,  1.83it/s, loss=0.272, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00339, train/loss_step=0.413, global_step=4134.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  19%|█▊        | 1110/5971 [10:07<44:19,  1.83it/s, loss=0.28, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000863, train/loss_step=0.222, global_step=4134.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1111/5971 [10:08<44:20,  1.83it/s, loss=0.269, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000388, train/loss_step=0.118, global_step=4134.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1112/5971 [10:10<44:26,  1.82it/s, loss=0.26, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000113, train/loss_step=0.0283, global_step=4134.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1113/5971 [10:11<44:27,  1.82it/s, loss=0.26, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000113, train/loss_step=0.0283, global_step=4134.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1113/5971 [10:11<44:27,  1.82it/s, loss=0.257, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.43e-5, train/loss_step=0.0132, global_step=4135.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1114/5971 [10:12<44:28,  1.82it/s, loss=0.288, v_num=0, train/loss_simple_step=0.648, train/loss_vlb_step=0.0126, train/loss_step=0.648, global_step=4135.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  19%|█▊        | 1115/5971 [10:13<44:29,  1.82it/s, loss=0.265, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.68e-5, train/loss_step=0.0102, global_step=4135.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1116/5971 [10:16<44:37,  1.81it/s, loss=0.268, v_num=0, train/loss_simple_step=0.092, train/loss_vlb_step=0.000303, train/loss_step=0.092, global_step=4135.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  19%|█▊        | 1117/5971 [10:16<44:38,  1.81it/s, loss=0.268, v_num=0, train/loss_simple_step=0.092, train/loss_vlb_step=0.000303, train/loss_step=0.092, global_step=4135.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1117/5971 [10:16<44:38,  1.81it/s, loss=0.258, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000622, train/loss_step=0.170, global_step=4136.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1118/5971 [10:17<44:39,  1.81it/s, loss=0.238, v_num=0, train/loss_simple_step=0.0793, train/loss_vlb_step=0.000261, train/loss_step=0.0793, global_step=4136.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▊        | 1119/5971 [10:18<44:40,  1.81it/s, loss=0.235, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=0.000103, train/loss_step=0.0263, global_step=4136.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1120/5971 [10:20<44:47,  1.81it/s, loss=0.233, v_num=0, train/loss_simple_step=0.098, train/loss_vlb_step=0.000324, train/loss_step=0.098, global_step=4136.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1121/5971 [10:21<44:48,  1.80it/s, loss=0.233, v_num=0, train/loss_simple_step=0.098, train/loss_vlb_step=0.000324, train/loss_step=0.098, global_step=4136.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1121/5971 [10:21<44:48,  1.80it/s, loss=0.23, v_num=0, train/loss_simple_step=0.00568, train/loss_vlb_step=2.83e-5, train/loss_step=0.00568, global_step=4137.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1122/5971 [10:22<44:49,  1.80it/s, loss=0.239, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000718, train/loss_step=0.190, global_step=4137.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1123/5971 [10:23<44:49,  1.80it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0737, train/loss_vlb_step=0.000244, train/loss_step=0.0737, global_step=4137.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1124/5971 [10:25<44:55,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00105, train/loss_step=0.274, global_step=4137.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  19%|█▉        | 1125/5971 [10:26<44:56,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00105, train/loss_step=0.274, global_step=4137.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1125/5971 [10:26<44:56,  1.80it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00699, train/loss_vlb_step=3.29e-5, train/loss_step=0.00699, global_step=4138.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1126/5971 [10:27<44:57,  1.80it/s, loss=0.141, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.00067, train/loss_step=0.182, global_step=4138.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  19%|█▉        | 1127/5971 [10:28<44:58,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.65e-5, train/loss_step=0.0103, global_step=4138.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1128/5971 [10:30<45:04,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00682, train/loss_vlb_step=3.29e-5, train/loss_step=0.00682, global_step=4138.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1129/5971 [10:31<45:05,  1.79it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00682, train/loss_vlb_step=3.29e-5, train/loss_step=0.00682, global_step=4138.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1129/5971 [10:31<45:05,  1.79it/s, loss=0.139, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00648, train/loss_step=0.519, global_step=4139.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  19%|█▉        | 1130/5971 [10:32<45:06,  1.79it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0914, train/loss_vlb_step=0.0003, train/loss_step=0.0914, global_step=4139.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1131/5971 [10:33<45:07,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00718, train/loss_vlb_step=3.53e-5, train/loss_step=0.00718, global_step=4139.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1132/5971 [10:35<45:13,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00323, train/loss_step=0.446, global_step=4139.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  19%|█▉        | 1133/5971 [10:36<45:14,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00323, train/loss_step=0.446, global_step=4139.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1133/5971 [10:36<45:14,  1.78it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00335, train/loss_vlb_step=1.79e-5, train/loss_step=0.00335, global_step=4140.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1134/5971 [10:37<45:15,  1.78it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.58e-5, train/loss_step=0.0208, global_step=4140.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1135/5971 [10:38<45:16,  1.78it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.97e-5, train/loss_step=0.0108, global_step=4140.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1136/5971 [10:40<45:23,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00224, train/loss_step=0.347, global_step=4140.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1137/5971 [10:41<45:23,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00224, train/loss_step=0.347, global_step=4140.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1137/5971 [10:41<45:23,  1.77it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000132, train/loss_step=0.0361, global_step=4141.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1138/5971 [10:42<45:24,  1.77it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0702, train/loss_vlb_step=0.000237, train/loss_step=0.0702, global_step=4141.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1139/5971 [10:42<45:25,  1.77it/s, loss=0.145, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00468, train/loss_step=0.509, global_step=4141.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  19%|█▉        | 1140/5971 [10:45<45:31,  1.77it/s, loss=0.147, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000467, train/loss_step=0.138, global_step=4141.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1141/5971 [10:46<45:32,  1.77it/s, loss=0.147, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000467, train/loss_step=0.138, global_step=4141.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1141/5971 [10:46<45:32,  1.77it/s, loss=0.153, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000421, train/loss_step=0.124, global_step=4142.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1142/5971 [10:46<45:33,  1.77it/s, loss=0.151, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000463, train/loss_step=0.140, global_step=4142.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1143/5971 [10:47<45:33,  1.77it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.2e-5, train/loss_step=0.00392, global_step=4142.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1144/5971 [10:50<45:43,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00171, train/loss_step=0.350, global_step=4142.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  19%|█▉        | 1145/5971 [10:51<45:44,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00171, train/loss_step=0.350, global_step=4142.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1145/5971 [10:51<45:44,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000135, train/loss_step=0.037, global_step=4143.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1146/5971 [10:52<45:44,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.73e-6, train/loss_step=0.00166, global_step=4143.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1147/5971 [10:53<45:45,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.01e-5, train/loss_step=0.0178, global_step=4143.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1148/5971 [10:55<45:51,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000235, train/loss_step=0.0716, global_step=4143.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1149/5971 [10:56<45:52,  1.75it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000235, train/loss_step=0.0716, global_step=4143.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1149/5971 [10:56<45:52,  1.75it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000308, train/loss_step=0.0928, global_step=4144.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1150/5971 [10:57<45:52,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000132, train/loss_step=0.0366, global_step=4144.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1151/5971 [10:58<45:53,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00559, train/loss_vlb_step=2.67e-5, train/loss_step=0.00559, global_step=4144.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1152/5971 [11:00<45:59,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=9.84e-5, train/loss_step=0.0268, global_step=4144.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1153/5971 [11:01<46:00,  1.75it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=9.84e-5, train/loss_step=0.0268, global_step=4144.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1153/5971 [11:01<46:00,  1.75it/s, loss=0.119, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00275, train/loss_step=0.346, global_step=4145.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1154/5971 [11:02<46:01,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00812, train/loss_vlb_step=4.05e-5, train/loss_step=0.00812, global_step=4145.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1155/5971 [11:02<46:02,  1.74it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0754, train/loss_vlb_step=0.000261, train/loss_step=0.0754, global_step=4145.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  19%|█▉        | 1156/5971 [11:05<46:09,  1.74it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.00015, train/loss_step=0.0398, global_step=4145.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  19%|█▉        | 1157/5971 [11:06<46:09,  1.74it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.00015, train/loss_step=0.0398, global_step=4145.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1157/5971 [11:06<46:09,  1.74it/s, loss=0.121, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00148, train/loss_step=0.334, global_step=4146.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1158/5971 [11:07<46:10,  1.74it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.27e-5, train/loss_step=0.0201, global_step=4146.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1159/5971 [11:08<46:11,  1.74it/s, loss=0.1, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000459, train/loss_step=0.138, global_step=4146.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  19%|█▉        | 1160/5971 [11:10<46:17,  1.73it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.00276, train/loss_vlb_step=1.56e-5, train/loss_step=0.00276, global_step=4146.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1161/5971 [11:11<46:18,  1.73it/s, loss=0.0936, v_num=0, train/loss_simple_step=0.00276, train/loss_vlb_step=1.56e-5, train/loss_step=0.00276, global_step=4146.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  19%|█▉        | 1161/5971 [11:11<46:18,  1.73it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.12e-5, train/loss_step=0.0138, global_step=4147.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  19%|█▉        | 1162/5971 [11:12<46:18,  1.73it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=5.8e-5, train/loss_step=0.0141, global_step=4147.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  19%|█▉        | 1163/5971 [11:12<46:19,  1.73it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00136, train/loss_step=0.292, global_step=4147.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  19%|█▉        | 1164/5971 [11:15<46:26,  1.73it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00214, train/loss_step=0.389, global_step=4147.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  20%|█▉        | 1165/5971 [11:16<46:26,  1.72it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00214, train/loss_step=0.389, global_step=4147.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  20%|█▉        | 1165/5971 [11:16<46:26,  1.72it/s, loss=0.126, v_num=0, train/loss_simple_step=0.603, train/loss_vlb_step=0.008, train/loss_step=0.603, global_step=4148.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  20%|█▉        | 1166/5971 [11:16<46:27,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.7e-5, train/loss_step=0.0104, global_step=4148.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  20%|█▉        | 1167/5971 [11:17<46:27,  1.72it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000151, train/loss_step=0.0399, global_step=4148.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  20%|█▉        | 1168/5971 [11:20<46:34,  1.72it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=6.9e-5, train/loss_step=0.0172, global_step=4148.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  20%|█▉        | 1169/5971 [11:20<46:34,  1.72it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=6.9e-5, train/loss_step=0.0172, global_step=4148.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  20%|█▉        | 1169/5971 [11:20<46:34,  1.72it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.04e-5, train/loss_step=0.0114, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  20%|█▉        | 1170/5971 [11:21<46:35,  1.72it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0312, train/loss_vlb_step=0.000118, train/loss_step=0.0312, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  20%|█▉        | 1171/5971 [11:22<46:36,  1.72it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0274, train/loss_vlb_step=0.000112, train/loss_step=0.0274, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  20%|█▉        | 1172/5971 [11:24<46:41,  1.71it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  20%|█▉        | 1173/5971 [11:24<46:39,  1.71it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:05,  2.53it/s][A

Validating:   1%|          | 2/167 [00:00<00:38,  4.26it/s][A
Epoch 7:  20%|█▉        | 1177/5971 [11:25<46:29,  1.72it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:15, 10.56it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:10, 14.90it/s][A
Epoch 7:  20%|█▉        | 1181/5971 [11:25<46:18,  1.72it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.39it/s][A
Epoch 7:  20%|█▉        | 1185/5971 [11:25<46:07,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.42it/s][A
Epoch 7:  20%|█▉        | 1189/5971 [11:25<45:56,  1.73it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.56it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.98it/s][A
Epoch 7:  20%|█▉        | 1193/5971 [11:26<45:45,  1.74it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.06it/s][A
Epoch 7:  20%|██        | 1197/5971 [11:26<45:34,  1.75it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.12it/s][A
Epoch 7:  20%|██        | 1201/5971 [11:26<45:23,  1.75it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.55it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.22it/s][A
Epoch 7:  20%|██        | 1205/5971 [11:26<45:13,  1.76it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:01<00:04, 26.89it/s][A
Epoch 7:  20%|██        | 1209/5971 [11:26<45:02,  1.76it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:01<00:05, 25.25it/s][A
Epoch 7:  20%|██        | 1213/5971 [11:26<44:52,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.59it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.16it/s][A
Epoch 7:  20%|██        | 1217/5971 [11:27<44:41,  1.77it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 27.04it/s][A
Epoch 7:  20%|██        | 1221/5971 [11:27<44:31,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.36it/s][A
Epoch 7:  21%|██        | 1225/5971 [11:27<44:20,  1.78it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.33it/s][A

Validating:  34%|███▎      | 56/167 [00:02<00:04, 25.85it/s][A
Epoch 7:  21%|██        | 1229/5971 [11:27<44:10,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.10it/s][A
Epoch 7:  21%|██        | 1233/5971 [11:27<44:00,  1.79it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.74it/s][A
Epoch 7:  21%|██        | 1237/5971 [11:27<43:50,  1.80it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.90it/s][A

Validating:  41%|████      | 68/167 [00:03<00:03, 26.51it/s][A
Epoch 7:  21%|██        | 1241/5971 [11:27<43:39,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.80it/s][A
Epoch 7:  21%|██        | 1245/5971 [11:28<43:29,  1.81it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.38it/s][A
Epoch 7:  21%|██        | 1249/5971 [11:28<43:20,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.18it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 24.98it/s][A
Epoch 7:  21%|██        | 1253/5971 [11:28<43:10,  1.82it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 24.69it/s][A
Epoch 7:  21%|██        | 1257/5971 [11:28<43:00,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.39it/s][A
Epoch 7:  21%|██        | 1261/5971 [11:28<42:50,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 25.39it/s][A
Epoch 7:  21%|██        | 1265/5971 [11:28<42:40,  1.84it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.04it/s][A

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 26.47it/s][A
Epoch 7:  21%|██▏       | 1269/5971 [11:29<42:31,  1.84it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.84it/s][A
Epoch 7:  21%|██▏       | 1273/5971 [11:29<42:21,  1.85it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.68it/s][A
Epoch 7:  21%|██▏       | 1277/5971 [11:29<42:11,  1.85it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.98it/s][A

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 27.01it/s][A
Epoch 7:  21%|██▏       | 1281/5971 [11:29<42:02,  1.86it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.90it/s][A
Epoch 7:  22%|██▏       | 1285/5971 [11:29<41:52,  1.86it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 26.55it/s][A
Epoch 7:  22%|██▏       | 1289/5971 [11:29<41:43,  1.87it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.65it/s][A

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.92it/s][A
Epoch 7:  22%|██▏       | 1293/5971 [11:29<41:34,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.76it/s][A
Epoch 7:  22%|██▏       | 1297/5971 [11:30<41:24,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.60it/s][A
Epoch 7:  22%|██▏       | 1301/5971 [11:30<41:15,  1.89it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.21it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.82it/s][A
Epoch 7:  22%|██▏       | 1305/5971 [11:30<41:06,  1.89it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.79it/s][A
Epoch 7:  22%|██▏       | 1309/5971 [11:30<40:57,  1.90it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.00it/s][A
Epoch 7:  22%|██▏       | 1313/5971 [11:30<40:48,  1.90it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 141/167 [00:05<00:01, 25.21it/s][A

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 24.96it/s][A
Epoch 7:  22%|██▏       | 1317/5971 [11:30<40:39,  1.91it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 25.66it/s][A
Epoch 7:  22%|██▏       | 1321/5971 [11:31<40:30,  1.91it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 25.21it/s][A
Epoch 7:  22%|██▏       | 1325/5971 [11:31<40:21,  1.92it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.74it/s][A
Epoch 7:  22%|██▏       | 1329/5971 [11:31<40:12,  1.92it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.22it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 27.18it/s][A
Epoch 7:  22%|██▏       | 1333/5971 [11:31<40:04,  1.93it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.09it/s][A
Epoch 7:  22%|██▏       | 1337/5971 [11:31<39:55,  1.93it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.51it/s][A
Epoch 7:  22%|██▏       | 1340/5971 [11:32<39:49,  1.94it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  22%|██▏       | 1341/5971 [11:32<39:50,  1.94it/s, loss=0.128, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000502, train/loss_step=0.153, global_step=4149.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  22%|██▏       | 1341/5971 [11:32<39:50,  1.94it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0762, train/loss_vlb_step=0.000258, train/loss_step=0.0762, global_step=4150.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  22%|██▏       | 1342/5971 [11:33<39:51,  1.94it/s, loss=0.131, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00138, train/loss_step=0.326, global_step=4150.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  22%|██▏       | 1343/5971 [11:34<39:52,  1.93it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.5e-5, train/loss_step=0.00265, global_step=4150.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1344/5971 [11:37<39:58,  1.93it/s, loss=0.138, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000968, train/loss_step=0.249, global_step=4150.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1345/5971 [11:38<39:59,  1.93it/s, loss=0.138, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000968, train/loss_step=0.249, global_step=4150.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1345/5971 [11:38<39:59,  1.93it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00328, train/loss_vlb_step=1.73e-5, train/loss_step=0.00328, global_step=4151.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1346/5971 [11:39<40:00,  1.93it/s, loss=0.151, v_num=0, train/loss_simple_step=0.627, train/loss_vlb_step=0.0111, train/loss_step=0.627, global_step=4151.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  23%|██▎       | 1347/5971 [11:39<40:00,  1.93it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.21e-5, train/loss_step=0.00216, global_step=4151.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1348/5971 [11:41<40:05,  1.92it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.28e-5, train/loss_step=0.00218, global_step=4151.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1349/5971 [11:42<40:06,  1.92it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.28e-5, train/loss_step=0.00218, global_step=4151.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1349/5971 [11:42<40:06,  1.92it/s, loss=0.165, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00269, train/loss_step=0.421, global_step=4152.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  23%|██▎       | 1350/5971 [11:43<40:07,  1.92it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000145, train/loss_step=0.0406, global_step=4152.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1351/5971 [11:44<40:07,  1.92it/s, loss=0.154, v_num=0, train/loss_simple_step=0.056, train/loss_vlb_step=0.000191, train/loss_step=0.056, global_step=4152.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1352/5971 [11:47<40:13,  1.91it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.00022, train/loss_step=0.0647, global_step=4152.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1353/5971 [11:47<40:14,  1.91it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.00022, train/loss_step=0.0647, global_step=4152.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1353/5971 [11:47<40:14,  1.91it/s, loss=0.143, v_num=0, train/loss_simple_step=0.696, train/loss_vlb_step=0.0157, train/loss_step=0.696, global_step=4153.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  23%|██▎       | 1354/5971 [11:48<40:15,  1.91it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.37e-5, train/loss_step=0.0128, global_step=4153.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1355/5971 [11:49<40:15,  1.91it/s, loss=0.153, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000917, train/loss_step=0.235, global_step=4153.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1356/5971 [11:51<40:20,  1.91it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.43e-5, train/loss_step=0.00253, global_step=4153.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1357/5971 [11:52<40:21,  1.91it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.43e-5, train/loss_step=0.00253, global_step=4153.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1357/5971 [11:52<40:21,  1.91it/s, loss=0.153, v_num=0, train/loss_simple_step=0.036, train/loss_vlb_step=0.000138, train/loss_step=0.036, global_step=4154.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  23%|██▎       | 1358/5971 [11:53<40:22,  1.90it/s, loss=0.164, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00092, train/loss_step=0.243, global_step=4154.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1359/5971 [11:54<40:22,  1.90it/s, loss=0.168, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=4154.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1360/5971 [11:56<40:28,  1.90it/s, loss=0.171, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000767, train/loss_step=0.213, global_step=4154.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1361/5971 [11:57<40:29,  1.90it/s, loss=0.171, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000767, train/loss_step=0.213, global_step=4154.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1361/5971 [11:57<40:29,  1.90it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000132, train/loss_step=0.0376, global_step=4155.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1362/5971 [11:58<40:30,  1.90it/s, loss=0.16, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000481, train/loss_step=0.145, global_step=4155.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  23%|██▎       | 1363/5971 [11:59<40:30,  1.90it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.00015, train/loss_step=0.0392, global_step=4155.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1364/5971 [12:01<40:35,  1.89it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000166, train/loss_step=0.0454, global_step=4155.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1365/5971 [12:02<40:36,  1.89it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000166, train/loss_step=0.0454, global_step=4155.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1365/5971 [12:02<40:36,  1.89it/s, loss=0.163, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000838, train/loss_step=0.233, global_step=4156.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1366/5971 [12:03<40:36,  1.89it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.15e-5, train/loss_step=0.0144, global_step=4156.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1367/5971 [12:04<40:37,  1.89it/s, loss=0.144, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000842, train/loss_step=0.218, global_step=4156.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1368/5971 [12:06<40:43,  1.88it/s, loss=0.158, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.0013, train/loss_step=0.298, global_step=4156.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1369/5971 [12:07<40:44,  1.88it/s, loss=0.158, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.0013, train/loss_step=0.298, global_step=4156.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1369/5971 [12:07<40:44,  1.88it/s, loss=0.147, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000663, train/loss_step=0.189, global_step=4157.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1370/5971 [12:08<40:45,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000736, train/loss_step=0.208, global_step=4157.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1371/5971 [12:09<40:45,  1.88it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0307, train/loss_vlb_step=0.000114, train/loss_step=0.0307, global_step=4157.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1372/5971 [12:11<40:50,  1.88it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00359, train/loss_vlb_step=1.97e-5, train/loss_step=0.00359, global_step=4157.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1373/5971 [12:12<40:51,  1.88it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00359, train/loss_vlb_step=1.97e-5, train/loss_step=0.00359, global_step=4157.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1373/5971 [12:12<40:51,  1.88it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0379, train/loss_vlb_step=0.000142, train/loss_step=0.0379, global_step=4158.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1374/5971 [12:13<40:51,  1.88it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=8.94e-5, train/loss_step=0.0232, global_step=4158.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1375/5971 [12:14<40:52,  1.87it/s, loss=0.113, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000411, train/loss_step=0.122, global_step=4158.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1376/5971 [12:16<40:57,  1.87it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.93e-5, train/loss_step=0.0113, global_step=4158.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1377/5971 [12:17<40:57,  1.87it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.93e-5, train/loss_step=0.0113, global_step=4158.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1377/5971 [12:17<40:57,  1.87it/s, loss=0.154, v_num=0, train/loss_simple_step=0.842, train/loss_vlb_step=0.0717, train/loss_step=0.842, global_step=4159.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  23%|██▎       | 1378/5971 [12:18<40:58,  1.87it/s, loss=0.157, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00152, train/loss_step=0.322, global_step=4159.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1379/5971 [12:18<40:58,  1.87it/s, loss=0.163, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000907, train/loss_step=0.228, global_step=4159.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1380/5971 [12:21<41:05,  1.86it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.12e-5, train/loss_step=0.00191, global_step=4159.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1381/5971 [12:22<41:05,  1.86it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.12e-5, train/loss_step=0.00191, global_step=4159.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1381/5971 [12:22<41:05,  1.86it/s, loss=0.17, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00218, train/loss_step=0.390, global_step=4160.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  23%|██▎       | 1382/5971 [12:23<41:06,  1.86it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.66e-5, train/loss_step=0.00308, global_step=4160.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1383/5971 [12:24<41:06,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.99e-5, train/loss_step=0.0135, global_step=4160.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1384/5971 [12:26<41:11,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00126, train/loss_step=0.323, global_step=4160.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1385/5971 [12:27<41:12,  1.85it/s, loss=0.176, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00126, train/loss_step=0.323, global_step=4160.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1385/5971 [12:27<41:12,  1.85it/s, loss=0.176, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000962, train/loss_step=0.239, global_step=4161.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1386/5971 [12:28<41:12,  1.85it/s, loss=0.193, v_num=0, train/loss_simple_step=0.346, train/loss_vlb_step=0.00147, train/loss_step=0.346, global_step=4161.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1387/5971 [12:28<41:13,  1.85it/s, loss=0.219, v_num=0, train/loss_simple_step=0.753, train/loss_vlb_step=0.0282, train/loss_step=0.753, global_step=4161.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1388/5971 [12:31<41:18,  1.85it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000213, train/loss_step=0.0642, global_step=4161.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1389/5971 [12:31<41:18,  1.85it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0642, train/loss_vlb_step=0.000213, train/loss_step=0.0642, global_step=4161.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1389/5971 [12:31<41:18,  1.85it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.000105, train/loss_step=0.0268, global_step=4162.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1390/5971 [12:32<41:19,  1.85it/s, loss=0.209, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00249, train/loss_step=0.397, global_step=4162.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  23%|██▎       | 1391/5971 [12:33<41:19,  1.85it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000138, train/loss_step=0.0391, global_step=4162.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1392/5971 [12:35<41:24,  1.84it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8.09e-5, train/loss_step=0.0191, global_step=4162.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1393/5971 [12:36<41:25,  1.84it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8.09e-5, train/loss_step=0.0191, global_step=4162.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1393/5971 [12:36<41:25,  1.84it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.23e-5, train/loss_step=0.0216, global_step=4163.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1394/5971 [12:37<41:25,  1.84it/s, loss=0.212, v_num=0, train/loss_simple_step=0.082, train/loss_vlb_step=0.000271, train/loss_step=0.082, global_step=4163.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1395/5971 [12:38<41:26,  1.84it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.000235, train/loss_step=0.0688, global_step=4163.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1396/5971 [12:40<41:30,  1.84it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.000324, train/loss_step=0.0974, global_step=4163.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1397/5971 [12:41<41:31,  1.84it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.000324, train/loss_step=0.0974, global_step=4163.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1397/5971 [12:41<41:31,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000685, train/loss_step=0.190, global_step=4164.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1398/5971 [12:42<41:32,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.011, train/loss_step=0.638, global_step=4164.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  23%|██▎       | 1399/5971 [12:43<41:32,  1.83it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000142, train/loss_step=0.0402, global_step=4164.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1400/5971 [12:45<41:38,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000448, train/loss_step=0.135, global_step=4164.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  23%|██▎       | 1401/5971 [12:46<41:38,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000448, train/loss_step=0.135, global_step=4164.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1401/5971 [12:46<41:38,  1.83it/s, loss=0.186, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.00135, train/loss_step=0.225, global_step=4165.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  23%|██▎       | 1402/5971 [12:47<41:39,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000808, train/loss_step=0.212, global_step=4165.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  23%|██▎       | 1403/5971 [12:48<41:39,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=6.51e-5, train/loss_step=0.0169, global_step=4165.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1404/5971 [12:50<41:44,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00157, train/loss_step=0.338, global_step=4165.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  24%|██▎       | 1405/5971 [12:51<41:44,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00157, train/loss_step=0.338, global_step=4165.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1405/5971 [12:51<41:44,  1.82it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.87e-5, train/loss_step=0.00588, global_step=4166.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1406/5971 [12:52<41:45,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000542, train/loss_step=0.159, global_step=4166.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  24%|██▎       | 1407/5971 [12:53<41:45,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0717, train/loss_vlb_step=0.000236, train/loss_step=0.0717, global_step=4166.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1408/5971 [12:55<41:51,  1.82it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0097, train/loss_vlb_step=4.39e-5, train/loss_step=0.0097, global_step=4166.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  24%|██▎       | 1409/5971 [12:56<41:51,  1.82it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0097, train/loss_vlb_step=4.39e-5, train/loss_step=0.0097, global_step=4166.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1409/5971 [12:56<41:51,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0659, train/loss_vlb_step=0.000226, train/loss_step=0.0659, global_step=4167.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1410/5971 [12:57<41:52,  1.82it/s, loss=0.138, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00175, train/loss_step=0.333, global_step=4167.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  24%|██▎       | 1411/5971 [12:58<41:52,  1.81it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00505, train/loss_vlb_step=2.64e-5, train/loss_step=0.00505, global_step=4167.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1412/5971 [13:00<41:57,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.806, train/loss_vlb_step=0.0417, train/loss_step=0.806, global_step=4167.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  24%|██▎       | 1413/5971 [13:01<41:57,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.806, train/loss_vlb_step=0.0417, train/loss_step=0.806, global_step=4167.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1413/5971 [13:01<41:57,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.94e-5, train/loss_step=0.0224, global_step=4168.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1414/5971 [13:01<41:58,  1.81it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.55e-5, train/loss_step=0.0129, global_step=4168.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1415/5971 [13:02<41:58,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.91e-5, train/loss_step=0.0107, global_step=4168.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  24%|██▎       | 1416/5971 [13:04<42:03,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000126, train/loss_step=0.033, global_step=4168.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1417/5971 [13:05<42:03,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000126, train/loss_step=0.033, global_step=4168.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1417/5971 [13:05<42:03,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.5e-5, train/loss_step=0.00277, global_step=4169.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▎       | 1418/5971 [13:06<42:04,  1.80it/s, loss=0.131, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=4169.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  24%|██▍       | 1419/5971 [13:07<42:04,  1.80it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0388, train/loss_vlb_step=0.000142, train/loss_step=0.0388, global_step=4169.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1420/5971 [13:09<42:09,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00213, train/loss_step=0.367, global_step=4169.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  24%|██▍       | 1421/5971 [13:10<42:09,  1.80it/s, loss=0.142, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00213, train/loss_step=0.367, global_step=4169.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1421/5971 [13:10<42:09,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000356, train/loss_step=0.108, global_step=4170.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1422/5971 [13:11<42:10,  1.80it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0607, train/loss_vlb_step=0.000204, train/loss_step=0.0607, global_step=4170.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1423/5971 [13:12<42:10,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.376, train/loss_vlb_step=0.00282, train/loss_step=0.376, global_step=4170.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  24%|██▍       | 1424/5971 [13:14<42:15,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00302, train/loss_step=0.455, global_step=4170.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1425/5971 [13:15<42:15,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00302, train/loss_step=0.455, global_step=4170.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1425/5971 [13:15<42:15,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0042, train/loss_vlb_step=2.26e-5, train/loss_step=0.0042, global_step=4171.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1426/5971 [13:16<42:16,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.22e-5, train/loss_step=0.00208, global_step=4171.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1427/5971 [13:17<42:16,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00123, train/loss_vlb_step=7.4e-6, train/loss_step=0.00123, global_step=4171.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  24%|██▍       | 1428/5971 [13:19<42:22,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00695, train/loss_vlb_step=3.46e-5, train/loss_step=0.00695, global_step=4171.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1429/5971 [13:20<42:22,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00695, train/loss_vlb_step=3.46e-5, train/loss_step=0.00695, global_step=4171.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1429/5971 [13:20<42:22,  1.79it/s, loss=0.148, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000834, train/loss_step=0.208, global_step=4172.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  24%|██▍       | 1430/5971 [13:21<42:23,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.726, train/loss_vlb_step=0.0254, train/loss_step=0.726, global_step=4172.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  24%|██▍       | 1431/5971 [13:22<42:24,  1.78it/s, loss=0.172, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=4172.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1432/5971 [13:24<42:28,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=4172.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1433/5971 [13:25<42:28,  1.78it/s, loss=0.139, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000441, train/loss_step=0.134, global_step=4172.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1433/5971 [13:25<42:28,  1.78it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.85e-5, train/loss_step=0.00364, global_step=4173.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1434/5971 [13:26<42:29,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.00034, train/loss_step=0.103, global_step=4173.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  24%|██▍       | 1435/5971 [13:27<42:29,  1.78it/s, loss=0.154, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.00102, train/loss_step=0.233, global_step=4173.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1436/5971 [13:29<42:33,  1.78it/s, loss=0.162, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000695, train/loss_step=0.202, global_step=4173.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1437/5971 [13:30<42:34,  1.77it/s, loss=0.162, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000695, train/loss_step=0.202, global_step=4173.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1437/5971 [13:30<42:34,  1.77it/s, loss=0.182, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00206, train/loss_step=0.403, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  24%|██▍       | 1438/5971 [13:31<42:34,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00979, train/loss_vlb_step=4.48e-5, train/loss_step=0.00979, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1439/5971 [13:31<42:35,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00113, train/loss_step=0.264, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  24%|██▍       | 1440/5971 [13:34<42:39,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  24%|██▍       | 1441/5971 [13:34<42:37,  1.77it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:06,  2.50it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:25,  6.52it/s][A
Epoch 7:  24%|██▍       | 1445/5971 [13:34<42:29,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   4%|▎         | 6/167 [00:00<00:13, 12.07it/s][A
Epoch 7:  24%|██▍       | 1449/5971 [13:34<42:21,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▌         | 9/167 [00:00<00:10, 15.76it/s][A

Validating:   7%|▋         | 12/167 [00:00<00:08, 19.14it/s][A
Epoch 7:  24%|██▍       | 1453/5971 [13:34<42:12,  1.78it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   9%|▉         | 15/167 [00:00<00:07, 21.57it/s][A
Epoch 7:  24%|██▍       | 1457/5971 [13:35<42:03,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  11%|█         | 18/167 [00:01<00:06, 23.15it/s][A
Epoch 7:  24%|██▍       | 1461/5971 [13:35<41:54,  1.79it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  13%|█▎        | 21/167 [00:01<00:05, 24.55it/s][A

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.72it/s][A
Epoch 7:  25%|██▍       | 1465/5971 [13:35<41:46,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.77it/s][A
Epoch 7:  25%|██▍       | 1469/5971 [13:35<41:37,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.80it/s][A
Epoch 7:  25%|██▍       | 1473/5971 [13:35<41:29,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 26.21it/s][A

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.99it/s][A
Epoch 7:  25%|██▍       | 1477/5971 [13:35<41:20,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 27.00it/s][A
Epoch 7:  25%|██▍       | 1481/5971 [13:35<41:12,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:01<00:04, 27.39it/s][A
Epoch 7:  25%|██▍       | 1485/5971 [13:36<41:03,  1.82it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.80it/s][A

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.13it/s][A
Epoch 7:  25%|██▍       | 1489/5971 [13:36<40:55,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 51/167 [00:02<00:04, 27.05it/s][A
Epoch 7:  25%|██▌       | 1493/5971 [13:36<40:47,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.73it/s][A
Epoch 7:  25%|██▌       | 1497/5971 [13:36<40:38,  1.83it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 27.91it/s][A
Epoch 7:  25%|██▌       | 1501/5971 [13:36<40:30,  1.84it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 29.03it/s][A

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.66it/s][A
Epoch 7:  25%|██▌       | 1505/5971 [13:36<40:22,  1.84it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████      | 68/167 [00:02<00:03, 28.55it/s][A
Epoch 7:  25%|██▌       | 1509/5971 [13:36<40:14,  1.85it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 28.35it/s][A
Epoch 7:  25%|██▌       | 1513/5971 [13:37<40:05,  1.85it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 28.51it/s][A
Epoch 7:  25%|██▌       | 1517/5971 [13:37<39:57,  1.86it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.17it/s][A

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.57it/s][A
Epoch 7:  25%|██▌       | 1521/5971 [13:37<39:49,  1.86it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.72it/s][A
Epoch 7:  26%|██▌       | 1525/5971 [13:37<39:42,  1.87it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.11it/s][A
Epoch 7:  26%|██▌       | 1529/5971 [13:37<39:34,  1.87it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.40it/s][A

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 26.97it/s][A
Epoch 7:  26%|██▌       | 1533/5971 [13:37<39:26,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 95/167 [00:03<00:02, 25.85it/s][A
Epoch 7:  26%|██▌       | 1537/5971 [13:38<39:18,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.92it/s][A
Epoch 7:  26%|██▌       | 1541/5971 [13:38<39:10,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.54it/s][A
Epoch 7:  26%|██▌       | 1545/5971 [13:38<39:02,  1.89it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 28.13it/s][A

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 27.63it/s][A
Epoch 7:  26%|██▌       | 1549/5971 [13:38<38:54,  1.89it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.71it/s][A
Epoch 7:  26%|██▌       | 1553/5971 [13:38<38:47,  1.90it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.36it/s][A
Epoch 7:  26%|██▌       | 1557/5971 [13:38<38:39,  1.90it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.42it/s][A
Epoch 7:  26%|██▌       | 1561/5971 [13:38<38:31,  1.91it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 121/167 [00:04<00:01, 28.49it/s][A

Validating:  74%|███████▍  | 124/167 [00:04<00:01, 28.00it/s][A
Epoch 7:  26%|██▌       | 1565/5971 [13:39<38:24,  1.91it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.26it/s][A
Epoch 7:  26%|██▋       | 1569/5971 [13:39<38:16,  1.92it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.06it/s][A
Epoch 7:  26%|██▋       | 1573/5971 [13:39<38:09,  1.92it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.18it/s][A

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.93it/s][A
Epoch 7:  26%|██▋       | 1577/5971 [13:39<38:01,  1.93it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.46it/s][A
Epoch 7:  26%|██▋       | 1581/5971 [13:39<37:54,  1.93it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.75it/s][A
Epoch 7:  27%|██▋       | 1585/5971 [13:39<37:47,  1.93it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 146/167 [00:05<00:00, 27.84it/s][A
Epoch 7:  27%|██▋       | 1589/5971 [13:39<37:39,  1.94it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▉ | 149/167 [00:05<00:00, 28.05it/s][A

Validating:  91%|█████████ | 152/167 [00:05<00:00, 27.58it/s][A
Epoch 7:  27%|██▋       | 1593/5971 [13:40<37:32,  1.94it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 28.93it/s][A
Epoch 7:  27%|██▋       | 1597/5971 [13:40<37:24,  1.95it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.57it/s][A
Epoch 7:  27%|██▋       | 1601/5971 [13:40<37:17,  1.95it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.79it/s][A
Epoch 7:  27%|██▋       | 1605/5971 [13:40<37:10,  1.96it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.70it/s][A
Epoch 7:  27%|██▋       | 1608/5971 [13:40<37:05,  1.96it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  27%|██▋       | 1609/5971 [13:41<37:06,  1.96it/s, loss=0.177, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4174.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1609/5971 [13:41<37:06,  1.96it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00446, train/loss_vlb_step=2.32e-5, train/loss_step=0.00446, global_step=4175.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1610/5971 [13:42<37:07,  1.96it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000128, train/loss_step=0.0342, global_step=4175.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  27%|██▋       | 1611/5971 [13:43<37:07,  1.96it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0821, train/loss_vlb_step=0.000275, train/loss_step=0.0821, global_step=4175.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1612/5971 [13:45<37:12,  1.95it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.33e-5, train/loss_step=0.0104, global_step=4175.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1613/5971 [13:46<37:12,  1.95it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.33e-5, train/loss_step=0.0104, global_step=4175.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1613/5971 [13:46<37:12,  1.95it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00198, train/loss_vlb_step=1.18e-5, train/loss_step=0.00198, global_step=4176.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1614/5971 [13:47<37:12,  1.95it/s, loss=0.139, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000398, train/loss_step=0.121, global_step=4176.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  27%|██▋       | 1615/5971 [13:48<37:13,  1.95it/s, loss=0.148, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.00066, train/loss_step=0.185, global_step=4176.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1616/5971 [13:50<37:17,  1.95it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000148, train/loss_step=0.0403, global_step=4176.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1617/5971 [13:51<37:17,  1.95it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000148, train/loss_step=0.0403, global_step=4176.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1617/5971 [13:51<37:17,  1.95it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0572, train/loss_vlb_step=0.000195, train/loss_step=0.0572, global_step=4177.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1618/5971 [13:52<37:18,  1.94it/s, loss=0.121, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00105, train/loss_step=0.289, global_step=4177.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  27%|██▋       | 1619/5971 [13:53<37:18,  1.94it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.27e-5, train/loss_step=0.0167, global_step=4177.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1620/5971 [13:55<37:22,  1.94it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0517, train/loss_vlb_step=0.000182, train/loss_step=0.0517, global_step=4177.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1621/5971 [13:56<37:22,  1.94it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0517, train/loss_vlb_step=0.000182, train/loss_step=0.0517, global_step=4177.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1621/5971 [13:56<37:22,  1.94it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.88e-5, train/loss_step=0.0166, global_step=4178.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1622/5971 [13:57<37:23,  1.94it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00651, train/loss_vlb_step=3.18e-5, train/loss_step=0.00651, global_step=4178.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1623/5971 [13:58<37:23,  1.94it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.00185, train/loss_vlb_step=1.04e-5, train/loss_step=0.00185, global_step=4178.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1624/5971 [14:00<37:28,  1.93it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.0091, train/loss_vlb_step=4.15e-5, train/loss_step=0.0091, global_step=4178.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  27%|██▋       | 1625/5971 [14:01<37:28,  1.93it/s, loss=0.0868, v_num=0, train/loss_simple_step=0.0091, train/loss_vlb_step=4.15e-5, train/loss_step=0.0091, global_step=4178.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1625/5971 [14:01<37:28,  1.93it/s, loss=0.0719, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=4179.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1626/5971 [14:02<37:29,  1.93it/s, loss=0.0852, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00116, train/loss_step=0.275, global_step=4179.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1627/5971 [14:03<37:29,  1.93it/s, loss=0.0888, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00184, train/loss_step=0.337, global_step=4179.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1628/5971 [14:05<37:33,  1.93it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.00428, train/loss_vlb_step=2.17e-5, train/loss_step=0.00428, global_step=4179.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1629/5971 [14:06<37:34,  1.93it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.00428, train/loss_vlb_step=2.17e-5, train/loss_step=0.00428, global_step=4179.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1629/5971 [14:06<37:34,  1.93it/s, loss=0.0849, v_num=0, train/loss_simple_step=0.0506, train/loss_vlb_step=0.000173, train/loss_step=0.0506, global_step=4180.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1630/5971 [14:07<37:34,  1.93it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.0011, train/loss_vlb_step=6.7e-6, train/loss_step=0.0011, global_step=4180.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  27%|██▋       | 1631/5971 [14:08<37:35,  1.92it/s, loss=0.0938, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00119, train/loss_step=0.293, global_step=4180.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1632/5971 [14:10<37:40,  1.92it/s, loss=0.105, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00101, train/loss_step=0.234, global_step=4180.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1633/5971 [14:11<37:40,  1.92it/s, loss=0.105, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.00101, train/loss_step=0.234, global_step=4180.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1633/5971 [14:11<37:40,  1.92it/s, loss=0.125, v_num=0, train/loss_simple_step=0.410, train/loss_vlb_step=0.00256, train/loss_step=0.410, global_step=4181.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1634/5971 [14:12<37:41,  1.92it/s, loss=0.139, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00242, train/loss_step=0.395, global_step=4181.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1635/5971 [14:13<37:41,  1.92it/s, loss=0.148, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00159, train/loss_step=0.358, global_step=4181.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1636/5971 [14:15<37:46,  1.91it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.71e-5, train/loss_step=0.0153, global_step=4181.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1637/5971 [14:16<37:46,  1.91it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.71e-5, train/loss_step=0.0153, global_step=4181.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1637/5971 [14:16<37:46,  1.91it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.43e-6, train/loss_step=0.00155, global_step=4182.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1638/5971 [14:17<37:47,  1.91it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0722, train/loss_vlb_step=0.000243, train/loss_step=0.0722, global_step=4182.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1639/5971 [14:18<37:47,  1.91it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.53e-5, train/loss_step=0.0122, global_step=4182.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1640/5971 [14:20<37:51,  1.91it/s, loss=0.139, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000602, train/loss_step=0.172, global_step=4182.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  27%|██▋       | 1641/5971 [14:21<37:51,  1.91it/s, loss=0.139, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000602, train/loss_step=0.172, global_step=4182.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1641/5971 [14:21<37:51,  1.91it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0618, train/loss_vlb_step=0.000205, train/loss_step=0.0618, global_step=4183.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  27%|██▋       | 1642/5971 [14:22<37:52,  1.91it/s, loss=0.162, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00333, train/loss_step=0.427, global_step=4183.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  28%|██▊       | 1643/5971 [14:23<37:52,  1.90it/s, loss=0.196, v_num=0, train/loss_simple_step=0.692, train/loss_vlb_step=0.015, train/loss_step=0.692, global_step=4183.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  28%|██▊       | 1644/5971 [14:25<37:56,  1.90it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000114, train/loss_step=0.0298, global_step=4183.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1645/5971 [14:26<37:57,  1.90it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000114, train/loss_step=0.0298, global_step=4183.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1645/5971 [14:26<37:57,  1.90it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000146, train/loss_step=0.0381, global_step=4184.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1646/5971 [14:27<37:57,  1.90it/s, loss=0.21, v_num=0, train/loss_simple_step=0.594, train/loss_vlb_step=0.0139, train/loss_step=0.594, global_step=4184.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  28%|██▊       | 1647/5971 [14:28<37:57,  1.90it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00691, train/loss_vlb_step=3.38e-5, train/loss_step=0.00691, global_step=4184.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1648/5971 [14:30<38:01,  1.89it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.13e-5, train/loss_step=0.00193, global_step=4184.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1649/5971 [14:31<38:01,  1.89it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.13e-5, train/loss_step=0.00193, global_step=4184.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1649/5971 [14:31<38:01,  1.89it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0761, train/loss_vlb_step=0.000254, train/loss_step=0.0761, global_step=4185.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1650/5971 [14:32<38:02,  1.89it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00312, train/loss_vlb_step=1.75e-5, train/loss_step=0.00312, global_step=4185.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1651/5971 [14:32<38:02,  1.89it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00527, train/loss_vlb_step=2.45e-5, train/loss_step=0.00527, global_step=4185.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1652/5971 [14:35<38:06,  1.89it/s, loss=0.185, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00169, train/loss_step=0.323, global_step=4185.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  28%|██▊       | 1653/5971 [14:36<38:07,  1.89it/s, loss=0.185, v_num=0, train/loss_simple_step=0.323, train/loss_vlb_step=0.00169, train/loss_step=0.323, global_step=4185.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1653/5971 [14:36<38:07,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000526, train/loss_step=0.157, global_step=4186.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1654/5971 [14:36<38:07,  1.89it/s, loss=0.183, v_num=0, train/loss_simple_step=0.614, train/loss_vlb_step=0.00908, train/loss_step=0.614, global_step=4186.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1655/5971 [14:37<38:07,  1.89it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=1.01e-5, train/loss_step=0.00168, global_step=4186.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1656/5971 [14:39<38:11,  1.88it/s, loss=0.174, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.00066, train/loss_step=0.183, global_step=4186.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  28%|██▊       | 1657/5971 [14:40<38:11,  1.88it/s, loss=0.174, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.00066, train/loss_step=0.183, global_step=4186.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1657/5971 [14:40<38:11,  1.88it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.34e-5, train/loss_step=0.0153, global_step=4187.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1658/5971 [14:41<38:12,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00179, train/loss_step=0.364, global_step=4187.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  28%|██▊       | 1659/5971 [14:42<38:12,  1.88it/s, loss=0.196, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000512, train/loss_step=0.153, global_step=4187.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1660/5971 [14:44<38:16,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=1.94e-5, train/loss_step=0.00401, global_step=4187.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1661/5971 [14:45<38:16,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00401, train/loss_vlb_step=1.94e-5, train/loss_step=0.00401, global_step=4187.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1661/5971 [14:45<38:16,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000211, train/loss_step=0.0622, global_step=4188.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1662/5971 [14:46<38:16,  1.88it/s, loss=0.18, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00108, train/loss_step=0.283, global_step=4188.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  28%|██▊       | 1663/5971 [14:47<38:17,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00758, train/loss_vlb_step=3.58e-5, train/loss_step=0.00758, global_step=4188.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1664/5971 [14:49<38:20,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000417, train/loss_step=0.126, global_step=4188.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  28%|██▊       | 1665/5971 [14:50<38:21,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000417, train/loss_step=0.126, global_step=4188.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1665/5971 [14:50<38:21,  1.87it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.63e-5, train/loss_step=0.0123, global_step=4189.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1666/5971 [14:51<38:21,  1.87it/s, loss=0.128, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000567, train/loss_step=0.163, global_step=4189.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1667/5971 [14:52<38:21,  1.87it/s, loss=0.134, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=4189.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1668/5971 [14:54<38:25,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00202, train/loss_step=0.338, global_step=4189.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1669/5971 [14:55<38:25,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00202, train/loss_step=0.338, global_step=4189.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1669/5971 [14:55<38:25,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.00071, train/loss_step=0.180, global_step=4190.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1670/5971 [14:55<38:26,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000622, train/loss_step=0.184, global_step=4190.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1671/5971 [14:56<38:26,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000324, train/loss_step=0.0968, global_step=4190.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1672/5971 [14:59<38:31,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000161, train/loss_step=0.0448, global_step=4190.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1673/5971 [15:00<38:31,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000161, train/loss_step=0.0448, global_step=4190.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1673/5971 [15:00<38:31,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00115, train/loss_step=0.269, global_step=4191.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  28%|██▊       | 1674/5971 [15:01<38:31,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00162, train/loss_step=0.305, global_step=4191.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1675/5971 [15:02<38:32,  1.86it/s, loss=0.151, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=4191.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1676/5971 [15:04<38:35,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00344, train/loss_step=0.441, global_step=4191.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1677/5971 [15:05<38:36,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00344, train/loss_step=0.441, global_step=4191.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1677/5971 [15:05<38:36,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.97e-5, train/loss_step=0.0156, global_step=4192.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1678/5971 [15:05<38:36,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000346, train/loss_step=0.105, global_step=4192.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1679/5971 [15:06<38:36,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.000206, train/loss_step=0.0601, global_step=4192.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1680/5971 [15:08<38:40,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0749, train/loss_vlb_step=0.000257, train/loss_step=0.0749, global_step=4192.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1681/5971 [15:09<38:40,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0749, train/loss_vlb_step=0.000257, train/loss_step=0.0749, global_step=4192.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1681/5971 [15:09<38:40,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000126, train/loss_step=0.031, global_step=4193.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1682/5971 [15:10<38:40,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.017, train/loss_vlb_step=7.38e-5, train/loss_step=0.017, global_step=4193.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1683/5971 [15:11<38:41,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.000976, train/loss_step=0.267, global_step=4193.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1684/5971 [15:13<38:44,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00129, train/loss_step=0.291, global_step=4193.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  28%|██▊       | 1685/5971 [15:14<38:45,  1.84it/s, loss=0.156, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00129, train/loss_step=0.291, global_step=4193.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1685/5971 [15:14<38:45,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000805, train/loss_step=0.217, global_step=4194.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1686/5971 [15:15<38:45,  1.84it/s, loss=0.167, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000621, train/loss_step=0.181, global_step=4194.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1687/5971 [15:16<38:45,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.34e-5, train/loss_step=0.00232, global_step=4194.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1688/5971 [15:18<38:49,  1.84it/s, loss=0.174, v_num=0, train/loss_simple_step=0.589, train/loss_vlb_step=0.0119, train/loss_step=0.589, global_step=4194.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  28%|██▊       | 1689/5971 [15:19<38:49,  1.84it/s, loss=0.174, v_num=0, train/loss_simple_step=0.589, train/loss_vlb_step=0.0119, train/loss_step=0.589, global_step=4194.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1689/5971 [15:19<38:49,  1.84it/s, loss=0.172, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000561, train/loss_step=0.135, global_step=4195.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1690/5971 [15:20<38:49,  1.84it/s, loss=0.167, v_num=0, train/loss_simple_step=0.087, train/loss_vlb_step=0.0003, train/loss_step=0.087, global_step=4195.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  28%|██▊       | 1691/5971 [15:21<38:50,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.000142, train/loss_step=0.0389, global_step=4195.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1692/5971 [15:23<38:53,  1.83it/s, loss=0.173, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000908, train/loss_step=0.230, global_step=4195.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  28%|██▊       | 1693/5971 [15:24<38:54,  1.83it/s, loss=0.173, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000908, train/loss_step=0.230, global_step=4195.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1693/5971 [15:24<38:54,  1.83it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.77e-5, train/loss_step=0.00533, global_step=4196.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1694/5971 [15:25<38:54,  1.83it/s, loss=0.186, v_num=0, train/loss_simple_step=0.819, train/loss_vlb_step=0.0386, train/loss_step=0.819, global_step=4196.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  28%|██▊       | 1695/5971 [15:26<38:54,  1.83it/s, loss=0.189, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000672, train/loss_step=0.178, global_step=4196.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1696/5971 [15:28<38:58,  1.83it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=3.02e-5, train/loss_step=0.00608, global_step=4196.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1697/5971 [15:29<38:58,  1.83it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=3.02e-5, train/loss_step=0.00608, global_step=4196.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1697/5971 [15:29<38:58,  1.83it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00512, train/loss_vlb_step=2.65e-5, train/loss_step=0.00512, global_step=4197.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1698/5971 [15:29<38:58,  1.83it/s, loss=0.192, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.00762, train/loss_step=0.609, global_step=4197.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  28%|██▊       | 1699/5971 [15:30<38:59,  1.83it/s, loss=0.198, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000563, train/loss_step=0.166, global_step=4197.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1700/5971 [15:33<39:03,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000142, train/loss_step=0.0383, global_step=4197.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1701/5971 [15:34<39:03,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.000142, train/loss_step=0.0383, global_step=4197.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  28%|██▊       | 1701/5971 [15:34<39:03,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00116, train/loss_vlb_step=6.93e-6, train/loss_step=0.00116, global_step=4198.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  29%|██▊       | 1702/5971 [15:34<39:03,  1.82it/s, loss=0.214, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00361, train/loss_step=0.417, global_step=4198.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  29%|██▊       | 1703/5971 [15:35<39:03,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.63e-5, train/loss_step=0.00315, global_step=4198.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  29%|██▊       | 1704/5971 [15:37<39:07,  1.82it/s, loss=0.215, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.0108, train/loss_step=0.570, global_step=4198.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  29%|██▊       | 1705/5971 [15:38<39:07,  1.82it/s, loss=0.215, v_num=0, train/loss_simple_step=0.570, train/loss_vlb_step=0.0108, train/loss_step=0.570, global_step=4198.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  29%|██▊       | 1705/5971 [15:38<39:07,  1.82it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.25e-5, train/loss_step=0.0145, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  29%|██▊       | 1706/5971 [15:39<39:07,  1.82it/s, loss=0.21, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00126, train/loss_step=0.277, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  29%|██▊       | 1707/5971 [15:40<39:08,  1.82it/s, loss=0.221, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000882, train/loss_step=0.223, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  29%|██▊       | 1708/5971 [15:42<39:11,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  29%|██▊       | 1709/5971 [15:42<39:09,  1.81it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.39it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:25,  6.31it/s][A
Epoch 7:  29%|██▊       | 1713/5971 [15:43<39:03,  1.82it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.97it/s][A

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.83it/s][A
Epoch 7:  29%|██▉       | 1717/5971 [15:43<38:56,  1.82it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.77it/s][A
Epoch 7:  29%|██▉       | 1721/5971 [15:43<38:49,  1.82it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.63it/s][A
Epoch 7:  29%|██▉       | 1725/5971 [15:43<38:41,  1.83it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.86it/s][A

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.46it/s][A
Epoch 7:  29%|██▉       | 1729/5971 [15:43<38:34,  1.83it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.12it/s][A
Epoch 7:  29%|██▉       | 1733/5971 [15:44<38:27,  1.84it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.01it/s][A
Epoch 7:  29%|██▉       | 1737/5971 [15:44<38:20,  1.84it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.13it/s][A

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.68it/s][A
Epoch 7:  29%|██▉       | 1741/5971 [15:44<38:13,  1.84it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:01<00:05, 24.68it/s][A
Epoch 7:  29%|██▉       | 1745/5971 [15:44<38:06,  1.85it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:01<00:05, 25.52it/s][A
Epoch 7:  29%|██▉       | 1749/5971 [15:44<37:59,  1.85it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.44it/s][A

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.98it/s][A
Epoch 7:  29%|██▉       | 1753/5971 [15:44<37:52,  1.86it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 27.71it/s][A
Epoch 7:  29%|██▉       | 1757/5971 [15:45<37:45,  1.86it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.11it/s][A
Epoch 7:  29%|██▉       | 1761/5971 [15:45<37:38,  1.86it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.93it/s][A
Epoch 7:  30%|██▉       | 1765/5971 [15:45<37:31,  1.87it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▍      | 57/167 [00:02<00:03, 28.08it/s][A

Validating:  36%|███▌      | 60/167 [00:02<00:03, 28.49it/s][A
Epoch 7:  30%|██▉       | 1769/5971 [15:45<37:24,  1.87it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 28.44it/s][A
Epoch 7:  30%|██▉       | 1773/5971 [15:45<37:17,  1.88it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|███▉      | 66/167 [00:02<00:03, 28.13it/s][A
Epoch 7:  30%|██▉       | 1777/5971 [15:45<37:10,  1.88it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 27.00it/s][A

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.35it/s][A
Epoch 7:  30%|██▉       | 1781/5971 [15:45<37:04,  1.88it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.38it/s][A
Epoch 7:  30%|██▉       | 1785/5971 [15:46<36:57,  1.89it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.38it/s][A
Epoch 7:  30%|██▉       | 1789/5971 [15:46<36:50,  1.89it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.30it/s][A

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.94it/s][A
Epoch 7:  30%|███       | 1793/5971 [15:46<36:43,  1.90it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.58it/s][A
Epoch 7:  30%|███       | 1797/5971 [15:46<36:37,  1.90it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 27.29it/s][A
Epoch 7:  30%|███       | 1801/5971 [15:46<36:30,  1.90it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 26.89it/s][A

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.44it/s][A
Epoch 7:  30%|███       | 1805/5971 [15:46<36:24,  1.91it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.71it/s][A
Epoch 7:  30%|███       | 1809/5971 [15:46<36:17,  1.91it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.72it/s][A
Epoch 7:  30%|███       | 1813/5971 [15:47<36:10,  1.92it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.52it/s][A

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 27.87it/s][A
Epoch 7:  30%|███       | 1817/5971 [15:47<36:04,  1.92it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▋   | 111/167 [00:04<00:01, 28.01it/s][A
Epoch 7:  30%|███       | 1821/5971 [15:47<35:57,  1.92it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 27.43it/s][A
Epoch 7:  31%|███       | 1825/5971 [15:47<35:51,  1.93it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.65it/s][A

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 26.34it/s][A
Epoch 7:  31%|███       | 1829/5971 [15:47<35:44,  1.93it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.92it/s][A
Epoch 7:  31%|███       | 1833/5971 [15:47<35:38,  1.93it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.83it/s][A
Epoch 7:  31%|███       | 1837/5971 [15:47<35:32,  1.94it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.45it/s][A

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.87it/s][A
Epoch 7:  31%|███       | 1841/5971 [15:48<35:25,  1.94it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████  | 135/167 [00:05<00:01, 28.22it/s][A
Epoch 7:  31%|███       | 1845/5971 [15:48<35:19,  1.95it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 25.74it/s][A
Epoch 7:  31%|███       | 1849/5971 [15:48<35:13,  1.95it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 141/167 [00:05<00:01, 25.72it/s][A
Epoch 7:  31%|███       | 1853/5971 [15:48<35:06,  1.95it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 27.44it/s][A

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 27.97it/s][A
Epoch 7:  31%|███       | 1857/5971 [15:48<35:00,  1.96it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.42it/s][A
Epoch 7:  31%|███       | 1861/5971 [15:48<34:54,  1.96it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 27.22it/s][A
Epoch 7:  31%|███       | 1865/5971 [15:49<34:48,  1.97it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 25.82it/s][A

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.43it/s][A
Epoch 7:  31%|███▏      | 1869/5971 [15:49<34:42,  1.97it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.39it/s][A
Epoch 7:  31%|███▏      | 1873/5971 [15:49<34:35,  1.97it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.84it/s][A
Epoch 7:  31%|███▏      | 1876/5971 [15:49<34:32,  1.98it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.17it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.99it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.18it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.43it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.45it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.46it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.58it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.53it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.46it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.34it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.28it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.27it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.27it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.45it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.51it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.55it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.52it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.51it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.46it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.45it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.45it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.44it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.43it/s][A
Epoch 7:  31%|███▏      | 1876/5971 [15:59<34:53,  1.96it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.41it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.42it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.41it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.43it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.52it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.16it/s]

Epoch 7:  31%|███▏      | 1877/5971 [16:01<34:57,  1.95it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.32e-5, train/loss_step=0.00665, global_step=4199.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  31%|███▏      | 1877/5971 [16:01<34:57,  1.95it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.31e-5, train/loss_step=0.0143, global_step=4200.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.29it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.95it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.25it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.50it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.55it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.60it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.61it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.67it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.56it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.54it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.56it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.59it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.58it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.56it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.56it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.60it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.65it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.57it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.63it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.67it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.29it/s]

Epoch 7:  31%|███▏      | 1878/5971 [16:13<35:20,  1.93it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0143, train/loss_vlb_step=6.31e-5, train/loss_step=0.0143, global_step=4200.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  31%|███▏      | 1878/5971 [16:13<35:20,  1.93it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000174, train/loss_step=0.0469, global_step=4200.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.31it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.97it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.47it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.83it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.09it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.24it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.55it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.70it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.70it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.69it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.64it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.66it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.68it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.69it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.68it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.68it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.70it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.69it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.33it/s]

Epoch 7:  31%|███▏      | 1879/5971 [16:25<35:44,  1.91it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0469, train/loss_vlb_step=0.000174, train/loss_step=0.0469, global_step=4200.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  31%|███▏      | 1879/5971 [16:25<35:44,  1.91it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000148, train/loss_step=0.0399, global_step=4200.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.93it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.41it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.77it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.04it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.23it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.19it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.49it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.62it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.71it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.71it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.71it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.71it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.71it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.69it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.69it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.33it/s]

Epoch 7:  31%|███▏      | 1880/5971 [16:38<36:10,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0399, train/loss_vlb_step=0.000148, train/loss_step=0.0399, global_step=4200.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  31%|███▏      | 1880/5971 [16:38<36:10,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.671, train/loss_vlb_step=0.0171, train/loss_step=0.671, global_step=4200.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  32%|███▏      | 1881/5971 [16:39<36:11,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.671, train/loss_vlb_step=0.0171, train/loss_step=0.671, global_step=4200.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1881/5971 [16:39<36:11,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00653, train/loss_vlb_step=3.04e-5, train/loss_step=0.00653, global_step=4201.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1882/5971 [16:39<36:11,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00653, train/loss_vlb_step=3.04e-5, train/loss_step=0.00653, global_step=4201.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1882/5971 [16:39<36:11,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.6e-5, train/loss_step=0.0181, global_step=4201.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  32%|███▏      | 1883/5971 [16:40<36:11,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.6e-5, train/loss_step=0.0181, global_step=4201.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1883/5971 [16:40<36:11,  1.88it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.38e-5, train/loss_step=0.0125, global_step=4201.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1884/5971 [16:43<36:14,  1.88it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.38e-5, train/loss_step=0.0125, global_step=4201.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1884/5971 [16:43<36:14,  1.88it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.29e-5, train/loss_step=0.00238, global_step=4201.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1885/5971 [16:43<36:15,  1.88it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.29e-5, train/loss_step=0.00238, global_step=4201.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1885/5971 [16:43<36:15,  1.88it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000232, train/loss_step=0.0694, global_step=4202.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1886/5971 [16:44<36:15,  1.88it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000232, train/loss_step=0.0694, global_step=4202.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1886/5971 [16:44<36:15,  1.88it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.77e-5, train/loss_step=0.00327, global_step=4202.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1887/5971 [16:45<36:15,  1.88it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.77e-5, train/loss_step=0.00327, global_step=4202.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1887/5971 [16:45<36:15,  1.88it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.04e-5, train/loss_step=0.00179, global_step=4202.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1888/5971 [16:48<36:18,  1.87it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.04e-5, train/loss_step=0.00179, global_step=4202.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1888/5971 [16:48<36:18,  1.87it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00756, train/loss_vlb_step=3.67e-5, train/loss_step=0.00756, global_step=4202.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  32%|███▏      | 1889/5971 [16:48<36:19,  1.87it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00756, train/loss_vlb_step=3.67e-5, train/loss_step=0.00756, global_step=4202.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1889/5971 [16:48<36:19,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00528, train/loss_vlb_step=2.67e-5, train/loss_step=0.00528, global_step=4203.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1890/5971 [16:49<36:19,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00528, train/loss_vlb_step=2.67e-5, train/loss_step=0.00528, global_step=4203.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1890/5971 [16:49<36:19,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0784, train/loss_vlb_step=0.00026, train/loss_step=0.0784, global_step=4203.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1891/5971 [16:50<36:19,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0784, train/loss_vlb_step=0.00026, train/loss_step=0.0784, global_step=4203.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1891/5971 [16:50<36:19,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000164, train/loss_step=0.0453, global_step=4203.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1892/5971 [16:52<36:22,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000164, train/loss_step=0.0453, global_step=4203.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1892/5971 [16:52<36:22,  1.87it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00179, train/loss_step=0.364, global_step=4203.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1893/5971 [16:53<36:22,  1.87it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00179, train/loss_step=0.364, global_step=4203.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1893/5971 [16:53<36:22,  1.87it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.00169, train/loss_vlb_step=9.84e-6, train/loss_step=0.00169, global_step=4204.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1894/5971 [16:54<36:22,  1.87it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.00169, train/loss_vlb_step=9.84e-6, train/loss_step=0.00169, global_step=4204.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1894/5971 [16:54<36:22,  1.87it/s, loss=0.0811, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=1.95e-5, train/loss_step=0.00377, global_step=4204.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1895/5971 [16:55<36:22,  1.87it/s, loss=0.0811, v_num=0, train/loss_simple_step=0.00377, train/loss_vlb_step=1.95e-5, train/loss_step=0.00377, global_step=4204.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1895/5971 [16:55<36:22,  1.87it/s, loss=0.071, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.85e-5, train/loss_step=0.0212, global_step=4204.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  32%|███▏      | 1896/5971 [16:57<36:25,  1.86it/s, loss=0.071, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.85e-5, train/loss_step=0.0212, global_step=4204.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1896/5971 [16:57<36:25,  1.86it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00145, train/loss_step=0.336, global_step=4204.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  32%|███▏      | 1897/5971 [16:58<36:25,  1.86it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00145, train/loss_step=0.336, global_step=4204.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1897/5971 [16:58<36:25,  1.86it/s, loss=0.088, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.73e-5, train/loss_step=0.0249, global_step=4205.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1898/5971 [16:59<36:26,  1.86it/s, loss=0.088, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.73e-5, train/loss_step=0.0249, global_step=4205.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1898/5971 [16:59<36:26,  1.86it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.2e-5, train/loss_step=0.0141, global_step=4205.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1899/5971 [17:00<36:26,  1.86it/s, loss=0.0864, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.2e-5, train/loss_step=0.0141, global_step=4205.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1899/5971 [17:00<36:26,  1.86it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.0033, train/loss_vlb_step=1.77e-5, train/loss_step=0.0033, global_step=4205.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1900/5971 [17:02<36:29,  1.86it/s, loss=0.0846, v_num=0, train/loss_simple_step=0.0033, train/loss_vlb_step=1.77e-5, train/loss_step=0.0033, global_step=4205.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1900/5971 [17:02<36:29,  1.86it/s, loss=0.0512, v_num=0, train/loss_simple_step=0.00471, train/loss_vlb_step=2.47e-5, train/loss_step=0.00471, global_step=4205.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1901/5971 [17:03<36:29,  1.86it/s, loss=0.0512, v_num=0, train/loss_simple_step=0.00471, train/loss_vlb_step=2.47e-5, train/loss_step=0.00471, global_step=4205.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1901/5971 [17:03<36:29,  1.86it/s, loss=0.053, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000152, train/loss_step=0.0421, global_step=4206.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1902/5971 [17:04<36:29,  1.86it/s, loss=0.053, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000152, train/loss_step=0.0421, global_step=4206.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1902/5971 [17:04<36:29,  1.86it/s, loss=0.0531, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.43e-5, train/loss_step=0.0203, global_step=4206.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1903/5971 [17:04<36:29,  1.86it/s, loss=0.0531, v_num=0, train/loss_simple_step=0.0203, train/loss_vlb_step=8.43e-5, train/loss_step=0.0203, global_step=4206.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1903/5971 [17:04<36:29,  1.86it/s, loss=0.0526, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.52e-6, train/loss_step=0.00164, global_step=4206.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1904/5971 [17:06<36:32,  1.85it/s, loss=0.0526, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.52e-6, train/loss_step=0.00164, global_step=4206.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1904/5971 [17:06<36:32,  1.85it/s, loss=0.0549, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000171, train/loss_step=0.0491, global_step=4206.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  32%|███▏      | 1905/5971 [17:07<36:32,  1.85it/s, loss=0.0549, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000171, train/loss_step=0.0491, global_step=4206.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1905/5971 [17:07<36:32,  1.85it/s, loss=0.0658, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00115, train/loss_step=0.288, global_step=4207.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  32%|███▏      | 1906/5971 [17:08<36:32,  1.85it/s, loss=0.0658, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00115, train/loss_step=0.288, global_step=4207.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1906/5971 [17:08<36:32,  1.85it/s, loss=0.0671, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000119, train/loss_step=0.0292, global_step=4207.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1907/5971 [17:09<36:33,  1.85it/s, loss=0.0671, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000119, train/loss_step=0.0292, global_step=4207.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1907/5971 [17:09<36:33,  1.85it/s, loss=0.0676, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.2e-5, train/loss_step=0.0116, global_step=4207.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1908/5971 [17:12<36:36,  1.85it/s, loss=0.0676, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.2e-5, train/loss_step=0.0116, global_step=4207.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1908/5971 [17:12<36:36,  1.85it/s, loss=0.0673, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.28e-6, train/loss_step=0.00154, global_step=4207.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1909/5971 [17:12<36:36,  1.85it/s, loss=0.0673, v_num=0, train/loss_simple_step=0.00154, train/loss_vlb_step=9.28e-6, train/loss_step=0.00154, global_step=4207.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1909/5971 [17:12<36:36,  1.85it/s, loss=0.1, v_num=0, train/loss_simple_step=0.663, train/loss_vlb_step=0.0169, train/loss_step=0.663, global_step=4208.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]        
Epoch 7:  32%|███▏      | 1910/5971 [17:13<36:36,  1.85it/s, loss=0.1, v_num=0, train/loss_simple_step=0.663, train/loss_vlb_step=0.0169, train/loss_step=0.663, global_step=4208.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1910/5971 [17:13<36:36,  1.85it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000226, train/loss_step=0.0674, global_step=4208.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1911/5971 [17:14<36:37,  1.85it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000226, train/loss_step=0.0674, global_step=4208.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1911/5971 [17:14<36:37,  1.85it/s, loss=0.117, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00259, train/loss_step=0.400, global_step=4208.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  32%|███▏      | 1912/5971 [17:16<36:39,  1.85it/s, loss=0.117, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00259, train/loss_step=0.400, global_step=4208.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1912/5971 [17:16<36:39,  1.85it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000299, train/loss_step=0.0895, global_step=4208.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1913/5971 [17:17<36:40,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000299, train/loss_step=0.0895, global_step=4208.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1913/5971 [17:17<36:40,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000941, train/loss_step=0.241, global_step=4209.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1914/5971 [17:18<36:40,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000941, train/loss_step=0.241, global_step=4209.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1914/5971 [17:18<36:40,  1.84it/s, loss=0.14, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.0031, train/loss_step=0.498, global_step=4209.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  32%|███▏      | 1915/5971 [17:19<36:40,  1.84it/s, loss=0.14, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.0031, train/loss_step=0.498, global_step=4209.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1915/5971 [17:19<36:40,  1.84it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0737, train/loss_vlb_step=0.000247, train/loss_step=0.0737, global_step=4209.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1916/5971 [17:21<36:43,  1.84it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0737, train/loss_vlb_step=0.000247, train/loss_step=0.0737, global_step=4209.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1916/5971 [17:21<36:43,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.0048, train/loss_step=0.534, global_step=4209.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  32%|███▏      | 1917/5971 [17:22<36:43,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.534, train/loss_vlb_step=0.0048, train/loss_step=0.534, global_step=4209.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1917/5971 [17:22<36:43,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.875, train/loss_vlb_step=0.0641, train/loss_step=0.875, global_step=4210.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1918/5971 [17:23<36:43,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.875, train/loss_vlb_step=0.0641, train/loss_step=0.875, global_step=4210.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1918/5971 [17:23<36:43,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000132, train/loss_step=0.0361, global_step=4210.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1919/5971 [17:24<36:43,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000132, train/loss_step=0.0361, global_step=4210.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1919/5971 [17:24<36:43,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.65e-5, train/loss_step=0.0191, global_step=4210.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  32%|███▏      | 1920/5971 [17:26<36:46,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=7.65e-5, train/loss_step=0.0191, global_step=4210.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1920/5971 [17:26<36:46,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00837, train/loss_vlb_step=3.92e-5, train/loss_step=0.00837, global_step=4210.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1921/5971 [17:27<36:46,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00837, train/loss_vlb_step=3.92e-5, train/loss_step=0.00837, global_step=4210.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1921/5971 [17:27<36:46,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.97e-6, train/loss_step=0.0017, global_step=4211.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1922/5971 [17:28<36:46,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.97e-6, train/loss_step=0.0017, global_step=4211.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1922/5971 [17:28<36:46,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=1.96e-5, train/loss_step=0.00379, global_step=4211.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1923/5971 [17:28<36:46,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=1.96e-5, train/loss_step=0.00379, global_step=4211.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1923/5971 [17:28<36:46,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00294, train/loss_step=0.434, global_step=4211.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  32%|███▏      | 1924/5971 [17:31<36:49,  1.83it/s, loss=0.216, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00294, train/loss_step=0.434, global_step=4211.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1924/5971 [17:31<36:49,  1.83it/s, loss=0.23, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00186, train/loss_step=0.329, global_step=4211.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  32%|███▏      | 1925/5971 [17:31<36:49,  1.83it/s, loss=0.23, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00186, train/loss_step=0.329, global_step=4211.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1925/5971 [17:31<36:49,  1.83it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.000216, train/loss_step=0.0649, global_step=4212.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1926/5971 [17:32<36:50,  1.83it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.000216, train/loss_step=0.0649, global_step=4212.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1926/5971 [17:32<36:50,  1.83it/s, loss=0.229, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000954, train/loss_step=0.235, global_step=4212.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1927/5971 [17:33<36:50,  1.83it/s, loss=0.229, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000954, train/loss_step=0.235, global_step=4212.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1927/5971 [17:33<36:50,  1.83it/s, loss=0.236, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000474, train/loss_step=0.142, global_step=4212.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1928/5971 [17:36<36:53,  1.83it/s, loss=0.236, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000474, train/loss_step=0.142, global_step=4212.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1928/5971 [17:36<36:53,  1.83it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0773, train/loss_vlb_step=0.000259, train/loss_step=0.0773, global_step=4212.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1929/5971 [17:36<36:53,  1.83it/s, loss=0.24, v_num=0, train/loss_simple_step=0.0773, train/loss_vlb_step=0.000259, train/loss_step=0.0773, global_step=4212.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1929/5971 [17:36<36:53,  1.83it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.000221, train/loss_step=0.0654, global_step=4213.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1930/5971 [17:37<36:53,  1.83it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0654, train/loss_vlb_step=0.000221, train/loss_step=0.0654, global_step=4213.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1930/5971 [17:37<36:53,  1.83it/s, loss=0.221, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00108, train/loss_step=0.285, global_step=4213.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1931/5971 [17:38<36:53,  1.82it/s, loss=0.221, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00108, train/loss_step=0.285, global_step=4213.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1931/5971 [17:38<36:53,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000222, train/loss_step=0.0643, global_step=4213.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1932/5971 [17:40<36:56,  1.82it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0643, train/loss_vlb_step=0.000222, train/loss_step=0.0643, global_step=4213.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1932/5971 [17:40<36:56,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000111, train/loss_step=0.0284, global_step=4213.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1933/5971 [17:41<36:56,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000111, train/loss_step=0.0284, global_step=4213.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1933/5971 [17:41<36:56,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.000173, train/loss_step=0.0493, global_step=4214.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1934/5971 [17:42<36:56,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0493, train/loss_vlb_step=0.000173, train/loss_step=0.0493, global_step=4214.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1934/5971 [17:42<36:56,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00445, train/loss_step=0.434, global_step=4214.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  32%|███▏      | 1935/5971 [17:43<36:56,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.434, train/loss_vlb_step=0.00445, train/loss_step=0.434, global_step=4214.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1935/5971 [17:43<36:56,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000703, train/loss_step=0.184, global_step=4214.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1936/5971 [17:45<37:00,  1.82it/s, loss=0.194, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000703, train/loss_step=0.184, global_step=4214.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1936/5971 [17:45<37:00,  1.82it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.43e-5, train/loss_step=0.00255, global_step=4214.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1937/5971 [17:46<37:00,  1.82it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.43e-5, train/loss_step=0.00255, global_step=4214.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1937/5971 [17:46<37:00,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00104, train/loss_step=0.265, global_step=4215.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  32%|███▏      | 1938/5971 [17:47<37:00,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00104, train/loss_step=0.265, global_step=4215.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1938/5971 [17:47<37:00,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000494, train/loss_step=0.148, global_step=4215.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1939/5971 [17:48<37:00,  1.82it/s, loss=0.142, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000494, train/loss_step=0.148, global_step=4215.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1939/5971 [17:48<37:00,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.700, train/loss_vlb_step=0.0281, train/loss_step=0.700, global_step=4215.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  32%|███▏      | 1940/5971 [17:50<37:03,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.700, train/loss_vlb_step=0.0281, train/loss_step=0.700, global_step=4215.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  32%|███▏      | 1940/5971 [17:50<37:03,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00344, train/loss_step=0.475, global_step=4215.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1941/5971 [17:51<37:03,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.475, train/loss_vlb_step=0.00344, train/loss_step=0.475, global_step=4215.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1941/5971 [17:51<37:03,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00657, train/loss_vlb_step=3.12e-5, train/loss_step=0.00657, global_step=4216.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1942/5971 [17:52<37:03,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00657, train/loss_vlb_step=3.12e-5, train/loss_step=0.00657, global_step=4216.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1942/5971 [17:52<37:03,  1.81it/s, loss=0.213, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00115, train/loss_step=0.263, global_step=4216.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  33%|███▎      | 1943/5971 [17:53<37:04,  1.81it/s, loss=0.213, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00115, train/loss_step=0.263, global_step=4216.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1943/5971 [17:53<37:04,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000224, train/loss_step=0.0665, global_step=4216.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1944/5971 [17:55<37:06,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000224, train/loss_step=0.0665, global_step=4216.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1944/5971 [17:55<37:06,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00125, train/loss_vlb_step=7.52e-6, train/loss_step=0.00125, global_step=4216.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1945/5971 [17:56<37:07,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00125, train/loss_vlb_step=7.52e-6, train/loss_step=0.00125, global_step=4216.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1945/5971 [17:56<37:07,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000129, train/loss_step=0.0343, global_step=4217.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1946/5971 [17:57<37:07,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000129, train/loss_step=0.0343, global_step=4217.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1946/5971 [17:57<37:07,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.27e-5, train/loss_step=0.0122, global_step=4217.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1947/5971 [17:58<37:07,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.27e-5, train/loss_step=0.0122, global_step=4217.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1947/5971 [17:58<37:07,  1.81it/s, loss=0.166, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.00055, train/loss_step=0.163, global_step=4217.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  33%|███▎      | 1948/5971 [18:00<37:10,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.00055, train/loss_step=0.163, global_step=4217.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1948/5971 [18:00<37:10,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00151, train/loss_step=0.339, global_step=4217.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1949/5971 [18:01<37:10,  1.80it/s, loss=0.179, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00151, train/loss_step=0.339, global_step=4217.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1949/5971 [18:01<37:10,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00402, train/loss_vlb_step=2.09e-5, train/loss_step=0.00402, global_step=4218.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1950/5971 [18:02<37:10,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00402, train/loss_vlb_step=2.09e-5, train/loss_step=0.00402, global_step=4218.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1950/5971 [18:02<37:10,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0921, train/loss_vlb_step=0.000311, train/loss_step=0.0921, global_step=4218.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1951/5971 [18:03<37:10,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0921, train/loss_vlb_step=0.000311, train/loss_step=0.0921, global_step=4218.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1951/5971 [18:03<37:10,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00292, train/loss_step=0.413, global_step=4218.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  33%|███▎      | 1952/5971 [18:05<37:13,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00292, train/loss_step=0.413, global_step=4218.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1952/5971 [18:05<37:13,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00874, train/loss_vlb_step=4.01e-5, train/loss_step=0.00874, global_step=4218.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1953/5971 [18:06<37:13,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00874, train/loss_vlb_step=4.01e-5, train/loss_step=0.00874, global_step=4218.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1953/5971 [18:06<37:13,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.46e-5, train/loss_step=0.0167, global_step=4219.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  33%|███▎      | 1954/5971 [18:06<37:13,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.46e-5, train/loss_step=0.0167, global_step=4219.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1954/5971 [18:06<37:13,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000259, train/loss_step=0.077, global_step=4219.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1955/5971 [18:07<37:13,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.077, train/loss_vlb_step=0.000259, train/loss_step=0.077, global_step=4219.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1955/5971 [18:07<37:13,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000601, train/loss_step=0.175, global_step=4219.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1956/5971 [18:10<37:16,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000601, train/loss_step=0.175, global_step=4219.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1956/5971 [18:10<37:16,  1.80it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000219, train/loss_step=0.0645, global_step=4219.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1957/5971 [18:11<37:16,  1.79it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000219, train/loss_step=0.0645, global_step=4219.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1957/5971 [18:11<37:16,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000352, train/loss_step=0.106, global_step=4220.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  33%|███▎      | 1958/5971 [18:11<37:16,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000352, train/loss_step=0.106, global_step=4220.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1958/5971 [18:11<37:16,  1.79it/s, loss=0.179, v_num=0, train/loss_simple_step=0.558, train/loss_vlb_step=0.00607, train/loss_step=0.558, global_step=4220.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1959/5971 [18:12<37:16,  1.79it/s, loss=0.179, v_num=0, train/loss_simple_step=0.558, train/loss_vlb_step=0.00607, train/loss_step=0.558, global_step=4220.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1959/5971 [18:12<37:16,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.618, train/loss_vlb_step=0.0121, train/loss_step=0.618, global_step=4220.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1960/5971 [18:14<37:19,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.618, train/loss_vlb_step=0.0121, train/loss_step=0.618, global_step=4220.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1960/5971 [18:14<37:19,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00096, train/loss_step=0.267, global_step=4220.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1961/5971 [18:15<37:19,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00096, train/loss_step=0.267, global_step=4220.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1961/5971 [18:15<37:19,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.77e-5, train/loss_step=0.00531, global_step=4221.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1962/5971 [18:16<37:19,  1.79it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.77e-5, train/loss_step=0.00531, global_step=4221.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1962/5971 [18:16<37:19,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00439, train/loss_vlb_step=2.29e-5, train/loss_step=0.00439, global_step=4221.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1963/5971 [18:17<37:19,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00439, train/loss_vlb_step=2.29e-5, train/loss_step=0.00439, global_step=4221.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1963/5971 [18:17<37:19,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0588, train/loss_vlb_step=0.000208, train/loss_step=0.0588, global_step=4221.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1964/5971 [18:19<37:22,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0588, train/loss_vlb_step=0.000208, train/loss_step=0.0588, global_step=4221.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1964/5971 [18:19<37:22,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.96e-5, train/loss_step=0.0258, global_step=4221.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1965/5971 [18:20<37:22,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0258, train/loss_vlb_step=9.96e-5, train/loss_step=0.0258, global_step=4221.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1965/5971 [18:20<37:22,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.73e-5, train/loss_step=0.00536, global_step=4222.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1966/5971 [18:21<37:22,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.73e-5, train/loss_step=0.00536, global_step=4222.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1966/5971 [18:21<37:22,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000291, train/loss_step=0.0887, global_step=4222.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  33%|███▎      | 1967/5971 [18:22<37:23,  1.79it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000291, train/loss_step=0.0887, global_step=4222.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1967/5971 [18:22<37:23,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000457, train/loss_step=0.137, global_step=4222.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  33%|███▎      | 1968/5971 [18:24<37:25,  1.78it/s, loss=0.153, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000457, train/loss_step=0.137, global_step=4222.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1968/5971 [18:24<37:25,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00106, train/loss_step=0.268, global_step=4222.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  33%|███▎      | 1969/5971 [18:25<37:25,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00106, train/loss_step=0.268, global_step=4222.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1969/5971 [18:25<37:25,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.58e-5, train/loss_step=0.0121, global_step=4223.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1970/5971 [18:26<37:26,  1.78it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.58e-5, train/loss_step=0.0121, global_step=4223.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1970/5971 [18:26<37:26,  1.78it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0678, train/loss_vlb_step=0.000237, train/loss_step=0.0678, global_step=4223.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1971/5971 [18:27<37:26,  1.78it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0678, train/loss_vlb_step=0.000237, train/loss_step=0.0678, global_step=4223.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1971/5971 [18:27<37:26,  1.78it/s, loss=0.135, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.00045, train/loss_step=0.136, global_step=4223.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  33%|███▎      | 1972/5971 [18:29<37:28,  1.78it/s, loss=0.135, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.00045, train/loss_step=0.136, global_step=4223.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1972/5971 [18:29<37:28,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=4223.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1973/5971 [18:30<37:28,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=4223.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1973/5971 [18:30<37:28,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000149, train/loss_step=0.0432, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1974/5971 [18:31<37:28,  1.78it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000149, train/loss_step=0.0432, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1974/5971 [18:31<37:28,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00191, train/loss_step=0.358, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  33%|███▎      | 1975/5971 [18:32<37:28,  1.78it/s, loss=0.156, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00191, train/loss_step=0.358, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1975/5971 [18:32<37:28,  1.78it/s, loss=0.164, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00128, train/loss_step=0.330, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1976/5971 [18:34<37:31,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00128, train/loss_step=0.330, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  33%|███▎      | 1976/5971 [18:34<37:31,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.32it/s][A
Epoch 7:  33%|███▎      | 1978/5971 [18:34<37:28,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:44,  3.71it/s][A
Epoch 7:  33%|███▎      | 1980/5971 [18:34<37:26,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.49it/s][A
Epoch 7:  33%|███▎      | 1983/5971 [18:34<37:21,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.95it/s][A
Epoch 7:  33%|███▎      | 1986/5971 [18:35<37:16,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.50it/s][A
Epoch 7:  33%|███▎      | 1989/5971 [18:35<37:11,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.96it/s][A
Epoch 7:  33%|███▎      | 1992/5971 [18:35<37:06,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.69it/s][A
Epoch 7:  33%|███▎      | 1995/5971 [18:35<37:01,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.14it/s][A
Epoch 7:  33%|███▎      | 1998/5971 [18:35<36:57,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.65it/s][A
Epoch 7:  34%|███▎      | 2001/5971 [18:35<36:52,  1.79it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.05it/s][A
Epoch 7:  34%|███▎      | 2004/5971 [18:35<36:47,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.10it/s][A
Epoch 7:  34%|███▎      | 2007/5971 [18:35<36:42,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.50it/s][A
Epoch 7:  34%|███▎      | 2010/5971 [18:35<36:38,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:01<00:05, 26.34it/s][A
Epoch 7:  34%|███▎      | 2013/5971 [18:36<36:33,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.69it/s][A
Epoch 7:  34%|███▍      | 2016/5971 [18:36<36:28,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.89it/s][A
Epoch 7:  34%|███▍      | 2019/5971 [18:36<36:24,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.42it/s][A
Epoch 7:  34%|███▍      | 2022/5971 [18:36<36:19,  1.81it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.34it/s][A
Epoch 7:  34%|███▍      | 2026/5971 [18:36<36:13,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 27.01it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.54it/s][A
Epoch 7:  34%|███▍      | 2030/5971 [18:36<36:06,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.16it/s][A
Epoch 7:  34%|███▍      | 2034/5971 [18:36<36:00,  1.82it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.63it/s][A
Epoch 7:  34%|███▍      | 2038/5971 [18:37<35:54,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.28it/s][A

Validating:  39%|███▉      | 65/167 [00:02<00:03, 27.39it/s][A
Epoch 7:  34%|███▍      | 2042/5971 [18:37<35:48,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.96it/s][A
Epoch 7:  34%|███▍      | 2046/5971 [18:37<35:42,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 28.03it/s][A
Epoch 7:  34%|███▍      | 2050/5971 [18:37<35:36,  1.84it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 29.10it/s][A
Epoch 7:  34%|███▍      | 2054/5971 [18:37<35:30,  1.84it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.83it/s][A
Epoch 7:  34%|███▍      | 2058/5971 [18:37<35:24,  1.84it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▉     | 82/167 [00:03<00:02, 28.78it/s][A

Validating:  51%|█████     | 85/167 [00:03<00:02, 28.27it/s][A
Epoch 7:  35%|███▍      | 2062/5971 [18:37<35:18,  1.85it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 25.21it/s][A
Epoch 7:  35%|███▍      | 2066/5971 [18:38<35:12,  1.85it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 91/167 [00:03<00:03, 25.33it/s][A
Epoch 7:  35%|███▍      | 2070/5971 [18:38<35:06,  1.85it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.12it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.61it/s][A
Epoch 7:  35%|███▍      | 2074/5971 [18:38<35:00,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.87it/s][A
Epoch 7:  35%|███▍      | 2078/5971 [18:38<34:54,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.57it/s][A
Epoch 7:  35%|███▍      | 2082/5971 [18:38<34:48,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.32it/s][A

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.73it/s][A
Epoch 7:  35%|███▍      | 2086/5971 [18:38<34:42,  1.87it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.59it/s][A
Epoch 7:  35%|███▌      | 2090/5971 [18:38<34:36,  1.87it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:04<00:01, 28.50it/s][A
Epoch 7:  35%|███▌      | 2094/5971 [18:39<34:30,  1.87it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 27.60it/s][A
Epoch 7:  35%|███▌      | 2098/5971 [18:39<34:25,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 26.92it/s][A
Epoch 7:  35%|███▌      | 2102/5971 [18:39<34:19,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.08it/s][A
Epoch 7:  35%|███▌      | 2106/5971 [18:39<34:13,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.54it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.29it/s][A
Epoch 7:  35%|███▌      | 2110/5971 [18:39<34:07,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.98it/s][A
Epoch 7:  35%|███▌      | 2114/5971 [18:39<34:02,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 27.47it/s][A
Epoch 7:  35%|███▌      | 2118/5971 [18:39<33:56,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 27.38it/s][A

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 25.48it/s][A
Epoch 7:  36%|███▌      | 2122/5971 [18:40<33:50,  1.90it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.46it/s][A
Epoch 7:  36%|███▌      | 2126/5971 [18:40<33:45,  1.90it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.06it/s][A
Epoch 7:  36%|███▌      | 2130/5971 [18:40<33:39,  1.90it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.70it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.25it/s][A
Epoch 7:  36%|███▌      | 2134/5971 [18:40<33:33,  1.91it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.76it/s][A
Epoch 7:  36%|███▌      | 2138/5971 [18:40<33:28,  1.91it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.64it/s][A
Epoch 7:  36%|███▌      | 2142/5971 [18:40<33:22,  1.91it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.65it/s][A
Epoch 7:  36%|███▌      | 2144/5971 [18:41<33:20,  1.91it/s, loss=0.176, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00175, train/loss_step=0.320, global_step=4224.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  36%|███▌      | 2145/5971 [18:42<33:20,  1.91it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0805, train/loss_vlb_step=0.000269, train/loss_step=0.0805, global_step=4225.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2146/5971 [18:43<33:20,  1.91it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0805, train/loss_vlb_step=0.000269, train/loss_step=0.0805, global_step=4225.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2146/5971 [18:43<33:20,  1.91it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.89e-5, train/loss_step=0.0208, global_step=4225.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  36%|███▌      | 2147/5971 [18:43<33:20,  1.91it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.000172, train/loss_step=0.0495, global_step=4225.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2148/5971 [18:47<33:25,  1.91it/s, loss=0.112, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=4225.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  36%|███▌      | 2149/5971 [18:48<33:25,  1.91it/s, loss=0.137, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00547, train/loss_step=0.502, global_step=4226.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  36%|███▌      | 2150/5971 [18:48<33:25,  1.91it/s, loss=0.137, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00547, train/loss_step=0.502, global_step=4226.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2150/5971 [18:48<33:25,  1.91it/s, loss=0.143, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000375, train/loss_step=0.114, global_step=4226.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2151/5971 [18:49<33:25,  1.90it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00137, train/loss_vlb_step=8.13e-6, train/loss_step=0.00137, global_step=4226.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2152/5971 [18:52<33:28,  1.90it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00925, train/loss_vlb_step=4.26e-5, train/loss_step=0.00925, global_step=4226.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2153/5971 [18:53<33:28,  1.90it/s, loss=0.145, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000437, train/loss_step=0.129, global_step=4227.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  36%|███▌      | 2154/5971 [18:53<33:28,  1.90it/s, loss=0.145, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000437, train/loss_step=0.129, global_step=4227.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2154/5971 [18:53<33:28,  1.90it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00327, train/loss_vlb_step=1.73e-5, train/loss_step=0.00327, global_step=4227.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2155/5971 [18:54<33:28,  1.90it/s, loss=0.144, v_num=0, train/loss_simple_step=0.198, train/loss_vlb_step=0.000675, train/loss_step=0.198, global_step=4227.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  36%|███▌      | 2156/5971 [18:57<33:30,  1.90it/s, loss=0.137, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000417, train/loss_step=0.126, global_step=4227.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2157/5971 [18:57<33:31,  1.90it/s, loss=0.142, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=4228.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  36%|███▌      | 2158/5971 [18:58<33:31,  1.90it/s, loss=0.142, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=4228.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2158/5971 [18:58<33:31,  1.90it/s, loss=0.151, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000966, train/loss_step=0.257, global_step=4228.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2159/5971 [18:59<33:31,  1.90it/s, loss=0.164, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00225, train/loss_step=0.390, global_step=4228.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  36%|███▌      | 2160/5971 [19:02<33:34,  1.89it/s, loss=0.179, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.0018, train/loss_step=0.415, global_step=4228.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  36%|███▌      | 2161/5971 [19:02<33:34,  1.89it/s, loss=0.199, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.0049, train/loss_step=0.435, global_step=4229.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2162/5971 [19:03<33:34,  1.89it/s, loss=0.199, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.0049, train/loss_step=0.435, global_step=4229.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2162/5971 [19:03<33:34,  1.89it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.43e-5, train/loss_step=0.0231, global_step=4229.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▌      | 2163/5971 [19:04<33:34,  1.89it/s, loss=0.181, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00136, train/loss_step=0.311, global_step=4229.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  36%|███▌      | 2164/5971 [19:06<33:36,  1.89it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00773, train/loss_vlb_step=3.81e-5, train/loss_step=0.00773, global_step=4229.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2165/5971 [19:07<33:36,  1.89it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.5e-5, train/loss_step=0.0133, global_step=4230.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  36%|███▋      | 2166/5971 [19:08<33:36,  1.89it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.5e-5, train/loss_step=0.0133, global_step=4230.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2166/5971 [19:08<33:36,  1.89it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00193, train/loss_vlb_step=1.11e-5, train/loss_step=0.00193, global_step=4230.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2167/5971 [19:09<33:36,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00116, train/loss_step=0.285, global_step=4230.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  36%|███▋      | 2168/5971 [19:11<33:39,  1.88it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00151, train/loss_vlb_step=8.54e-6, train/loss_step=0.00151, global_step=4230.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2169/5971 [19:12<33:39,  1.88it/s, loss=0.169, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.0044, train/loss_step=0.542, global_step=4231.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  36%|███▋      | 2170/5971 [19:13<33:39,  1.88it/s, loss=0.169, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.0044, train/loss_step=0.542, global_step=4231.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2170/5971 [19:13<33:39,  1.88it/s, loss=0.172, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000605, train/loss_step=0.172, global_step=4231.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2171/5971 [19:14<33:39,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0847, train/loss_vlb_step=0.000282, train/loss_step=0.0847, global_step=4231.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2172/5971 [19:16<33:42,  1.88it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0952, train/loss_vlb_step=0.000313, train/loss_step=0.0952, global_step=4231.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  36%|███▋      | 2173/5971 [19:17<33:42,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00113, train/loss_step=0.284, global_step=4232.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  36%|███▋      | 2174/5971 [19:18<33:42,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00113, train/loss_step=0.284, global_step=4232.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2174/5971 [19:18<33:42,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0059, train/loss_vlb_step=2.91e-5, train/loss_step=0.0059, global_step=4232.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2175/5971 [19:19<33:42,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000749, train/loss_step=0.215, global_step=4232.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  36%|███▋      | 2176/5971 [19:21<33:44,  1.87it/s, loss=0.185, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000127, train/loss_step=0.034, global_step=4232.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2177/5971 [19:22<33:44,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0033, train/loss_vlb_step=1.79e-5, train/loss_step=0.0033, global_step=4233.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2178/5971 [19:23<33:44,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0033, train/loss_vlb_step=1.79e-5, train/loss_step=0.0033, global_step=4233.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2178/5971 [19:23<33:44,  1.87it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0697, train/loss_vlb_step=0.000233, train/loss_step=0.0697, global_step=4233.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  36%|███▋      | 2179/5971 [19:24<33:44,  1.87it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00507, train/loss_vlb_step=2.58e-5, train/loss_step=0.00507, global_step=4233.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2180/5971 [19:26<33:47,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.72e-6, train/loss_step=0.00159, global_step=4233.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2181/5971 [19:27<33:47,  1.87it/s, loss=0.116, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000565, train/loss_step=0.170, global_step=4234.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2182/5971 [19:28<33:47,  1.87it/s, loss=0.116, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000565, train/loss_step=0.170, global_step=4234.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2182/5971 [19:28<33:47,  1.87it/s, loss=0.126, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000845, train/loss_step=0.217, global_step=4234.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2183/5971 [19:29<33:47,  1.87it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0623, train/loss_vlb_step=0.000213, train/loss_step=0.0623, global_step=4234.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2184/5971 [19:31<33:49,  1.87it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.71e-5, train/loss_step=0.00749, global_step=4234.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2185/5971 [19:32<33:50,  1.86it/s, loss=0.119, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000412, train/loss_step=0.125, global_step=4235.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  37%|███▋      | 2186/5971 [19:33<33:50,  1.86it/s, loss=0.119, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000412, train/loss_step=0.125, global_step=4235.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2186/5971 [19:33<33:50,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0756, train/loss_vlb_step=0.000252, train/loss_step=0.0756, global_step=4235.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2187/5971 [19:33<33:50,  1.86it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.06e-5, train/loss_step=0.0118, global_step=4235.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  37%|███▋      | 2188/5971 [19:36<33:52,  1.86it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.12e-5, train/loss_step=0.0173, global_step=4235.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  37%|███▋      | 2189/5971 [19:36<33:52,  1.86it/s, loss=0.096, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.0011, train/loss_step=0.262, global_step=4236.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2190/5971 [19:37<33:52,  1.86it/s, loss=0.096, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.0011, train/loss_step=0.262, global_step=4236.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2190/5971 [19:37<33:52,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.654, train/loss_vlb_step=0.016, train/loss_step=0.654, global_step=4236.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2191/5971 [19:38<33:52,  1.86it/s, loss=0.129, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00116, train/loss_step=0.255, global_step=4236.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2192/5971 [19:40<33:55,  1.86it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00409, train/loss_vlb_step=2.07e-5, train/loss_step=0.00409, global_step=4236.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2193/5971 [19:41<33:55,  1.86it/s, loss=0.151, v_num=0, train/loss_simple_step=0.828, train/loss_vlb_step=0.0607, train/loss_step=0.828, global_step=4237.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  37%|███▋      | 2194/5971 [19:42<33:55,  1.86it/s, loss=0.151, v_num=0, train/loss_simple_step=0.828, train/loss_vlb_step=0.0607, train/loss_step=0.828, global_step=4237.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2194/5971 [19:42<33:55,  1.86it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000217, train/loss_step=0.0645, global_step=4237.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2195/5971 [19:43<33:55,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000641, train/loss_step=0.181, global_step=4237.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2196/5971 [19:45<33:57,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.81e-5, train/loss_step=0.0163, global_step=4237.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2197/5971 [19:46<33:57,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00177, train/loss_step=0.368, global_step=4238.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  37%|███▋      | 2198/5971 [19:47<33:57,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00177, train/loss_step=0.368, global_step=4238.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2198/5971 [19:47<33:57,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.000211, train/loss_step=0.064, global_step=4238.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2199/5971 [19:48<33:57,  1.85it/s, loss=0.175, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=4238.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2200/5971 [19:50<34:00,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000183, train/loss_step=0.0548, global_step=4238.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2201/5971 [19:51<34:00,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00657, train/loss_vlb_step=3.28e-5, train/loss_step=0.00657, global_step=4239.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2202/5971 [19:52<34:00,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00657, train/loss_vlb_step=3.28e-5, train/loss_step=0.00657, global_step=4239.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2202/5971 [19:52<34:00,  1.85it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0814, train/loss_vlb_step=0.000272, train/loss_step=0.0814, global_step=4239.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2203/5971 [19:53<34:00,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.074, train/loss_vlb_step=0.000248, train/loss_step=0.074, global_step=4239.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2204/5971 [19:55<34:02,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.07e-5, train/loss_step=0.0175, global_step=4239.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2205/5971 [19:56<34:02,  1.84it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.52e-5, train/loss_step=0.0128, global_step=4240.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2206/5971 [19:57<34:02,  1.84it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.52e-5, train/loss_step=0.0128, global_step=4240.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2206/5971 [19:57<34:02,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0996, train/loss_vlb_step=0.000328, train/loss_step=0.0996, global_step=4240.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2207/5971 [19:58<34:02,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.3e-5, train/loss_step=0.0237, global_step=4240.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2208/5971 [20:00<34:04,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00711, train/loss_vlb_step=3.35e-5, train/loss_step=0.00711, global_step=4240.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2209/5971 [20:01<34:05,  1.84it/s, loss=0.159, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000955, train/loss_step=0.250, global_step=4241.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2210/5971 [20:02<34:05,  1.84it/s, loss=0.159, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000955, train/loss_step=0.250, global_step=4241.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2210/5971 [20:02<34:05,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000521, train/loss_step=0.157, global_step=4241.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2211/5971 [20:03<34:05,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.51e-5, train/loss_step=0.00259, global_step=4241.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2212/5971 [20:05<34:07,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0087, train/loss_vlb_step=4.19e-5, train/loss_step=0.0087, global_step=4241.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2213/5971 [20:06<34:07,  1.84it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.75e-5, train/loss_step=0.0263, global_step=4242.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2214/5971 [20:07<34:07,  1.84it/s, loss=0.0819, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.75e-5, train/loss_step=0.0263, global_step=4242.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2214/5971 [20:07<34:07,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.630, train/loss_vlb_step=0.0161, train/loss_step=0.630, global_step=4242.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  37%|███▋      | 2215/5971 [20:07<34:07,  1.83it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=1.94e-5, train/loss_step=0.00355, global_step=4242.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2216/5971 [20:10<34:09,  1.83it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00177, train/loss_vlb_step=1.06e-5, train/loss_step=0.00177, global_step=4242.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2217/5971 [20:10<34:09,  1.83it/s, loss=0.0824, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.44e-5, train/loss_step=0.00501, global_step=4243.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2218/5971 [20:11<34:09,  1.83it/s, loss=0.0824, v_num=0, train/loss_simple_step=0.00501, train/loss_vlb_step=2.44e-5, train/loss_step=0.00501, global_step=4243.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2218/5971 [20:11<34:09,  1.83it/s, loss=0.08, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.7e-5, train/loss_step=0.0156, global_step=4243.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  37%|███▋      | 2219/5971 [20:12<34:09,  1.83it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.0024, train/loss_step=0.415, global_step=4243.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2220/5971 [20:15<34:12,  1.83it/s, loss=0.0954, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000239, train/loss_step=0.0709, global_step=4243.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2221/5971 [20:16<34:12,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.854, train/loss_vlb_step=0.049, train/loss_step=0.854, global_step=4244.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      
Epoch 7:  37%|███▋      | 2222/5971 [20:16<34:12,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.854, train/loss_vlb_step=0.049, train/loss_step=0.854, global_step=4244.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2222/5971 [20:16<34:12,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0982, train/loss_vlb_step=0.000323, train/loss_step=0.0982, global_step=4244.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2223/5971 [20:17<34:12,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00279, train/loss_step=0.446, global_step=4244.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  37%|███▋      | 2224/5971 [20:19<34:14,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.8e-5, train/loss_step=0.0185, global_step=4244.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2225/5971 [20:20<34:14,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.46e-5, train/loss_step=0.00456, global_step=4245.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2226/5971 [20:21<34:14,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.46e-5, train/loss_step=0.00456, global_step=4245.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2226/5971 [20:21<34:14,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.5e-5, train/loss_step=0.0156, global_step=4245.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  37%|███▋      | 2227/5971 [20:22<34:14,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00143, train/loss_step=0.347, global_step=4245.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  37%|███▋      | 2228/5971 [20:24<34:16,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.95e-5, train/loss_step=0.00364, global_step=4245.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2229/5971 [20:25<34:16,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.00067, train/loss_step=0.176, global_step=4246.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  37%|███▋      | 2230/5971 [20:26<34:16,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.00067, train/loss_step=0.176, global_step=4246.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2230/5971 [20:26<34:16,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.789, train/loss_vlb_step=0.026, train/loss_step=0.789, global_step=4246.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2231/5971 [20:27<34:16,  1.82it/s, loss=0.203, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000458, train/loss_step=0.138, global_step=4246.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2232/5971 [20:29<34:18,  1.82it/s, loss=0.212, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000598, train/loss_step=0.179, global_step=4246.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2233/5971 [20:30<34:18,  1.82it/s, loss=0.211, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.44e-5, train/loss_step=0.00252, global_step=4247.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2234/5971 [20:31<34:18,  1.82it/s, loss=0.211, v_num=0, train/loss_simple_step=0.00252, train/loss_vlb_step=1.44e-5, train/loss_step=0.00252, global_step=4247.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2234/5971 [20:31<34:18,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.068, train/loss_vlb_step=0.000229, train/loss_step=0.068, global_step=4247.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  37%|███▋      | 2235/5971 [20:32<34:18,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.29e-5, train/loss_step=0.00219, global_step=4247.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2236/5971 [20:34<34:20,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0053, train/loss_vlb_step=2.66e-5, train/loss_step=0.0053, global_step=4247.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  37%|███▋      | 2237/5971 [20:35<34:20,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.98e-5, train/loss_step=0.0188, global_step=4248.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2238/5971 [20:36<34:20,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.98e-5, train/loss_step=0.0188, global_step=4248.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2238/5971 [20:36<34:20,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00647, train/loss_vlb_step=3.14e-5, train/loss_step=0.00647, global_step=4248.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  37%|███▋      | 2239/5971 [20:36<34:20,  1.81it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00578, train/loss_vlb_step=2.99e-5, train/loss_step=0.00578, global_step=4248.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  38%|███▊      | 2240/5971 [20:39<34:23,  1.81it/s, loss=0.177, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00165, train/loss_step=0.353, global_step=4248.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  38%|███▊      | 2241/5971 [20:40<34:23,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0953, train/loss_vlb_step=0.000313, train/loss_step=0.0953, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  38%|███▊      | 2242/5971 [20:40<34:23,  1.81it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0953, train/loss_vlb_step=0.000313, train/loss_step=0.0953, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  38%|███▊      | 2242/5971 [20:40<34:23,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000471, train/loss_step=0.143, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  38%|███▊      | 2243/5971 [20:41<34:22,  1.81it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00639, train/loss_vlb_step=3.2e-5, train/loss_step=0.00639, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  38%|███▊      | 2244/5971 [20:43<34:25,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<02:34,  1.08it/s][A
Epoch 7:  38%|███▊      | 2246/5971 [20:44<34:23,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:01<01:15,  2.20it/s][A

Validating:   3%|▎         | 5/167 [00:01<00:26,  6.19it/s][A
Epoch 7:  38%|███▊      | 2250/5971 [20:45<34:18,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:01<00:16,  9.82it/s][A
Epoch 7:  38%|███▊      | 2254/5971 [20:45<34:12,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:01<00:12, 12.74it/s][A
Epoch 7:  38%|███▊      | 2258/5971 [20:45<34:07,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:09, 15.90it/s][A

Validating:  10%|█         | 17/167 [00:01<00:08, 18.33it/s][A
Epoch 7:  38%|███▊      | 2262/5971 [20:45<34:01,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 21.58it/s][A
Epoch 7:  38%|███▊      | 2266/5971 [20:45<33:56,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.53it/s][A
Epoch 7:  38%|███▊      | 2270/5971 [20:45<33:50,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 27/167 [00:02<00:05, 24.57it/s][A
Epoch 7:  38%|███▊      | 2274/5971 [20:46<33:44,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  18%|█▊        | 30/167 [00:02<00:05, 25.23it/s][A

Validating:  20%|█▉        | 33/167 [00:02<00:05, 25.77it/s][A
Epoch 7:  38%|███▊      | 2278/5971 [20:46<33:39,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 25.83it/s][A
Epoch 7:  38%|███▊      | 2282/5971 [20:46<33:33,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 24.90it/s][A
Epoch 7:  38%|███▊      | 2286/5971 [20:46<33:28,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.95it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.21it/s][A
Epoch 7:  38%|███▊      | 2290/5971 [20:46<33:23,  1.84it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.03it/s][A
Epoch 7:  38%|███▊      | 2294/5971 [20:46<33:17,  1.84it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.23it/s][A
Epoch 7:  38%|███▊      | 2298/5971 [20:46<33:12,  1.84it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:03<00:04, 26.34it/s][A

Validating:  34%|███▍      | 57/167 [00:03<00:04, 26.15it/s][A
Epoch 7:  39%|███▊      | 2302/5971 [20:47<33:06,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:03<00:04, 25.22it/s][A
Epoch 7:  39%|███▊      | 2306/5971 [20:47<33:01,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.81it/s][A
Epoch 7:  39%|███▊      | 2310/5971 [20:47<32:56,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.01it/s][A
Epoch 7:  39%|███▉      | 2314/5971 [20:47<32:50,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.55it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.37it/s][A
Epoch 7:  39%|███▉      | 2318/5971 [20:47<32:45,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.32it/s][A
Epoch 7:  39%|███▉      | 2322/5971 [20:47<32:40,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.34it/s][A
Epoch 7:  39%|███▉      | 2326/5971 [20:48<32:34,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 25.65it/s][A

Validating:  51%|█████     | 85/167 [00:04<00:03, 26.69it/s][A
Epoch 7:  39%|███▉      | 2330/5971 [20:48<32:29,  1.87it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 25.98it/s][A
Epoch 7:  39%|███▉      | 2334/5971 [20:48<32:24,  1.87it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 26.69it/s][A
Epoch 7:  39%|███▉      | 2338/5971 [20:48<32:19,  1.87it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 26.30it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.65it/s][A
Epoch 7:  39%|███▉      | 2342/5971 [20:48<32:14,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.30it/s][A
Epoch 7:  39%|███▉      | 2346/5971 [20:48<32:08,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.44it/s][A
Epoch 7:  39%|███▉      | 2350/5971 [20:48<32:03,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 27.47it/s][A
Epoch 7:  39%|███▉      | 2354/5971 [20:49<31:58,  1.89it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 26.73it/s][A

Validating:  68%|██████▊   | 113/167 [00:05<00:01, 27.49it/s][A
Epoch 7:  39%|███▉      | 2358/5971 [20:49<31:53,  1.89it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 27.16it/s][A
Epoch 7:  40%|███▉      | 2362/5971 [20:49<31:48,  1.89it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 27.68it/s][A
Epoch 7:  40%|███▉      | 2366/5971 [20:49<31:43,  1.89it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 27.87it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.03it/s][A
Epoch 7:  40%|███▉      | 2370/5971 [20:49<31:37,  1.90it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.27it/s][A
Epoch 7:  40%|███▉      | 2374/5971 [20:49<31:32,  1.90it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 27.33it/s][A
Epoch 7:  40%|███▉      | 2378/5971 [20:49<31:27,  1.90it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|████████  | 134/167 [00:06<00:01, 26.64it/s][A

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 26.96it/s][A
Epoch 7:  40%|███▉      | 2382/5971 [20:50<31:22,  1.91it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 27.75it/s][A
Epoch 7:  40%|███▉      | 2386/5971 [20:50<31:17,  1.91it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 28.42it/s][A
Epoch 7:  40%|████      | 2390/5971 [20:50<31:12,  1.91it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 29.02it/s][A
Epoch 7:  40%|████      | 2394/5971 [20:50<31:07,  1.92it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.69it/s][A
Epoch 7:  40%|████      | 2398/5971 [20:50<31:02,  1.92it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.67it/s][A
Epoch 7:  40%|████      | 2402/5971 [20:50<30:57,  1.92it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.56it/s][A

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 27.89it/s][A
Epoch 7:  40%|████      | 2406/5971 [20:50<30:52,  1.92it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 27.70it/s][A
Epoch 7:  40%|████      | 2410/5971 [20:51<30:47,  1.93it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 100%|██████████| 167/167 [00:07<00:00, 28.21it/s][A
Epoch 7:  40%|████      | 2412/5971 [20:51<30:45,  1.93it/s, loss=0.123, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4249.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  40%|████      | 2413/5971 [20:52<30:45,  1.93it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.000129, train/loss_step=0.0332, global_step=4250.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  40%|████      | 2414/5971 [20:53<30:45,  1.93it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.000129, train/loss_step=0.0332, global_step=4250.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  40%|████      | 2414/5971 [20:53<30:45,  1.93it/s, loss=0.136, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000969, train/loss_step=0.237, global_step=4250.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  40%|████      | 2415/5971 [20:54<30:45,  1.93it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0072, train/loss_vlb_step=3.37e-5, train/loss_step=0.0072, global_step=4250.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  40%|████      | 2416/5971 [20:56<30:48,  1.92it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.47e-5, train/loss_step=0.0173, global_step=4250.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  40%|████      | 2417/5971 [20:57<30:48,  1.92it/s, loss=0.131, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00256, train/loss_step=0.417, global_step=4251.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  40%|████      | 2418/5971 [20:58<30:48,  1.92it/s, loss=0.131, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00256, train/loss_step=0.417, global_step=4251.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  40%|████      | 2418/5971 [20:58<30:48,  1.92it/s, loss=0.117, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00291, train/loss_step=0.490, global_step=4251.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2419/5971 [20:59<30:48,  1.92it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000256, train/loss_step=0.0771, global_step=4251.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2420/5971 [21:01<30:50,  1.92it/s, loss=0.136, v_num=0, train/loss_simple_step=0.633, train/loss_vlb_step=0.0109, train/loss_step=0.633, global_step=4251.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  41%|████      | 2421/5971 [21:02<30:50,  1.92it/s, loss=0.148, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000836, train/loss_step=0.236, global_step=4252.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2422/5971 [21:03<30:50,  1.92it/s, loss=0.148, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000836, train/loss_step=0.236, global_step=4252.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2422/5971 [21:03<30:50,  1.92it/s, loss=0.151, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000425, train/loss_step=0.128, global_step=4252.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2423/5971 [21:04<30:50,  1.92it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.17e-5, train/loss_step=0.00212, global_step=4252.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2424/5971 [21:06<30:52,  1.91it/s, loss=0.175, v_num=0, train/loss_simple_step=0.482, train/loss_vlb_step=0.00355, train/loss_step=0.482, global_step=4252.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  41%|████      | 2425/5971 [21:07<30:52,  1.91it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.000149, train/loss_step=0.0404, global_step=4253.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2426/5971 [21:08<30:52,  1.91it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0404, train/loss_vlb_step=0.000149, train/loss_step=0.0404, global_step=4253.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2426/5971 [21:08<30:52,  1.91it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000257, train/loss_step=0.0771, global_step=4253.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2427/5971 [21:09<30:52,  1.91it/s, loss=0.196, v_num=0, train/loss_simple_step=0.333, train/loss_vlb_step=0.00173, train/loss_step=0.333, global_step=4253.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  41%|████      | 2428/5971 [21:11<30:54,  1.91it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0354, train/loss_vlb_step=0.000132, train/loss_step=0.0354, global_step=4253.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2429/5971 [21:12<30:54,  1.91it/s, loss=0.183, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000525, train/loss_step=0.159, global_step=4254.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████      | 2430/5971 [21:13<30:54,  1.91it/s, loss=0.183, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000525, train/loss_step=0.159, global_step=4254.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2430/5971 [21:13<30:54,  1.91it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0463, train/loss_vlb_step=0.00016, train/loss_step=0.0463, global_step=4254.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2431/5971 [21:13<30:54,  1.91it/s, loss=0.189, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000761, train/loss_step=0.216, global_step=4254.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████      | 2432/5971 [21:16<30:56,  1.91it/s, loss=0.201, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00163, train/loss_step=0.359, global_step=4254.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████      | 2433/5971 [21:16<30:56,  1.91it/s, loss=0.225, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.0049, train/loss_step=0.499, global_step=4255.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████      | 2434/5971 [21:17<30:56,  1.91it/s, loss=0.225, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.0049, train/loss_step=0.499, global_step=4255.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2434/5971 [21:17<30:56,  1.91it/s, loss=0.226, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00113, train/loss_step=0.267, global_step=4255.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2435/5971 [21:18<30:56,  1.91it/s, loss=0.257, v_num=0, train/loss_simple_step=0.617, train/loss_vlb_step=0.00854, train/loss_step=0.617, global_step=4255.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2436/5971 [21:20<30:58,  1.90it/s, loss=0.259, v_num=0, train/loss_simple_step=0.0568, train/loss_vlb_step=0.000196, train/loss_step=0.0568, global_step=4255.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2437/5971 [21:21<30:58,  1.90it/s, loss=0.274, v_num=0, train/loss_simple_step=0.727, train/loss_vlb_step=0.0147, train/loss_step=0.727, global_step=4256.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  41%|████      | 2438/5971 [21:22<30:58,  1.90it/s, loss=0.274, v_num=0, train/loss_simple_step=0.727, train/loss_vlb_step=0.0147, train/loss_step=0.727, global_step=4256.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2438/5971 [21:22<30:58,  1.90it/s, loss=0.25, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=6.81e-5, train/loss_step=0.0178, global_step=4256.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2439/5971 [21:23<30:58,  1.90it/s, loss=0.27, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00327, train/loss_step=0.467, global_step=4256.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  41%|████      | 2440/5971 [21:25<30:59,  1.90it/s, loss=0.238, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.03e-5, train/loss_step=0.00179, global_step=4256.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2441/5971 [21:26<30:59,  1.90it/s, loss=0.238, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000874, train/loss_step=0.223, global_step=4257.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  41%|████      | 2442/5971 [21:27<30:59,  1.90it/s, loss=0.238, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000874, train/loss_step=0.223, global_step=4257.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2442/5971 [21:27<30:59,  1.90it/s, loss=0.251, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00251, train/loss_step=0.392, global_step=4257.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████      | 2443/5971 [21:28<30:59,  1.90it/s, loss=0.255, v_num=0, train/loss_simple_step=0.0744, train/loss_vlb_step=0.000251, train/loss_step=0.0744, global_step=4257.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2444/5971 [21:30<31:02,  1.89it/s, loss=0.236, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000341, train/loss_step=0.104, global_step=4257.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  41%|████      | 2445/5971 [21:31<31:02,  1.89it/s, loss=0.238, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000277, train/loss_step=0.0842, global_step=4258.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2446/5971 [21:32<31:02,  1.89it/s, loss=0.238, v_num=0, train/loss_simple_step=0.0842, train/loss_vlb_step=0.000277, train/loss_step=0.0842, global_step=4258.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2446/5971 [21:32<31:02,  1.89it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000154, train/loss_step=0.0396, global_step=4258.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2447/5971 [21:33<31:02,  1.89it/s, loss=0.238, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00233, train/loss_step=0.374, global_step=4258.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  41%|████      | 2448/5971 [21:35<31:04,  1.89it/s, loss=0.244, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000545, train/loss_step=0.161, global_step=4258.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2449/5971 [21:36<31:04,  1.89it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=6.87e-5, train/loss_step=0.0169, global_step=4259.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2450/5971 [21:37<31:04,  1.89it/s, loss=0.237, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=6.87e-5, train/loss_step=0.0169, global_step=4259.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2450/5971 [21:37<31:04,  1.89it/s, loss=0.244, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000584, train/loss_step=0.173, global_step=4259.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████      | 2451/5971 [21:38<31:04,  1.89it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.29e-5, train/loss_step=0.0129, global_step=4259.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2452/5971 [21:40<31:06,  1.89it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.38e-5, train/loss_step=0.0125, global_step=4259.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2453/5971 [21:41<31:06,  1.89it/s, loss=0.219, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.0083, train/loss_step=0.561, global_step=4260.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  41%|████      | 2454/5971 [21:42<31:06,  1.88it/s, loss=0.219, v_num=0, train/loss_simple_step=0.561, train/loss_vlb_step=0.0083, train/loss_step=0.561, global_step=4260.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2454/5971 [21:42<31:06,  1.88it/s, loss=0.214, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000596, train/loss_step=0.173, global_step=4260.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2455/5971 [21:43<31:06,  1.88it/s, loss=0.192, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000605, train/loss_step=0.167, global_step=4260.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2456/5971 [21:45<31:07,  1.88it/s, loss=0.209, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.0019, train/loss_step=0.393, global_step=4260.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  41%|████      | 2457/5971 [21:46<31:07,  1.88it/s, loss=0.204, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00744, train/loss_step=0.628, global_step=4261.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2458/5971 [21:47<31:07,  1.88it/s, loss=0.204, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00744, train/loss_step=0.628, global_step=4261.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2458/5971 [21:47<31:07,  1.88it/s, loss=0.237, v_num=0, train/loss_simple_step=0.683, train/loss_vlb_step=0.050, train/loss_step=0.683, global_step=4261.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  41%|████      | 2459/5971 [21:48<31:07,  1.88it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.48e-5, train/loss_step=0.00264, global_step=4261.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2460/5971 [21:50<31:09,  1.88it/s, loss=0.238, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00312, train/loss_step=0.474, global_step=4261.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  41%|████      | 2461/5971 [21:51<31:09,  1.88it/s, loss=0.232, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=4262.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2462/5971 [21:52<31:09,  1.88it/s, loss=0.232, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000369, train/loss_step=0.112, global_step=4262.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2462/5971 [21:52<31:09,  1.88it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0135, train/loss_vlb_step=5.55e-5, train/loss_step=0.0135, global_step=4262.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████      | 2463/5971 [21:53<31:09,  1.88it/s, loss=0.216, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000447, train/loss_step=0.132, global_step=4262.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████▏     | 2464/5971 [21:55<31:11,  1.87it/s, loss=0.225, v_num=0, train/loss_simple_step=0.284, train/loss_vlb_step=0.00133, train/loss_step=0.284, global_step=4262.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████▏     | 2465/5971 [21:56<31:11,  1.87it/s, loss=0.226, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00036, train/loss_step=0.109, global_step=4263.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2466/5971 [21:57<31:11,  1.87it/s, loss=0.226, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00036, train/loss_step=0.109, global_step=4263.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2466/5971 [21:57<31:11,  1.87it/s, loss=0.232, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000504, train/loss_step=0.152, global_step=4263.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2467/5971 [21:58<31:11,  1.87it/s, loss=0.236, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00344, train/loss_step=0.462, global_step=4263.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████▏     | 2468/5971 [22:00<31:13,  1.87it/s, loss=0.23, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000117, train/loss_step=0.0292, global_step=4263.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2469/5971 [22:01<31:13,  1.87it/s, loss=0.251, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00329, train/loss_step=0.454, global_step=4264.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  41%|████▏     | 2470/5971 [22:02<31:13,  1.87it/s, loss=0.251, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00329, train/loss_step=0.454, global_step=4264.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2470/5971 [22:02<31:13,  1.87it/s, loss=0.244, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000116, train/loss_step=0.0316, global_step=4264.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2471/5971 [22:02<31:13,  1.87it/s, loss=0.26, v_num=0, train/loss_simple_step=0.336, train/loss_vlb_step=0.00161, train/loss_step=0.336, global_step=4264.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  41%|████▏     | 2472/5971 [22:05<31:15,  1.87it/s, loss=0.26, v_num=0, train/loss_simple_step=0.00169, train/loss_vlb_step=1e-5, train/loss_step=0.00169, global_step=4264.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2473/5971 [22:06<31:15,  1.87it/s, loss=0.252, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00248, train/loss_step=0.399, global_step=4265.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2474/5971 [22:07<31:15,  1.87it/s, loss=0.252, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.00248, train/loss_step=0.399, global_step=4265.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2474/5971 [22:07<31:15,  1.87it/s, loss=0.281, v_num=0, train/loss_simple_step=0.754, train/loss_vlb_step=0.0768, train/loss_step=0.754, global_step=4265.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  41%|████▏     | 2475/5971 [22:07<31:14,  1.86it/s, loss=0.292, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00208, train/loss_step=0.395, global_step=4265.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2476/5971 [22:10<31:17,  1.86it/s, loss=0.283, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000778, train/loss_step=0.212, global_step=4265.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  41%|████▏     | 2477/5971 [22:11<31:17,  1.86it/s, loss=0.273, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.0038, train/loss_step=0.423, global_step=4266.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  42%|████▏     | 2478/5971 [22:12<31:16,  1.86it/s, loss=0.273, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.0038, train/loss_step=0.423, global_step=4266.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2478/5971 [22:12<31:16,  1.86it/s, loss=0.245, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=4266.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2479/5971 [22:12<31:16,  1.86it/s, loss=0.249, v_num=0, train/loss_simple_step=0.0877, train/loss_vlb_step=0.000288, train/loss_step=0.0877, global_step=4266.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2480/5971 [22:15<31:18,  1.86it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0483, train/loss_vlb_step=0.000173, train/loss_step=0.0483, global_step=4266.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2481/5971 [22:16<31:18,  1.86it/s, loss=0.246, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00314, train/loss_step=0.477, global_step=4267.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  42%|████▏     | 2482/5971 [22:16<31:18,  1.86it/s, loss=0.246, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.00314, train/loss_step=0.477, global_step=4267.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2482/5971 [22:16<31:18,  1.86it/s, loss=0.255, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000734, train/loss_step=0.195, global_step=4267.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2483/5971 [22:17<31:18,  1.86it/s, loss=0.252, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000224, train/loss_step=0.0644, global_step=4267.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2484/5971 [22:19<31:20,  1.85it/s, loss=0.246, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000519, train/loss_step=0.158, global_step=4267.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  42%|████▏     | 2485/5971 [22:20<31:20,  1.85it/s, loss=0.26, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00202, train/loss_step=0.398, global_step=4268.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  42%|████▏     | 2486/5971 [22:21<31:20,  1.85it/s, loss=0.26, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.00202, train/loss_step=0.398, global_step=4268.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2486/5971 [22:21<31:20,  1.85it/s, loss=0.255, v_num=0, train/loss_simple_step=0.0473, train/loss_vlb_step=0.000164, train/loss_step=0.0473, global_step=4268.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2487/5971 [22:22<31:20,  1.85it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.28e-5, train/loss_step=0.0215, global_step=4268.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  42%|████▏     | 2488/5971 [22:24<31:22,  1.85it/s, loss=0.239, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000622, train/loss_step=0.152, global_step=4268.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  42%|████▏     | 2489/5971 [22:25<31:22,  1.85it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.32e-5, train/loss_step=0.0178, global_step=4269.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2490/5971 [22:26<31:21,  1.85it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.32e-5, train/loss_step=0.0178, global_step=4269.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2490/5971 [22:26<31:21,  1.85it/s, loss=0.224, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000612, train/loss_step=0.175, global_step=4269.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  42%|████▏     | 2491/5971 [22:27<31:21,  1.85it/s, loss=0.215, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000495, train/loss_step=0.148, global_step=4269.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2492/5971 [22:30<31:24,  1.85it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.37e-5, train/loss_step=0.0126, global_step=4269.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2493/5971 [22:30<31:23,  1.85it/s, loss=0.2, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=4270.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  42%|████▏     | 2494/5971 [22:31<31:23,  1.85it/s, loss=0.2, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=4270.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2494/5971 [22:31<31:23,  1.85it/s, loss=0.171, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000564, train/loss_step=0.163, global_step=4270.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2495/5971 [22:32<31:23,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000669, train/loss_step=0.189, global_step=4270.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2496/5971 [22:34<31:25,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000179, train/loss_step=0.0503, global_step=4270.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2497/5971 [22:35<31:25,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000343, train/loss_step=0.104, global_step=4271.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  42%|████▏     | 2498/5971 [22:36<31:25,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000343, train/loss_step=0.104, global_step=4271.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2498/5971 [22:36<31:25,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000726, train/loss_step=0.217, global_step=4271.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2499/5971 [22:37<31:25,  1.84it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.81e-5, train/loss_step=0.0128, global_step=4271.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2500/5971 [22:39<31:26,  1.84it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000303, train/loss_step=0.0922, global_step=4271.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2501/5971 [22:40<31:26,  1.84it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000146, train/loss_step=0.0413, global_step=4272.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2502/5971 [22:41<31:26,  1.84it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000146, train/loss_step=0.0413, global_step=4272.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2502/5971 [22:41<31:26,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.361, train/loss_vlb_step=0.00249, train/loss_step=0.361, global_step=4272.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  42%|████▏     | 2503/5971 [22:42<31:26,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0521, train/loss_vlb_step=0.000178, train/loss_step=0.0521, global_step=4272.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2504/5971 [22:44<31:28,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000511, train/loss_step=0.154, global_step=4272.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  42%|████▏     | 2505/5971 [22:45<31:28,  1.84it/s, loss=0.117, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000865, train/loss_step=0.226, global_step=4273.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2506/5971 [22:46<31:28,  1.83it/s, loss=0.117, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000865, train/loss_step=0.226, global_step=4273.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2506/5971 [22:46<31:28,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.06e-5, train/loss_step=0.0117, global_step=4273.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2507/5971 [22:47<31:28,  1.83it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.00016, train/loss_step=0.0467, global_step=4273.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2508/5971 [22:49<31:29,  1.83it/s, loss=0.119, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000749, train/loss_step=0.197, global_step=4273.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  42%|████▏     | 2509/5971 [22:50<31:30,  1.83it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000104, train/loss_step=0.0294, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2510/5971 [22:51<31:29,  1.83it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0294, train/loss_vlb_step=0.000104, train/loss_step=0.0294, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2510/5971 [22:51<31:29,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000781, train/loss_step=0.222, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  42%|████▏     | 2511/5971 [22:52<31:29,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000505, train/loss_step=0.148, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  42%|████▏     | 2512/5971 [22:54<31:31,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:27,  1.89it/s][A
Epoch 7:  42%|████▏     | 2514/5971 [22:54<31:29,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:47,  3.50it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.42it/s][A
Epoch 7:  42%|████▏     | 2518/5971 [22:55<31:25,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   4%|▍         | 7/167 [00:00<00:13, 11.82it/s][A
Epoch 7:  42%|████▏     | 2522/5971 [22:55<31:20,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   6%|▌         | 10/167 [00:00<00:10, 15.41it/s][A

Validating:   8%|▊         | 13/167 [00:01<00:08, 18.33it/s][A
Epoch 7:  42%|████▏     | 2526/5971 [22:55<31:15,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.80it/s][A
Epoch 7:  42%|████▏     | 2530/5971 [22:55<31:10,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.87it/s][A
Epoch 7:  42%|████▏     | 2534/5971 [22:55<31:05,  1.84it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 23.45it/s][A

Validating:  15%|█▍        | 25/167 [00:01<00:05, 24.30it/s][A
Epoch 7:  43%|████▎     | 2538/5971 [22:55<31:00,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 23.83it/s][A
Epoch 7:  43%|████▎     | 2542/5971 [22:56<30:55,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 24.46it/s][A
Epoch 7:  43%|████▎     | 2546/5971 [22:56<30:50,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  20%|██        | 34/167 [00:01<00:05, 25.27it/s][A

Validating:  22%|██▏       | 37/167 [00:02<00:05, 25.77it/s][A
Epoch 7:  43%|████▎     | 2550/5971 [22:56<30:45,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 25.24it/s][A
Epoch 7:  43%|████▎     | 2554/5971 [22:56<30:40,  1.86it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.80it/s][A
Epoch 7:  43%|████▎     | 2558/5971 [22:56<30:36,  1.86it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.88it/s][A

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.29it/s][A
Epoch 7:  43%|████▎     | 2562/5971 [22:56<30:31,  1.86it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 52/167 [00:02<00:04, 26.73it/s][A
Epoch 7:  43%|████▎     | 2566/5971 [22:57<30:26,  1.86it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 27.11it/s][A
Epoch 7:  43%|████▎     | 2570/5971 [22:57<30:21,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.25it/s][A

Validating:  37%|███▋      | 61/167 [00:02<00:04, 25.77it/s][A
Epoch 7:  43%|████▎     | 2574/5971 [22:57<30:17,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.23it/s][A
Epoch 7:  43%|████▎     | 2578/5971 [22:57<30:12,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.89it/s][A
Epoch 7:  43%|████▎     | 2582/5971 [22:57<30:07,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.26it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.24it/s][A
Epoch 7:  43%|████▎     | 2586/5971 [22:57<30:02,  1.88it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 27.02it/s][A
Epoch 7:  43%|████▎     | 2590/5971 [22:57<29:58,  1.88it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 27.59it/s][A
Epoch 7:  43%|████▎     | 2594/5971 [22:58<29:53,  1.88it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|████▉     | 83/167 [00:03<00:02, 28.71it/s][A
Epoch 7:  44%|████▎     | 2598/5971 [22:58<29:48,  1.89it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.99it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.98it/s][A
Epoch 7:  44%|████▎     | 2602/5971 [22:58<29:43,  1.89it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 27.06it/s][A
Epoch 7:  44%|████▎     | 2606/5971 [22:58<29:39,  1.89it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.60it/s][A
Epoch 7:  44%|████▎     | 2610/5971 [22:58<29:34,  1.89it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 27.70it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.24it/s][A
Epoch 7:  44%|████▍     | 2614/5971 [22:58<29:30,  1.90it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 25.68it/s][A
Epoch 7:  44%|████▍     | 2618/5971 [22:58<29:25,  1.90it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.32it/s][A
Epoch 7:  44%|████▍     | 2622/5971 [22:59<29:20,  1.90it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.45it/s][A
Epoch 7:  44%|████▍     | 2626/5971 [22:59<29:16,  1.90it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 28.51it/s][A
Epoch 7:  44%|████▍     | 2630/5971 [22:59<29:11,  1.91it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████   | 118/167 [00:05<00:01, 27.16it/s][A

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.33it/s][A
Epoch 7:  44%|████▍     | 2634/5971 [22:59<29:07,  1.91it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.79it/s][A
Epoch 7:  44%|████▍     | 2638/5971 [22:59<29:02,  1.91it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.26it/s][A
Epoch 7:  44%|████▍     | 2642/5971 [22:59<28:57,  1.92it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.68it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.41it/s][A
Epoch 7:  44%|████▍     | 2646/5971 [22:59<28:53,  1.92it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.04it/s][A
Epoch 7:  44%|████▍     | 2650/5971 [23:00<28:48,  1.92it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.50it/s][A
Epoch 7:  44%|████▍     | 2654/5971 [23:00<28:44,  1.92it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.40it/s][A

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.86it/s][A
Epoch 7:  45%|████▍     | 2658/5971 [23:00<28:39,  1.93it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.64it/s][A
Epoch 7:  45%|████▍     | 2662/5971 [23:00<28:35,  1.93it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.29it/s][A
Epoch 7:  45%|████▍     | 2666/5971 [23:00<28:31,  1.93it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.90it/s][A
Epoch 7:  45%|████▍     | 2670/5971 [23:00<28:26,  1.93it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 28.05it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.14it/s][A
Epoch 7:  45%|████▍     | 2674/5971 [23:01<28:22,  1.94it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.92it/s][A
Epoch 7:  45%|████▍     | 2678/5971 [23:01<28:17,  1.94it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 100%|██████████| 167/167 [00:06<00:00, 27.05it/s][A
Epoch 7:  45%|████▍     | 2680/5971 [23:01<28:15,  1.94it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0837, train/loss_vlb_step=0.000277, train/loss_step=0.0837, global_step=4274.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  45%|████▍     | 2681/5971 [23:02<28:15,  1.94it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.9e-5, train/loss_step=0.0249, global_step=4275.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  45%|████▍     | 2682/5971 [23:03<28:15,  1.94it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0249, train/loss_vlb_step=9.9e-5, train/loss_step=0.0249, global_step=4275.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▍     | 2682/5971 [23:03<28:15,  1.94it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.47e-5, train/loss_step=0.0242, global_step=4275.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▍     | 2683/5971 [23:04<28:15,  1.94it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.95e-5, train/loss_step=0.0037, global_step=4275.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▍     | 2684/5971 [23:06<28:17,  1.94it/s, loss=0.114, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000902, train/loss_step=0.235, global_step=4275.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  45%|████▍     | 2685/5971 [23:07<28:17,  1.94it/s, loss=0.114, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000343, train/loss_step=0.104, global_step=4276.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▍     | 2686/5971 [23:08<28:17,  1.94it/s, loss=0.114, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000343, train/loss_step=0.104, global_step=4276.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▍     | 2686/5971 [23:08<28:17,  1.94it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.67e-5, train/loss_step=0.0151, global_step=4276.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2687/5971 [23:09<28:17,  1.94it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.89e-5, train/loss_step=0.0108, global_step=4276.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2688/5971 [23:11<28:18,  1.93it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0885, train/loss_vlb_step=0.000292, train/loss_step=0.0885, global_step=4276.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2689/5971 [23:12<28:18,  1.93it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.04e-5, train/loss_step=0.0175, global_step=4277.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  45%|████▌     | 2690/5971 [23:13<28:18,  1.93it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0175, train/loss_vlb_step=7.04e-5, train/loss_step=0.0175, global_step=4277.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2690/5971 [23:13<28:18,  1.93it/s, loss=0.0855, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.36e-5, train/loss_step=0.0144, global_step=4277.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2691/5971 [23:14<28:18,  1.93it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.272, train/loss_vlb_step=0.000942, train/loss_step=0.272, global_step=4277.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  45%|████▌     | 2692/5971 [23:16<28:19,  1.93it/s, loss=0.093, v_num=0, train/loss_simple_step=0.0845, train/loss_vlb_step=0.00028, train/loss_step=0.0845, global_step=4277.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2693/5971 [23:17<28:19,  1.93it/s, loss=0.0843, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000178, train/loss_step=0.0525, global_step=4278.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2694/5971 [23:17<28:19,  1.93it/s, loss=0.0843, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000178, train/loss_step=0.0525, global_step=4278.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2694/5971 [23:17<28:19,  1.93it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000589, train/loss_step=0.170, global_step=4278.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  45%|████▌     | 2695/5971 [23:18<28:19,  1.93it/s, loss=0.0899, v_num=0, train/loss_simple_step=0.00146, train/loss_vlb_step=8.71e-6, train/loss_step=0.00146, global_step=4278.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2696/5971 [23:21<28:21,  1.92it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.0453, train/loss_vlb_step=0.000164, train/loss_step=0.0453, global_step=4278.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  45%|████▌     | 2697/5971 [23:22<28:21,  1.92it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000548, train/loss_step=0.160, global_step=4279.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  45%|████▌     | 2698/5971 [23:22<28:21,  1.92it/s, loss=0.0889, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000548, train/loss_step=0.160, global_step=4279.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2698/5971 [23:22<28:21,  1.92it/s, loss=0.0782, v_num=0, train/loss_simple_step=0.00877, train/loss_vlb_step=4.15e-5, train/loss_step=0.00877, global_step=4279.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2699/5971 [23:23<28:21,  1.92it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00123, train/loss_step=0.306, global_step=4279.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  45%|████▌     | 2700/5971 [23:26<28:22,  1.92it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=5.91e-5, train/loss_step=0.0141, global_step=4279.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2701/5971 [23:27<28:22,  1.92it/s, loss=0.0919, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000895, train/loss_step=0.211, global_step=4280.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  45%|████▌     | 2702/5971 [23:27<28:22,  1.92it/s, loss=0.0919, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000895, train/loss_step=0.211, global_step=4280.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2702/5971 [23:27<28:22,  1.92it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.47e-5, train/loss_step=0.0162, global_step=4280.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2703/5971 [23:28<28:22,  1.92it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.00383, train/loss_vlb_step=2.07e-5, train/loss_step=0.00383, global_step=4280.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2704/5971 [23:31<28:24,  1.92it/s, loss=0.0801, v_num=0, train/loss_simple_step=0.00792, train/loss_vlb_step=3.88e-5, train/loss_step=0.00792, global_step=4280.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2705/5971 [23:32<28:24,  1.92it/s, loss=0.0755, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=4.94e-5, train/loss_step=0.0116, global_step=4281.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  45%|████▌     | 2706/5971 [23:32<28:24,  1.92it/s, loss=0.0755, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=4.94e-5, train/loss_step=0.0116, global_step=4281.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2706/5971 [23:32<28:24,  1.92it/s, loss=0.0755, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.15e-5, train/loss_step=0.0141, global_step=4281.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2707/5971 [23:33<28:24,  1.92it/s, loss=0.116, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0326, train/loss_step=0.812, global_step=4281.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  45%|████▌     | 2708/5971 [23:36<28:25,  1.91it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.31e-5, train/loss_step=0.00237, global_step=4281.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2709/5971 [23:37<28:25,  1.91it/s, loss=0.111, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.91e-5, train/loss_step=0.011, global_step=4282.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  45%|████▌     | 2710/5971 [23:38<28:25,  1.91it/s, loss=0.111, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.91e-5, train/loss_step=0.011, global_step=4282.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2710/5971 [23:38<28:25,  1.91it/s, loss=0.133, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00348, train/loss_step=0.455, global_step=4282.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2711/5971 [23:38<28:25,  1.91it/s, loss=0.126, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000417, train/loss_step=0.127, global_step=4282.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2712/5971 [23:41<28:27,  1.91it/s, loss=0.123, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000113, train/loss_step=0.029, global_step=4282.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2713/5971 [23:41<28:26,  1.91it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0832, train/loss_vlb_step=0.000285, train/loss_step=0.0832, global_step=4283.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2714/5971 [23:42<28:26,  1.91it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0832, train/loss_vlb_step=0.000285, train/loss_step=0.0832, global_step=4283.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2714/5971 [23:42<28:26,  1.91it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.23e-5, train/loss_step=0.0119, global_step=4283.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  45%|████▌     | 2715/5971 [23:43<28:26,  1.91it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.13e-5, train/loss_step=0.00191, global_step=4283.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  45%|████▌     | 2716/5971 [23:46<28:28,  1.91it/s, loss=0.125, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000717, train/loss_step=0.216, global_step=4283.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  46%|████▌     | 2717/5971 [23:47<28:28,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0719, train/loss_vlb_step=0.000238, train/loss_step=0.0719, global_step=4284.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2718/5971 [23:48<28:28,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0719, train/loss_vlb_step=0.000238, train/loss_step=0.0719, global_step=4284.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2718/5971 [23:48<28:28,  1.90it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0284, train/loss_vlb_step=0.000105, train/loss_step=0.0284, global_step=4284.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2719/5971 [23:49<28:28,  1.90it/s, loss=0.133, v_num=0, train/loss_simple_step=0.532, train/loss_vlb_step=0.00665, train/loss_step=0.532, global_step=4284.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  46%|████▌     | 2720/5971 [23:51<28:29,  1.90it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000146, train/loss_step=0.0406, global_step=4284.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2721/5971 [23:52<28:29,  1.90it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000185, train/loss_step=0.0531, global_step=4285.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2722/5971 [23:52<28:29,  1.90it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000185, train/loss_step=0.0531, global_step=4285.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2722/5971 [23:52<28:29,  1.90it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0529, train/loss_vlb_step=0.00018, train/loss_step=0.0529, global_step=4285.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  46%|████▌     | 2723/5971 [23:53<28:29,  1.90it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000127, train/loss_step=0.0319, global_step=4285.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2724/5971 [23:55<28:31,  1.90it/s, loss=0.136, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000446, train/loss_step=0.134, global_step=4285.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  46%|████▌     | 2725/5971 [23:56<28:31,  1.90it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.5e-5, train/loss_step=0.0107, global_step=4286.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2726/5971 [23:57<28:30,  1.90it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.5e-5, train/loss_step=0.0107, global_step=4286.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2726/5971 [23:57<28:30,  1.90it/s, loss=0.143, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.00057, train/loss_step=0.165, global_step=4286.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  46%|████▌     | 2727/5971 [23:58<28:30,  1.90it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=4.83e-5, train/loss_step=0.0115, global_step=4286.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2728/5971 [24:01<28:32,  1.89it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00389, train/loss_vlb_step=2.17e-5, train/loss_step=0.00389, global_step=4286.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2729/5971 [24:01<28:32,  1.89it/s, loss=0.11, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000479, train/loss_step=0.146, global_step=4287.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  46%|████▌     | 2730/5971 [24:02<28:32,  1.89it/s, loss=0.11, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000479, train/loss_step=0.146, global_step=4287.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2730/5971 [24:02<28:32,  1.89it/s, loss=0.0887, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.25e-5, train/loss_step=0.0237, global_step=4287.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2731/5971 [24:03<28:32,  1.89it/s, loss=0.0865, v_num=0, train/loss_simple_step=0.0834, train/loss_vlb_step=0.000279, train/loss_step=0.0834, global_step=4287.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2732/5971 [24:05<28:33,  1.89it/s, loss=0.0851, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=9.71e-6, train/loss_step=0.00167, global_step=4287.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2733/5971 [24:06<28:33,  1.89it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000281, train/loss_step=0.0854, global_step=4288.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  46%|████▌     | 2734/5971 [24:07<28:33,  1.89it/s, loss=0.0853, v_num=0, train/loss_simple_step=0.0854, train/loss_vlb_step=0.000281, train/loss_step=0.0854, global_step=4288.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2734/5971 [24:07<28:33,  1.89it/s, loss=0.0948, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000731, train/loss_step=0.204, global_step=4288.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  46%|████▌     | 2735/5971 [24:08<28:33,  1.89it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=5.87e-5, train/loss_step=0.0141, global_step=4288.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2736/5971 [24:11<28:35,  1.89it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.39e-5, train/loss_step=0.00259, global_step=4288.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2737/5971 [24:11<28:34,  1.89it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.56e-5, train/loss_step=0.00277, global_step=4289.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2738/5971 [24:12<28:34,  1.89it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.56e-5, train/loss_step=0.00277, global_step=4289.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2738/5971 [24:12<28:34,  1.89it/s, loss=0.088, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000541, train/loss_step=0.162, global_step=4289.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  46%|████▌     | 2739/5971 [24:13<28:34,  1.88it/s, loss=0.0658, v_num=0, train/loss_simple_step=0.0879, train/loss_vlb_step=0.000296, train/loss_step=0.0879, global_step=4289.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2740/5971 [24:15<28:36,  1.88it/s, loss=0.0841, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00231, train/loss_step=0.406, global_step=4289.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  46%|████▌     | 2741/5971 [24:16<28:35,  1.88it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.00779, train/loss_vlb_step=3.46e-5, train/loss_step=0.00779, global_step=4290.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2742/5971 [24:17<28:35,  1.88it/s, loss=0.0818, v_num=0, train/loss_simple_step=0.00779, train/loss_vlb_step=3.46e-5, train/loss_step=0.00779, global_step=4290.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2742/5971 [24:17<28:35,  1.88it/s, loss=0.0815, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000174, train/loss_step=0.0468, global_step=4290.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  46%|████▌     | 2743/5971 [24:18<28:35,  1.88it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00118, train/loss_step=0.312, global_step=4290.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  46%|████▌     | 2744/5971 [24:20<28:37,  1.88it/s, loss=0.0995, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000839, train/loss_step=0.214, global_step=4290.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2745/5971 [24:21<28:37,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.572, train/loss_vlb_step=0.00729, train/loss_step=0.572, global_step=4291.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  46%|████▌     | 2746/5971 [24:22<28:37,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.572, train/loss_vlb_step=0.00729, train/loss_step=0.572, global_step=4291.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2746/5971 [24:22<28:37,  1.88it/s, loss=0.127, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000548, train/loss_step=0.157, global_step=4291.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2747/5971 [24:23<28:36,  1.88it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00464, train/loss_vlb_step=2.4e-5, train/loss_step=0.00464, global_step=4291.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2748/5971 [24:25<28:38,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=9.1e-5, train/loss_step=0.0216, global_step=4291.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  46%|████▌     | 2749/5971 [24:26<28:38,  1.88it/s, loss=0.13, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000677, train/loss_step=0.188, global_step=4292.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  46%|████▌     | 2750/5971 [24:27<28:37,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.000677, train/loss_step=0.188, global_step=4292.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2750/5971 [24:27<28:37,  1.87it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0924, train/loss_vlb_step=0.000311, train/loss_step=0.0924, global_step=4292.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2751/5971 [24:28<28:37,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00987, train/loss_vlb_step=4.53e-5, train/loss_step=0.00987, global_step=4292.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2752/5971 [24:30<28:39,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.441, train/loss_vlb_step=0.00237, train/loss_step=0.441, global_step=4292.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  46%|████▌     | 2753/5971 [24:31<28:39,  1.87it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000123, train/loss_step=0.0339, global_step=4293.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2754/5971 [24:32<28:39,  1.87it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000123, train/loss_step=0.0339, global_step=4293.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2754/5971 [24:32<28:39,  1.87it/s, loss=0.16, v_num=0, train/loss_simple_step=0.432, train/loss_vlb_step=0.00362, train/loss_step=0.432, global_step=4293.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  46%|████▌     | 2755/5971 [24:33<28:38,  1.87it/s, loss=0.178, v_num=0, train/loss_simple_step=0.357, train/loss_vlb_step=0.0015, train/loss_step=0.357, global_step=4293.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2756/5971 [24:35<28:40,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000115, train/loss_step=0.0313, global_step=4293.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2757/5971 [24:36<28:40,  1.87it/s, loss=0.186, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000442, train/loss_step=0.135, global_step=4294.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  46%|████▌     | 2758/5971 [24:37<28:40,  1.87it/s, loss=0.186, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000442, train/loss_step=0.135, global_step=4294.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2758/5971 [24:37<28:40,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.354, train/loss_vlb_step=0.00158, train/loss_step=0.354, global_step=4294.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  46%|████▌     | 2759/5971 [24:38<28:40,  1.87it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.17e-5, train/loss_step=0.00396, global_step=4294.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2760/5971 [24:40<28:41,  1.86it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00906, train/loss_vlb_step=4.11e-5, train/loss_step=0.00906, global_step=4294.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▌     | 2761/5971 [24:41<28:41,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00139, train/loss_step=0.341, global_step=4295.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  46%|████▋     | 2762/5971 [24:42<28:41,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00139, train/loss_step=0.341, global_step=4295.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2762/5971 [24:42<28:41,  1.86it/s, loss=0.208, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00381, train/loss_step=0.459, global_step=4295.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2763/5971 [24:43<28:41,  1.86it/s, loss=0.223, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.018, train/loss_step=0.609, global_step=4295.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  46%|████▋     | 2764/5971 [24:45<28:42,  1.86it/s, loss=0.234, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00246, train/loss_step=0.420, global_step=4295.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2765/5971 [24:46<28:42,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.52e-5, train/loss_step=0.0125, global_step=4296.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2766/5971 [24:47<28:42,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0125, train/loss_vlb_step=5.52e-5, train/loss_step=0.0125, global_step=4296.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2766/5971 [24:47<28:42,  1.86it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=9.16e-5, train/loss_step=0.0217, global_step=4296.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2767/5971 [24:48<28:42,  1.86it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=4.49e-5, train/loss_step=0.0107, global_step=4296.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2768/5971 [24:50<28:43,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0677, train/loss_vlb_step=0.000226, train/loss_step=0.0677, global_step=4296.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2769/5971 [24:50<28:43,  1.86it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.00015, train/loss_step=0.0421, global_step=4297.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  46%|████▋     | 2770/5971 [24:51<28:43,  1.86it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.00015, train/loss_step=0.0421, global_step=4297.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2770/5971 [24:51<28:43,  1.86it/s, loss=0.204, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00122, train/loss_step=0.297, global_step=4297.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  46%|████▋     | 2771/5971 [24:52<28:43,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000136, train/loss_step=0.0358, global_step=4297.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2772/5971 [24:55<28:44,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000353, train/loss_step=0.107, global_step=4297.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  46%|████▋     | 2773/5971 [24:55<28:44,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000131, train/loss_step=0.0381, global_step=4298.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2774/5971 [24:56<28:44,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000131, train/loss_step=0.0381, global_step=4298.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2774/5971 [24:56<28:44,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000378, train/loss_step=0.109, global_step=4298.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  46%|████▋     | 2775/5971 [24:57<28:44,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.4e-5, train/loss_step=0.0025, global_step=4298.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  46%|████▋     | 2776/5971 [24:59<28:45,  1.85it/s, loss=0.163, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.00066, train/loss_step=0.189, global_step=4298.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  47%|████▋     | 2777/5971 [25:00<28:45,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00651, train/loss_vlb_step=3.26e-5, train/loss_step=0.00651, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  47%|████▋     | 2778/5971 [25:01<28:45,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00651, train/loss_vlb_step=3.26e-5, train/loss_step=0.00651, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  47%|████▋     | 2778/5971 [25:01<28:45,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00232, train/loss_step=0.384, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  47%|████▋     | 2779/5971 [25:02<28:45,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.8e-5, train/loss_step=0.00324, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  47%|████▋     | 2780/5971 [25:04<28:46,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:02,  2.65it/s][A
Epoch 7:  47%|████▋     | 2782/5971 [25:05<28:44,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:57,  2.88it/s][A

Validating:   3%|▎         | 5/167 [00:00<00:21,  7.61it/s][A
Epoch 7:  47%|████▋     | 2786/5971 [25:05<28:40,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:13, 11.82it/s][A
Epoch 7:  47%|████▋     | 2790/5971 [25:05<28:36,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:01<00:10, 14.75it/s][A
Epoch 7:  47%|████▋     | 2794/5971 [25:06<28:31,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.83it/s][A

Validating:  10%|█         | 17/167 [00:01<00:07, 19.84it/s][A
Epoch 7:  47%|████▋     | 2798/5971 [25:06<28:27,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.26it/s][A
Epoch 7:  47%|████▋     | 2802/5971 [25:06<28:23,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.66it/s][A
Epoch 7:  47%|████▋     | 2806/5971 [25:06<28:18,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.30it/s][A

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.43it/s][A
Epoch 7:  47%|████▋     | 2810/5971 [25:06<28:14,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.27it/s][A
Epoch 7:  47%|████▋     | 2814/5971 [25:06<28:10,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:02<00:05, 25.00it/s][A
Epoch 7:  47%|████▋     | 2818/5971 [25:07<28:05,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.50it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.46it/s][A
Epoch 7:  47%|████▋     | 2822/5971 [25:07<28:01,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.02it/s][A
Epoch 7:  47%|████▋     | 2826/5971 [25:07<27:56,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.71it/s][A
Epoch 7:  47%|████▋     | 2830/5971 [25:07<27:52,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  30%|██▉       | 50/167 [00:02<00:05, 21.32it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:06, 18.03it/s][A
Epoch 7:  47%|████▋     | 2834/5971 [25:07<27:48,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  33%|███▎      | 55/167 [00:03<00:06, 16.22it/s][A

Validating:  34%|███▍      | 57/167 [00:03<00:07, 15.13it/s][A
Epoch 7:  48%|████▊     | 2838/5971 [25:08<27:44,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:03<00:06, 17.10it/s][A
Epoch 7:  48%|████▊     | 2842/5971 [25:08<27:40,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 62/167 [00:03<00:06, 16.20it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:05, 19.22it/s][A
Epoch 7:  48%|████▊     | 2846/5971 [25:08<27:35,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████      | 68/167 [00:03<00:04, 21.15it/s][A
Epoch 7:  48%|████▊     | 2850/5971 [25:08<27:31,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 71/167 [00:03<00:04, 23.19it/s][A
Epoch 7:  48%|████▊     | 2854/5971 [25:08<27:27,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 24.59it/s][A

Validating:  46%|████▌     | 77/167 [00:04<00:03, 25.71it/s][A
Epoch 7:  48%|████▊     | 2858/5971 [25:09<27:23,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  48%|████▊     | 80/167 [00:04<00:03, 26.79it/s][A
Epoch 7:  48%|████▊     | 2862/5971 [25:09<27:18,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|████▉     | 83/167 [00:04<00:03, 27.36it/s][A
Epoch 7:  48%|████▊     | 2866/5971 [25:09<27:14,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████▏    | 86/167 [00:04<00:03, 26.67it/s][A

Validating:  53%|█████▎    | 89/167 [00:04<00:02, 26.33it/s][A
Epoch 7:  48%|████▊     | 2870/5971 [25:09<27:10,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 27.15it/s][A
Epoch 7:  48%|████▊     | 2874/5971 [25:09<27:06,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 28.35it/s][A
Epoch 7:  48%|████▊     | 2878/5971 [25:09<27:01,  1.91it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 27.92it/s][A
Epoch 7:  48%|████▊     | 2882/5971 [25:09<26:57,  1.91it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 28.33it/s][A
Epoch 7:  48%|████▊     | 2886/5971 [25:10<26:53,  1.91it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 106/167 [00:05<00:02, 27.68it/s][A

Validating:  65%|██████▌   | 109/167 [00:05<00:02, 27.91it/s][A
Epoch 7:  48%|████▊     | 2890/5971 [25:10<26:49,  1.91it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  67%|██████▋   | 112/167 [00:05<00:02, 27.49it/s][A
Epoch 7:  48%|████▊     | 2894/5971 [25:10<26:45,  1.92it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 28.60it/s][A
Epoch 7:  49%|████▊     | 2898/5971 [25:10<26:41,  1.92it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 28.68it/s][A
Epoch 7:  49%|████▊     | 2902/5971 [25:10<26:36,  1.92it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 28.57it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 28.27it/s][A
Epoch 7:  49%|████▊     | 2906/5971 [25:10<26:32,  1.92it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 28.19it/s][A
Epoch 7:  49%|████▊     | 2910/5971 [25:10<26:28,  1.93it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 28.14it/s][A
Epoch 7:  49%|████▉     | 2914/5971 [25:11<26:24,  1.93it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████  | 135/167 [00:06<00:01, 27.72it/s][A
Epoch 7:  49%|████▉     | 2918/5971 [25:11<26:20,  1.93it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 27.98it/s][A

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 27.78it/s][A
Epoch 7:  49%|████▉     | 2922/5971 [25:11<26:16,  1.93it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.76it/s][A
Epoch 7:  49%|████▉     | 2926/5971 [25:11<26:12,  1.94it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.35it/s][A
Epoch 7:  49%|████▉     | 2930/5971 [25:11<26:08,  1.94it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.98it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.43it/s][A
Epoch 7:  49%|████▉     | 2934/5971 [25:11<26:04,  1.94it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.81it/s][A
Epoch 7:  49%|████▉     | 2938/5971 [25:11<26:00,  1.94it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 28.14it/s][A
Epoch 7:  49%|████▉     | 2942/5971 [25:12<25:56,  1.95it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 27.35it/s][A
Epoch 7:  49%|████▉     | 2946/5971 [25:12<25:52,  1.95it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 27.00it/s][A
Epoch 7:  49%|████▉     | 2948/5971 [25:12<25:50,  1.95it/s, loss=0.166, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000577, train/loss_step=0.173, global_step=4299.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.45it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.42it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.79it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.97it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.04it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.09it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.13it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.37it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.52it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.57it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.61it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.50it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.53it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.35it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.25it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.20it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.12it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:04,  5.11it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.10it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.10it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.10it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.12it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.23it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.38it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.28it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.33it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.25it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.16it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.06it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.02it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.08it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.35it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.50it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.05it/s]

Epoch 7:  49%|████▉     | 2949/5971 [25:25<26:02,  1.93it/s, loss=0.156, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000452, train/loss_step=0.132, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.41it/s][A
Epoch 7:  49%|████▉     | 2949/5971 [25:27<26:05,  1.93it/s, loss=0.156, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000452, train/loss_step=0.132, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.76it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.02it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.22it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.45it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.52it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.62it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.67it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.66it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.68it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.69it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.69it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.68it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.68it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.18it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.31it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.42it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.49it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.53it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.61it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.63it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.67it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.29it/s]

Epoch 7:  49%|████▉     | 2950/5971 [25:36<26:13,  1.92it/s, loss=0.156, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000452, train/loss_step=0.132, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  49%|████▉     | 2950/5971 [25:36<26:13,  1.92it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.34e-5, train/loss_step=0.0023, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.37it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.22it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.89it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.63it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.90it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.12it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.49it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.60it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.62it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.66it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.67it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.66it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.65it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.67it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.53it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.04it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.17it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.26it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.21it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.22it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.22it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.10it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.09it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.11it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.12it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.13it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.17it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.20it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.23it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.24it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.26it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.25it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.09it/s]

Epoch 7:  49%|████▉     | 2951/5971 [25:49<26:24,  1.91it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.34e-5, train/loss_step=0.0023, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  49%|████▉     | 2951/5971 [25:49<26:24,  1.91it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00846, train/loss_vlb_step=3.95e-5, train/loss_step=0.00846, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.74it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.03it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.24it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.48it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.55it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.61it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.65it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.67it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.70it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.72it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.71it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.72it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.73it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.50it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.41it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.40it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.48it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.61it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.64it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.42it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.40it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.38it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.36it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.35it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.42it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.49it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.56it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s]

Epoch 7:  49%|████▉     | 2952/5971 [26:02<26:37,  1.89it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00846, train/loss_vlb_step=3.95e-5, train/loss_step=0.00846, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  49%|████▉     | 2952/5971 [26:02<26:37,  1.89it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00126, train/loss_step=0.287, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  49%|████▉     | 2953/5971 [26:03<26:37,  1.89it/s, loss=0.0965, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00126, train/loss_step=0.287, global_step=4300.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  49%|████▉     | 2953/5971 [26:03<26:37,  1.89it/s, loss=0.105, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000655, train/loss_step=0.192, global_step=4301.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  49%|████▉     | 2954/5971 [26:04<26:37,  1.89it/s, loss=0.105, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000655, train/loss_step=0.192, global_step=4301.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  49%|████▉     | 2954/5971 [26:04<26:37,  1.89it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0691, train/loss_vlb_step=0.000235, train/loss_step=0.0691, global_step=4301.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  49%|████▉     | 2955/5971 [26:05<26:36,  1.89it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0691, train/loss_vlb_step=0.000235, train/loss_step=0.0691, global_step=4301.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  49%|████▉     | 2955/5971 [26:05<26:36,  1.89it/s, loss=0.147, v_num=0, train/loss_simple_step=0.801, train/loss_vlb_step=0.0348, train/loss_step=0.801, global_step=4301.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  50%|████▉     | 2956/5971 [26:07<26:38,  1.89it/s, loss=0.147, v_num=0, train/loss_simple_step=0.801, train/loss_vlb_step=0.0348, train/loss_step=0.801, global_step=4301.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2956/5971 [26:07<26:38,  1.89it/s, loss=0.163, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00172, train/loss_step=0.390, global_step=4301.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2957/5971 [26:08<26:38,  1.89it/s, loss=0.163, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00172, train/loss_step=0.390, global_step=4301.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2957/5971 [26:08<26:38,  1.89it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.67e-5, train/loss_step=0.00296, global_step=4302.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2958/5971 [26:09<26:37,  1.89it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.67e-5, train/loss_step=0.00296, global_step=4302.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2958/5971 [26:09<26:37,  1.89it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000126, train/loss_step=0.0351, global_step=4302.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  50%|████▉     | 2959/5971 [26:10<26:37,  1.89it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0351, train/loss_vlb_step=0.000126, train/loss_step=0.0351, global_step=4302.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2959/5971 [26:10<26:37,  1.89it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0896, train/loss_vlb_step=0.000297, train/loss_step=0.0896, global_step=4302.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2960/5971 [26:12<26:38,  1.88it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0896, train/loss_vlb_step=0.000297, train/loss_step=0.0896, global_step=4302.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2960/5971 [26:12<26:38,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00482, train/loss_vlb_step=2.58e-5, train/loss_step=0.00482, global_step=4302.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2961/5971 [26:13<26:38,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00482, train/loss_vlb_step=2.58e-5, train/loss_step=0.00482, global_step=4302.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2961/5971 [26:13<26:38,  1.88it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.69e-5, train/loss_step=0.0101, global_step=4303.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|████▉     | 2962/5971 [26:14<26:38,  1.88it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.69e-5, train/loss_step=0.0101, global_step=4303.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2962/5971 [26:14<26:38,  1.88it/s, loss=0.182, v_num=0, train/loss_simple_step=0.859, train/loss_vlb_step=0.0553, train/loss_step=0.859, global_step=4303.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  50%|████▉     | 2963/5971 [26:14<26:38,  1.88it/s, loss=0.182, v_num=0, train/loss_simple_step=0.859, train/loss_vlb_step=0.0553, train/loss_step=0.859, global_step=4303.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2963/5971 [26:14<26:38,  1.88it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0082, train/loss_vlb_step=3.94e-5, train/loss_step=0.0082, global_step=4303.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2964/5971 [26:17<26:39,  1.88it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0082, train/loss_vlb_step=3.94e-5, train/loss_step=0.0082, global_step=4303.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2964/5971 [26:17<26:39,  1.88it/s, loss=0.194, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00331, train/loss_step=0.412, global_step=4303.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|████▉     | 2965/5971 [26:18<26:39,  1.88it/s, loss=0.194, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00331, train/loss_step=0.412, global_step=4303.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2965/5971 [26:18<26:39,  1.88it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000193, train/loss_step=0.0544, global_step=4304.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2966/5971 [26:19<26:39,  1.88it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000193, train/loss_step=0.0544, global_step=4304.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2966/5971 [26:19<26:39,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=8.55e-6, train/loss_step=0.00153, global_step=4304.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2967/5971 [26:19<26:39,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00153, train/loss_vlb_step=8.55e-6, train/loss_step=0.00153, global_step=4304.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2967/5971 [26:19<26:39,  1.88it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=6.91e-5, train/loss_step=0.0169, global_step=4304.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|████▉     | 2968/5971 [26:22<26:40,  1.88it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=6.91e-5, train/loss_step=0.0169, global_step=4304.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2968/5971 [26:22<26:40,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000569, train/loss_step=0.165, global_step=4304.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  50%|████▉     | 2969/5971 [26:23<26:40,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000569, train/loss_step=0.165, global_step=4304.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2969/5971 [26:23<26:40,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000194, train/loss_step=0.0544, global_step=4305.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2970/5971 [26:23<26:39,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000194, train/loss_step=0.0544, global_step=4305.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2970/5971 [26:23<26:39,  1.88it/s, loss=0.186, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.001, train/loss_step=0.250, global_step=4305.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  50%|████▉     | 2971/5971 [26:24<26:39,  1.88it/s, loss=0.186, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.001, train/loss_step=0.250, global_step=4305.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2971/5971 [26:24<26:39,  1.88it/s, loss=0.204, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00191, train/loss_step=0.377, global_step=4305.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2972/5971 [26:27<26:41,  1.87it/s, loss=0.204, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00191, train/loss_step=0.377, global_step=4305.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2972/5971 [26:27<26:41,  1.87it/s, loss=0.196, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000423, train/loss_step=0.129, global_step=4305.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2973/5971 [26:28<26:40,  1.87it/s, loss=0.196, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000423, train/loss_step=0.129, global_step=4305.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2973/5971 [26:28<26:40,  1.87it/s, loss=0.218, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.0166, train/loss_step=0.624, global_step=4306.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|████▉     | 2974/5971 [26:29<26:40,  1.87it/s, loss=0.218, v_num=0, train/loss_simple_step=0.624, train/loss_vlb_step=0.0166, train/loss_step=0.624, global_step=4306.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2974/5971 [26:29<26:40,  1.87it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000127, train/loss_step=0.0349, global_step=4306.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2975/5971 [26:29<26:40,  1.87it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000127, train/loss_step=0.0349, global_step=4306.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2975/5971 [26:29<26:40,  1.87it/s, loss=0.194, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00188, train/loss_step=0.366, global_step=4306.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  50%|████▉     | 2976/5971 [26:32<26:41,  1.87it/s, loss=0.194, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00188, train/loss_step=0.366, global_step=4306.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2976/5971 [26:32<26:41,  1.87it/s, loss=0.176, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.29e-5, train/loss_step=0.024, global_step=4306.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2977/5971 [26:33<26:41,  1.87it/s, loss=0.176, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.29e-5, train/loss_step=0.024, global_step=4306.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2977/5971 [26:33<26:41,  1.87it/s, loss=0.182, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=4307.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2978/5971 [26:33<26:41,  1.87it/s, loss=0.182, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=4307.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2978/5971 [26:33<26:41,  1.87it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.52e-5, train/loss_step=0.00485, global_step=4307.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2979/5971 [26:34<26:41,  1.87it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00485, train/loss_vlb_step=2.52e-5, train/loss_step=0.00485, global_step=4307.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2979/5971 [26:34<26:41,  1.87it/s, loss=0.182, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000419, train/loss_step=0.125, global_step=4307.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|████▉     | 2980/5971 [26:36<26:42,  1.87it/s, loss=0.182, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000419, train/loss_step=0.125, global_step=4307.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2980/5971 [26:36<26:42,  1.87it/s, loss=0.216, v_num=0, train/loss_simple_step=0.682, train/loss_vlb_step=0.0125, train/loss_step=0.682, global_step=4307.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|████▉     | 2981/5971 [26:37<26:42,  1.87it/s, loss=0.216, v_num=0, train/loss_simple_step=0.682, train/loss_vlb_step=0.0125, train/loss_step=0.682, global_step=4307.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2981/5971 [26:37<26:42,  1.87it/s, loss=0.226, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000758, train/loss_step=0.204, global_step=4308.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2982/5971 [26:38<26:41,  1.87it/s, loss=0.226, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000758, train/loss_step=0.204, global_step=4308.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2982/5971 [26:38<26:41,  1.87it/s, loss=0.192, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000646, train/loss_step=0.180, global_step=4308.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2983/5971 [26:39<26:41,  1.87it/s, loss=0.192, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000646, train/loss_step=0.180, global_step=4308.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2983/5971 [26:39<26:41,  1.87it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.9e-5, train/loss_step=0.0259, global_step=4308.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2984/5971 [26:42<26:43,  1.86it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.9e-5, train/loss_step=0.0259, global_step=4308.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2984/5971 [26:42<26:43,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.00627, train/loss_step=0.557, global_step=4308.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  50%|████▉     | 2985/5971 [26:42<26:42,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.557, train/loss_vlb_step=0.00627, train/loss_step=0.557, global_step=4308.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|████▉     | 2985/5971 [26:42<26:42,  1.86it/s, loss=0.205, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.0005, train/loss_step=0.151, global_step=4309.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2986/5971 [26:43<26:42,  1.86it/s, loss=0.205, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.0005, train/loss_step=0.151, global_step=4309.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2986/5971 [26:43<26:42,  1.86it/s, loss=0.224, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00277, train/loss_step=0.395, global_step=4309.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2987/5971 [26:44<26:42,  1.86it/s, loss=0.224, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00277, train/loss_step=0.395, global_step=4309.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2987/5971 [26:44<26:42,  1.86it/s, loss=0.23, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000418, train/loss_step=0.126, global_step=4309.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2988/5971 [26:46<26:43,  1.86it/s, loss=0.23, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000418, train/loss_step=0.126, global_step=4309.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2988/5971 [26:46<26:43,  1.86it/s, loss=0.222, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.07e-5, train/loss_step=0.00179, global_step=4309.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2989/5971 [26:47<26:43,  1.86it/s, loss=0.222, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.07e-5, train/loss_step=0.00179, global_step=4309.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2989/5971 [26:47<26:43,  1.86it/s, loss=0.248, v_num=0, train/loss_simple_step=0.584, train/loss_vlb_step=0.00808, train/loss_step=0.584, global_step=4310.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  50%|█████     | 2990/5971 [26:48<26:43,  1.86it/s, loss=0.248, v_num=0, train/loss_simple_step=0.584, train/loss_vlb_step=0.00808, train/loss_step=0.584, global_step=4310.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2990/5971 [26:48<26:43,  1.86it/s, loss=0.243, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00046, train/loss_step=0.140, global_step=4310.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2991/5971 [26:49<26:42,  1.86it/s, loss=0.243, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.00046, train/loss_step=0.140, global_step=4310.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2991/5971 [26:49<26:42,  1.86it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0877, train/loss_vlb_step=0.000289, train/loss_step=0.0877, global_step=4310.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2992/5971 [26:51<26:44,  1.86it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0877, train/loss_vlb_step=0.000289, train/loss_step=0.0877, global_step=4310.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2992/5971 [26:51<26:44,  1.86it/s, loss=0.231, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000578, train/loss_step=0.175, global_step=4310.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|█████     | 2993/5971 [26:52<26:43,  1.86it/s, loss=0.231, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000578, train/loss_step=0.175, global_step=4310.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2993/5971 [26:52<26:43,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.1e-5, train/loss_step=0.0122, global_step=4311.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|█████     | 2994/5971 [26:53<26:43,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0122, train/loss_vlb_step=5.1e-5, train/loss_step=0.0122, global_step=4311.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2994/5971 [26:53<26:43,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=4311.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2995/5971 [26:54<26:43,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000497, train/loss_step=0.148, global_step=4311.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2995/5971 [26:54<26:43,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.9e-5, train/loss_step=0.0112, global_step=4311.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2996/5971 [26:56<26:44,  1.85it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=4.9e-5, train/loss_step=0.0112, global_step=4311.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2996/5971 [26:56<26:44,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000175, train/loss_step=0.0502, global_step=4311.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2997/5971 [26:57<26:44,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000175, train/loss_step=0.0502, global_step=4311.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2997/5971 [26:57<26:44,  1.85it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.45e-5, train/loss_step=0.00264, global_step=4312.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2998/5971 [26:58<26:44,  1.85it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.45e-5, train/loss_step=0.00264, global_step=4312.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2998/5971 [26:58<26:44,  1.85it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.1e-5, train/loss_step=0.00194, global_step=4312.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  50%|█████     | 2999/5971 [26:59<26:43,  1.85it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.1e-5, train/loss_step=0.00194, global_step=4312.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 2999/5971 [26:59<26:43,  1.85it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000304, train/loss_step=0.0926, global_step=4312.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3000/5971 [27:01<26:45,  1.85it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000304, train/loss_step=0.0926, global_step=4312.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3000/5971 [27:01<26:45,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.12e-5, train/loss_step=0.0141, global_step=4312.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  50%|█████     | 3001/5971 [27:02<26:44,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0141, train/loss_vlb_step=6.12e-5, train/loss_step=0.0141, global_step=4312.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3001/5971 [27:02<26:44,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00245, train/loss_step=0.397, global_step=4313.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  50%|█████     | 3002/5971 [27:03<26:44,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00245, train/loss_step=0.397, global_step=4313.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3002/5971 [27:03<26:44,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00624, train/loss_vlb_step=3.07e-5, train/loss_step=0.00624, global_step=4313.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3003/5971 [27:03<26:44,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00624, train/loss_vlb_step=3.07e-5, train/loss_step=0.00624, global_step=4313.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3003/5971 [27:03<26:44,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00989, train/loss_vlb_step=4.55e-5, train/loss_step=0.00989, global_step=4313.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3004/5971 [27:06<26:46,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.00989, train/loss_vlb_step=4.55e-5, train/loss_step=0.00989, global_step=4313.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3004/5971 [27:06<26:46,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.875, train/loss_vlb_step=0.111, train/loss_step=0.875, global_step=4313.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      
Epoch 7:  50%|█████     | 3005/5971 [27:07<26:45,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.875, train/loss_vlb_step=0.111, train/loss_step=0.875, global_step=4313.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3005/5971 [27:07<26:45,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000129, train/loss_step=0.0357, global_step=4314.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3006/5971 [27:08<26:45,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000129, train/loss_step=0.0357, global_step=4314.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3006/5971 [27:08<26:45,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00946, train/loss_vlb_step=4.34e-5, train/loss_step=0.00946, global_step=4314.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3007/5971 [27:09<26:45,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00946, train/loss_vlb_step=4.34e-5, train/loss_step=0.00946, global_step=4314.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3007/5971 [27:09<26:45,  1.85it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000304, train/loss_step=0.0926, global_step=4314.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  50%|█████     | 3008/5971 [27:11<26:46,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000304, train/loss_step=0.0926, global_step=4314.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3008/5971 [27:11<26:46,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.00011, train/loss_step=0.0301, global_step=4314.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  50%|█████     | 3009/5971 [27:12<26:46,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.00011, train/loss_step=0.0301, global_step=4314.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3009/5971 [27:12<26:46,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.64e-5, train/loss_step=0.00295, global_step=4315.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3010/5971 [27:13<26:46,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.64e-5, train/loss_step=0.00295, global_step=4315.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3010/5971 [27:13<26:46,  1.84it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.36e-5, train/loss_step=0.00497, global_step=4315.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3011/5971 [27:14<26:45,  1.84it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00497, train/loss_vlb_step=2.36e-5, train/loss_step=0.00497, global_step=4315.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3011/5971 [27:14<26:45,  1.84it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.43e-5, train/loss_step=0.024, global_step=4315.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  50%|█████     | 3012/5971 [27:16<26:46,  1.84it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.024, train/loss_vlb_step=9.43e-5, train/loss_step=0.024, global_step=4315.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3012/5971 [27:16<26:46,  1.84it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000161, train/loss_step=0.0465, global_step=4315.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3013/5971 [27:17<26:46,  1.84it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.0465, train/loss_vlb_step=0.000161, train/loss_step=0.0465, global_step=4315.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3013/5971 [27:17<26:46,  1.84it/s, loss=0.105, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000915, train/loss_step=0.242, global_step=4316.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  50%|█████     | 3014/5971 [27:18<26:46,  1.84it/s, loss=0.105, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000915, train/loss_step=0.242, global_step=4316.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3014/5971 [27:18<26:46,  1.84it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000304, train/loss_step=0.0918, global_step=4316.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3015/5971 [27:18<26:46,  1.84it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0918, train/loss_vlb_step=0.000304, train/loss_step=0.0918, global_step=4316.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  50%|█████     | 3015/5971 [27:18<26:46,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0478, train/loss_vlb_step=0.000169, train/loss_step=0.0478, global_step=4316.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3016/5971 [27:21<26:47,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0478, train/loss_vlb_step=0.000169, train/loss_step=0.0478, global_step=4316.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3016/5971 [27:21<26:47,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0965, train/loss_vlb_step=0.000336, train/loss_step=0.0965, global_step=4316.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3017/5971 [27:22<26:47,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0965, train/loss_vlb_step=0.000336, train/loss_step=0.0965, global_step=4316.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3017/5971 [27:22<26:47,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00417, train/loss_step=0.453, global_step=4317.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  51%|█████     | 3018/5971 [27:22<26:47,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.453, train/loss_vlb_step=0.00417, train/loss_step=0.453, global_step=4317.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3018/5971 [27:22<26:47,  1.84it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000121, train/loss_step=0.0325, global_step=4317.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3019/5971 [27:23<26:46,  1.84it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000121, train/loss_step=0.0325, global_step=4317.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3019/5971 [27:23<26:46,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=4317.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  51%|█████     | 3020/5971 [27:25<26:47,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000402, train/loss_step=0.122, global_step=4317.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3020/5971 [27:25<26:47,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000733, train/loss_step=0.199, global_step=4317.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3021/5971 [27:26<26:47,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000733, train/loss_step=0.199, global_step=4317.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3021/5971 [27:26<26:47,  1.84it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.15e-5, train/loss_step=0.0176, global_step=4318.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3022/5971 [27:27<26:47,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.15e-5, train/loss_step=0.0176, global_step=4318.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3022/5971 [27:27<26:47,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.0042, train/loss_step=0.496, global_step=4318.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  51%|█████     | 3023/5971 [27:28<26:47,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.496, train/loss_vlb_step=0.0042, train/loss_step=0.496, global_step=4318.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3023/5971 [27:28<26:47,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00467, train/loss_vlb_step=2.43e-5, train/loss_step=0.00467, global_step=4318.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3024/5971 [27:30<26:48,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00467, train/loss_vlb_step=2.43e-5, train/loss_step=0.00467, global_step=4318.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3024/5971 [27:30<26:48,  1.83it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.78e-5, train/loss_step=0.0101, global_step=4318.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  51%|█████     | 3025/5971 [27:31<26:48,  1.83it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.78e-5, train/loss_step=0.0101, global_step=4318.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3025/5971 [27:31<26:48,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000435, train/loss_step=0.132, global_step=4319.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  51%|█████     | 3026/5971 [27:32<26:47,  1.83it/s, loss=0.108, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000435, train/loss_step=0.132, global_step=4319.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3026/5971 [27:32<26:47,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=4319.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3027/5971 [27:33<26:47,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000359, train/loss_step=0.109, global_step=4319.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3027/5971 [27:33<26:47,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000987, train/loss_step=0.246, global_step=4319.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  51%|█████     | 3028/5971 [27:35<26:48,  1.83it/s, loss=0.12, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000987, train/loss_step=0.246, global_step=4319.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3028/5971 [27:35<26:48,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00186, train/loss_step=0.371, global_step=4319.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3029/5971 [27:36<26:48,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.00186, train/loss_step=0.371, global_step=4319.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3029/5971 [27:36<26:48,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.38e-6, train/loss_step=0.00164, global_step=4320.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3030/5971 [27:37<26:48,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00164, train/loss_vlb_step=9.38e-6, train/loss_step=0.00164, global_step=4320.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3030/5971 [27:37<26:48,  1.83it/s, loss=0.173, v_num=0, train/loss_simple_step=0.710, train/loss_vlb_step=0.0108, train/loss_step=0.710, global_step=4320.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  51%|█████     | 3031/5971 [27:38<26:47,  1.83it/s, loss=0.173, v_num=0, train/loss_simple_step=0.710, train/loss_vlb_step=0.0108, train/loss_step=0.710, global_step=4320.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3031/5971 [27:38<26:47,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0975, train/loss_vlb_step=0.00032, train/loss_step=0.0975, global_step=4320.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3032/5971 [27:40<26:49,  1.83it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0975, train/loss_vlb_step=0.00032, train/loss_step=0.0975, global_step=4320.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3032/5971 [27:40<26:49,  1.83it/s, loss=0.209, v_num=0, train/loss_simple_step=0.709, train/loss_vlb_step=0.0143, train/loss_step=0.709, global_step=4320.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  51%|█████     | 3033/5971 [27:41<26:48,  1.83it/s, loss=0.209, v_num=0, train/loss_simple_step=0.709, train/loss_vlb_step=0.0143, train/loss_step=0.709, global_step=4320.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3033/5971 [27:41<26:48,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000774, train/loss_step=0.218, global_step=4321.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3034/5971 [27:42<26:48,  1.83it/s, loss=0.208, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000774, train/loss_step=0.218, global_step=4321.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3034/5971 [27:42<26:48,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.81e-5, train/loss_step=0.00337, global_step=4321.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3035/5971 [27:43<26:48,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.81e-5, train/loss_step=0.00337, global_step=4321.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3035/5971 [27:43<26:48,  1.83it/s, loss=0.229, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.0116, train/loss_step=0.554, global_step=4321.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  51%|█████     | 3036/5971 [27:45<26:49,  1.82it/s, loss=0.229, v_num=0, train/loss_simple_step=0.554, train/loss_vlb_step=0.0116, train/loss_step=0.554, global_step=4321.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3036/5971 [27:45<26:49,  1.82it/s, loss=0.256, v_num=0, train/loss_simple_step=0.634, train/loss_vlb_step=0.0109, train/loss_step=0.634, global_step=4321.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3037/5971 [27:46<26:49,  1.82it/s, loss=0.256, v_num=0, train/loss_simple_step=0.634, train/loss_vlb_step=0.0109, train/loss_step=0.634, global_step=4321.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3037/5971 [27:46<26:49,  1.82it/s, loss=0.258, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.00355, train/loss_step=0.485, global_step=4322.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3038/5971 [27:47<26:48,  1.82it/s, loss=0.258, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.00355, train/loss_step=0.485, global_step=4322.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3038/5971 [27:47<26:48,  1.82it/s, loss=0.256, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.78e-5, train/loss_step=0.00346, global_step=4322.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3039/5971 [27:48<26:48,  1.82it/s, loss=0.256, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.78e-5, train/loss_step=0.00346, global_step=4322.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3039/5971 [27:48<26:48,  1.82it/s, loss=0.263, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000957, train/loss_step=0.251, global_step=4322.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  51%|█████     | 3040/5971 [27:50<26:49,  1.82it/s, loss=0.263, v_num=0, train/loss_simple_step=0.251, train/loss_vlb_step=0.000957, train/loss_step=0.251, global_step=4322.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3040/5971 [27:50<26:49,  1.82it/s, loss=0.274, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00271, train/loss_step=0.426, global_step=4322.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  51%|█████     | 3041/5971 [27:51<26:49,  1.82it/s, loss=0.274, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00271, train/loss_step=0.426, global_step=4322.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3041/5971 [27:51<26:49,  1.82it/s, loss=0.273, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.22e-5, train/loss_step=0.00229, global_step=4323.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3042/5971 [27:52<26:49,  1.82it/s, loss=0.273, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.22e-5, train/loss_step=0.00229, global_step=4323.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3042/5971 [27:52<26:49,  1.82it/s, loss=0.274, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00348, train/loss_step=0.509, global_step=4323.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  51%|█████     | 3043/5971 [27:52<26:49,  1.82it/s, loss=0.274, v_num=0, train/loss_simple_step=0.509, train/loss_vlb_step=0.00348, train/loss_step=0.509, global_step=4323.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3043/5971 [27:52<26:49,  1.82it/s, loss=0.274, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.77e-5, train/loss_step=0.011, global_step=4323.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3044/5971 [27:55<26:50,  1.82it/s, loss=0.274, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.77e-5, train/loss_step=0.011, global_step=4323.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3044/5971 [27:55<26:50,  1.82it/s, loss=0.275, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.22e-5, train/loss_step=0.0243, global_step=4323.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3045/5971 [27:55<26:49,  1.82it/s, loss=0.275, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.22e-5, train/loss_step=0.0243, global_step=4323.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3045/5971 [27:55<26:49,  1.82it/s, loss=0.27, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000156, train/loss_step=0.0444, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3046/5971 [27:56<26:49,  1.82it/s, loss=0.27, v_num=0, train/loss_simple_step=0.0444, train/loss_vlb_step=0.000156, train/loss_step=0.0444, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3046/5971 [27:56<26:49,  1.82it/s, loss=0.275, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.00083, train/loss_step=0.209, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  51%|█████     | 3047/5971 [27:57<26:49,  1.82it/s, loss=0.275, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.00083, train/loss_step=0.209, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3047/5971 [27:57<26:49,  1.82it/s, loss=0.279, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00168, train/loss_step=0.318, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3048/5971 [27:59<26:50,  1.82it/s, loss=0.279, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00168, train/loss_step=0.318, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  51%|█████     | 3048/5971 [27:59<26:50,  1.82it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:16,  2.17it/s][A
Epoch 7:  51%|█████     | 3050/5971 [28:00<26:48,  1.82it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:49,  3.36it/s][A
Epoch 7:  51%|█████     | 3052/5971 [28:00<26:46,  1.82it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.03it/s][A
Epoch 7:  51%|█████     | 3055/5971 [28:00<26:43,  1.82it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.84it/s][A
Epoch 7:  51%|█████     | 3058/5971 [28:00<26:40,  1.82it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.03it/s][A
Epoch 7:  51%|█████▏    | 3061/5971 [28:00<26:37,  1.82it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.15it/s][A
Epoch 7:  51%|█████▏    | 3064/5971 [28:01<26:34,  1.82it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.78it/s][A
Epoch 7:  51%|█████▏    | 3067/5971 [28:01<26:31,  1.82it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.20it/s][A
Epoch 7:  51%|█████▏    | 3070/5971 [28:01<26:28,  1.83it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.31it/s][A
Epoch 7:  51%|█████▏    | 3073/5971 [28:01<26:25,  1.83it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.17it/s][A
Epoch 7:  52%|█████▏    | 3076/5971 [28:01<26:22,  1.83it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 27.05it/s][A
Epoch 7:  52%|█████▏    | 3080/5971 [28:01<26:17,  1.83it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  20%|█▉        | 33/167 [00:01<00:04, 27.79it/s][A
Epoch 7:  52%|█████▏    | 3084/5971 [28:01<26:13,  1.83it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  22%|██▏       | 36/167 [00:01<00:04, 26.56it/s][A

Validating:  23%|██▎       | 39/167 [00:01<00:04, 26.80it/s][A
Epoch 7:  52%|█████▏    | 3088/5971 [28:01<26:09,  1.84it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.54it/s][A
Epoch 7:  52%|█████▏    | 3092/5971 [28:02<26:05,  1.84it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 28.13it/s][A
Epoch 7:  52%|█████▏    | 3096/5971 [28:02<26:01,  1.84it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 28.16it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 26.41it/s][A
Epoch 7:  52%|█████▏    | 3100/5971 [28:02<25:57,  1.84it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.49it/s][A
Epoch 7:  52%|█████▏    | 3104/5971 [28:02<25:53,  1.85it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.01it/s][A
Epoch 7:  52%|█████▏    | 3108/5971 [28:02<25:49,  1.85it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 26.44it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.09it/s][A
Epoch 7:  52%|█████▏    | 3112/5971 [28:02<25:45,  1.85it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 26.34it/s][A
Epoch 7:  52%|█████▏    | 3116/5971 [28:02<25:41,  1.85it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.47it/s][A
Epoch 7:  52%|█████▏    | 3120/5971 [28:03<25:37,  1.85it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.94it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.33it/s][A
Epoch 7:  52%|█████▏    | 3124/5971 [28:03<25:33,  1.86it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.49it/s][A
Epoch 7:  52%|█████▏    | 3128/5971 [28:03<25:29,  1.86it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.72it/s][A
Epoch 7:  52%|█████▏    | 3132/5971 [28:03<25:25,  1.86it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.33it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 23.14it/s][A
Epoch 7:  53%|█████▎    | 3136/5971 [28:03<25:21,  1.86it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 20.74it/s][A
Epoch 7:  53%|█████▎    | 3140/5971 [28:03<25:17,  1.87it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▌    | 93/167 [00:04<00:03, 21.46it/s][A
Epoch 7:  53%|█████▎    | 3144/5971 [28:04<25:13,  1.87it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 96/167 [00:04<00:03, 21.70it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 22.78it/s][A
Epoch 7:  53%|█████▎    | 3148/5971 [28:04<25:09,  1.87it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.13it/s][A
Epoch 7:  53%|█████▎    | 3152/5971 [28:04<25:06,  1.87it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.31it/s][A
Epoch 7:  53%|█████▎    | 3156/5971 [28:04<25:02,  1.87it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 27.13it/s][A
Epoch 7:  53%|█████▎    | 3160/5971 [28:04<24:58,  1.88it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 27.13it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 26.80it/s][A
Epoch 7:  53%|█████▎    | 3164/5971 [28:04<24:54,  1.88it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.74it/s][A
Epoch 7:  53%|█████▎    | 3168/5971 [28:05<24:50,  1.88it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 25.99it/s][A
Epoch 7:  53%|█████▎    | 3172/5971 [28:05<24:46,  1.88it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.26it/s][A
Epoch 7:  53%|█████▎    | 3176/5971 [28:05<24:42,  1.89it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.16it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.61it/s][A
Epoch 7:  53%|█████▎    | 3180/5971 [28:05<24:38,  1.89it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.94it/s][A
Epoch 7:  53%|█████▎    | 3184/5971 [28:05<24:34,  1.89it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.61it/s][A
Epoch 7:  53%|█████▎    | 3188/5971 [28:05<24:31,  1.89it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.20it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 25.92it/s][A
Epoch 7:  53%|█████▎    | 3192/5971 [28:05<24:27,  1.89it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 25.34it/s][A
Epoch 7:  54%|█████▎    | 3196/5971 [28:06<24:23,  1.90it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 24.27it/s][A
Epoch 7:  54%|█████▎    | 3200/5971 [28:06<24:19,  1.90it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 24.72it/s][A
Epoch 7:  54%|█████▎    | 3204/5971 [28:06<24:15,  1.90it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.33it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.81it/s][A
Epoch 7:  54%|█████▎    | 3208/5971 [28:06<24:12,  1.90it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.64it/s][A
Epoch 7:  54%|█████▍    | 3212/5971 [28:06<24:08,  1.90it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 27.99it/s][A
Epoch 7:  54%|█████▍    | 3216/5971 [28:06<24:04,  1.91it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3216/5971 [28:07<24:04,  1.91it/s, loss=0.294, v_num=0, train/loss_simple_step=0.668, train/loss_vlb_step=0.0109, train/loss_step=0.668, global_step=4324.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  54%|█████▍    | 3217/5971 [28:08<24:04,  1.91it/s, loss=0.298, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.000261, train/loss_step=0.078, global_step=4325.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3218/5971 [28:08<24:04,  1.91it/s, loss=0.263, v_num=0, train/loss_simple_step=0.00782, train/loss_vlb_step=3.77e-5, train/loss_step=0.00782, global_step=4325.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3219/5971 [28:09<24:04,  1.91it/s, loss=0.291, v_num=0, train/loss_simple_step=0.655, train/loss_vlb_step=0.00886, train/loss_step=0.655, global_step=4325.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  54%|█████▍    | 3220/5971 [28:12<24:05,  1.90it/s, loss=0.291, v_num=0, train/loss_simple_step=0.655, train/loss_vlb_step=0.00886, train/loss_step=0.655, global_step=4325.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3220/5971 [28:12<24:05,  1.90it/s, loss=0.267, v_num=0, train/loss_simple_step=0.236, train/loss_vlb_step=0.000962, train/loss_step=0.236, global_step=4325.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3221/5971 [28:13<24:05,  1.90it/s, loss=0.258, v_num=0, train/loss_simple_step=0.045, train/loss_vlb_step=0.000158, train/loss_step=0.045, global_step=4326.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3222/5971 [28:14<24:05,  1.90it/s, loss=0.267, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000609, train/loss_step=0.175, global_step=4326.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3223/5971 [28:15<24:04,  1.90it/s, loss=0.248, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000641, train/loss_step=0.175, global_step=4326.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3224/5971 [28:17<24:05,  1.90it/s, loss=0.248, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000641, train/loss_step=0.175, global_step=4326.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3224/5971 [28:17<24:05,  1.90it/s, loss=0.216, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.38e-5, train/loss_step=0.00243, global_step=4326.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3225/5971 [28:18<24:05,  1.90it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00886, train/loss_vlb_step=4.07e-5, train/loss_step=0.00886, global_step=4327.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3226/5971 [28:19<24:05,  1.90it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000145, train/loss_step=0.0428, global_step=4327.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  54%|█████▍    | 3227/5971 [28:20<24:05,  1.90it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.6e-5, train/loss_step=0.0151, global_step=4327.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  54%|█████▍    | 3228/5971 [28:22<24:06,  1.90it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.6e-5, train/loss_step=0.0151, global_step=4327.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3228/5971 [28:22<24:06,  1.90it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.08e-5, train/loss_step=0.0206, global_step=4327.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3229/5971 [28:23<24:05,  1.90it/s, loss=0.176, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.0012, train/loss_step=0.275, global_step=4328.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  54%|█████▍    | 3230/5971 [28:24<24:05,  1.90it/s, loss=0.168, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00153, train/loss_step=0.344, global_step=4328.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3231/5971 [28:25<24:05,  1.90it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.59e-5, train/loss_step=0.0134, global_step=4328.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3232/5971 [28:27<24:06,  1.89it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.59e-5, train/loss_step=0.0134, global_step=4328.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3232/5971 [28:27<24:06,  1.89it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.05e-5, train/loss_step=0.00175, global_step=4328.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3233/5971 [28:28<24:06,  1.89it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.1e-5, train/loss_step=0.00382, global_step=4329.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  54%|█████▍    | 3234/5971 [28:28<24:05,  1.89it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=1.98e-5, train/loss_step=0.00392, global_step=4329.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3235/5971 [28:29<24:05,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00425, train/loss_vlb_step=2.21e-5, train/loss_step=0.00425, global_step=4329.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3236/5971 [28:31<24:06,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00425, train/loss_vlb_step=2.21e-5, train/loss_step=0.00425, global_step=4329.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3236/5971 [28:31<24:06,  1.89it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0398, train/loss_vlb_step=0.00015, train/loss_step=0.0398, global_step=4329.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  54%|█████▍    | 3237/5971 [28:32<24:06,  1.89it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.08e-5, train/loss_step=0.0114, global_step=4330.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3238/5971 [28:33<24:06,  1.89it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0963, train/loss_vlb_step=0.000316, train/loss_step=0.0963, global_step=4330.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3239/5971 [28:34<24:05,  1.89it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000478, train/loss_step=0.143, global_step=4330.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  54%|█████▍    | 3240/5971 [28:36<24:06,  1.89it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000478, train/loss_step=0.143, global_step=4330.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3240/5971 [28:36<24:06,  1.89it/s, loss=0.0934, v_num=0, train/loss_simple_step=0.447, train/loss_vlb_step=0.00301, train/loss_step=0.447, global_step=4330.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  54%|█████▍    | 3241/5971 [28:37<24:06,  1.89it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=4331.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3242/5971 [28:38<24:06,  1.89it/s, loss=0.0897, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=0.000104, train/loss_step=0.0256, global_step=4331.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3243/5971 [28:39<24:05,  1.89it/s, loss=0.0813, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.11e-5, train/loss_step=0.00643, global_step=4331.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3244/5971 [28:41<24:06,  1.88it/s, loss=0.0813, v_num=0, train/loss_simple_step=0.00643, train/loss_vlb_step=3.11e-5, train/loss_step=0.00643, global_step=4331.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3244/5971 [28:41<24:06,  1.88it/s, loss=0.0813, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.53e-5, train/loss_step=0.00274, global_step=4331.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3245/5971 [28:42<24:06,  1.88it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.0799, train/loss_vlb_step=0.000266, train/loss_step=0.0799, global_step=4332.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  54%|█████▍    | 3246/5971 [28:43<24:06,  1.88it/s, loss=0.0828, v_num=0, train/loss_simple_step=0.00307, train/loss_vlb_step=1.73e-5, train/loss_step=0.00307, global_step=4332.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3247/5971 [28:44<24:06,  1.88it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.2e-5, train/loss_step=0.0171, global_step=4332.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  54%|█████▍    | 3248/5971 [28:46<24:07,  1.88it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.2e-5, train/loss_step=0.0171, global_step=4332.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3248/5971 [28:46<24:07,  1.88it/s, loss=0.0876, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000396, train/loss_step=0.113, global_step=4332.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3249/5971 [28:47<24:06,  1.88it/s, loss=0.0814, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000549, train/loss_step=0.150, global_step=4333.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3250/5971 [28:48<24:06,  1.88it/s, loss=0.0824, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00196, train/loss_step=0.364, global_step=4333.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  54%|█████▍    | 3251/5971 [28:49<24:06,  1.88it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.91e-5, train/loss_step=0.0263, global_step=4333.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3252/5971 [28:51<24:07,  1.88it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.91e-5, train/loss_step=0.0263, global_step=4333.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3252/5971 [28:51<24:07,  1.88it/s, loss=0.0842, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000108, train/loss_step=0.026, global_step=4333.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3253/5971 [28:52<24:06,  1.88it/s, loss=0.0898, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000379, train/loss_step=0.115, global_step=4334.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  54%|█████▍    | 3254/5971 [28:53<24:06,  1.88it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000571, train/loss_step=0.173, global_step=4334.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3255/5971 [28:54<24:06,  1.88it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000169, train/loss_step=0.0497, global_step=4334.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3256/5971 [28:56<24:07,  1.88it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0497, train/loss_vlb_step=0.000169, train/loss_step=0.0497, global_step=4334.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3256/5971 [28:56<24:07,  1.88it/s, loss=0.104, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=4334.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  55%|█████▍    | 3257/5971 [28:57<24:07,  1.88it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.12e-5, train/loss_step=0.00183, global_step=4335.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3258/5971 [28:58<24:06,  1.87it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.53e-5, train/loss_step=0.0179, global_step=4335.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▍    | 3259/5971 [28:59<24:06,  1.87it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000217, train/loss_step=0.0635, global_step=4335.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3260/5971 [29:01<24:07,  1.87it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000217, train/loss_step=0.0635, global_step=4335.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3260/5971 [29:01<24:07,  1.87it/s, loss=0.0844, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000897, train/loss_step=0.230, global_step=4335.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  55%|█████▍    | 3261/5971 [29:02<24:07,  1.87it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00136, train/loss_step=0.294, global_step=4336.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▍    | 3262/5971 [29:03<24:07,  1.87it/s, loss=0.102, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000769, train/loss_step=0.209, global_step=4336.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3263/5971 [29:03<24:06,  1.87it/s, loss=0.107, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=4336.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3264/5971 [29:06<24:07,  1.87it/s, loss=0.107, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=4336.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3264/5971 [29:06<24:07,  1.87it/s, loss=0.128, v_num=0, train/loss_simple_step=0.419, train/loss_vlb_step=0.0034, train/loss_step=0.419, global_step=4336.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  55%|█████▍    | 3265/5971 [29:07<24:07,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.87e-5, train/loss_step=0.00579, global_step=4337.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3266/5971 [29:08<24:07,  1.87it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.68e-5, train/loss_step=0.0221, global_step=4337.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  55%|█████▍    | 3267/5971 [29:08<24:07,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.72e-5, train/loss_step=0.0123, global_step=4337.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3268/5971 [29:11<24:08,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.72e-5, train/loss_step=0.0123, global_step=4337.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3268/5971 [29:11<24:08,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.725, train/loss_vlb_step=0.0177, train/loss_step=0.725, global_step=4337.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  55%|█████▍    | 3269/5971 [29:12<24:07,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000436, train/loss_step=0.131, global_step=4338.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3270/5971 [29:13<24:07,  1.87it/s, loss=0.149, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.00111, train/loss_step=0.243, global_step=4338.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▍    | 3271/5971 [29:13<24:07,  1.87it/s, loss=0.154, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=4338.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3272/5971 [29:16<24:08,  1.86it/s, loss=0.154, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=4338.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3272/5971 [29:16<24:08,  1.86it/s, loss=0.171, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00237, train/loss_step=0.381, global_step=4338.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▍    | 3273/5971 [29:16<24:07,  1.86it/s, loss=0.172, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000442, train/loss_step=0.135, global_step=4339.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3274/5971 [29:17<24:07,  1.86it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.12e-5, train/loss_step=0.0019, global_step=4339.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3275/5971 [29:18<24:07,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.79e-5, train/loss_step=0.0156, global_step=4339.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3276/5971 [29:21<24:08,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.79e-5, train/loss_step=0.0156, global_step=4339.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3276/5971 [29:21<24:08,  1.86it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00352, train/loss_vlb_step=1.91e-5, train/loss_step=0.00352, global_step=4339.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3277/5971 [29:22<24:08,  1.86it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00702, train/loss_vlb_step=3.49e-5, train/loss_step=0.00702, global_step=4340.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3278/5971 [29:22<24:07,  1.86it/s, loss=0.168, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.001, train/loss_step=0.234, global_step=4340.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      
Epoch 7:  55%|█████▍    | 3279/5971 [29:23<24:07,  1.86it/s, loss=0.175, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.0007, train/loss_step=0.194, global_step=4340.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3280/5971 [29:25<24:08,  1.86it/s, loss=0.175, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.0007, train/loss_step=0.194, global_step=4340.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3280/5971 [29:25<24:08,  1.86it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00457, train/loss_vlb_step=2.23e-5, train/loss_step=0.00457, global_step=4340.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3281/5971 [29:26<24:08,  1.86it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00823, train/loss_vlb_step=3.83e-5, train/loss_step=0.00823, global_step=4341.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3282/5971 [29:27<24:07,  1.86it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.82e-5, train/loss_step=0.00521, global_step=4341.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3283/5971 [29:28<24:07,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.02e-5, train/loss_step=0.00394, global_step=4341.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3284/5971 [29:30<24:08,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00394, train/loss_vlb_step=2.02e-5, train/loss_step=0.00394, global_step=4341.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▍    | 3284/5971 [29:30<24:08,  1.86it/s, loss=0.13, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00175, train/loss_step=0.343, global_step=4341.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  55%|█████▌    | 3285/5971 [29:31<24:08,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000699, train/loss_step=0.201, global_step=4342.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3286/5971 [29:32<24:07,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000568, train/loss_step=0.166, global_step=4342.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3287/5971 [29:33<24:07,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00204, train/loss_step=0.312, global_step=4342.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▌    | 3288/5971 [29:35<24:08,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.312, train/loss_vlb_step=0.00204, train/loss_step=0.312, global_step=4342.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3288/5971 [29:35<24:08,  1.85it/s, loss=0.134, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000593, train/loss_step=0.176, global_step=4342.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3289/5971 [29:36<24:08,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000798, train/loss_step=0.217, global_step=4343.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3290/5971 [29:37<24:07,  1.85it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0681, train/loss_vlb_step=0.000227, train/loss_step=0.0681, global_step=4343.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3291/5971 [29:38<24:07,  1.85it/s, loss=0.126, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000133, train/loss_step=0.035, global_step=4343.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▌    | 3292/5971 [29:40<24:08,  1.85it/s, loss=0.126, v_num=0, train/loss_simple_step=0.035, train/loss_vlb_step=0.000133, train/loss_step=0.035, global_step=4343.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3292/5971 [29:40<24:08,  1.85it/s, loss=0.107, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.19e-5, train/loss_step=0.018, global_step=4343.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▌    | 3293/5971 [29:41<24:08,  1.85it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00334, train/loss_vlb_step=1.73e-5, train/loss_step=0.00334, global_step=4344.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3294/5971 [29:42<24:07,  1.85it/s, loss=0.113, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00103, train/loss_step=0.247, global_step=4344.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  55%|█████▌    | 3295/5971 [29:42<24:07,  1.85it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.00029, train/loss_step=0.0881, global_step=4344.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3296/5971 [29:45<24:08,  1.85it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0881, train/loss_vlb_step=0.00029, train/loss_step=0.0881, global_step=4344.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3296/5971 [29:45<24:08,  1.85it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000152, train/loss_step=0.0423, global_step=4344.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3297/5971 [29:46<24:08,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000117, train/loss_step=0.0316, global_step=4345.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▌    | 3298/5971 [29:47<24:08,  1.85it/s, loss=0.134, v_num=0, train/loss_simple_step=0.514, train/loss_vlb_step=0.00386, train/loss_step=0.514, global_step=4345.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  55%|█████▌    | 3299/5971 [29:48<24:07,  1.85it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000128, train/loss_step=0.0364, global_step=4345.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3300/5971 [29:50<24:08,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000128, train/loss_step=0.0364, global_step=4345.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3300/5971 [29:50<24:08,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.14e-5, train/loss_step=0.002, global_step=4345.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  55%|█████▌    | 3301/5971 [29:51<24:08,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000103, train/loss_step=0.0271, global_step=4346.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3302/5971 [29:51<24:07,  1.84it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00936, train/loss_vlb_step=4.37e-5, train/loss_step=0.00936, global_step=4346.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3303/5971 [29:52<24:07,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000125, train/loss_step=0.0346, global_step=4346.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▌    | 3304/5971 [29:54<24:08,  1.84it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000125, train/loss_step=0.0346, global_step=4346.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3304/5971 [29:54<24:08,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000106, train/loss_step=0.0279, global_step=4346.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3305/5971 [29:55<24:08,  1.84it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000146, train/loss_step=0.0396, global_step=4347.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3306/5971 [29:56<24:07,  1.84it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000318, train/loss_step=0.0948, global_step=4347.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3307/5971 [29:57<24:07,  1.84it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.00979, train/loss_vlb_step=4.86e-5, train/loss_step=0.00979, global_step=4347.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3308/5971 [29:59<24:08,  1.84it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.00979, train/loss_vlb_step=4.86e-5, train/loss_step=0.00979, global_step=4347.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3308/5971 [29:59<24:08,  1.84it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.47e-5, train/loss_step=0.0106, global_step=4347.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  55%|█████▌    | 3309/5971 [30:00<24:08,  1.84it/s, loss=0.0799, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00115, train/loss_step=0.260, global_step=4348.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  55%|█████▌    | 3310/5971 [30:01<24:07,  1.84it/s, loss=0.102, v_num=0, train/loss_simple_step=0.503, train/loss_vlb_step=0.00333, train/loss_step=0.503, global_step=4348.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  55%|█████▌    | 3311/5971 [30:02<24:07,  1.84it/s, loss=0.107, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000499, train/loss_step=0.142, global_step=4348.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3312/5971 [30:04<24:08,  1.84it/s, loss=0.107, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000499, train/loss_step=0.142, global_step=4348.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  55%|█████▌    | 3312/5971 [30:04<24:08,  1.84it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=9e-5, train/loss_step=0.0217, global_step=4348.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  55%|█████▌    | 3313/5971 [30:05<24:08,  1.84it/s, loss=0.113, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  56%|█████▌    | 3314/5971 [30:06<24:07,  1.84it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00602, train/loss_vlb_step=2.99e-5, train/loss_step=0.00602, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  56%|█████▌    | 3315/5971 [30:07<24:07,  1.83it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.00477, train/loss_vlb_step=2.41e-5, train/loss_step=0.00477, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  56%|█████▌    | 3316/5971 [30:09<24:08,  1.83it/s, loss=0.0964, v_num=0, train/loss_simple_step=0.00477, train/loss_vlb_step=2.41e-5, train/loss_step=0.00477, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  56%|█████▌    | 3316/5971 [30:09<24:08,  1.83it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.23it/s][A

Validating:   1%|          | 2/167 [00:00<00:41,  3.94it/s][A
Epoch 7:  56%|█████▌    | 3320/5971 [30:10<24:04,  1.83it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:16, 10.06it/s][A
Epoch 7:  56%|█████▌    | 3324/5971 [30:10<24:01,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.32it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.15it/s][A
Epoch 7:  56%|█████▌    | 3328/5971 [30:10<23:57,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.64it/s][A
Epoch 7:  56%|█████▌    | 3332/5971 [30:10<23:53,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.99it/s][A
Epoch 7:  56%|█████▌    | 3336/5971 [30:10<23:49,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.10it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.69it/s][A
Epoch 7:  56%|█████▌    | 3340/5971 [30:10<23:46,  1.84it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.29it/s][A
Epoch 7:  56%|█████▌    | 3344/5971 [30:11<23:42,  1.85it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.94it/s][A
Epoch 7:  56%|█████▌    | 3348/5971 [30:11<23:38,  1.85it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.19it/s][A

Validating:  21%|██        | 35/167 [00:01<00:04, 26.63it/s][A
Epoch 7:  56%|█████▌    | 3352/5971 [30:11<23:34,  1.85it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.34it/s][A
Epoch 7:  56%|█████▌    | 3356/5971 [30:11<23:31,  1.85it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 27.94it/s][A
Epoch 7:  56%|█████▋    | 3360/5971 [30:11<23:27,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 28.40it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 28.22it/s][A
Epoch 7:  56%|█████▋    | 3364/5971 [30:11<23:23,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 51/167 [00:02<00:03, 29.01it/s][A
Epoch 7:  56%|█████▋    | 3368/5971 [30:11<23:19,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:03, 28.85it/s][A
Epoch 7:  56%|█████▋    | 3372/5971 [30:12<23:16,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 27.05it/s][A
Epoch 7:  57%|█████▋    | 3376/5971 [30:12<23:12,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.38it/s][A

Validating:  38%|███▊      | 63/167 [00:02<00:04, 25.21it/s][A
Epoch 7:  57%|█████▋    | 3380/5971 [30:12<23:08,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|███▉      | 66/167 [00:02<00:03, 25.57it/s][A
Epoch 7:  57%|█████▋    | 3384/5971 [30:12<23:05,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.85it/s][A
Epoch 7:  57%|█████▋    | 3388/5971 [30:12<23:01,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 26.17it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.57it/s][A
Epoch 7:  57%|█████▋    | 3392/5971 [30:12<22:57,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.09it/s][A
Epoch 7:  57%|█████▋    | 3396/5971 [30:13<22:54,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.73it/s][A
Epoch 7:  57%|█████▋    | 3400/5971 [30:13<22:50,  1.88it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.52it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.56it/s][A
Epoch 7:  57%|█████▋    | 3404/5971 [30:13<22:47,  1.88it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.93it/s][A
Epoch 7:  57%|█████▋    | 3408/5971 [30:13<22:43,  1.88it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 27.43it/s][A
Epoch 7:  57%|█████▋    | 3412/5971 [30:13<22:39,  1.88it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 28.02it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.37it/s][A
Epoch 7:  57%|█████▋    | 3416/5971 [30:13<22:36,  1.88it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.37it/s][A
Epoch 7:  57%|█████▋    | 3420/5971 [30:13<22:32,  1.89it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 27.25it/s][A
Epoch 7:  57%|█████▋    | 3424/5971 [30:14<22:29,  1.89it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 25.99it/s][A
Epoch 7:  57%|█████▋    | 3428/5971 [30:14<22:25,  1.89it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.57it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 27.22it/s][A
Epoch 7:  57%|█████▋    | 3432/5971 [30:14<22:21,  1.89it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████   | 118/167 [00:04<00:01, 27.15it/s][A
Epoch 7:  58%|█████▊    | 3436/5971 [30:14<22:18,  1.89it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 28.13it/s][A
Epoch 7:  58%|█████▊    | 3440/5971 [30:14<22:14,  1.90it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 28.57it/s][A
Epoch 7:  58%|█████▊    | 3444/5971 [30:14<22:11,  1.90it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.81it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 28.09it/s][A
Epoch 7:  58%|█████▊    | 3448/5971 [30:14<22:07,  1.90it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.73it/s][A
Epoch 7:  58%|█████▊    | 3452/5971 [30:15<22:04,  1.90it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 28.02it/s][A
Epoch 7:  58%|█████▊    | 3456/5971 [30:15<22:00,  1.90it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 28.27it/s][A
Epoch 7:  58%|█████▊    | 3460/5971 [30:15<21:57,  1.91it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 27.84it/s][A
Epoch 7:  58%|█████▊    | 3464/5971 [30:15<21:53,  1.91it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▊ | 148/167 [00:05<00:00, 28.25it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 28.35it/s][A
Epoch 7:  58%|█████▊    | 3468/5971 [30:15<21:50,  1.91it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 28.63it/s][A
Epoch 7:  58%|█████▊    | 3472/5971 [30:15<21:46,  1.91it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 28.22it/s][A
Epoch 7:  58%|█████▊    | 3476/5971 [30:15<21:43,  1.91it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.82it/s][A
Epoch 7:  58%|█████▊    | 3480/5971 [30:16<21:39,  1.92it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 27.07it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 27.29it/s][A
Epoch 7:  58%|█████▊    | 3484/5971 [30:16<21:36,  1.92it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  58%|█████▊    | 3484/5971 [30:16<21:36,  1.92it/s, loss=0.106, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000951, train/loss_step=0.224, global_step=4349.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  58%|█████▊    | 3485/5971 [30:17<21:36,  1.92it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000141, train/loss_step=0.0405, global_step=4350.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  58%|█████▊    | 3486/5971 [30:18<21:35,  1.92it/s, loss=0.0859, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=4350.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  58%|█████▊    | 3487/5971 [30:19<21:35,  1.92it/s, loss=0.107, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00602, train/loss_step=0.454, global_step=4350.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  58%|█████▊    | 3488/5971 [30:21<21:36,  1.92it/s, loss=0.107, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00602, train/loss_step=0.454, global_step=4350.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  58%|█████▊    | 3488/5971 [30:21<21:36,  1.92it/s, loss=0.114, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000469, train/loss_step=0.140, global_step=4350.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  58%|█████▊    | 3489/5971 [30:22<21:36,  1.91it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0112, train/loss_vlb_step=5.06e-5, train/loss_step=0.0112, global_step=4351.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  58%|█████▊    | 3490/5971 [30:23<21:36,  1.91it/s, loss=0.132, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.0027, train/loss_step=0.395, global_step=4351.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  58%|█████▊    | 3491/5971 [30:24<21:35,  1.91it/s, loss=0.139, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.00055, train/loss_step=0.162, global_step=4351.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  58%|█████▊    | 3492/5971 [30:26<21:36,  1.91it/s, loss=0.139, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.00055, train/loss_step=0.162, global_step=4351.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  58%|█████▊    | 3492/5971 [30:26<21:36,  1.91it/s, loss=0.144, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000524, train/loss_step=0.147, global_step=4351.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  58%|█████▊    | 3493/5971 [30:27<21:36,  1.91it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000289, train/loss_step=0.0861, global_step=4352.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3494/5971 [30:28<21:36,  1.91it/s, loss=0.148, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000363, train/loss_step=0.109, global_step=4352.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  59%|█████▊    | 3495/5971 [30:29<21:35,  1.91it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.2e-5, train/loss_step=0.00204, global_step=4352.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3496/5971 [30:31<21:36,  1.91it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.2e-5, train/loss_step=0.00204, global_step=4352.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3496/5971 [30:31<21:36,  1.91it/s, loss=0.157, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000709, train/loss_step=0.203, global_step=4352.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  59%|█████▊    | 3497/5971 [30:32<21:36,  1.91it/s, loss=0.15, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000404, train/loss_step=0.122, global_step=4353.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  59%|█████▊    | 3498/5971 [30:33<21:35,  1.91it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0056, train/loss_vlb_step=2.78e-5, train/loss_step=0.0056, global_step=4353.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3499/5971 [30:34<21:35,  1.91it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00666, train/loss_vlb_step=3.2e-5, train/loss_step=0.00666, global_step=4353.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3500/5971 [30:36<21:36,  1.91it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00666, train/loss_vlb_step=3.2e-5, train/loss_step=0.00666, global_step=4353.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3500/5971 [30:36<21:36,  1.91it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0076, train/loss_vlb_step=3.48e-5, train/loss_step=0.0076, global_step=4353.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  59%|█████▊    | 3501/5971 [30:37<21:35,  1.91it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.36e-5, train/loss_step=0.00251, global_step=4354.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3502/5971 [30:38<21:35,  1.91it/s, loss=0.139, v_num=0, train/loss_simple_step=0.536, train/loss_vlb_step=0.00422, train/loss_step=0.536, global_step=4354.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  59%|█████▊    | 3503/5971 [30:39<21:35,  1.91it/s, loss=0.148, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000654, train/loss_step=0.189, global_step=4354.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3504/5971 [30:41<21:35,  1.90it/s, loss=0.148, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000654, train/loss_step=0.189, global_step=4354.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3504/5971 [30:41<21:35,  1.90it/s, loss=0.171, v_num=0, train/loss_simple_step=0.679, train/loss_vlb_step=0.0173, train/loss_step=0.679, global_step=4354.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  59%|█████▊    | 3505/5971 [30:42<21:35,  1.90it/s, loss=0.175, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000455, train/loss_step=0.138, global_step=4355.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3506/5971 [30:43<21:35,  1.90it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000124, train/loss_step=0.0342, global_step=4355.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▊    | 3507/5971 [30:43<21:35,  1.90it/s, loss=0.158, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.00066, train/loss_step=0.189, global_step=4355.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  59%|█████▉    | 3508/5971 [30:46<21:35,  1.90it/s, loss=0.158, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.00066, train/loss_step=0.189, global_step=4355.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3508/5971 [30:46<21:35,  1.90it/s, loss=0.157, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000411, train/loss_step=0.121, global_step=4355.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3509/5971 [30:46<21:35,  1.90it/s, loss=0.162, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000348, train/loss_step=0.105, global_step=4356.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3510/5971 [30:47<21:35,  1.90it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.57e-5, train/loss_step=0.0123, global_step=4356.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3511/5971 [30:48<21:34,  1.90it/s, loss=0.135, v_num=0, train/loss_simple_step=0.000972, train/loss_vlb_step=5.73e-6, train/loss_step=0.000972, global_step=4356.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3512/5971 [30:51<21:35,  1.90it/s, loss=0.135, v_num=0, train/loss_simple_step=0.000972, train/loss_vlb_step=5.73e-6, train/loss_step=0.000972, global_step=4356.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3512/5971 [30:51<21:35,  1.90it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000201, train/loss_step=0.0579, global_step=4356.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  59%|█████▉    | 3513/5971 [30:52<21:35,  1.90it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000167, train/loss_step=0.0488, global_step=4357.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3514/5971 [30:53<21:35,  1.90it/s, loss=0.152, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.00561, train/loss_step=0.586, global_step=4357.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  59%|█████▉    | 3515/5971 [30:54<21:35,  1.90it/s, loss=0.163, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000776, train/loss_step=0.209, global_step=4357.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3516/5971 [30:56<21:35,  1.89it/s, loss=0.163, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000776, train/loss_step=0.209, global_step=4357.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3516/5971 [30:56<21:35,  1.89it/s, loss=0.158, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000359, train/loss_step=0.108, global_step=4357.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3517/5971 [30:57<21:35,  1.89it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00163, train/loss_vlb_step=9.75e-6, train/loss_step=0.00163, global_step=4358.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3518/5971 [30:58<21:35,  1.89it/s, loss=0.162, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000763, train/loss_step=0.201, global_step=4358.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  59%|█████▉    | 3519/5971 [30:59<21:34,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000787, train/loss_step=0.220, global_step=4358.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3520/5971 [31:01<21:35,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000787, train/loss_step=0.220, global_step=4358.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3520/5971 [31:01<21:35,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.16e-5, train/loss_step=0.00387, global_step=4358.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3521/5971 [31:02<21:35,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.4e-5, train/loss_step=0.0247, global_step=4359.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  59%|█████▉    | 3522/5971 [31:02<21:34,  1.89it/s, loss=0.17, v_num=0, train/loss_simple_step=0.477, train/loss_vlb_step=0.0045, train/loss_step=0.477, global_step=4359.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  59%|█████▉    | 3523/5971 [31:03<21:34,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000811, train/loss_step=0.225, global_step=4359.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3524/5971 [31:05<21:35,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000811, train/loss_step=0.225, global_step=4359.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3524/5971 [31:05<21:35,  1.89it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00363, train/loss_vlb_step=1.86e-5, train/loss_step=0.00363, global_step=4359.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3525/5971 [31:06<21:34,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000504, train/loss_step=0.151, global_step=4360.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  59%|█████▉    | 3526/5971 [31:07<21:34,  1.89it/s, loss=0.146, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000565, train/loss_step=0.166, global_step=4360.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3527/5971 [31:08<21:34,  1.89it/s, loss=0.145, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000584, train/loss_step=0.176, global_step=4360.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3528/5971 [31:10<21:34,  1.89it/s, loss=0.145, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000584, train/loss_step=0.176, global_step=4360.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3528/5971 [31:10<21:34,  1.89it/s, loss=0.169, v_num=0, train/loss_simple_step=0.602, train/loss_vlb_step=0.0117, train/loss_step=0.602, global_step=4360.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  59%|█████▉    | 3529/5971 [31:11<21:34,  1.89it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0741, train/loss_vlb_step=0.000253, train/loss_step=0.0741, global_step=4361.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3530/5971 [31:12<21:34,  1.89it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00361, train/loss_vlb_step=1.89e-5, train/loss_step=0.00361, global_step=4361.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3531/5971 [31:13<21:34,  1.89it/s, loss=0.175, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000569, train/loss_step=0.166, global_step=4361.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  59%|█████▉    | 3532/5971 [31:15<21:34,  1.88it/s, loss=0.175, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000569, train/loss_step=0.166, global_step=4361.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3532/5971 [31:15<21:34,  1.88it/s, loss=0.181, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000577, train/loss_step=0.163, global_step=4361.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3533/5971 [31:16<21:34,  1.88it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00261, train/loss_vlb_step=1.47e-5, train/loss_step=0.00261, global_step=4362.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3534/5971 [31:17<21:34,  1.88it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0278, train/loss_vlb_step=0.000108, train/loss_step=0.0278, global_step=4362.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  59%|█████▉    | 3535/5971 [31:18<21:33,  1.88it/s, loss=0.141, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=9.97e-5, train/loss_step=0.027, global_step=4362.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  59%|█████▉    | 3536/5971 [31:20<21:34,  1.88it/s, loss=0.141, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=9.97e-5, train/loss_step=0.027, global_step=4362.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3536/5971 [31:20<21:34,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000117, train/loss_step=0.0324, global_step=4362.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3537/5971 [31:21<21:34,  1.88it/s, loss=0.157, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00171, train/loss_step=0.388, global_step=4363.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  59%|█████▉    | 3538/5971 [31:22<21:33,  1.88it/s, loss=0.157, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000798, train/loss_step=0.206, global_step=4363.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3539/5971 [31:23<21:33,  1.88it/s, loss=0.159, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00095, train/loss_step=0.256, global_step=4363.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  59%|█████▉    | 3540/5971 [31:25<21:34,  1.88it/s, loss=0.159, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00095, train/loss_step=0.256, global_step=4363.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3540/5971 [31:25<21:34,  1.88it/s, loss=0.174, v_num=0, train/loss_simple_step=0.300, train/loss_vlb_step=0.00132, train/loss_step=0.300, global_step=4363.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3541/5971 [31:26<21:34,  1.88it/s, loss=0.172, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.34e-5, train/loss_step=0.00244, global_step=4364.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3542/5971 [31:27<21:33,  1.88it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00264, train/loss_vlb_step=1.48e-5, train/loss_step=0.00264, global_step=4364.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3543/5971 [31:28<21:33,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.47e-6, train/loss_step=0.00159, global_step=4364.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3544/5971 [31:30<21:34,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00159, train/loss_vlb_step=9.47e-6, train/loss_step=0.00159, global_step=4364.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3544/5971 [31:30<21:34,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00908, train/loss_vlb_step=4.33e-5, train/loss_step=0.00908, global_step=4364.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3545/5971 [31:31<21:33,  1.87it/s, loss=0.145, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00114, train/loss_step=0.289, global_step=4365.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  59%|█████▉    | 3546/5971 [31:32<21:33,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.41e-5, train/loss_step=0.0214, global_step=4365.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3547/5971 [31:32<21:33,  1.87it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00751, train/loss_vlb_step=3.61e-5, train/loss_step=0.00751, global_step=4365.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3548/5971 [31:35<21:33,  1.87it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00751, train/loss_vlb_step=3.61e-5, train/loss_step=0.00751, global_step=4365.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3548/5971 [31:35<21:33,  1.87it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000269, train/loss_step=0.0809, global_step=4365.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  59%|█████▉    | 3549/5971 [31:36<21:33,  1.87it/s, loss=0.132, v_num=0, train/loss_simple_step=0.648, train/loss_vlb_step=0.0086, train/loss_step=0.648, global_step=4366.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  59%|█████▉    | 3550/5971 [31:37<21:33,  1.87it/s, loss=0.148, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00164, train/loss_step=0.338, global_step=4366.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3551/5971 [31:37<21:33,  1.87it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0889, train/loss_vlb_step=0.000295, train/loss_step=0.0889, global_step=4366.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3552/5971 [31:40<21:33,  1.87it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0889, train/loss_vlb_step=0.000295, train/loss_step=0.0889, global_step=4366.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  59%|█████▉    | 3552/5971 [31:40<21:33,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=9.8e-5, train/loss_step=0.0262, global_step=4366.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  60%|█████▉    | 3553/5971 [31:41<21:33,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0039, train/loss_vlb_step=2.01e-5, train/loss_step=0.0039, global_step=4367.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3554/5971 [31:42<21:33,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.511, train/loss_vlb_step=0.00557, train/loss_step=0.511, global_step=4367.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  60%|█████▉    | 3555/5971 [31:42<21:32,  1.87it/s, loss=0.177, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00143, train/loss_step=0.322, global_step=4367.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3556/5971 [31:45<21:33,  1.87it/s, loss=0.177, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.00143, train/loss_step=0.322, global_step=4367.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3556/5971 [31:45<21:33,  1.87it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.42e-5, train/loss_step=0.0214, global_step=4367.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3557/5971 [31:45<21:33,  1.87it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000147, train/loss_step=0.0419, global_step=4368.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3558/5971 [31:46<21:32,  1.87it/s, loss=0.158, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.000725, train/loss_step=0.194, global_step=4368.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  60%|█████▉    | 3559/5971 [31:47<21:32,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=4368.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3560/5971 [31:50<21:33,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000408, train/loss_step=0.124, global_step=4368.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3560/5971 [31:50<21:33,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0045, train/loss_vlb_step=2.37e-5, train/loss_step=0.0045, global_step=4368.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3561/5971 [31:51<21:32,  1.86it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000134, train/loss_step=0.0358, global_step=4369.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3562/5971 [31:51<21:32,  1.86it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0883, train/loss_vlb_step=0.00029, train/loss_step=0.0883, global_step=4369.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  60%|█████▉    | 3563/5971 [31:52<21:32,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.78e-5, train/loss_step=0.0181, global_step=4369.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3564/5971 [31:54<21:32,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.78e-5, train/loss_step=0.0181, global_step=4369.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3564/5971 [31:54<21:32,  1.86it/s, loss=0.163, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00214, train/loss_step=0.386, global_step=4369.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  60%|█████▉    | 3565/5971 [31:55<21:32,  1.86it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0405, train/loss_vlb_step=0.000152, train/loss_step=0.0405, global_step=4370.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3566/5971 [31:56<21:32,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00456, train/loss_step=0.471, global_step=4370.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  60%|█████▉    | 3567/5971 [31:57<21:31,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.51e-5, train/loss_step=0.0201, global_step=4370.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3568/5971 [31:59<21:32,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.51e-5, train/loss_step=0.0201, global_step=4370.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3568/5971 [31:59<21:32,  1.86it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.11e-5, train/loss_step=0.00656, global_step=4370.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3569/5971 [32:00<21:32,  1.86it/s, loss=0.138, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.41e-5, train/loss_step=0.018, global_step=4371.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  60%|█████▉    | 3570/5971 [32:01<21:31,  1.86it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0507, train/loss_vlb_step=0.000173, train/loss_step=0.0507, global_step=4371.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3571/5971 [32:02<21:31,  1.86it/s, loss=0.122, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000187, train/loss_step=0.052, global_step=4371.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  60%|█████▉    | 3572/5971 [32:04<21:32,  1.86it/s, loss=0.122, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000187, train/loss_step=0.052, global_step=4371.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3572/5971 [32:04<21:32,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.00017, train/loss_step=0.0474, global_step=4371.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3573/5971 [32:05<21:31,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.651, train/loss_vlb_step=0.0106, train/loss_step=0.651, global_step=4372.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  60%|█████▉    | 3574/5971 [32:06<21:31,  1.86it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00258, train/loss_vlb_step=1.48e-5, train/loss_step=0.00258, global_step=4372.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3575/5971 [32:07<21:31,  1.86it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0856, train/loss_vlb_step=0.000284, train/loss_step=0.0856, global_step=4372.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3576/5971 [32:09<21:31,  1.85it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0856, train/loss_vlb_step=0.000284, train/loss_step=0.0856, global_step=4372.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3576/5971 [32:09<21:31,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0692, train/loss_vlb_step=0.000235, train/loss_step=0.0692, global_step=4372.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  60%|█████▉    | 3577/5971 [32:10<21:31,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000305, train/loss_step=0.0928, global_step=4373.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3578/5971 [32:11<21:31,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00553, train/loss_step=0.508, global_step=4373.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  60%|█████▉    | 3579/5971 [32:12<21:30,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00316, train/loss_step=0.462, global_step=4373.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3580/5971 [32:14<21:31,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00316, train/loss_step=0.462, global_step=4373.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3580/5971 [32:14<21:31,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.38e-5, train/loss_step=0.015, global_step=4373.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3581/5971 [32:15<21:31,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.00014, train/loss_step=0.0386, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|█████▉    | 3582/5971 [32:16<21:31,  1.85it/s, loss=0.18, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00458, train/loss_step=0.564, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  60%|██████    | 3583/5971 [32:17<21:30,  1.85it/s, loss=0.191, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000892, train/loss_step=0.240, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|██████    | 3584/5971 [32:19<21:31,  1.85it/s, loss=0.191, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000892, train/loss_step=0.240, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  60%|██████    | 3584/5971 [32:19<21:31,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:05,  2.54it/s][A

Validating:   1%|          | 2/167 [00:00<00:49,  3.36it/s][A
Epoch 7:  60%|██████    | 3588/5971 [32:20<21:28,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.65it/s][A
Epoch 7:  60%|██████    | 3592/5971 [32:20<21:24,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.28it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.86it/s][A
Epoch 7:  60%|██████    | 3596/5971 [32:20<21:21,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.75it/s][A
Epoch 7:  60%|██████    | 3600/5971 [32:20<21:17,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.22it/s][A
Epoch 7:  60%|██████    | 3604/5971 [32:20<21:14,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.20it/s][A
Epoch 7:  60%|██████    | 3608/5971 [32:20<21:10,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.28it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.07it/s][A
Epoch 7:  60%|██████    | 3612/5971 [32:21<21:07,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.66it/s][A
Epoch 7:  61%|██████    | 3616/5971 [32:21<21:03,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.06it/s][A
Epoch 7:  61%|██████    | 3620/5971 [32:21<21:00,  1.87it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 24.44it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 23.69it/s][A
Epoch 7:  61%|██████    | 3624/5971 [32:21<20:57,  1.87it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 23.17it/s][A
Epoch 7:  61%|██████    | 3628/5971 [32:21<20:53,  1.87it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  27%|██▋       | 45/167 [00:02<00:05, 23.64it/s][A
Epoch 7:  61%|██████    | 3632/5971 [32:21<20:50,  1.87it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▊       | 48/167 [00:02<00:05, 23.38it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 24.98it/s][A
Epoch 7:  61%|██████    | 3636/5971 [32:22<20:46,  1.87it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.22it/s][A
Epoch 7:  61%|██████    | 3640/5971 [32:22<20:43,  1.87it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.33it/s][A
Epoch 7:  61%|██████    | 3644/5971 [32:22<20:40,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.22it/s][A

Validating:  38%|███▊      | 63/167 [00:03<00:03, 26.44it/s][A
Epoch 7:  61%|██████    | 3648/5971 [32:22<20:36,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.40it/s][A
Epoch 7:  61%|██████    | 3652/5971 [32:22<20:33,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.48it/s][A
Epoch 7:  61%|██████    | 3656/5971 [32:22<20:29,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.56it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.35it/s][A
Epoch 7:  61%|██████▏   | 3660/5971 [32:22<20:26,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 27.14it/s][A
Epoch 7:  61%|██████▏   | 3664/5971 [32:23<20:23,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.90it/s][A
Epoch 7:  61%|██████▏   | 3668/5971 [32:23<20:19,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|█████     | 84/167 [00:03<00:02, 28.41it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 28.29it/s][A
Epoch 7:  61%|██████▏   | 3672/5971 [32:23<20:16,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 28.62it/s][A
Epoch 7:  62%|██████▏   | 3676/5971 [32:23<20:13,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 29.34it/s][A
Epoch 7:  62%|██████▏   | 3680/5971 [32:23<20:09,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 28.75it/s][A
Epoch 7:  62%|██████▏   | 3684/5971 [32:23<20:06,  1.90it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 28.51it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.48it/s][A
Epoch 7:  62%|██████▏   | 3688/5971 [32:23<20:03,  1.90it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.43it/s][A
Epoch 7:  62%|██████▏   | 3692/5971 [32:24<19:59,  1.90it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.74it/s][A
Epoch 7:  62%|██████▏   | 3696/5971 [32:24<19:56,  1.90it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.20it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:02, 25.01it/s][A
Epoch 7:  62%|██████▏   | 3700/5971 [32:24<19:53,  1.90it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████   | 118/167 [00:05<00:01, 26.06it/s][A
Epoch 7:  62%|██████▏   | 3704/5971 [32:24<19:49,  1.91it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 24.90it/s][A
Epoch 7:  62%|██████▏   | 3708/5971 [32:24<19:46,  1.91it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.02it/s][A
Epoch 7:  62%|██████▏   | 3712/5971 [32:24<19:43,  1.91it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 25.90it/s][A

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.26it/s][A
Epoch 7:  62%|██████▏   | 3716/5971 [32:25<19:39,  1.91it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|████████  | 134/167 [00:05<00:01, 25.29it/s][A
Epoch 7:  62%|██████▏   | 3720/5971 [32:25<19:36,  1.91it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.92it/s][A
Epoch 7:  62%|██████▏   | 3724/5971 [32:25<19:33,  1.91it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 26.70it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.68it/s][A
Epoch 7:  62%|██████▏   | 3728/5971 [32:25<19:30,  1.92it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.01it/s][A
Epoch 7:  63%|██████▎   | 3732/5971 [32:25<19:26,  1.92it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.52it/s][A
Epoch 7:  63%|██████▎   | 3736/5971 [32:25<19:23,  1.92it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.94it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 28.08it/s][A
Epoch 7:  63%|██████▎   | 3740/5971 [32:25<19:20,  1.92it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 28.36it/s][A
Epoch 7:  63%|██████▎   | 3744/5971 [32:26<19:17,  1.92it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.25it/s][A
Epoch 7:  63%|██████▎   | 3748/5971 [32:26<19:14,  1.93it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.39it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.46it/s][A
Epoch 7:  63%|██████▎   | 3752/5971 [32:26<19:10,  1.93it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3752/5971 [32:26<19:10,  1.93it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.49e-5, train/loss_step=0.0157, global_step=4374.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  63%|██████▎   | 3753/5971 [32:27<19:10,  1.93it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00694, train/loss_vlb_step=3.29e-5, train/loss_step=0.00694, global_step=4375.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3754/5971 [32:28<19:10,  1.93it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.7e-5, train/loss_step=0.00315, global_step=4375.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  63%|██████▎   | 3755/5971 [32:29<19:10,  1.93it/s, loss=0.162, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00205, train/loss_step=0.319, global_step=4375.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  63%|██████▎   | 3756/5971 [32:31<19:10,  1.92it/s, loss=0.162, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00205, train/loss_step=0.319, global_step=4375.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3756/5971 [32:31<19:10,  1.92it/s, loss=0.164, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000123, train/loss_step=0.034, global_step=4375.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3757/5971 [32:32<19:10,  1.92it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00966, train/loss_vlb_step=4.45e-5, train/loss_step=0.00966, global_step=4376.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3758/5971 [32:33<19:10,  1.92it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.55e-6, train/loss_step=0.0016, global_step=4376.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  63%|██████▎   | 3759/5971 [32:34<19:09,  1.92it/s, loss=0.178, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.0033, train/loss_step=0.398, global_step=4376.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  63%|██████▎   | 3760/5971 [32:36<19:10,  1.92it/s, loss=0.178, v_num=0, train/loss_simple_step=0.398, train/loss_vlb_step=0.0033, train/loss_step=0.398, global_step=4376.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3760/5971 [32:36<19:10,  1.92it/s, loss=0.188, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000916, train/loss_step=0.242, global_step=4376.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3761/5971 [32:37<19:09,  1.92it/s, loss=0.172, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.0016, train/loss_step=0.338, global_step=4377.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  63%|██████▎   | 3762/5971 [32:38<19:09,  1.92it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000164, train/loss_step=0.0456, global_step=4377.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3763/5971 [32:39<19:09,  1.92it/s, loss=0.171, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.66e-5, train/loss_step=0.021, global_step=4377.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  63%|██████▎   | 3764/5971 [32:41<19:09,  1.92it/s, loss=0.171, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.66e-5, train/loss_step=0.021, global_step=4377.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3764/5971 [32:41<19:09,  1.92it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0555, train/loss_vlb_step=0.000191, train/loss_step=0.0555, global_step=4377.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3765/5971 [32:42<19:09,  1.92it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=8.79e-5, train/loss_step=0.0225, global_step=4378.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  63%|██████▎   | 3766/5971 [32:43<19:09,  1.92it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0237, train/loss_vlb_step=9.85e-5, train/loss_step=0.0237, global_step=4378.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3767/5971 [32:44<19:08,  1.92it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00695, train/loss_vlb_step=3.53e-5, train/loss_step=0.00695, global_step=4378.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3768/5971 [32:46<19:09,  1.92it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00695, train/loss_vlb_step=3.53e-5, train/loss_step=0.00695, global_step=4378.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3768/5971 [32:46<19:09,  1.92it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00608, train/loss_vlb_step=3e-5, train/loss_step=0.00608, global_step=4378.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  63%|██████▎   | 3769/5971 [32:47<19:09,  1.92it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00411, train/loss_vlb_step=2.01e-5, train/loss_step=0.00411, global_step=4379.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3770/5971 [32:48<19:08,  1.92it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=9.09e-5, train/loss_step=0.0239, global_step=4379.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  63%|██████▎   | 3771/5971 [32:49<19:08,  1.92it/s, loss=0.082, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000219, train/loss_step=0.0631, global_step=4379.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3772/5971 [32:51<19:08,  1.91it/s, loss=0.082, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000219, train/loss_step=0.0631, global_step=4379.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3772/5971 [32:51<19:08,  1.91it/s, loss=0.0827, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000108, train/loss_step=0.0296, global_step=4379.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3773/5971 [32:52<19:08,  1.91it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.87e-5, train/loss_step=0.0168, global_step=4380.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  63%|██████▎   | 3774/5971 [32:53<19:08,  1.91it/s, loss=0.0835, v_num=0, train/loss_simple_step=0.00911, train/loss_vlb_step=4.05e-5, train/loss_step=0.00911, global_step=4380.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3775/5971 [32:54<19:08,  1.91it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00246, train/loss_step=0.360, global_step=4380.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  63%|██████▎   | 3776/5971 [32:56<19:08,  1.91it/s, loss=0.0856, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00246, train/loss_step=0.360, global_step=4380.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3776/5971 [32:56<19:08,  1.91it/s, loss=0.0903, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000428, train/loss_step=0.128, global_step=4380.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3777/5971 [32:57<19:08,  1.91it/s, loss=0.0902, v_num=0, train/loss_simple_step=0.00712, train/loss_vlb_step=3.44e-5, train/loss_step=0.00712, global_step=4381.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3778/5971 [32:58<19:07,  1.91it/s, loss=0.0972, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000482, train/loss_step=0.142, global_step=4381.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  63%|██████▎   | 3779/5971 [32:59<19:07,  1.91it/s, loss=0.0783, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.97e-5, train/loss_step=0.0193, global_step=4381.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3780/5971 [33:01<19:08,  1.91it/s, loss=0.0783, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.97e-5, train/loss_step=0.0193, global_step=4381.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3780/5971 [33:01<19:08,  1.91it/s, loss=0.0664, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=3.07e-5, train/loss_step=0.00588, global_step=4381.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3781/5971 [33:02<19:07,  1.91it/s, loss=0.0897, v_num=0, train/loss_simple_step=0.803, train/loss_vlb_step=0.0684, train/loss_step=0.803, global_step=4382.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  63%|██████▎   | 3782/5971 [33:03<19:07,  1.91it/s, loss=0.114, v_num=0, train/loss_simple_step=0.540, train/loss_vlb_step=0.00537, train/loss_step=0.540, global_step=4382.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3783/5971 [33:03<19:07,  1.91it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.24e-5, train/loss_step=0.00219, global_step=4382.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3784/5971 [33:06<19:07,  1.91it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.24e-5, train/loss_step=0.00219, global_step=4382.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3784/5971 [33:06<19:07,  1.91it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0358, train/loss_vlb_step=0.000135, train/loss_step=0.0358, global_step=4382.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  63%|██████▎   | 3785/5971 [33:06<19:07,  1.91it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00468, train/loss_vlb_step=2.47e-5, train/loss_step=0.00468, global_step=4383.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3786/5971 [33:07<19:06,  1.91it/s, loss=0.117, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000457, train/loss_step=0.135, global_step=4383.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  63%|██████▎   | 3787/5971 [33:08<19:06,  1.90it/s, loss=0.136, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00204, train/loss_step=0.389, global_step=4383.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  63%|██████▎   | 3788/5971 [33:10<19:06,  1.90it/s, loss=0.136, v_num=0, train/loss_simple_step=0.389, train/loss_vlb_step=0.00204, train/loss_step=0.389, global_step=4383.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3788/5971 [33:10<19:06,  1.90it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.16e-5, train/loss_step=0.0245, global_step=4383.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3789/5971 [33:11<19:06,  1.90it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0261, train/loss_vlb_step=0.000107, train/loss_step=0.0261, global_step=4384.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  63%|██████▎   | 3790/5971 [33:12<19:06,  1.90it/s, loss=0.15, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00096, train/loss_step=0.264, global_step=4384.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  63%|██████▎   | 3791/5971 [33:13<19:06,  1.90it/s, loss=0.153, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000384, train/loss_step=0.113, global_step=4384.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3792/5971 [33:15<19:06,  1.90it/s, loss=0.153, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000384, train/loss_step=0.113, global_step=4384.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3792/5971 [33:15<19:06,  1.90it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0102, train/loss_vlb_step=4.67e-5, train/loss_step=0.0102, global_step=4384.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3793/5971 [33:16<19:06,  1.90it/s, loss=0.157, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000408, train/loss_step=0.120, global_step=4385.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▎   | 3794/5971 [33:17<19:05,  1.90it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0758, train/loss_vlb_step=0.000251, train/loss_step=0.0758, global_step=4385.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3795/5971 [33:18<19:05,  1.90it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.07e-5, train/loss_step=0.00184, global_step=4385.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3796/5971 [33:20<19:06,  1.90it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00184, train/loss_vlb_step=1.07e-5, train/loss_step=0.00184, global_step=4385.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3796/5971 [33:20<19:06,  1.90it/s, loss=0.147, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000802, train/loss_step=0.213, global_step=4385.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  64%|██████▎   | 3797/5971 [33:21<19:05,  1.90it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00509, train/loss_vlb_step=2.55e-5, train/loss_step=0.00509, global_step=4386.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3798/5971 [33:22<19:05,  1.90it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=9.37e-5, train/loss_step=0.0255, global_step=4386.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  64%|██████▎   | 3799/5971 [33:23<19:05,  1.90it/s, loss=0.146, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=4386.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▎   | 3800/5971 [33:25<19:05,  1.90it/s, loss=0.146, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000404, train/loss_step=0.123, global_step=4386.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3800/5971 [33:25<19:05,  1.90it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0376, train/loss_vlb_step=0.000139, train/loss_step=0.0376, global_step=4386.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3801/5971 [33:26<19:05,  1.89it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0984, train/loss_vlb_step=0.000332, train/loss_step=0.0984, global_step=4387.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3802/5971 [33:27<19:04,  1.89it/s, loss=0.11, v_num=0, train/loss_simple_step=0.489, train/loss_vlb_step=0.00524, train/loss_step=0.489, global_step=4387.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  64%|██████▎   | 3803/5971 [33:28<19:04,  1.89it/s, loss=0.116, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000426, train/loss_step=0.125, global_step=4387.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3804/5971 [33:30<19:04,  1.89it/s, loss=0.116, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000426, train/loss_step=0.125, global_step=4387.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3804/5971 [33:30<19:04,  1.89it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.88e-5, train/loss_step=0.0103, global_step=4387.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3805/5971 [33:31<19:04,  1.89it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00167, train/loss_vlb_step=9.91e-6, train/loss_step=0.00167, global_step=4388.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▎   | 3806/5971 [33:32<19:04,  1.89it/s, loss=0.108, v_num=0, train/loss_simple_step=0.00558, train/loss_vlb_step=2.94e-5, train/loss_step=0.00558, global_step=4388.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3807/5971 [33:33<19:03,  1.89it/s, loss=0.091, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000169, train/loss_step=0.0499, global_step=4388.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▍   | 3808/5971 [33:35<19:04,  1.89it/s, loss=0.091, v_num=0, train/loss_simple_step=0.0499, train/loss_vlb_step=0.000169, train/loss_step=0.0499, global_step=4388.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3808/5971 [33:35<19:04,  1.89it/s, loss=0.109, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00224, train/loss_step=0.393, global_step=4388.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  64%|██████▍   | 3809/5971 [33:36<19:04,  1.89it/s, loss=0.111, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000217, train/loss_step=0.065, global_step=4389.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3810/5971 [33:36<19:03,  1.89it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=6.34e-5, train/loss_step=0.0157, global_step=4389.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3811/5971 [33:37<19:03,  1.89it/s, loss=0.114, v_num=0, train/loss_simple_step=0.419, train/loss_vlb_step=0.00261, train/loss_step=0.419, global_step=4389.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  64%|██████▍   | 3812/5971 [33:39<19:03,  1.89it/s, loss=0.114, v_num=0, train/loss_simple_step=0.419, train/loss_vlb_step=0.00261, train/loss_step=0.419, global_step=4389.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3812/5971 [33:39<19:03,  1.89it/s, loss=0.144, v_num=0, train/loss_simple_step=0.614, train/loss_vlb_step=0.00603, train/loss_step=0.614, global_step=4389.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3813/5971 [33:40<19:03,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00148, train/loss_step=0.347, global_step=4390.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3814/5971 [33:41<19:03,  1.89it/s, loss=0.175, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00413, train/loss_step=0.465, global_step=4390.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3815/5971 [33:42<19:02,  1.89it/s, loss=0.197, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.003, train/loss_step=0.443, global_step=4390.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  64%|██████▍   | 3816/5971 [33:44<19:03,  1.89it/s, loss=0.197, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.003, train/loss_step=0.443, global_step=4390.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3816/5971 [33:44<19:03,  1.89it/s, loss=0.219, v_num=0, train/loss_simple_step=0.655, train/loss_vlb_step=0.00652, train/loss_step=0.655, global_step=4390.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3817/5971 [33:45<19:02,  1.88it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00783, train/loss_vlb_step=3.62e-5, train/loss_step=0.00783, global_step=4391.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3818/5971 [33:46<19:02,  1.88it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0636, train/loss_vlb_step=0.000219, train/loss_step=0.0636, global_step=4391.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3819/5971 [33:47<19:02,  1.88it/s, loss=0.226, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000801, train/loss_step=0.218, global_step=4391.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  64%|██████▍   | 3820/5971 [33:49<19:02,  1.88it/s, loss=0.226, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000801, train/loss_step=0.218, global_step=4391.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3820/5971 [33:49<19:02,  1.88it/s, loss=0.226, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.00014, train/loss_step=0.037, global_step=4391.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▍   | 3821/5971 [33:50<19:02,  1.88it/s, loss=0.238, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00171, train/loss_step=0.334, global_step=4392.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3822/5971 [33:51<19:01,  1.88it/s, loss=0.229, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.00172, train/loss_step=0.311, global_step=4392.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3823/5971 [33:52<19:01,  1.88it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.00017, train/loss_step=0.0475, global_step=4392.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3824/5971 [33:54<19:02,  1.88it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.00017, train/loss_step=0.0475, global_step=4392.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3824/5971 [33:54<19:02,  1.88it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000131, train/loss_step=0.0352, global_step=4392.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3825/5971 [33:55<19:01,  1.88it/s, loss=0.232, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000348, train/loss_step=0.106, global_step=4393.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  64%|██████▍   | 3826/5971 [33:56<19:01,  1.88it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0938, train/loss_vlb_step=0.000315, train/loss_step=0.0938, global_step=4393.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3827/5971 [33:57<19:01,  1.88it/s, loss=0.247, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00105, train/loss_step=0.275, global_step=4393.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  64%|██████▍   | 3828/5971 [33:59<19:01,  1.88it/s, loss=0.247, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00105, train/loss_step=0.275, global_step=4393.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3828/5971 [33:59<19:01,  1.88it/s, loss=0.233, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.00033, train/loss_step=0.101, global_step=4393.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3829/5971 [34:00<19:01,  1.88it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0543, train/loss_vlb_step=0.000187, train/loss_step=0.0543, global_step=4394.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3830/5971 [34:01<19:00,  1.88it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0893, train/loss_vlb_step=0.000293, train/loss_step=0.0893, global_step=4394.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3831/5971 [34:02<19:00,  1.88it/s, loss=0.218, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000248, train/loss_step=0.073, global_step=4394.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  64%|██████▍   | 3832/5971 [34:04<19:00,  1.88it/s, loss=0.218, v_num=0, train/loss_simple_step=0.073, train/loss_vlb_step=0.000248, train/loss_step=0.073, global_step=4394.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3832/5971 [34:04<19:00,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00202, train/loss_vlb_step=1.19e-5, train/loss_step=0.00202, global_step=4394.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3833/5971 [34:05<19:00,  1.87it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0155, train/loss_vlb_step=6.96e-5, train/loss_step=0.0155, global_step=4395.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  64%|██████▍   | 3834/5971 [34:05<19:00,  1.87it/s, loss=0.172, v_num=0, train/loss_simple_step=0.485, train/loss_vlb_step=0.00425, train/loss_step=0.485, global_step=4395.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  64%|██████▍   | 3835/5971 [34:06<18:59,  1.87it/s, loss=0.191, v_num=0, train/loss_simple_step=0.808, train/loss_vlb_step=0.0302, train/loss_step=0.808, global_step=4395.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▍   | 3836/5971 [34:08<19:00,  1.87it/s, loss=0.191, v_num=0, train/loss_simple_step=0.808, train/loss_vlb_step=0.0302, train/loss_step=0.808, global_step=4395.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3836/5971 [34:08<19:00,  1.87it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00898, train/loss_vlb_step=3.96e-5, train/loss_step=0.00898, global_step=4395.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3837/5971 [34:09<18:59,  1.87it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.84e-5, train/loss_step=0.00337, global_step=4396.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3838/5971 [34:10<18:59,  1.87it/s, loss=0.181, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.0034, train/loss_step=0.524, global_step=4396.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  64%|██████▍   | 3839/5971 [34:11<18:59,  1.87it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0676, train/loss_vlb_step=0.000224, train/loss_step=0.0676, global_step=4396.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3840/5971 [34:13<18:59,  1.87it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0676, train/loss_vlb_step=0.000224, train/loss_step=0.0676, global_step=4396.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3840/5971 [34:13<18:59,  1.87it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0525, train/loss_vlb_step=0.000181, train/loss_step=0.0525, global_step=4396.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3841/5971 [34:14<18:59,  1.87it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.51e-5, train/loss_step=0.0247, global_step=4397.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▍   | 3842/5971 [34:15<18:58,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000637, train/loss_step=0.164, global_step=4397.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▍   | 3843/5971 [34:16<18:58,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.00048, train/loss_step=0.143, global_step=4397.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▍   | 3844/5971 [34:18<18:58,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.00048, train/loss_step=0.143, global_step=4397.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3844/5971 [34:18<18:58,  1.87it/s, loss=0.16, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000381, train/loss_step=0.113, global_step=4397.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3845/5971 [34:19<18:58,  1.87it/s, loss=0.164, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000631, train/loss_step=0.182, global_step=4398.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3846/5971 [34:20<18:58,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.00557, train/loss_step=0.504, global_step=4398.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  64%|██████▍   | 3847/5971 [34:21<18:57,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000584, train/loss_step=0.166, global_step=4398.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3848/5971 [34:23<18:58,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000584, train/loss_step=0.166, global_step=4398.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3848/5971 [34:23<18:58,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000305, train/loss_step=0.0926, global_step=4398.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3849/5971 [34:24<18:57,  1.87it/s, loss=0.196, v_num=0, train/loss_simple_step=0.411, train/loss_vlb_step=0.00259, train/loss_step=0.411, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  64%|██████▍   | 3850/5971 [34:25<18:57,  1.86it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0283, train/loss_vlb_step=0.000109, train/loss_step=0.0283, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  64%|██████▍   | 3851/5971 [34:25<18:57,  1.86it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.000231, train/loss_step=0.0649, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  65%|██████▍   | 3852/5971 [34:28<18:57,  1.86it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0649, train/loss_vlb_step=0.000231, train/loss_step=0.0649, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  65%|██████▍   | 3852/5971 [34:28<18:57,  1.86it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:17,  2.14it/s][A

Validating:   1%|          | 2/167 [00:00<00:56,  2.92it/s][A
Epoch 7:  65%|██████▍   | 3856/5971 [34:28<18:54,  1.86it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   2%|▏         | 4/167 [00:00<00:27,  5.95it/s][A

Validating:   4%|▍         | 7/167 [00:00<00:15, 10.54it/s][A
Epoch 7:  65%|██████▍   | 3860/5971 [34:29<18:51,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   6%|▌         | 10/167 [00:01<00:11, 13.95it/s][A
Epoch 7:  65%|██████▍   | 3864/5971 [34:29<18:48,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 13/167 [00:01<00:08, 17.41it/s][A
Epoch 7:  65%|██████▍   | 3868/5971 [34:29<18:44,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.77it/s][A

Validating:  11%|█▏        | 19/167 [00:01<00:07, 20.95it/s][A
Epoch 7:  65%|██████▍   | 3872/5971 [34:29<18:41,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  13%|█▎        | 22/167 [00:01<00:06, 21.89it/s][A
Epoch 7:  65%|██████▍   | 3876/5971 [34:29<18:38,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 23.84it/s][A
Epoch 7:  65%|██████▍   | 3880/5971 [34:29<18:35,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 23.69it/s][A

Validating:  19%|█▊        | 31/167 [00:01<00:05, 23.92it/s][A
Epoch 7:  65%|██████▌   | 3884/5971 [34:30<18:32,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:02<00:05, 25.93it/s][A
Epoch 7:  65%|██████▌   | 3888/5971 [34:30<18:28,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 39/167 [00:02<00:04, 27.22it/s][A
Epoch 7:  65%|██████▌   | 3892/5971 [34:30<18:25,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.82it/s][A
Epoch 7:  65%|██████▌   | 3896/5971 [34:30<18:22,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.70it/s][A
Epoch 7:  65%|██████▌   | 3900/5971 [34:30<18:19,  1.88it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.64it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 25.59it/s][A
Epoch 7:  65%|██████▌   | 3904/5971 [34:30<18:16,  1.89it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.21it/s][A
Epoch 7:  65%|██████▌   | 3908/5971 [34:30<18:12,  1.89it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.44it/s][A
Epoch 7:  66%|██████▌   | 3912/5971 [34:31<18:09,  1.89it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 61/167 [00:03<00:03, 27.98it/s][A
Epoch 7:  66%|██████▌   | 3916/5971 [34:31<18:06,  1.89it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.84it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 26.81it/s][A
Epoch 7:  66%|██████▌   | 3920/5971 [34:31<18:03,  1.89it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.27it/s][A
Epoch 7:  66%|██████▌   | 3924/5971 [34:31<18:00,  1.89it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 26.36it/s][A
Epoch 7:  66%|██████▌   | 3928/5971 [34:31<17:57,  1.90it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.65it/s][A
Epoch 7:  66%|██████▌   | 3932/5971 [34:31<17:54,  1.90it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 28.11it/s][A
Epoch 7:  66%|██████▌   | 3936/5971 [34:31<17:50,  1.90it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|█████     | 84/167 [00:03<00:02, 28.58it/s][A
Epoch 7:  66%|██████▌   | 3940/5971 [34:32<17:47,  1.90it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 29.08it/s][A

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 28.42it/s][A
Epoch 7:  66%|██████▌   | 3944/5971 [34:32<17:44,  1.90it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 28.22it/s][A
Epoch 7:  66%|██████▌   | 3948/5971 [34:32<17:41,  1.91it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 28.17it/s][A
Epoch 7:  66%|██████▌   | 3952/5971 [34:32<17:38,  1.91it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  60%|██████    | 101/167 [00:04<00:02, 28.37it/s][A
Epoch 7:  66%|██████▋   | 3956/5971 [34:32<17:35,  1.91it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.57it/s][A
Epoch 7:  66%|██████▋   | 3960/5971 [34:32<17:32,  1.91it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 28.64it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.51it/s][A
Epoch 7:  66%|██████▋   | 3964/5971 [34:32<17:29,  1.91it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 24.84it/s][A
Epoch 7:  66%|██████▋   | 3968/5971 [34:33<17:26,  1.91it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:05<00:01, 25.08it/s][A
Epoch 7:  67%|██████▋   | 3972/5971 [34:33<17:23,  1.92it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 26.22it/s][A
Epoch 7:  67%|██████▋   | 3976/5971 [34:33<17:20,  1.92it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.42it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.32it/s][A
Epoch 7:  67%|██████▋   | 3980/5971 [34:33<17:17,  1.92it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.20it/s][A
Epoch 7:  67%|██████▋   | 3984/5971 [34:33<17:13,  1.92it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|████████  | 134/167 [00:05<00:01, 26.94it/s][A
Epoch 7:  67%|██████▋   | 3988/5971 [34:33<17:10,  1.92it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.72it/s][A
Epoch 7:  67%|██████▋   | 3992/5971 [34:33<17:07,  1.93it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.18it/s][A

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.73it/s][A
Epoch 7:  67%|██████▋   | 3996/5971 [34:34<17:04,  1.93it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 26.39it/s][A
Epoch 7:  67%|██████▋   | 4000/5971 [34:34<17:01,  1.93it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.96it/s][A
Epoch 7:  67%|██████▋   | 4004/5971 [34:34<16:58,  1.93it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 25.99it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.95it/s][A
Epoch 7:  67%|██████▋   | 4008/5971 [34:34<16:55,  1.93it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.19it/s][A
Epoch 7:  67%|██████▋   | 4012/5971 [34:34<16:52,  1.93it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.86it/s][A
Epoch 7:  67%|██████▋   | 4016/5971 [34:34<16:49,  1.94it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.49it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 27.45it/s][A
Epoch 7:  67%|██████▋   | 4020/5971 [34:35<16:46,  1.94it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4020/5971 [34:35<16:46,  1.94it/s, loss=0.223, v_num=0, train/loss_simple_step=0.606, train/loss_vlb_step=0.0154, train/loss_step=0.606, global_step=4399.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.45it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.31it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.32it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.64it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.17it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.50it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.57it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.62it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.64it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.70it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.72it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.72it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.72it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.72it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.72it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.57it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.64it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.64it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.61it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.61it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.61it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.62it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.66it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.30it/s]

Epoch 7:  67%|██████▋   | 4021/5971 [34:47<16:51,  1.93it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0739, train/loss_vlb_step=0.000249, train/loss_step=0.0739, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4021/5971 [34:47<16:52,  1.93it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0739, train/loss_vlb_step=0.000249, train/loss_step=0.0739, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.73it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.94it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.15it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.42it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.50it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.52it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.52it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.47it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.55it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.57it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.60it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.58it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.52it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.52it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.50it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.41it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.43it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.44it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.29it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.37it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.40it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.47it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.56it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.63it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.67it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.68it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.22it/s]

Epoch 7:  67%|██████▋   | 4022/5971 [34:58<16:56,  1.92it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0739, train/loss_vlb_step=0.000249, train/loss_step=0.0739, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4022/5971 [34:58<16:56,  1.92it/s, loss=0.214, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000803, train/loss_step=0.232, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.45it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.31it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.69it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.89it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.04it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  5.03it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.08it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.32it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.41it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.41it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.36it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.37it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.48it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.54it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.61it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.63it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.50it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.47it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.47it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.49it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.61it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.51it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.46it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.45it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.54it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.54it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.18it/s]

Epoch 7:  67%|██████▋   | 4023/5971 [35:11<17:01,  1.91it/s, loss=0.214, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000803, train/loss_step=0.232, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4023/5971 [35:11<17:01,  1.91it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.19e-5, train/loss_step=0.00406, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.42it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.26it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.91it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.40it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.76it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.03it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.23it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.49it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.51it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.38it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.37it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.35it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.31it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.26it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.24it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.25it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.50it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.54it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.41it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.43it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.33it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.35it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.45it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.58it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.59it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.57it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.59it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.58it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.52it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.51it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.50it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.37it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.27it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.22it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.14it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.02it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  4.95it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.10it/s]

Epoch 7:  67%|██████▋   | 4024/5971 [35:24<17:07,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.19e-5, train/loss_step=0.00406, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4024/5971 [35:24<17:07,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.77e-5, train/loss_step=0.00536, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4025/5971 [35:25<17:07,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00536, train/loss_vlb_step=2.77e-5, train/loss_step=0.00536, global_step=4400.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4025/5971 [35:25<17:07,  1.89it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.25e-5, train/loss_step=0.0218, global_step=4401.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  67%|██████▋   | 4026/5971 [35:26<17:06,  1.89it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0218, train/loss_vlb_step=8.25e-5, train/loss_step=0.0218, global_step=4401.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4026/5971 [35:26<17:06,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.560, train/loss_vlb_step=0.00566, train/loss_step=0.560, global_step=4401.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  67%|██████▋   | 4027/5971 [35:27<17:06,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.560, train/loss_vlb_step=0.00566, train/loss_step=0.560, global_step=4401.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4027/5971 [35:27<17:06,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.86e-5, train/loss_step=0.0162, global_step=4401.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4028/5971 [35:29<17:06,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.86e-5, train/loss_step=0.0162, global_step=4401.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4028/5971 [35:29<17:06,  1.89it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0936, train/loss_vlb_step=0.000313, train/loss_step=0.0936, global_step=4401.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4029/5971 [35:30<17:06,  1.89it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0936, train/loss_vlb_step=0.000313, train/loss_step=0.0936, global_step=4401.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4029/5971 [35:30<17:06,  1.89it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=9.57e-6, train/loss_step=0.00171, global_step=4402.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4030/5971 [35:30<17:06,  1.89it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00171, train/loss_vlb_step=9.57e-6, train/loss_step=0.00171, global_step=4402.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  67%|██████▋   | 4030/5971 [35:30<17:06,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=1.99e-5, train/loss_step=0.00374, global_step=4402.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4031/5971 [35:31<17:05,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=1.99e-5, train/loss_step=0.00374, global_step=4402.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4031/5971 [35:31<17:05,  1.89it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.4e-5, train/loss_step=0.0179, global_step=4402.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  68%|██████▊   | 4032/5971 [35:33<17:05,  1.89it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.4e-5, train/loss_step=0.0179, global_step=4402.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4032/5971 [35:33<17:05,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000144, train/loss_step=0.0382, global_step=4402.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4033/5971 [35:34<17:05,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0382, train/loss_vlb_step=0.000144, train/loss_step=0.0382, global_step=4402.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4033/5971 [35:34<17:05,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00561, train/loss_step=0.499, global_step=4403.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  68%|██████▊   | 4034/5971 [35:35<17:05,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.499, train/loss_vlb_step=0.00561, train/loss_step=0.499, global_step=4403.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4034/5971 [35:35<17:05,  1.89it/s, loss=0.194, v_num=0, train/loss_simple_step=0.948, train/loss_vlb_step=0.239, train/loss_step=0.948, global_step=4403.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4035/5971 [35:36<17:04,  1.89it/s, loss=0.194, v_num=0, train/loss_simple_step=0.948, train/loss_vlb_step=0.239, train/loss_step=0.948, global_step=4403.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4035/5971 [35:36<17:04,  1.89it/s, loss=0.194, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000516, train/loss_step=0.154, global_step=4403.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4036/5971 [35:38<17:05,  1.89it/s, loss=0.194, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000516, train/loss_step=0.154, global_step=4403.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4036/5971 [35:38<17:05,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.00047, train/loss_step=0.142, global_step=4403.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4037/5971 [35:39<17:04,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.00047, train/loss_step=0.142, global_step=4403.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4037/5971 [35:39<17:04,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.26e-5, train/loss_step=0.00217, global_step=4404.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4038/5971 [35:40<17:04,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.26e-5, train/loss_step=0.00217, global_step=4404.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4038/5971 [35:40<17:04,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000115, train/loss_step=0.0288, global_step=4404.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4039/5971 [35:41<17:04,  1.89it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000115, train/loss_step=0.0288, global_step=4404.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4039/5971 [35:41<17:04,  1.89it/s, loss=0.181, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000595, train/loss_step=0.178, global_step=4404.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4040/5971 [35:43<17:04,  1.89it/s, loss=0.181, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000595, train/loss_step=0.178, global_step=4404.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4040/5971 [35:43<17:04,  1.89it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.2e-5, train/loss_step=0.00422, global_step=4404.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4041/5971 [35:44<17:03,  1.88it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00422, train/loss_vlb_step=2.2e-5, train/loss_step=0.00422, global_step=4404.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4041/5971 [35:44<17:03,  1.88it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0435, train/loss_vlb_step=0.000152, train/loss_step=0.0435, global_step=4405.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4042/5971 [35:45<17:03,  1.88it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0435, train/loss_vlb_step=0.000152, train/loss_step=0.0435, global_step=4405.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4042/5971 [35:45<17:03,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000296, train/loss_step=0.0894, global_step=4405.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4043/5971 [35:46<17:03,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0894, train/loss_vlb_step=0.000296, train/loss_step=0.0894, global_step=4405.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4043/5971 [35:46<17:03,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.59e-5, train/loss_step=0.0189, global_step=4405.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4044/5971 [35:48<17:03,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.59e-5, train/loss_step=0.0189, global_step=4405.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4044/5971 [35:48<17:03,  1.88it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.00025, train/loss_step=0.0726, global_step=4405.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4045/5971 [35:49<17:03,  1.88it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.00025, train/loss_step=0.0726, global_step=4405.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4045/5971 [35:49<17:03,  1.88it/s, loss=0.165, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00201, train/loss_step=0.379, global_step=4406.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4046/5971 [35:50<17:02,  1.88it/s, loss=0.165, v_num=0, train/loss_simple_step=0.379, train/loss_vlb_step=0.00201, train/loss_step=0.379, global_step=4406.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4046/5971 [35:50<17:02,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.64e-5, train/loss_step=0.0104, global_step=4406.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4047/5971 [35:51<17:02,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.64e-5, train/loss_step=0.0104, global_step=4406.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4047/5971 [35:51<17:02,  1.88it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00426, train/loss_vlb_step=2.18e-5, train/loss_step=0.00426, global_step=4406.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4048/5971 [35:53<17:02,  1.88it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00426, train/loss_vlb_step=2.18e-5, train/loss_step=0.00426, global_step=4406.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4048/5971 [35:53<17:02,  1.88it/s, loss=0.163, v_num=0, train/loss_simple_step=0.618, train/loss_vlb_step=0.00599, train/loss_step=0.618, global_step=4406.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  68%|██████▊   | 4049/5971 [35:54<17:02,  1.88it/s, loss=0.163, v_num=0, train/loss_simple_step=0.618, train/loss_vlb_step=0.00599, train/loss_step=0.618, global_step=4406.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4049/5971 [35:54<17:02,  1.88it/s, loss=0.182, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00279, train/loss_step=0.386, global_step=4407.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4050/5971 [35:54<17:01,  1.88it/s, loss=0.182, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00279, train/loss_step=0.386, global_step=4407.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4050/5971 [35:54<17:01,  1.88it/s, loss=0.187, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000361, train/loss_step=0.108, global_step=4407.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4051/5971 [35:55<17:01,  1.88it/s, loss=0.187, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000361, train/loss_step=0.108, global_step=4407.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4051/5971 [35:55<17:01,  1.88it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00981, train/loss_vlb_step=4.66e-5, train/loss_step=0.00981, global_step=4407.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4052/5971 [35:57<17:01,  1.88it/s, loss=0.187, v_num=0, train/loss_simple_step=0.00981, train/loss_vlb_step=4.66e-5, train/loss_step=0.00981, global_step=4407.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4052/5971 [35:57<17:01,  1.88it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0059, train/loss_vlb_step=2.96e-5, train/loss_step=0.0059, global_step=4407.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4053/5971 [35:59<17:01,  1.88it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0059, train/loss_vlb_step=2.96e-5, train/loss_step=0.0059, global_step=4407.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4053/5971 [35:59<17:01,  1.88it/s, loss=0.169, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000661, train/loss_step=0.181, global_step=4408.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4054/5971 [35:59<17:01,  1.88it/s, loss=0.169, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000661, train/loss_step=0.181, global_step=4408.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4054/5971 [35:59<17:01,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.08e-5, train/loss_step=0.0236, global_step=4408.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4055/5971 [36:00<17:00,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.08e-5, train/loss_step=0.0236, global_step=4408.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4055/5971 [36:00<17:00,  1.88it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.46e-5, train/loss_step=0.00274, global_step=4408.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4056/5971 [36:02<17:00,  1.88it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00274, train/loss_vlb_step=1.46e-5, train/loss_step=0.00274, global_step=4408.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4056/5971 [36:02<17:00,  1.88it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.7e-5, train/loss_step=0.0219, global_step=4408.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  68%|██████▊   | 4057/5971 [36:03<17:00,  1.88it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.7e-5, train/loss_step=0.0219, global_step=4408.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4057/5971 [36:03<17:00,  1.88it/s, loss=0.127, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00226, train/loss_step=0.353, global_step=4409.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4058/5971 [36:04<17:00,  1.88it/s, loss=0.127, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00226, train/loss_step=0.353, global_step=4409.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4058/5971 [36:04<17:00,  1.88it/s, loss=0.13, v_num=0, train/loss_simple_step=0.099, train/loss_vlb_step=0.000326, train/loss_step=0.099, global_step=4409.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4059/5971 [36:05<16:59,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.099, train/loss_vlb_step=0.000326, train/loss_step=0.099, global_step=4409.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4059/5971 [36:05<16:59,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000233, train/loss_step=0.0706, global_step=4409.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4060/5971 [36:07<17:00,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000233, train/loss_step=0.0706, global_step=4409.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4060/5971 [36:07<17:00,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.8e-5, train/loss_step=0.0111, global_step=4409.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4061/5971 [36:08<16:59,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.8e-5, train/loss_step=0.0111, global_step=4409.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4061/5971 [36:08<16:59,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00212, train/loss_step=0.385, global_step=4410.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4062/5971 [36:09<16:59,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00212, train/loss_step=0.385, global_step=4410.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4062/5971 [36:09<16:59,  1.87it/s, loss=0.148, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000693, train/loss_step=0.200, global_step=4410.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4063/5971 [36:10<16:58,  1.87it/s, loss=0.148, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000693, train/loss_step=0.200, global_step=4410.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4063/5971 [36:10<16:58,  1.87it/s, loss=0.172, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00551, train/loss_step=0.498, global_step=4410.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4064/5971 [36:12<16:59,  1.87it/s, loss=0.172, v_num=0, train/loss_simple_step=0.498, train/loss_vlb_step=0.00551, train/loss_step=0.498, global_step=4410.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4064/5971 [36:12<16:59,  1.87it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.22e-5, train/loss_step=0.0171, global_step=4410.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4065/5971 [36:13<16:58,  1.87it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.22e-5, train/loss_step=0.0171, global_step=4410.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4065/5971 [36:13<16:58,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000198, train/loss_step=0.055, global_step=4411.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4066/5971 [36:14<16:58,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.055, train/loss_vlb_step=0.000198, train/loss_step=0.055, global_step=4411.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4066/5971 [36:14<16:58,  1.87it/s, loss=0.159, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000436, train/loss_step=0.133, global_step=4411.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4067/5971 [36:15<16:58,  1.87it/s, loss=0.159, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000436, train/loss_step=0.133, global_step=4411.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4067/5971 [36:15<16:58,  1.87it/s, loss=0.182, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00306, train/loss_step=0.458, global_step=4411.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4068/5971 [36:17<16:58,  1.87it/s, loss=0.182, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00306, train/loss_step=0.458, global_step=4411.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4068/5971 [36:17<16:58,  1.87it/s, loss=0.157, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000424, train/loss_step=0.127, global_step=4411.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4069/5971 [36:18<16:57,  1.87it/s, loss=0.157, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000424, train/loss_step=0.127, global_step=4411.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4069/5971 [36:18<16:57,  1.87it/s, loss=0.141, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000182, train/loss_step=0.053, global_step=4412.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4070/5971 [36:18<16:57,  1.87it/s, loss=0.141, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000182, train/loss_step=0.053, global_step=4412.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4070/5971 [36:19<16:57,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000165, train/loss_step=0.0461, global_step=4412.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4071/5971 [36:19<16:57,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000165, train/loss_step=0.0461, global_step=4412.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4071/5971 [36:19<16:57,  1.87it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000155, train/loss_step=0.0425, global_step=4412.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4072/5971 [36:22<16:57,  1.87it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0425, train/loss_vlb_step=0.000155, train/loss_step=0.0425, global_step=4412.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4072/5971 [36:22<16:57,  1.87it/s, loss=0.16, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.0027, train/loss_step=0.423, global_step=4412.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  68%|██████▊   | 4073/5971 [36:22<16:56,  1.87it/s, loss=0.16, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.0027, train/loss_step=0.423, global_step=4412.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4073/5971 [36:22<16:56,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000113, train/loss_step=0.0299, global_step=4413.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4074/5971 [36:23<16:56,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000113, train/loss_step=0.0299, global_step=4413.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4074/5971 [36:23<16:56,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=4413.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4075/5971 [36:24<16:56,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=4413.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4075/5971 [36:24<16:56,  1.87it/s, loss=0.171, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.0016, train/loss_step=0.303, global_step=4413.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4076/5971 [36:26<16:56,  1.86it/s, loss=0.171, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.0016, train/loss_step=0.303, global_step=4413.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4076/5971 [36:26<16:56,  1.86it/s, loss=0.183, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00104, train/loss_step=0.247, global_step=4413.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4077/5971 [36:27<16:56,  1.86it/s, loss=0.183, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.00104, train/loss_step=0.247, global_step=4413.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4077/5971 [36:27<16:56,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.49e-5, train/loss_step=0.00265, global_step=4414.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4078/5971 [36:28<16:55,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.49e-5, train/loss_step=0.00265, global_step=4414.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4078/5971 [36:28<16:55,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000117, train/loss_step=0.0311, global_step=4414.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4079/5971 [36:29<16:55,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000117, train/loss_step=0.0311, global_step=4414.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4079/5971 [36:29<16:55,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=4414.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4080/5971 [36:31<16:55,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=4414.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4080/5971 [36:31<16:55,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.64e-6, train/loss_step=0.00165, global_step=4414.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4081/5971 [36:32<16:55,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.64e-6, train/loss_step=0.00165, global_step=4414.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4081/5971 [36:32<16:55,  1.86it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000144, train/loss_step=0.0417, global_step=4415.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4082/5971 [36:33<16:54,  1.86it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000144, train/loss_step=0.0417, global_step=4415.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4082/5971 [36:33<16:54,  1.86it/s, loss=0.141, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000225, train/loss_step=0.065, global_step=4415.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4083/5971 [36:34<16:54,  1.86it/s, loss=0.141, v_num=0, train/loss_simple_step=0.065, train/loss_vlb_step=0.000225, train/loss_step=0.065, global_step=4415.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4083/5971 [36:34<16:54,  1.86it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=4415.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4084/5971 [36:36<16:54,  1.86it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.04e-5, train/loss_step=0.00174, global_step=4415.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4084/5971 [36:36<16:54,  1.86it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000101, train/loss_step=0.0266, global_step=4415.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4085/5971 [36:37<16:54,  1.86it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0266, train/loss_vlb_step=0.000101, train/loss_step=0.0266, global_step=4415.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4085/5971 [36:37<16:54,  1.86it/s, loss=0.124, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000833, train/loss_step=0.214, global_step=4416.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  68%|██████▊   | 4086/5971 [36:38<16:53,  1.86it/s, loss=0.124, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000833, train/loss_step=0.214, global_step=4416.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4086/5971 [36:38<16:53,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00188, train/loss_step=0.382, global_step=4416.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  68%|██████▊   | 4087/5971 [36:39<16:53,  1.86it/s, loss=0.137, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00188, train/loss_step=0.382, global_step=4416.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4087/5971 [36:39<16:53,  1.86it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000288, train/loss_step=0.0841, global_step=4416.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4088/5971 [36:41<16:53,  1.86it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000288, train/loss_step=0.0841, global_step=4416.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4088/5971 [36:41<16:53,  1.86it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.4e-5, train/loss_step=0.00483, global_step=4416.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4089/5971 [36:42<16:53,  1.86it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.4e-5, train/loss_step=0.00483, global_step=4416.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4089/5971 [36:42<16:53,  1.86it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.11e-5, train/loss_step=0.00199, global_step=4417.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4090/5971 [36:43<16:53,  1.86it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.11e-5, train/loss_step=0.00199, global_step=4417.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  68%|██████▊   | 4090/5971 [36:43<16:53,  1.86it/s, loss=0.131, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00368, train/loss_step=0.472, global_step=4417.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  69%|██████▊   | 4091/5971 [36:44<16:52,  1.86it/s, loss=0.131, v_num=0, train/loss_simple_step=0.472, train/loss_vlb_step=0.00368, train/loss_step=0.472, global_step=4417.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4091/5971 [36:44<16:52,  1.86it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=6.19e-5, train/loss_step=0.0137, global_step=4417.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4092/5971 [36:46<16:52,  1.86it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=6.19e-5, train/loss_step=0.0137, global_step=4417.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4092/5971 [36:46<16:52,  1.86it/s, loss=0.133, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.00258, train/loss_step=0.492, global_step=4417.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  69%|██████▊   | 4093/5971 [36:47<16:52,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.492, train/loss_vlb_step=0.00258, train/loss_step=0.492, global_step=4417.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4093/5971 [36:47<16:52,  1.85it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.66e-5, train/loss_step=0.0098, global_step=4418.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4094/5971 [36:47<16:52,  1.85it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0098, train/loss_vlb_step=4.66e-5, train/loss_step=0.0098, global_step=4418.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4094/5971 [36:47<16:52,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.000943, train/loss_step=0.256, global_step=4418.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  69%|██████▊   | 4095/5971 [36:48<16:51,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.000943, train/loss_step=0.256, global_step=4418.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4095/5971 [36:48<16:51,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000681, train/loss_step=0.204, global_step=4418.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4096/5971 [36:51<16:51,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000681, train/loss_step=0.204, global_step=4418.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4096/5971 [36:51<16:51,  1.85it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.64e-5, train/loss_step=0.00298, global_step=4418.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4097/5971 [36:51<16:51,  1.85it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00298, train/loss_vlb_step=1.64e-5, train/loss_step=0.00298, global_step=4418.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4097/5971 [36:51<16:51,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.22e-5, train/loss_step=0.0172, global_step=4419.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  69%|██████▊   | 4098/5971 [36:52<16:51,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.22e-5, train/loss_step=0.0172, global_step=4419.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4098/5971 [36:52<16:51,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00351, train/loss_step=0.488, global_step=4419.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  69%|██████▊   | 4099/5971 [36:53<16:50,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.488, train/loss_vlb_step=0.00351, train/loss_step=0.488, global_step=4419.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4099/5971 [36:53<16:50,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.48e-5, train/loss_step=0.0156, global_step=4419.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4100/5971 [36:55<16:50,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.48e-5, train/loss_step=0.0156, global_step=4419.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4100/5971 [36:55<16:50,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00117, train/loss_vlb_step=7.11e-6, train/loss_step=0.00117, global_step=4419.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4101/5971 [36:56<16:50,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00117, train/loss_vlb_step=7.11e-6, train/loss_step=0.00117, global_step=4419.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4101/5971 [36:56<16:50,  1.85it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000284, train/loss_step=0.0861, global_step=4420.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4102/5971 [36:57<16:50,  1.85it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000284, train/loss_step=0.0861, global_step=4420.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4102/5971 [36:57<16:50,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000732, train/loss_step=0.216, global_step=4420.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  69%|██████▊   | 4103/5971 [36:58<16:49,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000732, train/loss_step=0.216, global_step=4420.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4103/5971 [36:58<16:49,  1.85it/s, loss=0.174, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.0145, train/loss_step=0.493, global_step=4420.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  69%|██████▊   | 4104/5971 [37:00<16:49,  1.85it/s, loss=0.174, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.0145, train/loss_step=0.493, global_step=4420.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4104/5971 [37:00<16:49,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.11e-5, train/loss_step=0.0019, global_step=4420.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4105/5971 [37:01<16:49,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.11e-5, train/loss_step=0.0019, global_step=4420.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▊   | 4105/5971 [37:01<16:49,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000421, train/loss_step=0.127, global_step=4421.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  69%|██████▉   | 4106/5971 [37:02<16:49,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000421, train/loss_step=0.127, global_step=4421.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4106/5971 [37:02<16:49,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000132, train/loss_step=0.0374, global_step=4421.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4107/5971 [37:03<16:48,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000132, train/loss_step=0.0374, global_step=4421.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4107/5971 [37:03<16:48,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00445, train/loss_vlb_step=2.42e-5, train/loss_step=0.00445, global_step=4421.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4108/5971 [37:05<16:49,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00445, train/loss_vlb_step=2.42e-5, train/loss_step=0.00445, global_step=4421.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4108/5971 [37:05<16:49,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000107, train/loss_step=0.0275, global_step=4421.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  69%|██████▉   | 4109/5971 [37:06<16:48,  1.85it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000107, train/loss_step=0.0275, global_step=4421.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4109/5971 [37:06<16:48,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.00011, train/loss_step=0.0287, global_step=4422.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  69%|██████▉   | 4110/5971 [37:07<16:48,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0287, train/loss_vlb_step=0.00011, train/loss_step=0.0287, global_step=4422.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4110/5971 [37:07<16:48,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.837, train/loss_vlb_step=0.0337, train/loss_step=0.837, global_step=4422.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  69%|██████▉   | 4111/5971 [37:08<16:47,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.837, train/loss_vlb_step=0.0337, train/loss_step=0.837, global_step=4422.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4111/5971 [37:08<16:47,  1.85it/s, loss=0.175, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000502, train/loss_step=0.148, global_step=4422.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4112/5971 [37:10<16:48,  1.84it/s, loss=0.175, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000502, train/loss_step=0.148, global_step=4422.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4112/5971 [37:10<16:48,  1.84it/s, loss=0.158, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000552, train/loss_step=0.164, global_step=4422.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4113/5971 [37:11<16:47,  1.84it/s, loss=0.158, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000552, train/loss_step=0.164, global_step=4422.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4113/5971 [37:11<16:47,  1.84it/s, loss=0.167, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000682, train/loss_step=0.184, global_step=4423.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4114/5971 [37:12<16:47,  1.84it/s, loss=0.167, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000682, train/loss_step=0.184, global_step=4423.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4114/5971 [37:12<16:47,  1.84it/s, loss=0.171, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00173, train/loss_step=0.330, global_step=4423.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  69%|██████▉   | 4115/5971 [37:13<16:46,  1.84it/s, loss=0.171, v_num=0, train/loss_simple_step=0.330, train/loss_vlb_step=0.00173, train/loss_step=0.330, global_step=4423.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4115/5971 [37:13<16:46,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.67e-5, train/loss_step=0.0123, global_step=4423.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4116/5971 [37:15<16:47,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.67e-5, train/loss_step=0.0123, global_step=4423.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4116/5971 [37:15<16:47,  1.84it/s, loss=0.168, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000462, train/loss_step=0.140, global_step=4423.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  69%|██████▉   | 4117/5971 [37:16<16:46,  1.84it/s, loss=0.168, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000462, train/loss_step=0.140, global_step=4423.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4117/5971 [37:16<16:46,  1.84it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.000161, train/loss_step=0.0455, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4118/5971 [37:16<16:46,  1.84it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0455, train/loss_vlb_step=0.000161, train/loss_step=0.0455, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4118/5971 [37:16<16:46,  1.84it/s, loss=0.177, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.00944, train/loss_step=0.638, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  69%|██████▉   | 4119/5971 [37:17<16:45,  1.84it/s, loss=0.177, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.00944, train/loss_step=0.638, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4119/5971 [37:17<16:45,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00222, train/loss_step=0.408, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4120/5971 [37:20<16:46,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.408, train/loss_vlb_step=0.00222, train/loss_step=0.408, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  69%|██████▉   | 4120/5971 [37:20<16:46,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:04,  2.59it/s][A
Epoch 7:  69%|██████▉   | 4122/5971 [37:20<16:44,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:44,  3.69it/s][A
Epoch 7:  69%|██████▉   | 4124/5971 [37:20<16:43,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.76it/s][A
Epoch 7:  69%|██████▉   | 4127/5971 [37:20<16:41,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.60it/s][A
Epoch 7:  69%|██████▉   | 4130/5971 [37:21<16:38,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:09, 17.02it/s][A
Epoch 7:  69%|██████▉   | 4133/5971 [37:21<16:36,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.92it/s][A
Epoch 7:  69%|██████▉   | 4136/5971 [37:21<16:34,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.53it/s][A
Epoch 7:  69%|██████▉   | 4139/5971 [37:21<16:31,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.56it/s][A
Epoch 7:  69%|██████▉   | 4142/5971 [37:21<16:29,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.53it/s][A
Epoch 7:  69%|██████▉   | 4145/5971 [37:21<16:27,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.39it/s][A
Epoch 7:  69%|██████▉   | 4148/5971 [37:21<16:24,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.53it/s][A
Epoch 7:  70%|██████▉   | 4151/5971 [37:21<16:22,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.32it/s][A
Epoch 7:  70%|██████▉   | 4154/5971 [37:21<16:20,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.09it/s][A
Epoch 7:  70%|██████▉   | 4157/5971 [37:22<16:18,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:01<00:05, 25.52it/s][A
Epoch 7:  70%|██████▉   | 4160/5971 [37:22<16:15,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.32it/s][A
Epoch 7:  70%|██████▉   | 4163/5971 [37:22<16:13,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 24.64it/s][A
Epoch 7:  70%|██████▉   | 4166/5971 [37:22<16:11,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.76it/s][A
Epoch 7:  70%|██████▉   | 4169/5971 [37:22<16:09,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.61it/s][A
Epoch 7:  70%|██████▉   | 4172/5971 [37:22<16:06,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.64it/s][A
Epoch 7:  70%|██████▉   | 4175/5971 [37:22<16:04,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.43it/s][A
Epoch 7:  70%|██████▉   | 4178/5971 [37:22<16:02,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.28it/s][A
Epoch 7:  70%|███████   | 4181/5971 [37:22<16:00,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.91it/s][A
Epoch 7:  70%|███████   | 4184/5971 [37:23<15:57,  1.87it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 26.52it/s][A
Epoch 7:  70%|███████   | 4187/5971 [37:23<15:55,  1.87it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████      | 68/167 [00:03<00:03, 25.03it/s][A
Epoch 7:  70%|███████   | 4190/5971 [37:23<15:53,  1.87it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.31it/s][A
Epoch 7:  70%|███████   | 4193/5971 [37:23<15:51,  1.87it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 26.34it/s][A
Epoch 7:  70%|███████   | 4196/5971 [37:23<15:48,  1.87it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 26.77it/s][A
Epoch 7:  70%|███████   | 4199/5971 [37:23<15:46,  1.87it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 28.18it/s][A
Epoch 7:  70%|███████   | 4203/5971 [37:23<15:43,  1.87it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.02it/s][A
Epoch 7:  70%|███████   | 4207/5971 [37:23<15:40,  1.88it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.04it/s][A

Validating:  54%|█████▍    | 90/167 [00:03<00:03, 25.56it/s][A
Epoch 7:  71%|███████   | 4211/5971 [37:24<15:37,  1.88it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.12it/s][A
Epoch 7:  71%|███████   | 4215/5971 [37:24<15:34,  1.88it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.64it/s][A
Epoch 7:  71%|███████   | 4219/5971 [37:24<15:31,  1.88it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.54it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.92it/s][A
Epoch 7:  71%|███████   | 4223/5971 [37:24<15:28,  1.88it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.30it/s][A
Epoch 7:  71%|███████   | 4227/5971 [37:24<15:25,  1.88it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 24.30it/s][A
Epoch 7:  71%|███████   | 4231/5971 [37:24<15:22,  1.89it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 25.56it/s][A

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 25.59it/s][A
Epoch 7:  71%|███████   | 4235/5971 [37:25<15:20,  1.89it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.68it/s][A
Epoch 7:  71%|███████   | 4239/5971 [37:25<15:17,  1.89it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 25.60it/s][A
Epoch 7:  71%|███████   | 4243/5971 [37:25<15:14,  1.89it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.97it/s][A

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.02it/s][A
Epoch 7:  71%|███████   | 4247/5971 [37:25<15:11,  1.89it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.90it/s][A
Epoch 7:  71%|███████   | 4251/5971 [37:25<15:08,  1.89it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.67it/s][A
Epoch 7:  71%|███████▏  | 4255/5971 [37:25<15:05,  1.90it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.60it/s][A
Epoch 7:  71%|███████▏  | 4259/5971 [37:25<15:02,  1.90it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 24.62it/s][A

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 25.68it/s][A
Epoch 7:  71%|███████▏  | 4263/5971 [37:26<14:59,  1.90it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.01it/s][A
Epoch 7:  71%|███████▏  | 4267/5971 [37:26<14:56,  1.90it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.46it/s][A
Epoch 7:  72%|███████▏  | 4271/5971 [37:26<14:53,  1.90it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 25.77it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 24.59it/s][A
Epoch 7:  72%|███████▏  | 4275/5971 [37:26<14:51,  1.90it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 24.92it/s][A
Epoch 7:  72%|███████▏  | 4279/5971 [37:26<14:48,  1.90it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.81it/s][A
Epoch 7:  72%|███████▏  | 4283/5971 [37:26<14:45,  1.91it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.61it/s][A
Epoch 7:  72%|███████▏  | 4287/5971 [37:27<14:42,  1.91it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 100%|██████████| 167/167 [00:06<00:00, 27.03it/s][A
Epoch 7:  72%|███████▏  | 4288/5971 [37:27<14:41,  1.91it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0851, train/loss_vlb_step=0.000281, train/loss_step=0.0851, global_step=4424.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  72%|███████▏  | 4289/5971 [37:28<14:41,  1.91it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.93e-5, train/loss_step=0.0108, global_step=4425.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  72%|███████▏  | 4290/5971 [37:29<14:41,  1.91it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.13e-5, train/loss_step=0.00195, global_step=4425.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4291/5971 [37:30<14:40,  1.91it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00195, train/loss_vlb_step=1.13e-5, train/loss_step=0.00195, global_step=4425.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4291/5971 [37:30<14:40,  1.91it/s, loss=0.175, v_num=0, train/loss_simple_step=0.266, train/loss_vlb_step=0.000978, train/loss_step=0.266, global_step=4425.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  72%|███████▏  | 4292/5971 [37:32<14:40,  1.91it/s, loss=0.198, v_num=0, train/loss_simple_step=0.470, train/loss_vlb_step=0.00256, train/loss_step=0.470, global_step=4425.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  72%|███████▏  | 4293/5971 [37:33<14:40,  1.91it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000106, train/loss_step=0.0279, global_step=4426.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4294/5971 [37:34<14:40,  1.91it/s, loss=0.198, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=4426.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  72%|███████▏  | 4295/5971 [37:34<14:39,  1.91it/s, loss=0.198, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000394, train/loss_step=0.120, global_step=4426.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4295/5971 [37:34<14:39,  1.91it/s, loss=0.203, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000384, train/loss_step=0.117, global_step=4426.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4296/5971 [37:37<14:39,  1.90it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.44e-5, train/loss_step=0.0127, global_step=4426.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4297/5971 [37:37<14:39,  1.90it/s, loss=0.216, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00138, train/loss_step=0.305, global_step=4427.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  72%|███████▏  | 4298/5971 [37:38<14:39,  1.90it/s, loss=0.203, v_num=0, train/loss_simple_step=0.581, train/loss_vlb_step=0.00563, train/loss_step=0.581, global_step=4427.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4299/5971 [37:39<14:38,  1.90it/s, loss=0.203, v_num=0, train/loss_simple_step=0.581, train/loss_vlb_step=0.00563, train/loss_step=0.581, global_step=4427.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4299/5971 [37:39<14:38,  1.90it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.68e-5, train/loss_step=0.00316, global_step=4427.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4300/5971 [37:41<14:38,  1.90it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.09e-5, train/loss_step=0.0139, global_step=4427.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  72%|███████▏  | 4301/5971 [37:42<14:38,  1.90it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00591, train/loss_vlb_step=3.13e-5, train/loss_step=0.00591, global_step=4428.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4302/5971 [37:43<14:37,  1.90it/s, loss=0.18, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00153, train/loss_step=0.332, global_step=4428.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  72%|███████▏  | 4303/5971 [37:44<14:37,  1.90it/s, loss=0.18, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00153, train/loss_step=0.332, global_step=4428.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4303/5971 [37:44<14:37,  1.90it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.000256, train/loss_step=0.0771, global_step=4428.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4304/5971 [37:46<14:37,  1.90it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00398, train/loss_vlb_step=2.07e-5, train/loss_step=0.00398, global_step=4428.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4305/5971 [37:47<14:37,  1.90it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.83e-5, train/loss_step=0.00348, global_step=4429.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4306/5971 [37:48<14:36,  1.90it/s, loss=0.162, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00382, train/loss_step=0.405, global_step=4429.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  72%|███████▏  | 4307/5971 [37:49<14:36,  1.90it/s, loss=0.162, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00382, train/loss_step=0.405, global_step=4429.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4307/5971 [37:49<14:36,  1.90it/s, loss=0.159, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00127, train/loss_step=0.334, global_step=4429.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4308/5971 [37:51<14:36,  1.90it/s, loss=0.165, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.00073, train/loss_step=0.202, global_step=4429.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4309/5971 [37:52<14:36,  1.90it/s, loss=0.182, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00186, train/loss_step=0.351, global_step=4430.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4310/5971 [37:53<14:35,  1.90it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000118, train/loss_step=0.0319, global_step=4430.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4311/5971 [37:53<14:35,  1.90it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0319, train/loss_vlb_step=0.000118, train/loss_step=0.0319, global_step=4430.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4311/5971 [37:53<14:35,  1.90it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0548, train/loss_vlb_step=0.000192, train/loss_step=0.0548, global_step=4430.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4312/5971 [37:56<14:35,  1.89it/s, loss=0.163, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00119, train/loss_step=0.276, global_step=4430.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  72%|███████▏  | 4313/5971 [37:57<14:35,  1.89it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00183, train/loss_vlb_step=1.1e-5, train/loss_step=0.00183, global_step=4431.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4314/5971 [37:57<14:34,  1.89it/s, loss=0.164, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000655, train/loss_step=0.168, global_step=4431.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  72%|███████▏  | 4315/5971 [37:58<14:34,  1.89it/s, loss=0.164, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000655, train/loss_step=0.168, global_step=4431.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4315/5971 [37:58<14:34,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.167, train/loss_vlb_step=0.000598, train/loss_step=0.167, global_step=4431.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4316/5971 [38:00<14:34,  1.89it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0855, train/loss_vlb_step=0.000286, train/loss_step=0.0855, global_step=4431.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4317/5971 [38:01<14:34,  1.89it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000216, train/loss_step=0.0645, global_step=4432.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4318/5971 [38:02<14:33,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000774, train/loss_step=0.207, global_step=4432.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  72%|███████▏  | 4319/5971 [38:03<14:33,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000774, train/loss_step=0.207, global_step=4432.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4319/5971 [38:03<14:33,  1.89it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000123, train/loss_step=0.0314, global_step=4432.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4320/5971 [38:05<14:33,  1.89it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00739, train/loss_vlb_step=3.55e-5, train/loss_step=0.00739, global_step=4432.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4321/5971 [38:06<14:32,  1.89it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0954, train/loss_vlb_step=0.000317, train/loss_step=0.0954, global_step=4433.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4322/5971 [38:07<14:32,  1.89it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000122, train/loss_step=0.0348, global_step=4433.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  72%|███████▏  | 4323/5971 [38:08<14:32,  1.89it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0348, train/loss_vlb_step=0.000122, train/loss_step=0.0348, global_step=4433.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4323/5971 [38:08<14:32,  1.89it/s, loss=0.138, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000861, train/loss_step=0.234, global_step=4433.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  72%|███████▏  | 4324/5971 [38:10<14:32,  1.89it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00907, train/loss_vlb_step=4.12e-5, train/loss_step=0.00907, global_step=4433.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4325/5971 [38:11<14:31,  1.89it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0361, train/loss_vlb_step=0.000134, train/loss_step=0.0361, global_step=4434.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  72%|███████▏  | 4326/5971 [38:12<14:31,  1.89it/s, loss=0.167, v_num=0, train/loss_simple_step=0.945, train/loss_vlb_step=0.475, train/loss_step=0.945, global_step=4434.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  72%|███████▏  | 4327/5971 [38:13<14:31,  1.89it/s, loss=0.167, v_num=0, train/loss_simple_step=0.945, train/loss_vlb_step=0.475, train/loss_step=0.945, global_step=4434.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4327/5971 [38:13<14:31,  1.89it/s, loss=0.157, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000473, train/loss_step=0.141, global_step=4434.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  72%|███████▏  | 4328/5971 [38:15<14:31,  1.89it/s, loss=0.16, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00155, train/loss_step=0.267, global_step=4434.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4329/5971 [38:16<14:30,  1.89it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0602, train/loss_vlb_step=0.000204, train/loss_step=0.0602, global_step=4435.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4330/5971 [38:16<14:30,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000846, train/loss_step=0.238, global_step=4435.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4331/5971 [38:17<14:29,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000846, train/loss_step=0.238, global_step=4435.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4331/5971 [38:17<14:29,  1.89it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.91e-5, train/loss_step=0.0113, global_step=4435.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4332/5971 [38:19<14:29,  1.88it/s, loss=0.162, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00267, train/loss_step=0.430, global_step=4435.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4333/5971 [38:20<14:29,  1.88it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.23e-5, train/loss_step=0.0021, global_step=4436.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4334/5971 [38:21<14:29,  1.88it/s, loss=0.159, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000383, train/loss_step=0.113, global_step=4436.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  73%|███████▎  | 4335/5971 [38:22<14:28,  1.88it/s, loss=0.159, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000383, train/loss_step=0.113, global_step=4436.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4335/5971 [38:22<14:28,  1.88it/s, loss=0.161, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000827, train/loss_step=0.212, global_step=4436.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4336/5971 [38:24<14:28,  1.88it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000157, train/loss_step=0.0438, global_step=4436.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4337/5971 [38:25<14:28,  1.88it/s, loss=0.181, v_num=0, train/loss_simple_step=0.502, train/loss_vlb_step=0.00333, train/loss_step=0.502, global_step=4437.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  73%|███████▎  | 4338/5971 [38:26<14:28,  1.88it/s, loss=0.192, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00225, train/loss_step=0.423, global_step=4437.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4339/5971 [38:27<14:27,  1.88it/s, loss=0.192, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00225, train/loss_step=0.423, global_step=4437.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4339/5971 [38:27<14:27,  1.88it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.42e-5, train/loss_step=0.0219, global_step=4437.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4340/5971 [38:29<14:27,  1.88it/s, loss=0.219, v_num=0, train/loss_simple_step=0.558, train/loss_vlb_step=0.00444, train/loss_step=0.558, global_step=4437.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4341/5971 [38:30<14:27,  1.88it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00573, train/loss_vlb_step=2.75e-5, train/loss_step=0.00573, global_step=4438.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4342/5971 [38:31<14:26,  1.88it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000171, train/loss_step=0.0464, global_step=4438.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  73%|███████▎  | 4343/5971 [38:32<14:26,  1.88it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.000171, train/loss_step=0.0464, global_step=4438.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4343/5971 [38:32<14:26,  1.88it/s, loss=0.208, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000341, train/loss_step=0.102, global_step=4438.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4344/5971 [38:34<14:26,  1.88it/s, loss=0.245, v_num=0, train/loss_simple_step=0.751, train/loss_vlb_step=0.0326, train/loss_step=0.751, global_step=4438.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4345/5971 [38:35<14:26,  1.88it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.45e-5, train/loss_step=0.0185, global_step=4439.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4346/5971 [38:36<14:25,  1.88it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00431, train/loss_vlb_step=2.35e-5, train/loss_step=0.00431, global_step=4439.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4347/5971 [38:36<14:25,  1.88it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00431, train/loss_vlb_step=2.35e-5, train/loss_step=0.00431, global_step=4439.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4347/5971 [38:36<14:25,  1.88it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00232, train/loss_vlb_step=1.29e-5, train/loss_step=0.00232, global_step=4439.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4348/5971 [38:39<14:25,  1.88it/s, loss=0.185, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000502, train/loss_step=0.152, global_step=4439.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  73%|███████▎  | 4349/5971 [38:39<14:25,  1.88it/s, loss=0.191, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000611, train/loss_step=0.179, global_step=4440.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4350/5971 [38:40<14:24,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00914, train/loss_vlb_step=4.39e-5, train/loss_step=0.00914, global_step=4440.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4351/5971 [38:41<14:24,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00914, train/loss_vlb_step=4.39e-5, train/loss_step=0.00914, global_step=4440.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4351/5971 [38:41<14:24,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.880, train/loss_vlb_step=0.0645, train/loss_step=0.880, global_step=4440.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  73%|███████▎  | 4352/5971 [38:43<14:24,  1.87it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00775, train/loss_vlb_step=3.76e-5, train/loss_step=0.00775, global_step=4440.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4353/5971 [38:44<14:23,  1.87it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0563, train/loss_vlb_step=0.000192, train/loss_step=0.0563, global_step=4441.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  73%|███████▎  | 4354/5971 [38:45<14:23,  1.87it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.53e-5, train/loss_step=0.00484, global_step=4441.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4355/5971 [38:46<14:23,  1.87it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00484, train/loss_vlb_step=2.53e-5, train/loss_step=0.00484, global_step=4441.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4355/5971 [38:46<14:23,  1.87it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0991, train/loss_vlb_step=0.000329, train/loss_step=0.0991, global_step=4441.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  73%|███████▎  | 4356/5971 [38:48<14:23,  1.87it/s, loss=0.197, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000397, train/loss_step=0.120, global_step=4441.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4357/5971 [38:49<14:22,  1.87it/s, loss=0.182, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000636, train/loss_step=0.189, global_step=4442.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4358/5971 [38:50<14:22,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000334, train/loss_step=0.102, global_step=4442.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4359/5971 [38:51<14:21,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000334, train/loss_step=0.102, global_step=4442.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4359/5971 [38:51<14:21,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=9.01e-5, train/loss_step=0.0209, global_step=4442.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4360/5971 [38:53<14:22,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=7.09e-5, train/loss_step=0.0167, global_step=4442.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4361/5971 [38:54<14:21,  1.87it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.18e-5, train/loss_step=0.00924, global_step=4443.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4362/5971 [38:55<14:21,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000566, train/loss_step=0.166, global_step=4443.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  73%|███████▎  | 4363/5971 [38:56<14:20,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000566, train/loss_step=0.166, global_step=4443.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4363/5971 [38:56<14:20,  1.87it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00656, train/loss_vlb_step=3.28e-5, train/loss_step=0.00656, global_step=4443.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4364/5971 [38:58<14:20,  1.87it/s, loss=0.109, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000466, train/loss_step=0.136, global_step=4443.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4365/5971 [38:59<14:20,  1.87it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0306, train/loss_vlb_step=0.000117, train/loss_step=0.0306, global_step=4444.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4366/5971 [39:00<14:20,  1.87it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=9.52e-5, train/loss_step=0.0255, global_step=4444.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4367/5971 [39:01<14:19,  1.87it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0255, train/loss_vlb_step=9.52e-5, train/loss_step=0.0255, global_step=4444.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4367/5971 [39:01<14:19,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.713, train/loss_vlb_step=0.0409, train/loss_step=0.713, global_step=4444.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  73%|███████▎  | 4368/5971 [39:03<14:19,  1.86it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0884, train/loss_vlb_step=0.0003, train/loss_step=0.0884, global_step=4444.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4369/5971 [39:04<14:19,  1.86it/s, loss=0.154, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00261, train/loss_step=0.400, global_step=4445.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  73%|███████▎  | 4370/5971 [39:05<14:18,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000847, train/loss_step=0.245, global_step=4445.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4371/5971 [39:05<14:18,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.245, train/loss_vlb_step=0.000847, train/loss_step=0.245, global_step=4445.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4371/5971 [39:05<14:18,  1.86it/s, loss=0.129, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000503, train/loss_step=0.152, global_step=4445.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4372/5971 [39:08<14:18,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00127, train/loss_step=0.292, global_step=4445.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  73%|███████▎  | 4373/5971 [39:09<14:18,  1.86it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0895, train/loss_vlb_step=0.000294, train/loss_step=0.0895, global_step=4446.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4374/5971 [39:09<14:17,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000961, train/loss_step=0.225, global_step=4446.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4375/5971 [39:10<14:17,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000961, train/loss_step=0.225, global_step=4446.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4375/5971 [39:10<14:17,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000686, train/loss_step=0.201, global_step=4446.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4376/5971 [39:12<14:17,  1.86it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.000114, train/loss_step=0.0332, global_step=4446.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4377/5971 [39:13<14:16,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000528, train/loss_step=0.160, global_step=4447.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4378/5971 [39:14<14:16,  1.86it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.00023, train/loss_step=0.0688, global_step=4447.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4379/5971 [39:15<14:16,  1.86it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0688, train/loss_vlb_step=0.00023, train/loss_step=0.0688, global_step=4447.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4379/5971 [39:15<14:16,  1.86it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00445, train/loss_vlb_step=2.11e-5, train/loss_step=0.00445, global_step=4447.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4380/5971 [39:17<14:16,  1.86it/s, loss=0.167, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00126, train/loss_step=0.289, global_step=4447.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  73%|███████▎  | 4381/5971 [39:18<14:15,  1.86it/s, loss=0.177, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000776, train/loss_step=0.220, global_step=4448.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4382/5971 [39:19<14:15,  1.86it/s, loss=0.213, v_num=0, train/loss_simple_step=0.884, train/loss_vlb_step=0.112, train/loss_step=0.884, global_step=4448.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  73%|███████▎  | 4383/5971 [39:20<14:15,  1.86it/s, loss=0.213, v_num=0, train/loss_simple_step=0.884, train/loss_vlb_step=0.112, train/loss_step=0.884, global_step=4448.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4383/5971 [39:20<14:15,  1.86it/s, loss=0.218, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000356, train/loss_step=0.108, global_step=4448.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4384/5971 [39:22<14:15,  1.86it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00262, train/loss_vlb_step=1.46e-5, train/loss_step=0.00262, global_step=4448.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4385/5971 [39:23<14:14,  1.86it/s, loss=0.21, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.05e-5, train/loss_step=0.00175, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  73%|███████▎  | 4386/5971 [39:24<14:14,  1.86it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.53e-5, train/loss_step=0.0221, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  73%|███████▎  | 4387/5971 [39:25<14:13,  1.86it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=8.53e-5, train/loss_step=0.0221, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4387/5971 [39:25<14:13,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0767, train/loss_vlb_step=0.000254, train/loss_step=0.0767, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  73%|███████▎  | 4388/5971 [39:27<14:13,  1.85it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.25it/s][A

Validating:   1%|          | 2/167 [00:00<00:45,  3.61it/s][A
Epoch 7:  74%|███████▎  | 4391/5971 [39:28<14:11,  1.85it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   2%|▏         | 4/167 [00:00<00:22,  7.41it/s][A
Epoch 7:  74%|███████▎  | 4395/5971 [39:28<14:09,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   4%|▍         | 7/167 [00:00<00:12, 12.72it/s][A

Validating:   6%|▌         | 10/167 [00:00<00:09, 16.58it/s][A
Epoch 7:  74%|███████▎  | 4399/5971 [39:28<14:06,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 13/167 [00:01<00:08, 18.94it/s][A
Epoch 7:  74%|███████▎  | 4403/5971 [39:28<14:03,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|▉         | 16/167 [00:01<00:07, 19.77it/s][A
Epoch 7:  74%|███████▍  | 4407/5971 [39:28<14:00,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 21.79it/s][A

Validating:  13%|█▎        | 22/167 [00:01<00:06, 22.39it/s][A
Epoch 7:  74%|███████▍  | 4411/5971 [39:28<13:57,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  15%|█▍        | 25/167 [00:01<00:06, 23.41it/s][A
Epoch 7:  74%|███████▍  | 4415/5971 [39:29<13:54,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.21it/s][A
Epoch 7:  74%|███████▍  | 4419/5971 [39:29<13:51,  1.87it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.09it/s][A

Validating:  20%|██        | 34/167 [00:01<00:05, 25.42it/s][A
Epoch 7:  74%|███████▍  | 4423/5971 [39:29<13:49,  1.87it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  22%|██▏       | 37/167 [00:02<00:05, 25.15it/s][A
Epoch 7:  74%|███████▍  | 4427/5971 [39:29<13:46,  1.87it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 25.40it/s][A
Epoch 7:  74%|███████▍  | 4431/5971 [39:29<13:43,  1.87it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.77it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.73it/s][A
Epoch 7:  74%|███████▍  | 4435/5971 [39:29<13:40,  1.87it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.99it/s][A
Epoch 7:  74%|███████▍  | 4439/5971 [39:29<13:37,  1.87it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 52/167 [00:02<00:04, 27.32it/s][A
Epoch 7:  74%|███████▍  | 4443/5971 [39:30<13:34,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▎      | 56/167 [00:02<00:03, 28.39it/s][A
Epoch 7:  74%|███████▍  | 4447/5971 [39:30<13:32,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 28.15it/s][A

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.89it/s][A
Epoch 7:  75%|███████▍  | 4451/5971 [39:30<13:29,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|███▉      | 66/167 [00:03<00:03, 27.17it/s][A
Epoch 7:  75%|███████▍  | 4455/5971 [39:30<13:26,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.86it/s][A
Epoch 7:  75%|███████▍  | 4459/5971 [39:30<13:23,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 25.57it/s][A
Epoch 7:  75%|███████▍  | 4463/5971 [39:30<13:20,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.45it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 25.62it/s][A
Epoch 7:  75%|███████▍  | 4467/5971 [39:31<13:18,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.56it/s][A
Epoch 7:  75%|███████▍  | 4471/5971 [39:31<13:15,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████     | 85/167 [00:03<00:03, 27.04it/s][A
Epoch 7:  75%|███████▍  | 4475/5971 [39:31<13:12,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.92it/s][A
Epoch 7:  75%|███████▌  | 4479/5971 [39:31<13:09,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 28.30it/s][A
Epoch 7:  75%|███████▌  | 4483/5971 [39:31<13:07,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 26.37it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.89it/s][A
Epoch 7:  75%|███████▌  | 4487/5971 [39:31<13:04,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  60%|██████    | 101/167 [00:04<00:02, 25.56it/s][A
Epoch 7:  75%|███████▌  | 4491/5971 [39:31<13:01,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 24.81it/s][A
Epoch 7:  75%|███████▌  | 4495/5971 [39:32<12:58,  1.90it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 23.72it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 24.11it/s][A
Epoch 7:  75%|███████▌  | 4499/5971 [39:32<12:55,  1.90it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 24.82it/s][A
Epoch 7:  75%|███████▌  | 4503/5971 [39:32<12:53,  1.90it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 25.23it/s][A
Epoch 7:  75%|███████▌  | 4507/5971 [39:32<12:50,  1.90it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 24.01it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 24.45it/s][A
Epoch 7:  76%|███████▌  | 4511/5971 [39:32<12:47,  1.90it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 24.55it/s][A
Epoch 7:  76%|███████▌  | 4515/5971 [39:32<12:45,  1.90it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 24.55it/s][A
Epoch 7:  76%|███████▌  | 4519/5971 [39:33<12:42,  1.90it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 25.54it/s][A
Epoch 7:  76%|███████▌  | 4523/5971 [39:33<12:39,  1.91it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.02it/s][A

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 27.55it/s][A
Epoch 7:  76%|███████▌  | 4527/5971 [39:33<12:36,  1.91it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.97it/s][A
Epoch 7:  76%|███████▌  | 4531/5971 [39:33<12:34,  1.91it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 25.02it/s][A
Epoch 7:  76%|███████▌  | 4535/5971 [39:33<12:31,  1.91it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 24.32it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.05it/s][A
Epoch 7:  76%|███████▌  | 4539/5971 [39:33<12:28,  1.91it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 24.54it/s][A
Epoch 7:  76%|███████▌  | 4543/5971 [39:33<12:26,  1.91it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.50it/s][A
Epoch 7:  76%|███████▌  | 4547/5971 [39:34<12:23,  1.92it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 24.75it/s][A

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.55it/s][A
Epoch 7:  76%|███████▌  | 4551/5971 [39:34<12:20,  1.92it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.93it/s][A
Epoch 7:  76%|███████▋  | 4555/5971 [39:34<12:17,  1.92it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4556/5971 [39:34<12:17,  1.92it/s, loss=0.206, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4449.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  76%|███████▋  | 4557/5971 [39:35<12:16,  1.92it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00883, train/loss_vlb_step=4.18e-5, train/loss_step=0.00883, global_step=4450.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4558/5971 [39:36<12:16,  1.92it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000113, train/loss_step=0.0316, global_step=4450.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  76%|███████▋  | 4559/5971 [39:37<12:16,  1.92it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000113, train/loss_step=0.0316, global_step=4450.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4559/5971 [39:37<12:16,  1.92it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0923, train/loss_vlb_step=0.000305, train/loss_step=0.0923, global_step=4450.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4560/5971 [39:39<12:16,  1.92it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.6e-5, train/loss_step=0.0123, global_step=4450.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  76%|███████▋  | 4561/5971 [39:40<12:15,  1.92it/s, loss=0.162, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000517, train/loss_step=0.157, global_step=4451.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4562/5971 [39:41<12:15,  1.92it/s, loss=0.161, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.0007, train/loss_step=0.206, global_step=4451.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  76%|███████▋  | 4563/5971 [39:42<12:15,  1.92it/s, loss=0.161, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.0007, train/loss_step=0.206, global_step=4451.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4563/5971 [39:42<12:15,  1.92it/s, loss=0.186, v_num=0, train/loss_simple_step=0.695, train/loss_vlb_step=0.0328, train/loss_step=0.695, global_step=4451.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4564/5971 [39:44<12:15,  1.91it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0267, train/loss_vlb_step=0.000101, train/loss_step=0.0267, global_step=4451.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4565/5971 [39:45<12:14,  1.91it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000112, train/loss_step=0.0328, global_step=4452.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4566/5971 [39:46<12:14,  1.91it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0523, train/loss_vlb_step=0.00018, train/loss_step=0.0523, global_step=4452.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  76%|███████▋  | 4567/5971 [39:47<12:13,  1.91it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0523, train/loss_vlb_step=0.00018, train/loss_step=0.0523, global_step=4452.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  76%|███████▋  | 4567/5971 [39:47<12:13,  1.91it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000133, train/loss_step=0.0364, global_step=4452.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4568/5971 [39:49<12:13,  1.91it/s, loss=0.174, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000607, train/loss_step=0.170, global_step=4452.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4569/5971 [39:50<12:13,  1.91it/s, loss=0.173, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000713, train/loss_step=0.209, global_step=4453.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4570/5971 [39:51<12:12,  1.91it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.42e-5, train/loss_step=0.0159, global_step=4453.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4571/5971 [39:52<12:12,  1.91it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.42e-5, train/loss_step=0.0159, global_step=4453.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4571/5971 [39:52<12:12,  1.91it/s, loss=0.126, v_num=0, train/loss_simple_step=0.020, train/loss_vlb_step=7.92e-5, train/loss_step=0.020, global_step=4453.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4572/5971 [39:54<12:12,  1.91it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0215, train/loss_vlb_step=8.67e-5, train/loss_step=0.0215, global_step=4453.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4573/5971 [39:55<12:12,  1.91it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0563, train/loss_vlb_step=0.000195, train/loss_step=0.0563, global_step=4454.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4574/5971 [39:56<12:11,  1.91it/s, loss=0.133, v_num=0, train/loss_simple_step=0.097, train/loss_vlb_step=0.00032, train/loss_step=0.097, global_step=4454.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  77%|███████▋  | 4575/5971 [39:57<12:11,  1.91it/s, loss=0.133, v_num=0, train/loss_simple_step=0.097, train/loss_vlb_step=0.00032, train/loss_step=0.097, global_step=4454.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4575/5971 [39:57<12:11,  1.91it/s, loss=0.135, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000351, train/loss_step=0.106, global_step=4454.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4576/5971 [39:59<12:11,  1.91it/s, loss=0.108, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000363, train/loss_step=0.110, global_step=4454.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4577/5971 [40:00<12:10,  1.91it/s, loss=0.12, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.00113, train/loss_step=0.257, global_step=4455.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  77%|███████▋  | 4578/5971 [40:00<12:10,  1.91it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00313, train/loss_vlb_step=1.71e-5, train/loss_step=0.00313, global_step=4455.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4579/5971 [40:01<12:09,  1.91it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00313, train/loss_vlb_step=1.71e-5, train/loss_step=0.00313, global_step=4455.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4579/5971 [40:01<12:09,  1.91it/s, loss=0.12, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=4455.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  77%|███████▋  | 4580/5971 [40:03<12:09,  1.91it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00328, train/loss_vlb_step=1.72e-5, train/loss_step=0.00328, global_step=4455.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4581/5971 [40:04<12:09,  1.91it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.72e-5, train/loss_step=0.0161, global_step=4456.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4582/5971 [40:05<12:09,  1.91it/s, loss=0.114, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000851, train/loss_step=0.239, global_step=4456.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4583/5971 [40:06<12:08,  1.90it/s, loss=0.114, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000851, train/loss_step=0.239, global_step=4456.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4583/5971 [40:06<12:08,  1.90it/s, loss=0.0807, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=8.07e-5, train/loss_step=0.0193, global_step=4456.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4584/5971 [40:08<12:08,  1.90it/s, loss=0.0794, v_num=0, train/loss_simple_step=0.00143, train/loss_vlb_step=8.45e-6, train/loss_step=0.00143, global_step=4456.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4585/5971 [40:09<12:08,  1.90it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.00788, train/loss_vlb_step=3.67e-5, train/loss_step=0.00788, global_step=4457.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4586/5971 [40:10<12:07,  1.90it/s, loss=0.115, v_num=0, train/loss_simple_step=0.799, train/loss_vlb_step=0.0321, train/loss_step=0.799, global_step=4457.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      
Epoch 7:  77%|███████▋  | 4587/5971 [40:11<12:07,  1.90it/s, loss=0.115, v_num=0, train/loss_simple_step=0.799, train/loss_vlb_step=0.0321, train/loss_step=0.799, global_step=4457.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4587/5971 [40:11<12:07,  1.90it/s, loss=0.122, v_num=0, train/loss_simple_step=0.176, train/loss_vlb_step=0.000586, train/loss_step=0.176, global_step=4457.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4588/5971 [40:13<12:07,  1.90it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=6.07e-5, train/loss_step=0.0138, global_step=4457.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4589/5971 [40:14<12:06,  1.90it/s, loss=0.12, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00161, train/loss_step=0.325, global_step=4458.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  77%|███████▋  | 4590/5971 [40:15<12:06,  1.90it/s, loss=0.127, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000526, train/loss_step=0.156, global_step=4458.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4591/5971 [40:16<12:06,  1.90it/s, loss=0.127, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000526, train/loss_step=0.156, global_step=4458.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4591/5971 [40:16<12:06,  1.90it/s, loss=0.137, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000719, train/loss_step=0.201, global_step=4458.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4592/5971 [40:18<12:06,  1.90it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.38e-5, train/loss_step=0.00248, global_step=4458.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4593/5971 [40:19<12:05,  1.90it/s, loss=0.139, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.00043, train/loss_step=0.130, global_step=4459.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  77%|███████▋  | 4594/5971 [40:20<12:05,  1.90it/s, loss=0.147, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00102, train/loss_step=0.260, global_step=4459.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4595/5971 [40:20<12:04,  1.90it/s, loss=0.147, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00102, train/loss_step=0.260, global_step=4459.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4595/5971 [40:20<12:04,  1.90it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0728, train/loss_vlb_step=0.000247, train/loss_step=0.0728, global_step=4459.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4596/5971 [40:23<12:04,  1.90it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000149, train/loss_step=0.0451, global_step=4459.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4597/5971 [40:24<12:04,  1.90it/s, loss=0.137, v_num=0, train/loss_simple_step=0.149, train/loss_vlb_step=0.000489, train/loss_step=0.149, global_step=4460.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  77%|███████▋  | 4598/5971 [40:24<12:03,  1.90it/s, loss=0.165, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00481, train/loss_step=0.568, global_step=4460.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4599/5971 [40:25<12:03,  1.90it/s, loss=0.165, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00481, train/loss_step=0.568, global_step=4460.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4599/5971 [40:25<12:03,  1.90it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000191, train/loss_step=0.0542, global_step=4460.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4600/5971 [40:27<12:03,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0838, train/loss_vlb_step=0.000281, train/loss_step=0.0838, global_step=4460.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4601/5971 [40:28<12:03,  1.89it/s, loss=0.18, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00113, train/loss_step=0.289, global_step=4461.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  77%|███████▋  | 4602/5971 [40:29<12:02,  1.89it/s, loss=0.191, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00497, train/loss_step=0.471, global_step=4461.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4603/5971 [40:30<12:02,  1.89it/s, loss=0.191, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00497, train/loss_step=0.471, global_step=4461.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4603/5971 [40:30<12:02,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000358, train/loss_step=0.108, global_step=4461.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4604/5971 [40:32<12:02,  1.89it/s, loss=0.213, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00227, train/loss_step=0.351, global_step=4461.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4605/5971 [40:33<12:01,  1.89it/s, loss=0.226, v_num=0, train/loss_simple_step=0.273, train/loss_vlb_step=0.00121, train/loss_step=0.273, global_step=4462.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4606/5971 [40:34<12:01,  1.89it/s, loss=0.205, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.003, train/loss_step=0.369, global_step=4462.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  77%|███████▋  | 4607/5971 [40:35<12:00,  1.89it/s, loss=0.205, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.003, train/loss_step=0.369, global_step=4462.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4607/5971 [40:35<12:00,  1.89it/s, loss=0.202, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000425, train/loss_step=0.126, global_step=4462.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4608/5971 [40:37<12:00,  1.89it/s, loss=0.219, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00173, train/loss_step=0.337, global_step=4462.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4609/5971 [40:38<12:00,  1.89it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=9.94e-5, train/loss_step=0.0273, global_step=4463.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4610/5971 [40:39<11:59,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.22e-6, train/loss_step=0.00162, global_step=4463.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4611/5971 [40:40<11:59,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.22e-6, train/loss_step=0.00162, global_step=4463.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4611/5971 [40:40<11:59,  1.89it/s, loss=0.195, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000611, train/loss_step=0.177, global_step=4463.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  77%|███████▋  | 4612/5971 [40:42<11:59,  1.89it/s, loss=0.202, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000513, train/loss_step=0.143, global_step=4463.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4613/5971 [40:43<11:59,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0169, train/loss_vlb_step=7.07e-5, train/loss_step=0.0169, global_step=4464.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4614/5971 [40:44<11:58,  1.89it/s, loss=0.191, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000565, train/loss_step=0.165, global_step=4464.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4615/5971 [40:45<11:58,  1.89it/s, loss=0.191, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000565, train/loss_step=0.165, global_step=4464.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4615/5971 [40:45<11:58,  1.89it/s, loss=0.197, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.0007, train/loss_step=0.189, global_step=4464.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  77%|███████▋  | 4616/5971 [40:47<11:58,  1.89it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0718, train/loss_vlb_step=0.000237, train/loss_step=0.0718, global_step=4464.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4617/5971 [40:48<11:57,  1.89it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=7.85e-5, train/loss_step=0.0195, global_step=4465.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4618/5971 [40:49<11:57,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.00067, train/loss_step=0.194, global_step=4465.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  77%|███████▋  | 4619/5971 [40:50<11:57,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.194, train/loss_vlb_step=0.00067, train/loss_step=0.194, global_step=4465.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4619/5971 [40:50<11:57,  1.89it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000118, train/loss_step=0.0313, global_step=4465.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4620/5971 [40:52<11:56,  1.88it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=2.02e-5, train/loss_step=0.00379, global_step=4465.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4621/5971 [40:53<11:56,  1.88it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0078, train/loss_vlb_step=3.63e-5, train/loss_step=0.0078, global_step=4466.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  77%|███████▋  | 4622/5971 [40:54<11:56,  1.88it/s, loss=0.144, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00105, train/loss_step=0.271, global_step=4466.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  77%|███████▋  | 4623/5971 [40:54<11:55,  1.88it/s, loss=0.144, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00105, train/loss_step=0.271, global_step=4466.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4623/5971 [40:54<11:55,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0763, train/loss_vlb_step=0.000255, train/loss_step=0.0763, global_step=4466.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4624/5971 [40:57<11:55,  1.88it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.000311, train/loss_step=0.0941, global_step=4466.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4625/5971 [40:58<11:55,  1.88it/s, loss=0.126, v_num=0, train/loss_simple_step=0.204, train/loss_vlb_step=0.000732, train/loss_step=0.204, global_step=4467.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4626/5971 [40:59<11:54,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00141, train/loss_step=0.309, global_step=4467.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  77%|███████▋  | 4627/5971 [40:59<11:54,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00141, train/loss_step=0.309, global_step=4467.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  77%|███████▋  | 4627/5971 [40:59<11:54,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0744, train/loss_vlb_step=0.000247, train/loss_step=0.0744, global_step=4467.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4628/5971 [41:02<11:54,  1.88it/s, loss=0.129, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00425, train/loss_step=0.501, global_step=4467.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  78%|███████▊  | 4629/5971 [41:03<11:53,  1.88it/s, loss=0.141, v_num=0, train/loss_simple_step=0.263, train/loss_vlb_step=0.00104, train/loss_step=0.263, global_step=4468.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4630/5971 [41:03<11:53,  1.88it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0356, train/loss_vlb_step=0.000135, train/loss_step=0.0356, global_step=4468.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4631/5971 [41:04<11:53,  1.88it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0356, train/loss_vlb_step=0.000135, train/loss_step=0.0356, global_step=4468.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4631/5971 [41:04<11:53,  1.88it/s, loss=0.149, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00126, train/loss_step=0.308, global_step=4468.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  78%|███████▊  | 4632/5971 [41:06<11:52,  1.88it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0708, train/loss_vlb_step=0.000235, train/loss_step=0.0708, global_step=4468.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4633/5971 [41:07<11:52,  1.88it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.39e-5, train/loss_step=0.0128, global_step=4469.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  78%|███████▊  | 4634/5971 [41:08<11:52,  1.88it/s, loss=0.158, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00267, train/loss_step=0.415, global_step=4469.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  78%|███████▊  | 4635/5971 [41:09<11:51,  1.88it/s, loss=0.158, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00267, train/loss_step=0.415, global_step=4469.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4635/5971 [41:09<11:51,  1.88it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0515, train/loss_vlb_step=0.000175, train/loss_step=0.0515, global_step=4469.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4636/5971 [41:11<11:51,  1.88it/s, loss=0.169, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00338, train/loss_step=0.433, global_step=4469.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  78%|███████▊  | 4637/5971 [41:12<11:51,  1.88it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0482, train/loss_vlb_step=0.000176, train/loss_step=0.0482, global_step=4470.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4638/5971 [41:13<11:50,  1.88it/s, loss=0.174, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00105, train/loss_step=0.269, global_step=4470.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  78%|███████▊  | 4639/5971 [41:14<11:50,  1.88it/s, loss=0.174, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00105, train/loss_step=0.269, global_step=4470.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4639/5971 [41:14<11:50,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.000242, train/loss_step=0.0726, global_step=4470.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4640/5971 [41:16<11:50,  1.87it/s, loss=0.186, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000724, train/loss_step=0.203, global_step=4470.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  78%|███████▊  | 4641/5971 [41:17<11:49,  1.87it/s, loss=0.225, v_num=0, train/loss_simple_step=0.791, train/loss_vlb_step=0.0454, train/loss_step=0.791, global_step=4471.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  78%|███████▊  | 4642/5971 [41:18<11:49,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.00059, train/loss_step=0.159, global_step=4471.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4643/5971 [41:19<11:49,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.00059, train/loss_step=0.159, global_step=4471.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4643/5971 [41:19<11:49,  1.87it/s, loss=0.216, v_num=0, train/loss_simple_step=0.00364, train/loss_vlb_step=1.95e-5, train/loss_step=0.00364, global_step=4471.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4644/5971 [41:21<11:48,  1.87it/s, loss=0.221, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000643, train/loss_step=0.191, global_step=4471.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  78%|███████▊  | 4645/5971 [41:22<11:48,  1.87it/s, loss=0.24, v_num=0, train/loss_simple_step=0.586, train/loss_vlb_step=0.006, train/loss_step=0.586, global_step=4472.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  78%|███████▊  | 4646/5971 [41:23<11:48,  1.87it/s, loss=0.244, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.0021, train/loss_step=0.388, global_step=4472.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4647/5971 [41:24<11:47,  1.87it/s, loss=0.244, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.0021, train/loss_step=0.388, global_step=4472.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4647/5971 [41:24<11:47,  1.87it/s, loss=0.243, v_num=0, train/loss_simple_step=0.0618, train/loss_vlb_step=0.000212, train/loss_step=0.0618, global_step=4472.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4648/5971 [41:26<11:47,  1.87it/s, loss=0.244, v_num=0, train/loss_simple_step=0.507, train/loss_vlb_step=0.00445, train/loss_step=0.507, global_step=4472.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  78%|███████▊  | 4649/5971 [41:27<11:47,  1.87it/s, loss=0.235, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.00029, train/loss_step=0.0861, global_step=4473.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4650/5971 [41:28<11:46,  1.87it/s, loss=0.238, v_num=0, train/loss_simple_step=0.0959, train/loss_vlb_step=0.000316, train/loss_step=0.0959, global_step=4473.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4651/5971 [41:28<11:46,  1.87it/s, loss=0.238, v_num=0, train/loss_simple_step=0.0959, train/loss_vlb_step=0.000316, train/loss_step=0.0959, global_step=4473.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4651/5971 [41:28<11:46,  1.87it/s, loss=0.223, v_num=0, train/loss_simple_step=0.00431, train/loss_vlb_step=2.31e-5, train/loss_step=0.00431, global_step=4473.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4652/5971 [41:31<11:46,  1.87it/s, loss=0.219, v_num=0, train/loss_simple_step=0.00842, train/loss_vlb_step=3.93e-5, train/loss_step=0.00842, global_step=4473.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4653/5971 [41:31<11:45,  1.87it/s, loss=0.234, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00139, train/loss_step=0.313, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  78%|███████▊  | 4654/5971 [41:32<11:45,  1.87it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000162, train/loss_step=0.0466, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4655/5971 [41:33<11:44,  1.87it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000162, train/loss_step=0.0466, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4655/5971 [41:33<11:44,  1.87it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=3.91e-5, train/loss_step=0.00843, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  78%|███████▊  | 4656/5971 [41:36<11:44,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:15,  2.21it/s][A
Epoch 7:  78%|███████▊  | 4659/5971 [41:36<11:42,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   2%|▏         | 3/167 [00:00<00:26,  6.24it/s][A

Validating:   4%|▎         | 6/167 [00:00<00:13, 11.55it/s][A
Epoch 7:  78%|███████▊  | 4663/5971 [41:36<11:40,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.79it/s][A
Epoch 7:  78%|███████▊  | 4667/5971 [41:37<11:37,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 12/167 [00:00<00:09, 16.88it/s][A
Epoch 7:  78%|███████▊  | 4671/5971 [41:37<11:34,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   9%|▉         | 15/167 [00:01<00:08, 18.62it/s][A

Validating:  11%|█         | 18/167 [00:01<00:07, 20.40it/s][A
Epoch 7:  78%|███████▊  | 4675/5971 [41:37<11:32,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.24it/s][A
Epoch 7:  78%|███████▊  | 4679/5971 [41:37<11:29,  1.87it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 24.90it/s][A
Epoch 7:  78%|███████▊  | 4683/5971 [41:37<11:26,  1.88it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.58it/s][A
Epoch 7:  78%|███████▊  | 4687/5971 [41:37<11:24,  1.88it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 24.83it/s][A

Validating:  20%|██        | 34/167 [00:01<00:05, 25.29it/s][A
Epoch 7:  79%|███████▊  | 4691/5971 [41:38<11:21,  1.88it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  22%|██▏       | 37/167 [00:01<00:05, 25.74it/s][A
Epoch 7:  79%|███████▊  | 4695/5971 [41:38<11:18,  1.88it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 25.61it/s][A
Epoch 7:  79%|███████▊  | 4699/5971 [41:38<11:16,  1.88it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.66it/s][A

Validating:  28%|██▊       | 46/167 [00:02<00:04, 27.20it/s][A
Epoch 7:  79%|███████▉  | 4703/5971 [41:38<11:13,  1.88it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 28.60it/s][A
Epoch 7:  79%|███████▉  | 4707/5971 [41:38<11:10,  1.88it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 53/167 [00:02<00:03, 28.81it/s][A
Epoch 7:  79%|███████▉  | 4711/5971 [41:38<11:08,  1.89it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▎      | 56/167 [00:02<00:03, 28.10it/s][A
Epoch 7:  79%|███████▉  | 4715/5971 [41:38<11:05,  1.89it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.59it/s][A

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.97it/s][A
Epoch 7:  79%|███████▉  | 4719/5971 [41:39<11:02,  1.89it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 27.48it/s][A
Epoch 7:  79%|███████▉  | 4723/5971 [41:39<11:00,  1.89it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.09it/s][A
Epoch 7:  79%|███████▉  | 4727/5971 [41:39<10:57,  1.89it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.44it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 24.94it/s][A
Epoch 7:  79%|███████▉  | 4731/5971 [41:39<10:54,  1.89it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.23it/s][A
Epoch 7:  79%|███████▉  | 4735/5971 [41:39<10:52,  1.89it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.41it/s][A
Epoch 7:  79%|███████▉  | 4739/5971 [41:39<10:49,  1.90it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 26.31it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.27it/s][A
Epoch 7:  79%|███████▉  | 4743/5971 [41:39<10:47,  1.90it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.78it/s][A
Epoch 7:  80%|███████▉  | 4747/5971 [41:40<10:44,  1.90it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 27.26it/s][A
Epoch 7:  80%|███████▉  | 4751/5971 [41:40<10:41,  1.90it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.96it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 28.53it/s][A
Epoch 7:  80%|███████▉  | 4755/5971 [41:40<10:39,  1.90it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  60%|██████    | 101/167 [00:04<00:02, 28.64it/s][A
Epoch 7:  80%|███████▉  | 4759/5971 [41:40<10:36,  1.90it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 27.72it/s][A
Epoch 7:  80%|███████▉  | 4763/5971 [41:40<10:34,  1.91it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.26it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.65it/s][A
Epoch 7:  80%|███████▉  | 4767/5971 [41:40<10:31,  1.91it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.29it/s][A
Epoch 7:  80%|███████▉  | 4771/5971 [41:40<10:28,  1.91it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 28.01it/s][A
Epoch 7:  80%|███████▉  | 4775/5971 [41:41<10:26,  1.91it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 28.09it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 27.68it/s][A
Epoch 7:  80%|████████  | 4779/5971 [41:41<10:23,  1.91it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.21it/s][A
Epoch 7:  80%|████████  | 4783/5971 [41:41<10:21,  1.91it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.20it/s][A
Epoch 7:  80%|████████  | 4787/5971 [41:41<10:18,  1.91it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 24.52it/s][A

Validating:  80%|████████  | 134/167 [00:05<00:01, 24.24it/s][A
Epoch 7:  80%|████████  | 4791/5971 [41:41<10:16,  1.92it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 25.33it/s][A
Epoch 7:  80%|████████  | 4795/5971 [41:41<10:13,  1.92it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 140/167 [00:05<00:01, 25.89it/s][A
Epoch 7:  80%|████████  | 4799/5971 [41:42<10:10,  1.92it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 25.96it/s][A

Validating:  87%|████████▋ | 146/167 [00:05<00:00, 26.49it/s][A
Epoch 7:  80%|████████  | 4803/5971 [41:42<10:08,  1.92it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 25.96it/s][A
Epoch 7:  81%|████████  | 4807/5971 [41:42<10:05,  1.92it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 24.73it/s][A
Epoch 7:  81%|████████  | 4811/5971 [41:42<10:03,  1.92it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.61it/s][A

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.44it/s][A
Epoch 7:  81%|████████  | 4815/5971 [41:42<10:00,  1.92it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 25.93it/s][A
Epoch 7:  81%|████████  | 4819/5971 [41:42<09:58,  1.93it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.28it/s][A
Epoch 7:  81%|████████  | 4823/5971 [41:42<09:55,  1.93it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.13it/s][A
Epoch 7:  81%|████████  | 4824/5971 [41:43<09:55,  1.93it/s, loss=0.22, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.0092, train/loss_step=0.563, global_step=4474.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  81%|████████  | 4825/5971 [41:44<09:54,  1.93it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00444, train/loss_vlb_step=2.34e-5, train/loss_step=0.00444, global_step=4475.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4826/5971 [41:45<09:54,  1.93it/s, loss=0.24, v_num=0, train/loss_simple_step=0.702, train/loss_vlb_step=0.0187, train/loss_step=0.702, global_step=4475.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      
Epoch 7:  81%|████████  | 4827/5971 [41:46<09:53,  1.93it/s, loss=0.24, v_num=0, train/loss_simple_step=0.702, train/loss_vlb_step=0.0187, train/loss_step=0.702, global_step=4475.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4827/5971 [41:46<09:53,  1.93it/s, loss=0.238, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000108, train/loss_step=0.0273, global_step=4475.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4828/5971 [41:49<09:53,  1.92it/s, loss=0.236, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000558, train/loss_step=0.166, global_step=4475.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  81%|████████  | 4829/5971 [41:50<09:53,  1.92it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000107, train/loss_step=0.0301, global_step=4476.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4830/5971 [41:51<09:53,  1.92it/s, loss=0.199, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000655, train/loss_step=0.182, global_step=4476.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  81%|████████  | 4831/5971 [41:51<09:52,  1.92it/s, loss=0.199, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000655, train/loss_step=0.182, global_step=4476.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4831/5971 [41:51<09:52,  1.92it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00787, train/loss_vlb_step=3.87e-5, train/loss_step=0.00787, global_step=4476.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4832/5971 [41:53<09:52,  1.92it/s, loss=0.217, v_num=0, train/loss_simple_step=0.553, train/loss_vlb_step=0.00551, train/loss_step=0.553, global_step=4476.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  81%|████████  | 4833/5971 [41:54<09:52,  1.92it/s, loss=0.206, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00154, train/loss_step=0.368, global_step=4477.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4834/5971 [41:55<09:51,  1.92it/s, loss=0.199, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000954, train/loss_step=0.250, global_step=4477.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4835/5971 [41:56<09:51,  1.92it/s, loss=0.199, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000954, train/loss_step=0.250, global_step=4477.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4835/5971 [41:56<09:51,  1.92it/s, loss=0.205, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000587, train/loss_step=0.172, global_step=4477.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4836/5971 [41:58<09:51,  1.92it/s, loss=0.202, v_num=0, train/loss_simple_step=0.448, train/loss_vlb_step=0.00247, train/loss_step=0.448, global_step=4477.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  81%|████████  | 4837/5971 [41:59<09:50,  1.92it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.28e-5, train/loss_step=0.0115, global_step=4478.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4838/5971 [42:00<09:50,  1.92it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0945, train/loss_vlb_step=0.000322, train/loss_step=0.0945, global_step=4478.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4839/5971 [42:01<09:49,  1.92it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0945, train/loss_vlb_step=0.000322, train/loss_step=0.0945, global_step=4478.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4839/5971 [42:01<09:49,  1.92it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000128, train/loss_step=0.0336, global_step=4478.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4840/5971 [42:04<09:49,  1.92it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0058, train/loss_vlb_step=2.86e-5, train/loss_step=0.0058, global_step=4478.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  81%|████████  | 4841/5971 [42:05<09:49,  1.92it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.96e-5, train/loss_step=0.0257, global_step=4479.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4842/5971 [42:05<09:48,  1.92it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000174, train/loss_step=0.0488, global_step=4479.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4843/5971 [42:06<09:48,  1.92it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0488, train/loss_vlb_step=0.000174, train/loss_step=0.0488, global_step=4479.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4843/5971 [42:06<09:48,  1.92it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.94e-5, train/loss_step=0.0109, global_step=4479.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  81%|████████  | 4844/5971 [42:08<09:48,  1.92it/s, loss=0.171, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00122, train/loss_step=0.288, global_step=4479.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  81%|████████  | 4845/5971 [42:09<09:47,  1.92it/s, loss=0.182, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000791, train/loss_step=0.217, global_step=4480.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4846/5971 [42:10<09:47,  1.92it/s, loss=0.153, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=4480.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4847/5971 [42:11<09:46,  1.91it/s, loss=0.153, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=4480.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4847/5971 [42:11<09:46,  1.91it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.96e-5, train/loss_step=0.0187, global_step=4480.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4848/5971 [42:13<09:46,  1.91it/s, loss=0.163, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00153, train/loss_step=0.368, global_step=4480.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  81%|████████  | 4849/5971 [42:14<09:46,  1.91it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00602, train/loss_vlb_step=3.09e-5, train/loss_step=0.00602, global_step=4481.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4850/5971 [42:15<09:45,  1.91it/s, loss=0.165, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000863, train/loss_step=0.242, global_step=4481.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  81%|████████  | 4851/5971 [42:16<09:45,  1.91it/s, loss=0.165, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.000863, train/loss_step=0.242, global_step=4481.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████  | 4851/5971 [42:16<09:45,  1.91it/s, loss=0.176, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000771, train/loss_step=0.221, global_step=4481.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4852/5971 [42:18<09:45,  1.91it/s, loss=0.154, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000421, train/loss_step=0.127, global_step=4481.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4853/5971 [42:19<09:44,  1.91it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0021, train/loss_vlb_step=1.22e-5, train/loss_step=0.0021, global_step=4482.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4854/5971 [42:20<09:44,  1.91it/s, loss=0.134, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000759, train/loss_step=0.209, global_step=4482.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  81%|████████▏ | 4855/5971 [42:21<09:44,  1.91it/s, loss=0.134, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000759, train/loss_step=0.209, global_step=4482.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4855/5971 [42:21<09:44,  1.91it/s, loss=0.131, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000405, train/loss_step=0.120, global_step=4482.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4856/5971 [42:23<09:43,  1.91it/s, loss=0.115, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000408, train/loss_step=0.123, global_step=4482.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4857/5971 [42:24<09:43,  1.91it/s, loss=0.122, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000536, train/loss_step=0.156, global_step=4483.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4858/5971 [42:25<09:43,  1.91it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.29e-5, train/loss_step=0.0022, global_step=4483.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4859/5971 [42:26<09:42,  1.91it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.29e-5, train/loss_step=0.0022, global_step=4483.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4859/5971 [42:26<09:42,  1.91it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0809, train/loss_vlb_step=0.000269, train/loss_step=0.0809, global_step=4483.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4860/5971 [42:28<09:42,  1.91it/s, loss=0.131, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.00085, train/loss_step=0.219, global_step=4483.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  81%|████████▏ | 4861/5971 [42:29<09:42,  1.91it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.03e-5, train/loss_step=0.0117, global_step=4484.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4862/5971 [42:30<09:41,  1.91it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0507, train/loss_vlb_step=0.000171, train/loss_step=0.0507, global_step=4484.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4863/5971 [42:31<09:41,  1.91it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0507, train/loss_vlb_step=0.000171, train/loss_step=0.0507, global_step=4484.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4863/5971 [42:31<09:41,  1.91it/s, loss=0.134, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.000257, train/loss_step=0.078, global_step=4484.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  81%|████████▏ | 4864/5971 [42:33<09:41,  1.91it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000123, train/loss_step=0.0343, global_step=4484.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  81%|████████▏ | 4865/5971 [42:34<09:40,  1.90it/s, loss=0.128, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00237, train/loss_step=0.368, global_step=4485.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  81%|████████▏ | 4866/5971 [42:35<09:40,  1.90it/s, loss=0.131, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000683, train/loss_step=0.187, global_step=4485.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4867/5971 [42:36<09:39,  1.90it/s, loss=0.131, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.000683, train/loss_step=0.187, global_step=4485.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4867/5971 [42:36<09:39,  1.90it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00807, train/loss_vlb_step=3.68e-5, train/loss_step=0.00807, global_step=4485.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4868/5971 [42:38<09:39,  1.90it/s, loss=0.117, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=4485.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  82%|████████▏ | 4869/5971 [42:39<09:39,  1.90it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00843, train/loss_vlb_step=3.98e-5, train/loss_step=0.00843, global_step=4486.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4870/5971 [42:40<09:38,  1.90it/s, loss=0.112, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.00041, train/loss_step=0.123, global_step=4486.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  82%|████████▏ | 4871/5971 [42:41<09:38,  1.90it/s, loss=0.112, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.00041, train/loss_step=0.123, global_step=4486.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4871/5971 [42:41<09:38,  1.90it/s, loss=0.127, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00574, train/loss_step=0.528, global_step=4486.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4872/5971 [42:43<09:38,  1.90it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0702, train/loss_vlb_step=0.000238, train/loss_step=0.0702, global_step=4486.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4873/5971 [42:44<09:37,  1.90it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.08e-5, train/loss_step=0.00179, global_step=4487.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4874/5971 [42:45<09:37,  1.90it/s, loss=0.137, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.0038, train/loss_step=0.468, global_step=4487.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  82%|████████▏ | 4875/5971 [42:46<09:36,  1.90it/s, loss=0.137, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.0038, train/loss_step=0.468, global_step=4487.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4875/5971 [42:46<09:36,  1.90it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0251, train/loss_vlb_step=9.84e-5, train/loss_step=0.0251, global_step=4487.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4876/5971 [42:48<09:36,  1.90it/s, loss=0.141, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00156, train/loss_step=0.288, global_step=4487.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  82%|████████▏ | 4877/5971 [42:49<09:36,  1.90it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.26e-5, train/loss_step=0.00223, global_step=4488.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4878/5971 [42:50<09:35,  1.90it/s, loss=0.138, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000383, train/loss_step=0.116, global_step=4488.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  82%|████████▏ | 4879/5971 [42:51<09:35,  1.90it/s, loss=0.138, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000383, train/loss_step=0.116, global_step=4488.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4879/5971 [42:51<09:35,  1.90it/s, loss=0.142, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000497, train/loss_step=0.144, global_step=4488.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4880/5971 [42:53<09:35,  1.90it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0732, train/loss_vlb_step=0.000242, train/loss_step=0.0732, global_step=4488.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4881/5971 [42:54<09:34,  1.90it/s, loss=0.143, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000633, train/loss_step=0.181, global_step=4489.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  82%|████████▏ | 4882/5971 [42:55<09:34,  1.90it/s, loss=0.161, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00232, train/loss_step=0.418, global_step=4489.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4883/5971 [42:56<09:33,  1.90it/s, loss=0.161, v_num=0, train/loss_simple_step=0.418, train/loss_vlb_step=0.00232, train/loss_step=0.418, global_step=4489.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4883/5971 [42:56<09:33,  1.90it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00384, train/loss_vlb_step=1.99e-5, train/loss_step=0.00384, global_step=4489.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4884/5971 [42:58<09:33,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000682, train/loss_step=0.199, global_step=4489.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  82%|████████▏ | 4885/5971 [42:59<09:33,  1.89it/s, loss=0.18, v_num=0, train/loss_simple_step=0.663, train/loss_vlb_step=0.0676, train/loss_step=0.663, global_step=4490.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  82%|████████▏ | 4886/5971 [42:59<09:32,  1.89it/s, loss=0.177, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=4490.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4887/5971 [43:00<09:32,  1.89it/s, loss=0.177, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=4490.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4887/5971 [43:00<09:32,  1.89it/s, loss=0.189, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00106, train/loss_step=0.259, global_step=4490.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4888/5971 [43:02<09:32,  1.89it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0779, train/loss_vlb_step=0.000258, train/loss_step=0.0779, global_step=4490.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4889/5971 [43:03<09:31,  1.89it/s, loss=0.199, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000851, train/loss_step=0.225, global_step=4491.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  82%|████████▏ | 4890/5971 [43:04<09:31,  1.89it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00103, train/loss_vlb_step=6.27e-6, train/loss_step=0.00103, global_step=4491.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4891/5971 [43:05<09:30,  1.89it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00103, train/loss_vlb_step=6.27e-6, train/loss_step=0.00103, global_step=4491.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4891/5971 [43:05<09:30,  1.89it/s, loss=0.184, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00191, train/loss_step=0.353, global_step=4491.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  82%|████████▏ | 4892/5971 [43:07<09:30,  1.89it/s, loss=0.217, v_num=0, train/loss_simple_step=0.730, train/loss_vlb_step=0.0469, train/loss_step=0.730, global_step=4491.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4893/5971 [43:08<09:30,  1.89it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0472, train/loss_vlb_step=0.000168, train/loss_step=0.0472, global_step=4492.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4894/5971 [43:09<09:29,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.58e-5, train/loss_step=0.00286, global_step=4492.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4895/5971 [43:10<09:29,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.58e-5, train/loss_step=0.00286, global_step=4492.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4895/5971 [43:10<09:29,  1.89it/s, loss=0.221, v_num=0, train/loss_simple_step=0.528, train/loss_vlb_step=0.00625, train/loss_step=0.528, global_step=4492.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  82%|████████▏ | 4896/5971 [43:12<09:29,  1.89it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.96e-5, train/loss_step=0.0111, global_step=4492.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4897/5971 [43:13<09:28,  1.89it/s, loss=0.221, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00112, train/loss_step=0.275, global_step=4493.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  82%|████████▏ | 4898/5971 [43:14<09:28,  1.89it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.85e-5, train/loss_step=0.0213, global_step=4493.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4899/5971 [43:15<09:27,  1.89it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0213, train/loss_vlb_step=8.85e-5, train/loss_step=0.0213, global_step=4493.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4899/5971 [43:15<09:27,  1.89it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.19e-5, train/loss_step=0.0197, global_step=4493.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4900/5971 [43:17<09:27,  1.89it/s, loss=0.214, v_num=0, train/loss_simple_step=0.151, train/loss_vlb_step=0.000519, train/loss_step=0.151, global_step=4493.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4901/5971 [43:18<09:27,  1.89it/s, loss=0.215, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000676, train/loss_step=0.192, global_step=4494.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4902/5971 [43:19<09:26,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000141, train/loss_step=0.0409, global_step=4494.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4903/5971 [43:20<09:26,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000141, train/loss_step=0.0409, global_step=4494.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4903/5971 [43:20<09:26,  1.89it/s, loss=0.203, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.00046, train/loss_step=0.139, global_step=4494.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  82%|████████▏ | 4904/5971 [43:22<09:26,  1.88it/s, loss=0.194, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.2e-5, train/loss_step=0.0224, global_step=4494.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4905/5971 [43:23<09:25,  1.88it/s, loss=0.177, v_num=0, train/loss_simple_step=0.327, train/loss_vlb_step=0.00164, train/loss_step=0.327, global_step=4495.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4906/5971 [43:24<09:25,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00104, train/loss_step=0.259, global_step=4495.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4907/5971 [43:25<09:24,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00104, train/loss_step=0.259, global_step=4495.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4907/5971 [43:25<09:24,  1.88it/s, loss=0.186, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.0011, train/loss_step=0.294, global_step=4495.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4908/5971 [43:27<09:24,  1.88it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00885, train/loss_vlb_step=4.18e-5, train/loss_step=0.00885, global_step=4495.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4909/5971 [43:28<09:24,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.362, train/loss_vlb_step=0.00205, train/loss_step=0.362, global_step=4496.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  82%|████████▏ | 4910/5971 [43:29<09:23,  1.88it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000272, train/loss_step=0.0828, global_step=4496.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4911/5971 [43:29<09:23,  1.88it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0828, train/loss_vlb_step=0.000272, train/loss_step=0.0828, global_step=4496.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4911/5971 [43:29<09:23,  1.88it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000208, train/loss_step=0.0614, global_step=4496.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4912/5971 [43:32<09:23,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.77e-5, train/loss_step=0.0189, global_step=4496.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4913/5971 [43:32<09:22,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000336, train/loss_step=0.101, global_step=4497.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4914/5971 [43:33<09:22,  1.88it/s, loss=0.151, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000375, train/loss_step=0.114, global_step=4497.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4915/5971 [43:34<09:21,  1.88it/s, loss=0.151, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000375, train/loss_step=0.114, global_step=4497.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4915/5971 [43:34<09:21,  1.88it/s, loss=0.133, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.00056, train/loss_step=0.165, global_step=4497.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4916/5971 [43:37<09:21,  1.88it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0372, train/loss_vlb_step=0.000139, train/loss_step=0.0372, global_step=4497.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4917/5971 [43:37<09:21,  1.88it/s, loss=0.131, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000708, train/loss_step=0.195, global_step=4498.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  82%|████████▏ | 4918/5971 [43:38<09:20,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000578, train/loss_step=0.175, global_step=4498.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4919/5971 [43:39<09:20,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000578, train/loss_step=0.175, global_step=4498.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4919/5971 [43:39<09:20,  1.88it/s, loss=0.15, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00109, train/loss_step=0.256, global_step=4498.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  82%|████████▏ | 4920/5971 [43:41<09:19,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=2.02e-5, train/loss_step=0.00378, global_step=4498.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4921/5971 [43:42<09:19,  1.88it/s, loss=0.144, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000733, train/loss_step=0.218, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  82%|████████▏ | 4921/5971 [43:58<09:22,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000733, train/loss_step=0.218, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4922/5971 [44:17<09:26,  1.85it/s, loss=0.144, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000733, train/loss_step=0.218, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4922/5971 [44:17<09:26,  1.85it/s, loss=0.153, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.00085, train/loss_step=0.221, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  82%|████████▏ | 4923/5971 [44:18<09:25,  1.85it/s, loss=0.153, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.00085, train/loss_step=0.221, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4923/5971 [44:18<09:25,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00999, train/loss_vlb_step=4.59e-5, train/loss_step=0.00999, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4924/5971 [44:20<09:25,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00999, train/loss_vlb_step=4.59e-5, train/loss_step=0.00999, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  82%|████████▏ | 4924/5971 [44:20<09:25,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:11,  2.33it/s][A
Epoch 7:  82%|████████▏ | 4926/5971 [44:21<09:24,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:41,  3.94it/s][A
Epoch 7:  83%|████████▎ | 4928/5971 [44:21<09:23,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.53it/s][A
Epoch 7:  83%|████████▎ | 4931/5971 [44:21<09:21,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.43it/s][A
Epoch 7:  83%|████████▎ | 4934/5971 [44:21<09:19,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.69it/s][A
Epoch 7:  83%|████████▎ | 4937/5971 [44:21<09:17,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.26it/s][A
Epoch 7:  83%|████████▎ | 4940/5971 [44:21<09:15,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:06, 22.56it/s][A
Epoch 7:  83%|████████▎ | 4944/5971 [44:22<09:12,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.68it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.55it/s][A
Epoch 7:  83%|████████▎ | 4948/5971 [44:22<09:10,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.51it/s][A
Epoch 7:  83%|████████▎ | 4952/5971 [44:22<09:07,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.69it/s][A
Epoch 7:  83%|████████▎ | 4956/5971 [44:22<09:05,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.19it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 26.27it/s][A
Epoch 7:  83%|████████▎ | 4960/5971 [44:22<09:02,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.15it/s][A
Epoch 7:  83%|████████▎ | 4964/5971 [44:22<09:00,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 28.33it/s][A
Epoch 7:  83%|████████▎ | 4968/5971 [44:22<08:57,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 27.68it/s][A
Epoch 7:  83%|████████▎ | 4972/5971 [44:23<08:54,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.41it/s][A
Epoch 7:  83%|████████▎ | 4976/5971 [44:23<08:52,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  31%|███       | 52/167 [00:02<00:04, 28.57it/s][A

Validating:  33%|███▎      | 55/167 [00:02<00:03, 28.10it/s][A
Epoch 7:  83%|████████▎ | 4980/5971 [44:23<08:49,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 28.26it/s][A
Epoch 7:  83%|████████▎ | 4984/5971 [44:23<08:47,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.18it/s][A
Epoch 7:  84%|████████▎ | 4988/5971 [44:23<08:44,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  40%|███▉      | 66/167 [00:02<00:03, 27.83it/s][A
Epoch 7:  84%|████████▎ | 4992/5971 [44:23<08:42,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████▏     | 69/167 [00:03<00:03, 26.96it/s][A
Epoch 7:  84%|████████▎ | 4996/5971 [44:23<08:39,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.43it/s][A

Validating:  45%|████▍     | 75/167 [00:03<00:03, 27.74it/s][A
Epoch 7:  84%|████████▎ | 5000/5971 [44:24<08:37,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.89it/s][A
Epoch 7:  84%|████████▍ | 5004/5971 [44:24<08:34,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.68it/s][A
Epoch 7:  84%|████████▍ | 5008/5971 [44:24<08:32,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|█████     | 84/167 [00:03<00:02, 27.97it/s][A

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 23.95it/s][A
Epoch 7:  84%|████████▍ | 5012/5971 [44:24<08:29,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 90/167 [00:03<00:03, 24.36it/s][A
Epoch 7:  84%|████████▍ | 5016/5971 [44:24<08:27,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▌    | 93/167 [00:03<00:02, 24.77it/s][A
Epoch 7:  84%|████████▍ | 5020/5971 [44:24<08:24,  1.88it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.20it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 24.80it/s][A
Epoch 7:  84%|████████▍ | 5024/5971 [44:25<08:22,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.95it/s][A
Epoch 7:  84%|████████▍ | 5028/5971 [44:25<08:19,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.95it/s][A
Epoch 7:  84%|████████▍ | 5032/5971 [44:25<08:17,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.94it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 25.22it/s][A
Epoch 7:  84%|████████▍ | 5036/5971 [44:25<08:14,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 24.63it/s][A
Epoch 7:  84%|████████▍ | 5040/5971 [44:25<08:12,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:04<00:02, 24.05it/s][A
Epoch 7:  84%|████████▍ | 5044/5971 [44:25<08:09,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 24.13it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 24.22it/s][A
Epoch 7:  85%|████████▍ | 5048/5971 [44:26<08:07,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 25.66it/s][A
Epoch 7:  85%|████████▍ | 5052/5971 [44:26<08:04,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.06it/s][A
Epoch 7:  85%|████████▍ | 5056/5971 [44:26<08:02,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.91it/s][A
Epoch 7:  85%|████████▍ | 5060/5971 [44:26<07:59,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.45it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.71it/s][A
Epoch 7:  85%|████████▍ | 5064/5971 [44:26<07:57,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.62it/s][A
Epoch 7:  85%|████████▍ | 5068/5971 [44:26<07:55,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 26.60it/s][A
Epoch 7:  85%|████████▍ | 5072/5971 [44:26<07:52,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 25.79it/s][A
Epoch 7:  85%|████████▌ | 5076/5971 [44:27<07:50,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.51it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 28.12it/s][A
Epoch 7:  85%|████████▌ | 5080/5971 [44:27<07:47,  1.91it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 28.50it/s][A
Epoch 7:  85%|████████▌ | 5084/5971 [44:27<07:45,  1.91it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 28.53it/s][A
Epoch 7:  85%|████████▌ | 5088/5971 [44:27<07:42,  1.91it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 28.68it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 28.64it/s][A
Epoch 7:  85%|████████▌ | 5092/5971 [44:27<07:40,  1.91it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5092/5971 [44:28<07:40,  1.91it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.45it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.94it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.07it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.26it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.49it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.57it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.61it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.63it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.66it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.67it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.72it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.69it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.68it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.70it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.69it/s][A
Epoch 7:  85%|████████▌ | 5092/5971 [44:38<07:42,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.69it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:08<00:00,  5.70it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.70it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.35it/s]

Epoch 7:  85%|████████▌ | 5093/5971 [44:39<07:41,  1.90it/s, loss=0.156, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000779, train/loss_step=0.207, global_step=4499.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5093/5971 [44:39<07:41,  1.90it/s, loss=0.171, v_num=0, train/loss_simple_step=0.629, train/loss_vlb_step=0.0142, train/loss_step=0.629, global_step=4500.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:34,  1.43it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:18,  2.55it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:13,  3.40it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  4.02it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.50it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.84it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.07it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.25it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.37it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.46it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.51it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.62it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.40it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.39it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.40it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.43it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.46it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.49it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.53it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.48it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.44it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.49it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.40it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.36it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.42it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.50it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.57it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.59it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.62it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.64it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.65it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.66it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.25it/s]

Epoch 7:  85%|████████▌ | 5094/5971 [44:51<07:43,  1.89it/s, loss=0.171, v_num=0, train/loss_simple_step=0.629, train/loss_vlb_step=0.0142, train/loss_step=0.629, global_step=4500.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5094/5971 [44:51<07:43,  1.89it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=7.89e-5, train/loss_step=0.0206, global_step=4500.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.73it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.02it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.23it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.28it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.51it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.58it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.65it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.45it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.49it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.59it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.60it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.60it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.63it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.66it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.68it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.56it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.46it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.37it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.29it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.36it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.26it/s]

Epoch 7:  85%|████████▌ | 5095/5971 [45:03<07:44,  1.89it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=7.89e-5, train/loss_step=0.0206, global_step=4500.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5095/5971 [45:03<07:44,  1.89it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000163, train/loss_step=0.0468, global_step=4500.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.30it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:01<00:21,  2.19it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:16,  2.92it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:13,  3.50it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:11,  3.90it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.21it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:02<00:09,  4.45it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:09,  4.64it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.66it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.73it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:08,  4.73it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:03<00:07,  4.85it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:07,  4.92it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:07,  4.95it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.00it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  4.90it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:04<00:06,  5.00it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:04<00:06,  4.92it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:06,  4.91it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:06,  5.00it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.15it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.26it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:05<00:05,  5.33it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.41it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.39it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.44it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.46it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:06<00:04,  5.23it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  4.94it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:04,  4.90it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  4.84it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  4.93it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:07<00:03,  4.90it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  4.97it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:03,  4.94it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.01it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  4.98it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:08<00:02,  5.05it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:08<00:02,  5.12it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.18it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.11it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.11it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:09<00:01,  5.18it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:09<00:01,  5.15it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:09<00:00,  5.16it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.02it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.06it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:10<00:00,  5.09it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:10<00:00,  5.12it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.95it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:10<00:00,  4.75it/s]

Epoch 7:  85%|████████▌ | 5096/5971 [45:17<07:46,  1.88it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000163, train/loss_step=0.0468, global_step=4500.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5096/5971 [45:17<07:46,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.2e-5, train/loss_step=0.00434, global_step=4500.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5097/5971 [45:18<07:46,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.2e-5, train/loss_step=0.00434, global_step=4500.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5097/5971 [45:18<07:46,  1.88it/s, loss=0.153, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00469, train/loss_step=0.490, global_step=4501.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  85%|████████▌ | 5098/5971 [45:19<07:45,  1.88it/s, loss=0.153, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00469, train/loss_step=0.490, global_step=4501.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5098/5971 [45:19<07:45,  1.88it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.5e-5, train/loss_step=0.0247, global_step=4501.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5099/5971 [45:20<07:45,  1.87it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0247, train/loss_vlb_step=9.5e-5, train/loss_step=0.0247, global_step=4501.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5099/5971 [45:20<07:45,  1.87it/s, loss=0.148, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.04e-5, train/loss_step=0.015, global_step=4501.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5100/5971 [45:22<07:44,  1.87it/s, loss=0.148, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.04e-5, train/loss_step=0.015, global_step=4501.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5100/5971 [45:22<07:44,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0783, train/loss_vlb_step=0.000258, train/loss_step=0.0783, global_step=4501.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5101/5971 [45:23<07:44,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0783, train/loss_vlb_step=0.000258, train/loss_step=0.0783, global_step=4501.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5101/5971 [45:23<07:44,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000343, train/loss_step=0.104, global_step=4502.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  85%|████████▌ | 5102/5971 [45:23<07:43,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000343, train/loss_step=0.104, global_step=4502.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5102/5971 [45:23<07:43,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=4502.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5103/5971 [45:24<07:43,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=4502.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5103/5971 [45:24<07:43,  1.87it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00765, train/loss_vlb_step=3.61e-5, train/loss_step=0.00765, global_step=4502.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5104/5971 [45:27<07:43,  1.87it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00765, train/loss_vlb_step=3.61e-5, train/loss_step=0.00765, global_step=4502.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5104/5971 [45:27<07:43,  1.87it/s, loss=0.148, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000483, train/loss_step=0.144, global_step=4502.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  85%|████████▌ | 5105/5971 [45:28<07:42,  1.87it/s, loss=0.148, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000483, train/loss_step=0.144, global_step=4502.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  85%|████████▌ | 5105/5971 [45:28<07:42,  1.87it/s, loss=0.175, v_num=0, train/loss_simple_step=0.743, train/loss_vlb_step=0.0245, train/loss_step=0.743, global_step=4503.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  86%|████████▌ | 5106/5971 [45:28<07:42,  1.87it/s, loss=0.175, v_num=0, train/loss_simple_step=0.743, train/loss_vlb_step=0.0245, train/loss_step=0.743, global_step=4503.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5106/5971 [45:28<07:42,  1.87it/s, loss=0.185, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00168, train/loss_step=0.360, global_step=4503.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5107/5971 [45:29<07:41,  1.87it/s, loss=0.185, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00168, train/loss_step=0.360, global_step=4503.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5107/5971 [45:29<07:41,  1.87it/s, loss=0.18, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.00055, train/loss_step=0.165, global_step=4503.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5108/5971 [45:32<07:41,  1.87it/s, loss=0.18, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.00055, train/loss_step=0.165, global_step=4503.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5108/5971 [45:32<07:41,  1.87it/s, loss=0.19, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.00075, train/loss_step=0.201, global_step=4503.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5109/5971 [45:32<07:41,  1.87it/s, loss=0.19, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.00075, train/loss_step=0.201, global_step=4503.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5109/5971 [45:32<07:41,  1.87it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0442, train/loss_vlb_step=0.000158, train/loss_step=0.0442, global_step=4504.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5110/5971 [45:33<07:40,  1.87it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0442, train/loss_vlb_step=0.000158, train/loss_step=0.0442, global_step=4504.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5110/5971 [45:33<07:40,  1.87it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=9.03e-6, train/loss_step=0.00149, global_step=4504.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5111/5971 [45:34<07:40,  1.87it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=9.03e-6, train/loss_step=0.00149, global_step=4504.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5111/5971 [45:34<07:40,  1.87it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.79e-5, train/loss_step=0.0188, global_step=4504.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5112/5971 [45:36<07:39,  1.87it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.79e-5, train/loss_step=0.0188, global_step=4504.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5112/5971 [45:36<07:39,  1.87it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.26e-5, train/loss_step=0.00456, global_step=4504.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5113/5971 [45:37<07:39,  1.87it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00456, train/loss_vlb_step=2.26e-5, train/loss_step=0.00456, global_step=4504.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5113/5971 [45:37<07:39,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00914, train/loss_vlb_step=4.21e-5, train/loss_step=0.00914, global_step=4505.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5114/5971 [45:38<07:38,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00914, train/loss_vlb_step=4.21e-5, train/loss_step=0.00914, global_step=4505.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5114/5971 [45:38<07:38,  1.87it/s, loss=0.147, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00164, train/loss_step=0.359, global_step=4505.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  86%|████████▌ | 5115/5971 [45:39<07:38,  1.87it/s, loss=0.147, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00164, train/loss_step=0.359, global_step=4505.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5115/5971 [45:39<07:38,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000767, train/loss_step=0.223, global_step=4505.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5116/5971 [45:41<07:38,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.223, train/loss_vlb_step=0.000767, train/loss_step=0.223, global_step=4505.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5116/5971 [45:41<07:38,  1.87it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0486, train/loss_vlb_step=0.00017, train/loss_step=0.0486, global_step=4505.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5117/5971 [45:42<07:37,  1.87it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0486, train/loss_vlb_step=0.00017, train/loss_step=0.0486, global_step=4505.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5117/5971 [45:42<07:37,  1.87it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.02e-5, train/loss_step=0.0163, global_step=4506.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5118/5971 [45:43<07:37,  1.87it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.02e-5, train/loss_step=0.0163, global_step=4506.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5118/5971 [45:43<07:37,  1.87it/s, loss=0.141, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000618, train/loss_step=0.175, global_step=4506.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5119/5971 [45:44<07:36,  1.87it/s, loss=0.141, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000618, train/loss_step=0.175, global_step=4506.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5119/5971 [45:44<07:36,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0732, train/loss_vlb_step=0.000253, train/loss_step=0.0732, global_step=4506.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5120/5971 [45:46<07:36,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0732, train/loss_vlb_step=0.000253, train/loss_step=0.0732, global_step=4506.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5120/5971 [45:46<07:36,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=4506.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  86%|████████▌ | 5121/5971 [45:47<07:35,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000456, train/loss_step=0.139, global_step=4506.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5121/5971 [45:47<07:35,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0692, train/loss_vlb_step=0.000233, train/loss_step=0.0692, global_step=4507.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5122/5971 [45:48<07:35,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0692, train/loss_vlb_step=0.000233, train/loss_step=0.0692, global_step=4507.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5122/5971 [45:48<07:35,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000948, train/loss_step=0.229, global_step=4507.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  86%|████████▌ | 5123/5971 [45:49<07:34,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000948, train/loss_step=0.229, global_step=4507.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5123/5971 [45:49<07:34,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000729, train/loss_step=0.206, global_step=4507.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5124/5971 [45:51<07:34,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000729, train/loss_step=0.206, global_step=4507.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5124/5971 [45:51<07:34,  1.86it/s, loss=0.175, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00266, train/loss_step=0.422, global_step=4507.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5125/5971 [45:52<07:34,  1.86it/s, loss=0.175, v_num=0, train/loss_simple_step=0.422, train/loss_vlb_step=0.00266, train/loss_step=0.422, global_step=4507.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5125/5971 [45:52<07:34,  1.86it/s, loss=0.148, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000653, train/loss_step=0.192, global_step=4508.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5126/5971 [45:53<07:33,  1.86it/s, loss=0.148, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000653, train/loss_step=0.192, global_step=4508.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5126/5971 [45:53<07:33,  1.86it/s, loss=0.159, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.00779, train/loss_step=0.575, global_step=4508.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5127/5971 [45:54<07:33,  1.86it/s, loss=0.159, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.00779, train/loss_step=0.575, global_step=4508.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5127/5971 [45:54<07:33,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.559, train/loss_vlb_step=0.00699, train/loss_step=0.559, global_step=4508.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5128/5971 [45:56<07:33,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.559, train/loss_vlb_step=0.00699, train/loss_step=0.559, global_step=4508.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5128/5971 [45:56<07:33,  1.86it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=4508.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5129/5971 [45:57<07:32,  1.86it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00281, train/loss_vlb_step=1.55e-5, train/loss_step=0.00281, global_step=4508.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5129/5971 [45:57<07:32,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00687, train/loss_vlb_step=3.39e-5, train/loss_step=0.00687, global_step=4509.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5130/5971 [45:57<07:32,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00687, train/loss_vlb_step=3.39e-5, train/loss_step=0.00687, global_step=4509.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5130/5971 [45:57<07:32,  1.86it/s, loss=0.207, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4509.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  86%|████████▌ | 5131/5971 [45:58<07:31,  1.86it/s, loss=0.207, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4509.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5131/5971 [45:58<07:31,  1.86it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000326, train/loss_step=0.0992, global_step=4509.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5132/5971 [46:01<07:31,  1.86it/s, loss=0.211, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000326, train/loss_step=0.0992, global_step=4509.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5132/5971 [46:01<07:31,  1.86it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000175, train/loss_step=0.0496, global_step=4509.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5133/5971 [46:02<07:30,  1.86it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0496, train/loss_vlb_step=0.000175, train/loss_step=0.0496, global_step=4509.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5133/5971 [46:02<07:30,  1.86it/s, loss=0.236, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00334, train/loss_step=0.465, global_step=4510.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  86%|████████▌ | 5134/5971 [46:02<07:30,  1.86it/s, loss=0.236, v_num=0, train/loss_simple_step=0.465, train/loss_vlb_step=0.00334, train/loss_step=0.465, global_step=4510.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5134/5971 [46:02<07:30,  1.86it/s, loss=0.219, v_num=0, train/loss_simple_step=0.00827, train/loss_vlb_step=3.89e-5, train/loss_step=0.00827, global_step=4510.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5135/5971 [46:03<07:29,  1.86it/s, loss=0.219, v_num=0, train/loss_simple_step=0.00827, train/loss_vlb_step=3.89e-5, train/loss_step=0.00827, global_step=4510.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5135/5971 [46:03<07:29,  1.86it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.82e-5, train/loss_step=0.0104, global_step=4510.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  86%|████████▌ | 5136/5971 [46:05<07:29,  1.86it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.82e-5, train/loss_step=0.0104, global_step=4510.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5136/5971 [46:05<07:29,  1.86it/s, loss=0.211, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=4510.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5137/5971 [46:06<07:29,  1.86it/s, loss=0.211, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=4510.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5137/5971 [46:06<07:29,  1.86it/s, loss=0.211, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=2.8e-5, train/loss_step=0.00598, global_step=4511.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5138/5971 [46:07<07:28,  1.86it/s, loss=0.211, v_num=0, train/loss_simple_step=0.00598, train/loss_vlb_step=2.8e-5, train/loss_step=0.00598, global_step=4511.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5138/5971 [46:07<07:28,  1.86it/s, loss=0.216, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00105, train/loss_step=0.274, global_step=4511.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  86%|████████▌ | 5139/5971 [46:08<07:28,  1.86it/s, loss=0.216, v_num=0, train/loss_simple_step=0.274, train/loss_vlb_step=0.00105, train/loss_step=0.274, global_step=4511.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5139/5971 [46:08<07:28,  1.86it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.75e-5, train/loss_step=0.0126, global_step=4511.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5140/5971 [46:10<07:27,  1.86it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.75e-5, train/loss_step=0.0126, global_step=4511.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5140/5971 [46:10<07:27,  1.86it/s, loss=0.214, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000531, train/loss_step=0.153, global_step=4511.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5141/5971 [46:11<07:27,  1.86it/s, loss=0.214, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000531, train/loss_step=0.153, global_step=4511.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5141/5971 [46:11<07:27,  1.86it/s, loss=0.233, v_num=0, train/loss_simple_step=0.460, train/loss_vlb_step=0.00359, train/loss_step=0.460, global_step=4512.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5142/5971 [46:12<07:26,  1.86it/s, loss=0.233, v_num=0, train/loss_simple_step=0.460, train/loss_vlb_step=0.00359, train/loss_step=0.460, global_step=4512.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5142/5971 [46:12<07:26,  1.86it/s, loss=0.229, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.00047, train/loss_step=0.143, global_step=4512.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5143/5971 [46:13<07:26,  1.85it/s, loss=0.229, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.00047, train/loss_step=0.143, global_step=4512.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5143/5971 [46:13<07:26,  1.85it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0806, train/loss_vlb_step=0.000277, train/loss_step=0.0806, global_step=4512.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5144/5971 [46:15<07:26,  1.85it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0806, train/loss_vlb_step=0.000277, train/loss_step=0.0806, global_step=4512.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5144/5971 [46:15<07:26,  1.85it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.94e-5, train/loss_step=0.0136, global_step=4512.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▌ | 5145/5971 [46:16<07:25,  1.85it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.94e-5, train/loss_step=0.0136, global_step=4512.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5145/5971 [46:16<07:25,  1.85it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.01e-5, train/loss_step=0.00181, global_step=4513.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5146/5971 [46:17<07:25,  1.85it/s, loss=0.193, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.01e-5, train/loss_step=0.00181, global_step=4513.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5146/5971 [46:17<07:25,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=9.63e-5, train/loss_step=0.027, global_step=4513.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  86%|████████▌ | 5147/5971 [46:18<07:24,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=9.63e-5, train/loss_step=0.027, global_step=4513.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5147/5971 [46:18<07:24,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00949, train/loss_vlb_step=4.21e-5, train/loss_step=0.00949, global_step=4513.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5148/5971 [46:20<07:24,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00949, train/loss_vlb_step=4.21e-5, train/loss_step=0.00949, global_step=4513.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5148/5971 [46:20<07:24,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00188, train/loss_step=0.387, global_step=4513.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  86%|████████▌ | 5149/5971 [46:21<07:23,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00188, train/loss_step=0.387, global_step=4513.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▌ | 5149/5971 [46:21<07:23,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00122, train/loss_step=0.264, global_step=4514.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  86%|████████▋ | 5150/5971 [46:22<07:23,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00122, train/loss_step=0.264, global_step=4514.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5150/5971 [46:22<07:23,  1.85it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00312, train/loss_vlb_step=1.68e-5, train/loss_step=0.00312, global_step=4514.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5151/5971 [46:23<07:22,  1.85it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00312, train/loss_vlb_step=1.68e-5, train/loss_step=0.00312, global_step=4514.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5151/5971 [46:23<07:22,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.17e-5, train/loss_step=0.00199, global_step=4514.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5152/5971 [46:25<07:22,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.17e-5, train/loss_step=0.00199, global_step=4514.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5152/5971 [46:25<07:22,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000833, train/loss_step=0.216, global_step=4514.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  86%|████████▋ | 5153/5971 [46:26<07:22,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000833, train/loss_step=0.216, global_step=4514.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5153/5971 [46:26<07:22,  1.85it/s, loss=0.118, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000527, train/loss_step=0.159, global_step=4515.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5154/5971 [46:27<07:21,  1.85it/s, loss=0.118, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000527, train/loss_step=0.159, global_step=4515.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5154/5971 [46:27<07:21,  1.85it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.4e-5, train/loss_step=0.0181, global_step=4515.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5155/5971 [46:28<07:21,  1.85it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.4e-5, train/loss_step=0.0181, global_step=4515.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5155/5971 [46:28<07:21,  1.85it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00448, train/loss_vlb_step=2.32e-5, train/loss_step=0.00448, global_step=4515.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5156/5971 [46:30<07:20,  1.85it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00448, train/loss_vlb_step=2.32e-5, train/loss_step=0.00448, global_step=4515.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5156/5971 [46:30<07:20,  1.85it/s, loss=0.116, v_num=0, train/loss_simple_step=0.081, train/loss_vlb_step=0.00027, train/loss_step=0.081, global_step=4515.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  86%|████████▋ | 5157/5971 [46:31<07:20,  1.85it/s, loss=0.116, v_num=0, train/loss_simple_step=0.081, train/loss_vlb_step=0.00027, train/loss_step=0.081, global_step=4515.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5157/5971 [46:31<07:20,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00333, train/loss_step=0.443, global_step=4516.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5158/5971 [46:32<07:19,  1.85it/s, loss=0.138, v_num=0, train/loss_simple_step=0.443, train/loss_vlb_step=0.00333, train/loss_step=0.443, global_step=4516.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5158/5971 [46:32<07:19,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00266, train/loss_step=0.462, global_step=4516.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5159/5971 [46:32<07:19,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00266, train/loss_step=0.462, global_step=4516.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5159/5971 [46:32<07:19,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000377, train/loss_step=0.115, global_step=4516.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5160/5971 [46:35<07:19,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000377, train/loss_step=0.115, global_step=4516.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5160/5971 [46:35<07:19,  1.85it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=9.07e-6, train/loss_step=0.00149, global_step=4516.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5161/5971 [46:35<07:18,  1.85it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=9.07e-6, train/loss_step=0.00149, global_step=4516.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5161/5971 [46:35<07:18,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00036, train/loss_step=0.110, global_step=4517.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  86%|████████▋ | 5162/5971 [46:36<07:18,  1.85it/s, loss=0.127, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.00036, train/loss_step=0.110, global_step=4517.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5162/5971 [46:36<07:18,  1.85it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.27e-5, train/loss_step=0.0208, global_step=4517.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5163/5971 [46:37<07:17,  1.85it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.27e-5, train/loss_step=0.0208, global_step=4517.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5163/5971 [46:37<07:17,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00312, train/loss_step=0.438, global_step=4517.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  86%|████████▋ | 5164/5971 [46:39<07:17,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.00312, train/loss_step=0.438, global_step=4517.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  86%|████████▋ | 5164/5971 [46:39<07:17,  1.84it/s, loss=0.144, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000417, train/loss_step=0.126, global_step=4517.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5165/5971 [46:40<07:16,  1.84it/s, loss=0.144, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000417, train/loss_step=0.126, global_step=4517.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5165/5971 [46:40<07:16,  1.84it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.6e-5, train/loss_step=0.0126, global_step=4518.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5166/5971 [46:41<07:16,  1.84it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.6e-5, train/loss_step=0.0126, global_step=4518.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5166/5971 [46:41<07:16,  1.84it/s, loss=0.147, v_num=0, train/loss_simple_step=0.072, train/loss_vlb_step=0.00024, train/loss_step=0.072, global_step=4518.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  87%|████████▋ | 5167/5971 [46:42<07:15,  1.84it/s, loss=0.147, v_num=0, train/loss_simple_step=0.072, train/loss_vlb_step=0.00024, train/loss_step=0.072, global_step=4518.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5167/5971 [46:42<07:15,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0836, train/loss_vlb_step=0.000285, train/loss_step=0.0836, global_step=4518.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5168/5971 [46:44<07:15,  1.84it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0836, train/loss_vlb_step=0.000285, train/loss_step=0.0836, global_step=4518.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5168/5971 [46:44<07:15,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000507, train/loss_step=0.153, global_step=4518.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  87%|████████▋ | 5169/5971 [46:45<07:15,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000507, train/loss_step=0.153, global_step=4518.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5169/5971 [46:45<07:15,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000727, train/loss_step=0.214, global_step=4519.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5170/5971 [46:46<07:14,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000727, train/loss_step=0.214, global_step=4519.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5170/5971 [46:46<07:14,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=1.95e-5, train/loss_step=0.00372, global_step=4519.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5171/5971 [46:47<07:14,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=1.95e-5, train/loss_step=0.00372, global_step=4519.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5171/5971 [46:47<07:14,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000157, train/loss_step=0.0413, global_step=4519.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  87%|████████▋ | 5172/5971 [46:49<07:13,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.000157, train/loss_step=0.0413, global_step=4519.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5172/5971 [46:49<07:13,  1.84it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000121, train/loss_step=0.0325, global_step=4519.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  87%|████████▋ | 5173/5971 [46:50<07:13,  1.84it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0325, train/loss_vlb_step=0.000121, train/loss_step=0.0325, global_step=4519.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5173/5971 [46:50<07:13,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0855, train/loss_vlb_step=0.000287, train/loss_step=0.0855, global_step=4520.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5174/5971 [46:51<07:13,  1.84it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0855, train/loss_vlb_step=0.000287, train/loss_step=0.0855, global_step=4520.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5174/5971 [46:51<07:13,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00373, train/loss_step=0.474, global_step=4520.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  87%|████████▋ | 5175/5971 [46:52<07:12,  1.84it/s, loss=0.149, v_num=0, train/loss_simple_step=0.474, train/loss_vlb_step=0.00373, train/loss_step=0.474, global_step=4520.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5175/5971 [46:52<07:12,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000256, train/loss_step=0.0755, global_step=4520.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5176/5971 [46:54<07:12,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0755, train/loss_vlb_step=0.000256, train/loss_step=0.0755, global_step=4520.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5176/5971 [46:54<07:12,  1.84it/s, loss=0.159, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000801, train/loss_step=0.214, global_step=4520.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  87%|████████▋ | 5177/5971 [46:55<07:11,  1.84it/s, loss=0.159, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000801, train/loss_step=0.214, global_step=4520.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5177/5971 [46:55<07:11,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000144, train/loss_step=0.0402, global_step=4521.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5178/5971 [46:56<07:11,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000144, train/loss_step=0.0402, global_step=4521.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5178/5971 [46:56<07:11,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.44e-6, train/loss_step=0.00155, global_step=4521.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5179/5971 [46:57<07:10,  1.84it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00155, train/loss_vlb_step=9.44e-6, train/loss_step=0.00155, global_step=4521.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5179/5971 [46:57<07:10,  1.84it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=4521.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  87%|████████▋ | 5180/5971 [46:59<07:10,  1.84it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0868, train/loss_vlb_step=0.000286, train/loss_step=0.0868, global_step=4521.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5180/5971 [46:59<07:10,  1.84it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0035, train/loss_vlb_step=1.86e-5, train/loss_step=0.0035, global_step=4521.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  87%|████████▋ | 5181/5971 [47:00<07:09,  1.84it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0035, train/loss_vlb_step=1.86e-5, train/loss_step=0.0035, global_step=4521.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5181/5971 [47:00<07:09,  1.84it/s, loss=0.118, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000583, train/loss_step=0.172, global_step=4522.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  87%|████████▋ | 5182/5971 [47:01<07:09,  1.84it/s, loss=0.118, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000583, train/loss_step=0.172, global_step=4522.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5182/5971 [47:01<07:09,  1.84it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.75e-5, train/loss_step=0.00324, global_step=4522.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5183/5971 [47:02<07:08,  1.84it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00324, train/loss_vlb_step=1.75e-5, train/loss_step=0.00324, global_step=4522.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5183/5971 [47:02<07:08,  1.84it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.32e-5, train/loss_step=0.00234, global_step=4522.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5184/5971 [47:04<07:08,  1.84it/s, loss=0.0949, v_num=0, train/loss_simple_step=0.00234, train/loss_vlb_step=1.32e-5, train/loss_step=0.00234, global_step=4522.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5184/5971 [47:04<07:08,  1.84it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.00015, train/loss_step=0.0394, global_step=4522.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  87%|████████▋ | 5185/5971 [47:05<07:08,  1.84it/s, loss=0.0905, v_num=0, train/loss_simple_step=0.0394, train/loss_vlb_step=0.00015, train/loss_step=0.0394, global_step=4522.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5185/5971 [47:05<07:08,  1.84it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000658, train/loss_step=0.197, global_step=4523.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  87%|████████▋ | 5186/5971 [47:06<07:07,  1.84it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000658, train/loss_step=0.197, global_step=4523.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5186/5971 [47:06<07:07,  1.84it/s, loss=0.102, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.00037, train/loss_step=0.111, global_step=4523.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  87%|████████▋ | 5187/5971 [47:06<07:07,  1.84it/s, loss=0.102, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.00037, train/loss_step=0.111, global_step=4523.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5187/5971 [47:06<07:07,  1.84it/s, loss=0.115, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00167, train/loss_step=0.358, global_step=4523.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5188/5971 [47:09<07:06,  1.83it/s, loss=0.115, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.00167, train/loss_step=0.358, global_step=4523.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5188/5971 [47:09<07:06,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000335, train/loss_step=0.100, global_step=4523.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5189/5971 [47:09<07:06,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000335, train/loss_step=0.100, global_step=4523.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5189/5971 [47:09<07:06,  1.83it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.84e-5, train/loss_step=0.0168, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5190/5971 [47:10<07:05,  1.83it/s, loss=0.103, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.84e-5, train/loss_step=0.0168, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5190/5971 [47:10<07:05,  1.83it/s, loss=0.116, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  87%|████████▋ | 5191/5971 [47:11<07:05,  1.83it/s, loss=0.116, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5191/5971 [47:11<07:05,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00203, train/loss_step=0.384, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5192/5971 [47:14<07:05,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.384, train/loss_vlb_step=0.00203, train/loss_step=0.384, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  87%|████████▋ | 5192/5971 [47:14<07:05,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:20,  2.06it/s][A
Epoch 7:  87%|████████▋ | 5194/5971 [47:14<07:03,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   1%|          | 2/167 [00:00<00:44,  3.68it/s][A
Epoch 7:  87%|████████▋ | 5196/5971 [47:14<07:02,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:16,  9.53it/s][A
Epoch 7:  87%|████████▋ | 5199/5971 [47:14<07:00,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 14.12it/s][A
Epoch 7:  87%|████████▋ | 5202/5971 [47:14<06:59,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.58it/s][A
Epoch 7:  87%|████████▋ | 5205/5971 [47:15<06:57,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.27it/s][A
Epoch 7:  87%|████████▋ | 5208/5971 [47:15<06:55,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.60it/s][A
Epoch 7:  87%|████████▋ | 5211/5971 [47:15<06:53,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 23.30it/s][A
Epoch 7:  87%|████████▋ | 5214/5971 [47:15<06:51,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.16it/s][A
Epoch 7:  87%|████████▋ | 5217/5971 [47:15<06:49,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.84it/s][A
Epoch 7:  87%|████████▋ | 5220/5971 [47:15<06:47,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.27it/s][A
Epoch 7:  87%|████████▋ | 5223/5971 [47:15<06:46,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.91it/s][A
Epoch 7:  88%|████████▊ | 5226/5971 [47:15<06:44,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.93it/s][A
Epoch 7:  88%|████████▊ | 5229/5971 [47:15<06:42,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:01<00:05, 25.10it/s][A
Epoch 7:  88%|████████▊ | 5232/5971 [47:16<06:40,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.91it/s][A
Epoch 7:  88%|████████▊ | 5235/5971 [47:16<06:38,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.62it/s][A
Epoch 7:  88%|████████▊ | 5238/5971 [47:16<06:36,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.80it/s][A
Epoch 7:  88%|████████▊ | 5241/5971 [47:16<06:34,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.08it/s][A
Epoch 7:  88%|████████▊ | 5244/5971 [47:16<06:33,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.45it/s][A
Epoch 7:  88%|████████▊ | 5247/5971 [47:16<06:31,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.28it/s][A
Epoch 7:  88%|████████▊ | 5250/5971 [47:16<06:29,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.43it/s][A
Epoch 7:  88%|████████▊ | 5253/5971 [47:16<06:27,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.68it/s][A
Epoch 7:  88%|████████▊ | 5256/5971 [47:16<06:25,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  39%|███▉      | 65/167 [00:03<00:04, 25.48it/s][A
Epoch 7:  88%|████████▊ | 5259/5971 [47:17<06:24,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████      | 68/167 [00:03<00:03, 25.11it/s][A
Epoch 7:  88%|████████▊ | 5262/5971 [47:17<06:22,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 25.46it/s][A
Epoch 7:  88%|████████▊ | 5265/5971 [47:17<06:20,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.17it/s][A
Epoch 7:  88%|████████▊ | 5268/5971 [47:17<06:18,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.83it/s][A
Epoch 7:  88%|████████▊ | 5271/5971 [47:17<06:16,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.15it/s][A
Epoch 7:  88%|████████▊ | 5274/5971 [47:17<06:14,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  50%|█████     | 84/167 [00:03<00:03, 24.92it/s][A
Epoch 7:  88%|████████▊ | 5277/5971 [47:17<06:13,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.08it/s][A
Epoch 7:  88%|████████▊ | 5280/5971 [47:17<06:11,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.28it/s][A
Epoch 7:  88%|████████▊ | 5283/5971 [47:18<06:09,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.45it/s][A
Epoch 7:  89%|████████▊ | 5286/5971 [47:18<06:07,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.79it/s][A
Epoch 7:  89%|████████▊ | 5289/5971 [47:18<06:05,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.38it/s][A
Epoch 7:  89%|████████▊ | 5292/5971 [47:18<06:04,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.22it/s][A
Epoch 7:  89%|████████▊ | 5295/5971 [47:18<06:02,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 24.52it/s][A
Epoch 7:  89%|████████▊ | 5298/5971 [47:18<06:00,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 24.12it/s][A
Epoch 7:  89%|████████▉ | 5301/5971 [47:18<05:58,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 25.57it/s][A
Epoch 7:  89%|████████▉ | 5304/5971 [47:18<05:56,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 25.20it/s][A
Epoch 7:  89%|████████▉ | 5307/5971 [47:18<05:55,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.14it/s][A
Epoch 7:  89%|████████▉ | 5310/5971 [47:19<05:53,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.50it/s][A
Epoch 7:  89%|████████▉ | 5313/5971 [47:19<05:51,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.81it/s][A
Epoch 7:  89%|████████▉ | 5316/5971 [47:19<05:49,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.39it/s][A
Epoch 7:  89%|████████▉ | 5319/5971 [47:19<05:47,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.20it/s][A
Epoch 7:  89%|████████▉ | 5322/5971 [47:19<05:46,  1.87it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.85it/s][A
Epoch 7:  89%|████████▉ | 5325/5971 [47:19<05:44,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.07it/s][A
Epoch 7:  89%|████████▉ | 5328/5971 [47:19<05:42,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.88it/s][A
Epoch 7:  89%|████████▉ | 5331/5971 [47:19<05:40,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.41it/s][A
Epoch 7:  89%|████████▉ | 5334/5971 [47:19<05:39,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.09it/s][A
Epoch 7:  89%|████████▉ | 5337/5971 [47:20<05:37,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.43it/s][A
Epoch 7:  89%|████████▉ | 5340/5971 [47:20<05:35,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.89it/s][A
Epoch 7:  89%|████████▉ | 5343/5971 [47:20<05:33,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 25.52it/s][A
Epoch 7:  90%|████████▉ | 5346/5971 [47:20<05:32,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.49it/s][A
Epoch 7:  90%|████████▉ | 5349/5971 [47:20<05:30,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 27.05it/s][A
Epoch 7:  90%|████████▉ | 5352/5971 [47:20<05:28,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 27.90it/s][A
Epoch 7:  90%|████████▉ | 5356/5971 [47:20<05:26,  1.89it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.87it/s][A
Epoch 7:  90%|████████▉ | 5360/5971 [47:20<05:23,  1.89it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5360/5971 [47:21<05:23,  1.89it/s, loss=0.146, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00116, train/loss_step=0.296, global_step=4524.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  90%|████████▉ | 5361/5971 [47:22<05:23,  1.89it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0678, train/loss_vlb_step=0.000227, train/loss_step=0.0678, global_step=4525.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5362/5971 [47:23<05:22,  1.89it/s, loss=0.132, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000776, train/loss_step=0.199, global_step=4525.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  90%|████████▉ | 5363/5971 [47:24<05:22,  1.89it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0746, train/loss_vlb_step=0.000246, train/loss_step=0.0746, global_step=4525.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5364/5971 [47:26<05:22,  1.88it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0746, train/loss_vlb_step=0.000246, train/loss_step=0.0746, global_step=4525.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5364/5971 [47:26<05:22,  1.88it/s, loss=0.145, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00492, train/loss_step=0.480, global_step=4525.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  90%|████████▉ | 5365/5971 [47:27<05:21,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0625, train/loss_vlb_step=0.000206, train/loss_step=0.0625, global_step=4526.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5366/5971 [47:28<05:21,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.03e-5, train/loss_step=0.00179, global_step=4526.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5367/5971 [47:28<05:20,  1.88it/s, loss=0.152, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000694, train/loss_step=0.192, global_step=4526.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  90%|████████▉ | 5368/5971 [47:31<05:20,  1.88it/s, loss=0.152, v_num=0, train/loss_simple_step=0.192, train/loss_vlb_step=0.000694, train/loss_step=0.192, global_step=4526.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5368/5971 [47:31<05:20,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0736, train/loss_vlb_step=0.00025, train/loss_step=0.0736, global_step=4526.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5369/5971 [47:32<05:19,  1.88it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00897, train/loss_vlb_step=4.01e-5, train/loss_step=0.00897, global_step=4527.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5370/5971 [47:33<05:19,  1.88it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0432, train/loss_vlb_step=0.000155, train/loss_step=0.0432, global_step=4527.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  90%|████████▉ | 5371/5971 [47:33<05:18,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.799, train/loss_vlb_step=0.0413, train/loss_step=0.799, global_step=4527.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  90%|████████▉ | 5372/5971 [47:36<05:18,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.799, train/loss_vlb_step=0.0413, train/loss_step=0.799, global_step=4527.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5372/5971 [47:36<05:18,  1.88it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0127, train/loss_vlb_step=5.23e-5, train/loss_step=0.0127, global_step=4527.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|████████▉ | 5373/5971 [47:37<05:17,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000869, train/loss_step=0.208, global_step=4528.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  90%|█████████ | 5374/5971 [47:37<05:17,  1.88it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0961, train/loss_vlb_step=0.000317, train/loss_step=0.0961, global_step=4528.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5375/5971 [47:38<05:16,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000231, train/loss_step=0.0694, global_step=4528.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5376/5971 [47:41<05:16,  1.88it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0694, train/loss_vlb_step=0.000231, train/loss_step=0.0694, global_step=4528.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5376/5971 [47:41<05:16,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.00124, train/loss_step=0.319, global_step=4528.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  90%|█████████ | 5377/5971 [47:41<05:16,  1.88it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00306, train/loss_vlb_step=1.64e-5, train/loss_step=0.00306, global_step=4529.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5378/5971 [47:42<05:15,  1.88it/s, loss=0.172, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.00019, train/loss_step=0.053, global_step=4529.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  90%|█████████ | 5379/5971 [47:43<05:15,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000137, train/loss_step=0.0391, global_step=4529.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5380/5971 [47:45<05:14,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000137, train/loss_step=0.0391, global_step=4529.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5380/5971 [47:45<05:14,  1.88it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0457, train/loss_vlb_step=0.000162, train/loss_step=0.0457, global_step=4529.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5381/5971 [47:46<05:14,  1.88it/s, loss=0.141, v_num=0, train/loss_simple_step=0.045, train/loss_vlb_step=0.000157, train/loss_step=0.045, global_step=4530.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  90%|█████████ | 5382/5971 [47:47<05:13,  1.88it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.41e-5, train/loss_step=0.00483, global_step=4530.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5383/5971 [47:48<05:13,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.86e-5, train/loss_step=0.0113, global_step=4530.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  90%|█████████ | 5384/5971 [47:50<05:12,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.86e-5, train/loss_step=0.0113, global_step=4530.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5384/5971 [47:50<05:12,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00199, train/loss_step=0.370, global_step=4530.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  90%|█████████ | 5385/5971 [47:51<05:12,  1.88it/s, loss=0.126, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000384, train/loss_step=0.117, global_step=4531.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5386/5971 [47:52<05:11,  1.88it/s, loss=0.134, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000566, train/loss_step=0.169, global_step=4531.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5387/5971 [47:53<05:11,  1.88it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.26e-5, train/loss_step=0.00638, global_step=4531.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5388/5971 [47:55<05:11,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00638, train/loss_vlb_step=3.26e-5, train/loss_step=0.00638, global_step=4531.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5388/5971 [47:55<05:11,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00473, train/loss_vlb_step=2.48e-5, train/loss_step=0.00473, global_step=4531.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5389/5971 [47:56<05:10,  1.87it/s, loss=0.131, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000755, train/loss_step=0.202, global_step=4532.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  90%|█████████ | 5390/5971 [47:57<05:10,  1.87it/s, loss=0.131, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000178, train/loss_step=0.053, global_step=4532.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5391/5971 [47:58<05:09,  1.87it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.63e-5, train/loss_step=0.00294, global_step=4532.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5392/5971 [48:00<05:09,  1.87it/s, loss=0.0916, v_num=0, train/loss_simple_step=0.00294, train/loss_vlb_step=1.63e-5, train/loss_step=0.00294, global_step=4532.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5392/5971 [48:00<05:09,  1.87it/s, loss=0.0919, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.54e-5, train/loss_step=0.0181, global_step=4532.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  90%|█████████ | 5393/5971 [48:01<05:08,  1.87it/s, loss=0.0827, v_num=0, train/loss_simple_step=0.0242, train/loss_vlb_step=9.33e-5, train/loss_step=0.0242, global_step=4533.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5394/5971 [48:02<05:08,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00841, train/loss_step=0.568, global_step=4533.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  90%|█████████ | 5395/5971 [48:03<05:07,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000118, train/loss_step=0.0326, global_step=4533.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5396/5971 [48:05<05:07,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0326, train/loss_vlb_step=0.000118, train/loss_step=0.0326, global_step=4533.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5396/5971 [48:05<05:07,  1.87it/s, loss=0.0893, v_num=0, train/loss_simple_step=0.0153, train/loss_vlb_step=6.65e-5, train/loss_step=0.0153, global_step=4533.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5397/5971 [48:06<05:06,  1.87it/s, loss=0.117, v_num=0, train/loss_simple_step=0.552, train/loss_vlb_step=0.00592, train/loss_step=0.552, global_step=4534.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  90%|█████████ | 5398/5971 [48:07<05:06,  1.87it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0065, train/loss_vlb_step=3.11e-5, train/loss_step=0.0065, global_step=4534.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5399/5971 [48:07<05:05,  1.87it/s, loss=0.119, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=4534.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  90%|█████████ | 5400/5971 [48:10<05:05,  1.87it/s, loss=0.119, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000407, train/loss_step=0.124, global_step=4534.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5400/5971 [48:10<05:05,  1.87it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.29e-5, train/loss_step=0.00217, global_step=4534.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5401/5971 [48:11<05:05,  1.87it/s, loss=0.119, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=4535.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  90%|█████████ | 5402/5971 [48:11<05:04,  1.87it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00665, train/loss_vlb_step=3.33e-5, train/loss_step=0.00665, global_step=4535.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  90%|█████████ | 5403/5971 [48:12<05:04,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.692, train/loss_vlb_step=0.015, train/loss_step=0.692, global_step=4535.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      
Epoch 7:  91%|█████████ | 5404/5971 [48:14<05:03,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.692, train/loss_vlb_step=0.015, train/loss_step=0.692, global_step=4535.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5404/5971 [48:14<05:03,  1.87it/s, loss=0.149, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.0012, train/loss_step=0.286, global_step=4535.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5405/5971 [48:15<05:03,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0224, train/loss_vlb_step=8.59e-5, train/loss_step=0.0224, global_step=4536.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5406/5971 [48:16<05:02,  1.87it/s, loss=0.138, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000153, train/loss_step=0.043, global_step=4536.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████ | 5407/5971 [48:17<05:02,  1.87it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0473, train/loss_vlb_step=0.000171, train/loss_step=0.0473, global_step=4536.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5408/5971 [48:19<05:01,  1.87it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0473, train/loss_vlb_step=0.000171, train/loss_step=0.0473, global_step=4536.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5408/5971 [48:19<05:01,  1.87it/s, loss=0.147, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000446, train/loss_step=0.136, global_step=4536.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████ | 5409/5971 [48:20<05:01,  1.86it/s, loss=0.143, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000416, train/loss_step=0.124, global_step=4537.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5410/5971 [48:21<05:00,  1.86it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0585, train/loss_vlb_step=0.000207, train/loss_step=0.0585, global_step=4537.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5411/5971 [48:22<05:00,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000237, train/loss_step=0.0709, global_step=4537.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5412/5971 [48:24<04:59,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0709, train/loss_vlb_step=0.000237, train/loss_step=0.0709, global_step=4537.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5412/5971 [48:24<04:59,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=1.96e-5, train/loss_step=0.00403, global_step=4537.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5413/5971 [48:25<04:59,  1.86it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00937, train/loss_vlb_step=4.29e-5, train/loss_step=0.00937, global_step=4538.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5414/5971 [48:26<04:58,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000723, train/loss_step=0.208, global_step=4538.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  91%|█████████ | 5415/5971 [48:27<04:58,  1.86it/s, loss=0.135, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000638, train/loss_step=0.184, global_step=4538.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5416/5971 [48:29<04:58,  1.86it/s, loss=0.135, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000638, train/loss_step=0.184, global_step=4538.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5416/5971 [48:29<04:58,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.7e-5, train/loss_step=0.00309, global_step=4538.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5417/5971 [48:30<04:57,  1.86it/s, loss=0.124, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00178, train/loss_step=0.350, global_step=4539.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  91%|█████████ | 5418/5971 [48:31<04:57,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000208, train/loss_step=0.0632, global_step=4539.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5419/5971 [48:32<04:56,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.69e-5, train/loss_step=0.00308, global_step=4539.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5420/5971 [48:34<04:56,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.69e-5, train/loss_step=0.00308, global_step=4539.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5420/5971 [48:34<04:56,  1.86it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0324, train/loss_vlb_step=0.000118, train/loss_step=0.0324, global_step=4539.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████ | 5421/5971 [48:35<04:55,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000638, train/loss_step=0.191, global_step=4540.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  91%|█████████ | 5422/5971 [48:36<04:55,  1.86it/s, loss=0.14, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00138, train/loss_step=0.281, global_step=4540.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  91%|█████████ | 5423/5971 [48:37<04:54,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4540.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5424/5971 [48:39<04:54,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.764, train/loss_vlb_step=0.0238, train/loss_step=0.764, global_step=4540.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5424/5971 [48:39<04:54,  1.86it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00173, train/loss_vlb_step=1.05e-5, train/loss_step=0.00173, global_step=4540.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5425/5971 [48:40<04:53,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.805, train/loss_vlb_step=0.038, train/loss_step=0.805, global_step=4541.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  91%|█████████ | 5426/5971 [48:41<04:53,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0494, train/loss_vlb_step=0.000169, train/loss_step=0.0494, global_step=4541.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5427/5971 [48:42<04:52,  1.86it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.12e-5, train/loss_step=0.00199, global_step=4541.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5428/5971 [48:44<04:52,  1.86it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.12e-5, train/loss_step=0.00199, global_step=4541.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5428/5971 [48:44<04:52,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.00101, train/loss_step=0.261, global_step=4541.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  91%|█████████ | 5429/5971 [48:45<04:51,  1.86it/s, loss=0.203, v_num=0, train/loss_simple_step=0.712, train/loss_vlb_step=0.0103, train/loss_step=0.712, global_step=4542.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████ | 5430/5971 [48:45<04:51,  1.86it/s, loss=0.208, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000554, train/loss_step=0.161, global_step=4542.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5431/5971 [48:46<04:50,  1.86it/s, loss=0.252, v_num=0, train/loss_simple_step=0.963, train/loss_vlb_step=0.485, train/loss_step=0.963, global_step=4542.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  91%|█████████ | 5432/5971 [48:49<04:50,  1.85it/s, loss=0.252, v_num=0, train/loss_simple_step=0.963, train/loss_vlb_step=0.485, train/loss_step=0.963, global_step=4542.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5432/5971 [48:49<04:50,  1.85it/s, loss=0.269, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00127, train/loss_step=0.332, global_step=4542.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5433/5971 [48:49<04:50,  1.85it/s, loss=0.275, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000428, train/loss_step=0.129, global_step=4543.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5434/5971 [48:50<04:49,  1.85it/s, loss=0.264, v_num=0, train/loss_simple_step=0.00238, train/loss_vlb_step=1.45e-5, train/loss_step=0.00238, global_step=4543.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5435/5971 [48:51<04:49,  1.85it/s, loss=0.259, v_num=0, train/loss_simple_step=0.0746, train/loss_vlb_step=0.000245, train/loss_step=0.0746, global_step=4543.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████ | 5436/5971 [48:54<04:48,  1.85it/s, loss=0.259, v_num=0, train/loss_simple_step=0.0746, train/loss_vlb_step=0.000245, train/loss_step=0.0746, global_step=4543.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5436/5971 [48:54<04:48,  1.85it/s, loss=0.26, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=9.97e-5, train/loss_step=0.0263, global_step=4543.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  91%|█████████ | 5437/5971 [48:55<04:48,  1.85it/s, loss=0.243, v_num=0, train/loss_simple_step=0.00586, train/loss_vlb_step=2.91e-5, train/loss_step=0.00586, global_step=4544.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5438/5971 [48:55<04:47,  1.85it/s, loss=0.243, v_num=0, train/loss_simple_step=0.0592, train/loss_vlb_step=0.000197, train/loss_step=0.0592, global_step=4544.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████ | 5439/5971 [48:56<04:47,  1.85it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000264, train/loss_step=0.0743, global_step=4544.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5440/5971 [48:59<04:46,  1.85it/s, loss=0.246, v_num=0, train/loss_simple_step=0.0743, train/loss_vlb_step=0.000264, train/loss_step=0.0743, global_step=4544.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5440/5971 [48:59<04:46,  1.85it/s, loss=0.251, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000452, train/loss_step=0.132, global_step=4544.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  91%|█████████ | 5441/5971 [48:59<04:46,  1.85it/s, loss=0.262, v_num=0, train/loss_simple_step=0.402, train/loss_vlb_step=0.00271, train/loss_step=0.402, global_step=4545.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████ | 5442/5971 [49:00<04:45,  1.85it/s, loss=0.248, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.08e-5, train/loss_step=0.00392, global_step=4545.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5443/5971 [49:01<04:45,  1.85it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000228, train/loss_step=0.0674, global_step=4545.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████ | 5444/5971 [49:03<04:44,  1.85it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0674, train/loss_vlb_step=0.000228, train/loss_step=0.0674, global_step=4545.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5444/5971 [49:03<04:44,  1.85it/s, loss=0.228, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00191, train/loss_step=0.305, global_step=4545.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  91%|█████████ | 5445/5971 [49:04<04:44,  1.85it/s, loss=0.212, v_num=0, train/loss_simple_step=0.471, train/loss_vlb_step=0.00267, train/loss_step=0.471, global_step=4546.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5446/5971 [49:05<04:43,  1.85it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0982, train/loss_vlb_step=0.000331, train/loss_step=0.0982, global_step=4546.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5447/5971 [49:06<04:43,  1.85it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.28e-6, train/loss_step=0.00141, global_step=4546.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5448/5971 [49:08<04:43,  1.85it/s, loss=0.214, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.28e-6, train/loss_step=0.00141, global_step=4546.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████ | 5448/5971 [49:08<04:43,  1.85it/s, loss=0.201, v_num=0, train/loss_simple_step=0.00472, train/loss_vlb_step=2.4e-5, train/loss_step=0.00472, global_step=4546.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████▏| 5449/5971 [49:09<04:42,  1.85it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=0.000102, train/loss_step=0.0263, global_step=4547.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5450/5971 [49:10<04:42,  1.85it/s, loss=0.18, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.00205, train/loss_step=0.414, global_step=4547.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  91%|█████████▏| 5451/5971 [49:11<04:41,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0253, train/loss_vlb_step=9.59e-5, train/loss_step=0.0253, global_step=4547.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5452/5971 [49:13<04:41,  1.85it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0253, train/loss_vlb_step=9.59e-5, train/loss_step=0.0253, global_step=4547.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5452/5971 [49:13<04:41,  1.85it/s, loss=0.117, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.77e-5, train/loss_step=0.016, global_step=4547.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  91%|█████████▏| 5453/5971 [49:14<04:40,  1.85it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00342, train/loss_vlb_step=1.9e-5, train/loss_step=0.00342, global_step=4548.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5454/5971 [49:15<04:40,  1.85it/s, loss=0.132, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00231, train/loss_step=0.425, global_step=4548.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  91%|█████████▏| 5455/5971 [49:16<04:39,  1.85it/s, loss=0.141, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00104, train/loss_step=0.253, global_step=4548.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5456/5971 [49:18<04:39,  1.84it/s, loss=0.141, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00104, train/loss_step=0.253, global_step=4548.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5456/5971 [49:18<04:39,  1.84it/s, loss=0.144, v_num=0, train/loss_simple_step=0.089, train/loss_vlb_step=0.000295, train/loss_step=0.089, global_step=4548.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5457/5971 [49:19<04:38,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.00354, train/loss_step=0.423, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████▏| 5458/5971 [49:20<04:38,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00824, train/loss_vlb_step=3.89e-5, train/loss_step=0.00824, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5459/5971 [49:21<04:37,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0518, train/loss_vlb_step=0.000179, train/loss_step=0.0518, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  91%|█████████▏| 5460/5971 [49:23<04:37,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0518, train/loss_vlb_step=0.000179, train/loss_step=0.0518, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  91%|█████████▏| 5460/5971 [49:23<04:37,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:09,  2.37it/s][A

Validating:   1%|          | 2/167 [00:00<00:46,  3.55it/s][A
Epoch 7:  92%|█████████▏| 5464/5971 [49:23<04:34,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.37it/s][A
Epoch 7:  92%|█████████▏| 5468/5971 [49:24<04:32,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.83it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.80it/s][A
Epoch 7:  92%|█████████▏| 5472/5971 [49:24<04:30,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:08, 19.04it/s][A
Epoch 7:  92%|█████████▏| 5476/5971 [49:24<04:27,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  10%|█         | 17/167 [00:01<00:07, 19.89it/s][A
Epoch 7:  92%|█████████▏| 5480/5971 [49:24<04:25,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.00it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.22it/s][A
Epoch 7:  92%|█████████▏| 5484/5971 [49:24<04:23,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.49it/s][A
Epoch 7:  92%|█████████▏| 5488/5971 [49:24<04:20,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 24.43it/s][A
Epoch 7:  92%|█████████▏| 5492/5971 [49:25<04:18,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.90it/s][A

Validating:  21%|██        | 35/167 [00:01<00:05, 25.53it/s][A
Epoch 7:  92%|█████████▏| 5496/5971 [49:25<04:16,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.12it/s][A
Epoch 7:  92%|█████████▏| 5500/5971 [49:25<04:13,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.31it/s][A
Epoch 7:  92%|█████████▏| 5504/5971 [49:25<04:11,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.98it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.08it/s][A
Epoch 7:  92%|█████████▏| 5508/5971 [49:25<04:09,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.43it/s][A
Epoch 7:  92%|█████████▏| 5512/5971 [49:25<04:06,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.95it/s][A
Epoch 7:  92%|█████████▏| 5516/5971 [49:26<04:04,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.90it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.24it/s][A
Epoch 7:  92%|█████████▏| 5520/5971 [49:26<04:02,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.29it/s][A
Epoch 7:  93%|█████████▎| 5524/5971 [49:26<03:59,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  39%|███▉      | 65/167 [00:03<00:03, 27.25it/s][A
Epoch 7:  93%|█████████▎| 5528/5971 [49:26<03:57,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.10it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.44it/s][A
Epoch 7:  93%|█████████▎| 5532/5971 [49:26<03:55,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.22it/s][A
Epoch 7:  93%|█████████▎| 5536/5971 [49:26<03:53,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 24.94it/s][A
Epoch 7:  93%|█████████▎| 5540/5971 [49:26<03:50,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.69it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.17it/s][A
Epoch 7:  93%|█████████▎| 5544/5971 [49:27<03:48,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 24.79it/s][A
Epoch 7:  93%|█████████▎| 5548/5971 [49:27<03:46,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 25.72it/s][A
Epoch 7:  93%|█████████▎| 5552/5971 [49:27<03:43,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  55%|█████▌    | 92/167 [00:04<00:02, 25.83it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.41it/s][A
Epoch 7:  93%|█████████▎| 5556/5971 [49:27<03:41,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.62it/s][A
Epoch 7:  93%|█████████▎| 5560/5971 [49:27<03:39,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.46it/s][A
Epoch 7:  93%|█████████▎| 5564/5971 [49:27<03:37,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.84it/s][A
Epoch 7:  93%|█████████▎| 5568/5971 [49:28<03:34,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.55it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.28it/s][A
Epoch 7:  93%|█████████▎| 5572/5971 [49:28<03:32,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 26.69it/s][A
Epoch 7:  93%|█████████▎| 5576/5971 [49:28<03:30,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.41it/s][A
Epoch 7:  93%|█████████▎| 5580/5971 [49:28<03:27,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.17it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 25.57it/s][A
Epoch 7:  94%|█████████▎| 5584/5971 [49:28<03:25,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.68it/s][A
Epoch 7:  94%|█████████▎| 5588/5971 [49:28<03:23,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.75it/s][A
Epoch 7:  94%|█████████▎| 5592/5971 [49:28<03:21,  1.88it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  80%|████████  | 134/167 [00:05<00:01, 27.77it/s][A
Epoch 7:  94%|█████████▎| 5596/5971 [49:29<03:18,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 28.32it/s][A
Epoch 7:  94%|█████████▍| 5600/5971 [49:29<03:16,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 28.61it/s][A
Epoch 7:  94%|█████████▍| 5604/5971 [49:29<03:14,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 28.46it/s][A
Epoch 7:  94%|█████████▍| 5608/5971 [49:29<03:12,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 28.15it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.95it/s][A
Epoch 7:  94%|█████████▍| 5612/5971 [49:29<03:09,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.43it/s][A
Epoch 7:  94%|█████████▍| 5616/5971 [49:29<03:07,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.94it/s][A
Epoch 7:  94%|█████████▍| 5620/5971 [49:29<03:05,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.71it/s][A
Epoch 7:  94%|█████████▍| 5624/5971 [49:30<03:03,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.62it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 26.31it/s][A
Epoch 7:  94%|█████████▍| 5628/5971 [49:30<03:00,  1.90it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5628/5971 [49:30<03:01,  1.89it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0028, train/loss_vlb_step=1.42e-5, train/loss_step=0.0028, global_step=4549.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  94%|█████████▍| 5629/5971 [49:31<03:00,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000302, train/loss_step=0.0899, global_step=4550.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5630/5971 [49:32<03:00,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.32e-5, train/loss_step=0.00219, global_step=4550.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5631/5971 [49:33<02:59,  1.89it/s, loss=0.142, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000398, train/loss_step=0.121, global_step=4550.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  94%|█████████▍| 5632/5971 [49:35<02:59,  1.89it/s, loss=0.142, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000398, train/loss_step=0.121, global_step=4550.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5632/5971 [49:35<02:59,  1.89it/s, loss=0.135, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000677, train/loss_step=0.182, global_step=4550.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5633/5971 [49:36<02:58,  1.89it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0756, train/loss_vlb_step=0.000252, train/loss_step=0.0756, global_step=4551.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5634/5971 [49:37<02:58,  1.89it/s, loss=0.117, v_num=0, train/loss_simple_step=0.124, train/loss_vlb_step=0.000409, train/loss_step=0.124, global_step=4551.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  94%|█████████▍| 5635/5971 [49:38<02:57,  1.89it/s, loss=0.133, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00177, train/loss_step=0.324, global_step=4551.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  94%|█████████▍| 5636/5971 [49:40<02:57,  1.89it/s, loss=0.133, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00177, train/loss_step=0.324, global_step=4551.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5636/5971 [49:40<02:57,  1.89it/s, loss=0.135, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000138, train/loss_step=0.037, global_step=4551.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5637/5971 [49:41<02:56,  1.89it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000151, train/loss_step=0.0424, global_step=4552.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5638/5971 [49:42<02:56,  1.89it/s, loss=0.122, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000459, train/loss_step=0.134, global_step=4552.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  94%|█████████▍| 5639/5971 [49:43<02:55,  1.89it/s, loss=0.128, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000576, train/loss_step=0.162, global_step=4552.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5640/5971 [49:45<02:55,  1.89it/s, loss=0.128, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000576, train/loss_step=0.162, global_step=4552.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5640/5971 [49:45<02:55,  1.89it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0801, train/loss_vlb_step=0.000266, train/loss_step=0.0801, global_step=4552.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5641/5971 [49:46<02:54,  1.89it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00537, train/loss_vlb_step=2.72e-5, train/loss_step=0.00537, global_step=4553.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  94%|█████████▍| 5642/5971 [49:47<02:54,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000725, train/loss_step=0.212, global_step=4553.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  95%|█████████▍| 5643/5971 [49:48<02:53,  1.89it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00343, train/loss_vlb_step=1.89e-5, train/loss_step=0.00343, global_step=4553.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5644/5971 [49:50<02:53,  1.89it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00343, train/loss_vlb_step=1.89e-5, train/loss_step=0.00343, global_step=4553.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5644/5971 [49:50<02:53,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.703, train/loss_vlb_step=0.0232, train/loss_step=0.703, global_step=4553.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]     
Epoch 7:  95%|█████████▍| 5645/5971 [49:51<02:52,  1.89it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00826, train/loss_vlb_step=3.82e-5, train/loss_step=0.00826, global_step=4554.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5646/5971 [49:52<02:52,  1.89it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0346, train/loss_vlb_step=0.000131, train/loss_step=0.0346, global_step=4554.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▍| 5647/5971 [49:52<02:51,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00306, train/loss_step=0.436, global_step=4554.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▍| 5648/5971 [49:54<02:51,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00306, train/loss_step=0.436, global_step=4554.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5648/5971 [49:54<02:51,  1.89it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00859, train/loss_vlb_step=3.92e-5, train/loss_step=0.00859, global_step=4554.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5649/5971 [49:55<02:50,  1.89it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00909, train/loss_vlb_step=4.09e-5, train/loss_step=0.00909, global_step=4555.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5650/5971 [49:56<02:50,  1.89it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.6e-5, train/loss_step=0.00292, global_step=4555.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  95%|█████████▍| 5651/5971 [49:57<02:49,  1.89it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.52e-5, train/loss_step=0.0172, global_step=4555.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▍| 5652/5971 [50:00<02:49,  1.88it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.52e-5, train/loss_step=0.0172, global_step=4555.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5652/5971 [50:00<02:49,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.0015, train/loss_step=0.316, global_step=4555.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▍| 5653/5971 [50:01<02:48,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000711, train/loss_step=0.205, global_step=4556.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5654/5971 [50:02<02:48,  1.88it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=6.15e-5, train/loss_step=0.0137, global_step=4556.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5655/5971 [50:02<02:47,  1.88it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.07e-5, train/loss_step=0.00189, global_step=4556.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5656/5971 [50:05<02:47,  1.88it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.07e-5, train/loss_step=0.00189, global_step=4556.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5656/5971 [50:05<02:47,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.41e-5, train/loss_step=0.0238, global_step=4556.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▍| 5657/5971 [50:06<02:46,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00183, train/loss_step=0.367, global_step=4557.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▍| 5658/5971 [50:06<02:46,  1.88it/s, loss=0.131, v_num=0, train/loss_simple_step=0.00985, train/loss_vlb_step=4.69e-5, train/loss_step=0.00985, global_step=4557.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5659/5971 [50:07<02:45,  1.88it/s, loss=0.131, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000541, train/loss_step=0.164, global_step=4557.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  95%|█████████▍| 5660/5971 [50:09<02:45,  1.88it/s, loss=0.131, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000541, train/loss_step=0.164, global_step=4557.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5660/5971 [50:09<02:45,  1.88it/s, loss=0.137, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000635, train/loss_step=0.189, global_step=4557.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5661/5971 [50:10<02:44,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000625, train/loss_step=0.186, global_step=4558.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5662/5971 [50:11<02:44,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000582, train/loss_step=0.169, global_step=4558.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5663/5971 [50:12<02:43,  1.88it/s, loss=0.153, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.00063, train/loss_step=0.188, global_step=4558.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  95%|█████████▍| 5664/5971 [50:14<02:43,  1.88it/s, loss=0.153, v_num=0, train/loss_simple_step=0.188, train/loss_vlb_step=0.00063, train/loss_step=0.188, global_step=4558.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5664/5971 [50:14<02:43,  1.88it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=6.72e-5, train/loss_step=0.0166, global_step=4558.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5665/5971 [50:15<02:42,  1.88it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.05e-5, train/loss_step=0.00181, global_step=4559.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5666/5971 [50:16<02:42,  1.88it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000146, train/loss_step=0.0402, global_step=4559.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  95%|█████████▍| 5667/5971 [50:17<02:41,  1.88it/s, loss=0.125, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.00683, train/loss_step=0.577, global_step=4559.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  95%|█████████▍| 5668/5971 [50:19<02:41,  1.88it/s, loss=0.125, v_num=0, train/loss_simple_step=0.577, train/loss_vlb_step=0.00683, train/loss_step=0.577, global_step=4559.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5668/5971 [50:19<02:41,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.0023, train/loss_step=0.365, global_step=4559.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  95%|█████████▍| 5669/5971 [50:20<02:40,  1.88it/s, loss=0.151, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000564, train/loss_step=0.166, global_step=4560.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5670/5971 [50:21<02:40,  1.88it/s, loss=0.172, v_num=0, train/loss_simple_step=0.423, train/loss_vlb_step=0.0028, train/loss_step=0.423, global_step=4560.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▍| 5671/5971 [50:22<02:39,  1.88it/s, loss=0.205, v_num=0, train/loss_simple_step=0.670, train/loss_vlb_step=0.0113, train/loss_step=0.670, global_step=4560.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5672/5971 [50:24<02:39,  1.88it/s, loss=0.205, v_num=0, train/loss_simple_step=0.670, train/loss_vlb_step=0.0113, train/loss_step=0.670, global_step=4560.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▍| 5672/5971 [50:24<02:39,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.35e-5, train/loss_step=0.012, global_step=4560.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5673/5971 [50:25<02:38,  1.88it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0632, train/loss_vlb_step=0.000213, train/loss_step=0.0632, global_step=4561.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5674/5971 [50:26<02:38,  1.88it/s, loss=0.191, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000613, train/loss_step=0.183, global_step=4561.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▌| 5675/5971 [50:26<02:37,  1.88it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.000321, train/loss_step=0.0974, global_step=4561.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5676/5971 [50:29<02:37,  1.87it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0974, train/loss_vlb_step=0.000321, train/loss_step=0.0974, global_step=4561.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5676/5971 [50:29<02:37,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.75e-5, train/loss_step=0.0154, global_step=4561.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  95%|█████████▌| 5677/5971 [50:29<02:36,  1.87it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00166, train/loss_vlb_step=9.36e-6, train/loss_step=0.00166, global_step=4562.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5678/5971 [50:30<02:36,  1.87it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0328, train/loss_vlb_step=0.000118, train/loss_step=0.0328, global_step=4562.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  95%|█████████▌| 5679/5971 [50:31<02:35,  1.87it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000137, train/loss_step=0.0381, global_step=4562.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5680/5971 [50:34<02:35,  1.87it/s, loss=0.172, v_num=0, train/loss_simple_step=0.0381, train/loss_vlb_step=0.000137, train/loss_step=0.0381, global_step=4562.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5680/5971 [50:34<02:35,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.51e-6, train/loss_step=0.00141, global_step=4562.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5681/5971 [50:35<02:34,  1.87it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.00013, train/loss_step=0.0335, global_step=4563.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▌| 5682/5971 [50:36<02:34,  1.87it/s, loss=0.161, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00132, train/loss_step=0.291, global_step=4563.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  95%|█████████▌| 5683/5971 [50:37<02:33,  1.87it/s, loss=0.157, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=4563.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5684/5971 [50:39<02:33,  1.87it/s, loss=0.157, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=4563.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5684/5971 [50:39<02:33,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000416, train/loss_step=0.126, global_step=4563.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5685/5971 [50:40<02:32,  1.87it/s, loss=0.169, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000478, train/loss_step=0.145, global_step=4564.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5686/5971 [50:41<02:32,  1.87it/s, loss=0.172, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000344, train/loss_step=0.105, global_step=4564.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5687/5971 [50:41<02:31,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000552, train/loss_step=0.168, global_step=4564.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5688/5971 [50:44<02:31,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000552, train/loss_step=0.168, global_step=4564.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5688/5971 [50:44<02:31,  1.87it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0166, train/loss_vlb_step=7.26e-5, train/loss_step=0.0166, global_step=4564.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5689/5971 [50:44<02:30,  1.87it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=0.000103, train/loss_step=0.0263, global_step=4565.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5690/5971 [50:45<02:30,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.292, train/loss_vlb_step=0.00112, train/loss_step=0.292, global_step=4565.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  95%|█████████▌| 5691/5971 [50:46<02:29,  1.87it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.99e-5, train/loss_step=0.0113, global_step=4565.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5692/5971 [50:48<02:29,  1.87it/s, loss=0.0881, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.99e-5, train/loss_step=0.0113, global_step=4565.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5692/5971 [50:48<02:29,  1.87it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.0952, train/loss_vlb_step=0.000313, train/loss_step=0.0952, global_step=4565.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5693/5971 [50:49<02:28,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00178, train/loss_step=0.339, global_step=4566.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  95%|█████████▌| 5694/5971 [50:50<02:28,  1.87it/s, loss=0.128, v_num=0, train/loss_simple_step=0.629, train/loss_vlb_step=0.0103, train/loss_step=0.629, global_step=4566.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  95%|█████████▌| 5695/5971 [50:51<02:27,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000459, train/loss_step=0.139, global_step=4566.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5696/5971 [50:53<02:27,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000459, train/loss_step=0.139, global_step=4566.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5696/5971 [50:53<02:27,  1.87it/s, loss=0.145, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00153, train/loss_step=0.305, global_step=4566.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5697/5971 [50:54<02:26,  1.87it/s, loss=0.158, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.000943, train/loss_step=0.256, global_step=4567.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5698/5971 [50:55<02:26,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.183, train/loss_vlb_step=0.000722, train/loss_step=0.183, global_step=4567.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5699/5971 [50:56<02:25,  1.86it/s, loss=0.175, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000905, train/loss_step=0.233, global_step=4567.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5700/5971 [50:59<02:25,  1.86it/s, loss=0.175, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000905, train/loss_step=0.233, global_step=4567.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5700/5971 [50:59<02:25,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.00013, train/loss_step=0.0339, global_step=4567.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  95%|█████████▌| 5701/5971 [51:00<02:24,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.503, train/loss_vlb_step=0.00396, train/loss_step=0.503, global_step=4568.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  95%|█████████▌| 5702/5971 [51:00<02:24,  1.86it/s, loss=0.214, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00812, train/loss_step=0.574, global_step=4568.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5703/5971 [51:01<02:23,  1.86it/s, loss=0.21, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.3e-5, train/loss_step=0.018, global_step=4568.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  96%|█████████▌| 5704/5971 [51:03<02:23,  1.86it/s, loss=0.21, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.3e-5, train/loss_step=0.018, global_step=4568.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5704/5971 [51:03<02:23,  1.86it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00512, train/loss_vlb_step=2.47e-5, train/loss_step=0.00512, global_step=4568.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5705/5971 [51:04<02:22,  1.86it/s, loss=0.207, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000721, train/loss_step=0.203, global_step=4569.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  96%|█████████▌| 5706/5971 [51:05<02:22,  1.86it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0177, train/loss_vlb_step=7.36e-5, train/loss_step=0.0177, global_step=4569.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5707/5971 [51:06<02:21,  1.86it/s, loss=0.202, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000553, train/loss_step=0.163, global_step=4569.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  96%|█████████▌| 5708/5971 [51:08<02:21,  1.86it/s, loss=0.202, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000553, train/loss_step=0.163, global_step=4569.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5708/5971 [51:08<02:21,  1.86it/s, loss=0.202, v_num=0, train/loss_simple_step=0.0056, train/loss_vlb_step=2.84e-5, train/loss_step=0.0056, global_step=4569.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5709/5971 [51:09<02:20,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.55e-5, train/loss_step=0.016, global_step=4570.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  96%|█████████▌| 5710/5971 [51:10<02:20,  1.86it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.4e-5, train/loss_step=0.0187, global_step=4570.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5711/5971 [51:11<02:19,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=5.93e-5, train/loss_step=0.0144, global_step=4570.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5712/5971 [51:13<02:19,  1.86it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=5.93e-5, train/loss_step=0.0144, global_step=4570.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5712/5971 [51:13<02:19,  1.86it/s, loss=0.198, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.0013, train/loss_step=0.296, global_step=4570.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  96%|█████████▌| 5713/5971 [51:14<02:18,  1.86it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00146, train/loss_vlb_step=8.74e-6, train/loss_step=0.00146, global_step=4571.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5714/5971 [51:15<02:18,  1.86it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00204, train/loss_vlb_step=1.22e-5, train/loss_step=0.00204, global_step=4571.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5715/5971 [51:16<02:17,  1.86it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.72e-5, train/loss_step=0.00315, global_step=4571.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5716/5971 [51:18<02:17,  1.86it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00315, train/loss_vlb_step=1.72e-5, train/loss_step=0.00315, global_step=4571.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5716/5971 [51:18<02:17,  1.86it/s, loss=0.127, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.28e-5, train/loss_step=0.00223, global_step=4571.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5717/5971 [51:19<02:16,  1.86it/s, loss=0.115, v_num=0, train/loss_simple_step=0.00856, train/loss_vlb_step=3.96e-5, train/loss_step=0.00856, global_step=4572.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5718/5971 [51:20<02:16,  1.86it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000144, train/loss_step=0.0392, global_step=4572.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  96%|█████████▌| 5719/5971 [51:20<02:15,  1.86it/s, loss=0.128, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00731, train/loss_step=0.628, global_step=4572.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  96%|█████████▌| 5720/5971 [51:23<02:15,  1.86it/s, loss=0.128, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.00731, train/loss_step=0.628, global_step=4572.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5720/5971 [51:23<02:15,  1.86it/s, loss=0.131, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000373, train/loss_step=0.110, global_step=4572.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5721/5971 [51:24<02:14,  1.86it/s, loss=0.124, v_num=0, train/loss_simple_step=0.345, train/loss_vlb_step=0.00143, train/loss_step=0.345, global_step=4573.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  96%|█████████▌| 5722/5971 [51:25<02:14,  1.86it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000155, train/loss_step=0.0426, global_step=4573.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5723/5971 [51:25<02:13,  1.85it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.66e-5, train/loss_step=0.00317, global_step=4573.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5724/5971 [51:28<02:13,  1.85it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.66e-5, train/loss_step=0.00317, global_step=4573.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5724/5971 [51:28<02:13,  1.85it/s, loss=0.0992, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.000214, train/loss_step=0.0647, global_step=4573.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  96%|█████████▌| 5725/5971 [51:28<02:12,  1.85it/s, loss=0.0979, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000633, train/loss_step=0.177, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  96%|█████████▌| 5726/5971 [51:29<02:12,  1.85it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0962, train/loss_vlb_step=0.000318, train/loss_step=0.0962, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5727/5971 [51:30<02:11,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00556, train/loss_step=0.519, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  96%|█████████▌| 5728/5971 [51:33<02:11,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.519, train/loss_vlb_step=0.00556, train/loss_step=0.519, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  96%|█████████▌| 5728/5971 [51:33<02:11,  1.85it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:15,  2.19it/s][A

Validating:   2%|▏         | 3/167 [00:00<00:26,  6.15it/s][A
Epoch 7:  96%|█████████▌| 5732/5971 [51:33<02:08,  1.85it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   4%|▎         | 6/167 [00:00<00:13, 11.75it/s][A
Epoch 7:  96%|█████████▌| 5736/5971 [51:33<02:06,  1.85it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.72it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.99it/s][A
Epoch 7:  96%|█████████▌| 5740/5971 [51:34<02:04,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.82it/s][A
Epoch 7:  96%|█████████▌| 5744/5971 [51:34<02:02,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  11%|█         | 18/167 [00:01<00:06, 23.51it/s][A
Epoch 7:  96%|█████████▋| 5748/5971 [51:34<02:00,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 22.84it/s][A
Epoch 7:  96%|█████████▋| 5752/5971 [51:34<01:57,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.05it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 24.01it/s][A
Epoch 7:  96%|█████████▋| 5756/5971 [51:34<01:55,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.37it/s][A
Epoch 7:  96%|█████████▋| 5760/5971 [51:34<01:53,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 24.98it/s][A
Epoch 7:  97%|█████████▋| 5764/5971 [51:34<01:51,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 24.81it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 24.16it/s][A
Epoch 7:  97%|█████████▋| 5768/5971 [51:35<01:48,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.37it/s][A
Epoch 7:  97%|█████████▋| 5772/5971 [51:35<01:46,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.57it/s][A
Epoch 7:  97%|█████████▋| 5776/5971 [51:35<01:44,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.31it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 25.07it/s][A
Epoch 7:  97%|█████████▋| 5780/5971 [51:35<01:42,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 26.05it/s][A
Epoch 7:  97%|█████████▋| 5784/5971 [51:35<01:40,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 26.91it/s][A
Epoch 7:  97%|█████████▋| 5788/5971 [51:35<01:37,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  36%|███▌      | 60/167 [00:02<00:03, 27.33it/s][A
Epoch 7:  97%|█████████▋| 5792/5971 [51:36<01:35,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 28.58it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 28.91it/s][A
Epoch 7:  97%|█████████▋| 5796/5971 [51:36<01:33,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 28.25it/s][A
Epoch 7:  97%|█████████▋| 5800/5971 [51:36<01:31,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 28.29it/s][A
Epoch 7:  97%|█████████▋| 5804/5971 [51:36<01:29,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 28.23it/s][A
Epoch 7:  97%|█████████▋| 5808/5971 [51:36<01:26,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 28.64it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:02, 28.69it/s][A
Epoch 7:  97%|█████████▋| 5812/5971 [51:36<01:24,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  51%|█████▏    | 86/167 [00:03<00:02, 27.26it/s][A
Epoch 7:  97%|█████████▋| 5816/5971 [51:36<01:22,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.26it/s][A
Epoch 7:  97%|█████████▋| 5820/5971 [51:37<01:20,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 26.51it/s][A

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.00it/s][A
Epoch 7:  98%|█████████▊| 5824/5971 [51:37<01:18,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 27.04it/s][A
Epoch 7:  98%|█████████▊| 5828/5971 [51:37<01:15,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.09it/s][A
Epoch 7:  98%|█████████▊| 5832/5971 [51:37<01:13,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 28.46it/s][A
Epoch 7:  98%|█████████▊| 5836/5971 [51:37<01:11,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 28.56it/s][A
Epoch 7:  98%|█████████▊| 5840/5971 [51:37<01:09,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 26.55it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:02, 25.01it/s][A
Epoch 7:  98%|█████████▊| 5844/5971 [51:37<01:07,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  71%|███████   | 118/167 [00:04<00:01, 26.24it/s][A
Epoch 7:  98%|█████████▊| 5848/5971 [51:38<01:05,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.21it/s][A
Epoch 7:  98%|█████████▊| 5852/5971 [51:38<01:02,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 28.63it/s][A
Epoch 7:  98%|█████████▊| 5856/5971 [51:38<01:00,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 29.25it/s][A
Epoch 7:  98%|█████████▊| 5860/5971 [51:38<00:58,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 28.63it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.88it/s][A
Epoch 7:  98%|█████████▊| 5864/5971 [51:38<00:56,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.99it/s][A
Epoch 7:  98%|█████████▊| 5868/5971 [51:38<00:54,  1.89it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.32it/s][A
Epoch 7:  98%|█████████▊| 5872/5971 [51:38<00:52,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 25.94it/s][A

Validating:  88%|████████▊ | 147/167 [00:05<00:00, 25.11it/s][A
Epoch 7:  98%|█████████▊| 5876/5971 [51:39<00:50,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.97it/s][A
Epoch 7:  98%|█████████▊| 5880/5971 [51:39<00:47,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 24.24it/s][A
Epoch 7:  99%|█████████▊| 5884/5971 [51:39<00:45,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.88it/s][A

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.15it/s][A
Epoch 7:  99%|█████████▊| 5888/5971 [51:39<00:43,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.48it/s][A
Epoch 7:  99%|█████████▊| 5892/5971 [51:39<00:41,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.67it/s][A
Epoch 7:  99%|█████████▊| 5896/5971 [51:39<00:39,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▊| 5896/5971 [51:40<00:39,  1.90it/s, loss=0.121, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000139, train/loss_step=0.0386, global_step=4574.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]

                                                             [A
Epoch 7:  99%|█████████▉| 5897/5971 [51:41<00:38,  1.90it/s, loss=0.126, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000339, train/loss_step=0.103, global_step=4575.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  99%|█████████▉| 5898/5971 [51:42<00:38,  1.90it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0501, train/loss_vlb_step=0.000176, train/loss_step=0.0501, global_step=4575.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5899/5971 [51:43<00:37,  1.90it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000152, train/loss_step=0.0438, global_step=4575.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5900/5971 [51:45<00:37,  1.90it/s, loss=0.129, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000152, train/loss_step=0.0438, global_step=4575.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5900/5971 [51:45<00:37,  1.90it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00207, train/loss_vlb_step=1.23e-5, train/loss_step=0.00207, global_step=4575.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5901/5971 [51:46<00:36,  1.90it/s, loss=0.127, v_num=0, train/loss_simple_step=0.255, train/loss_vlb_step=0.00108, train/loss_step=0.255, global_step=4576.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  99%|█████████▉| 5902/5971 [51:46<00:36,  1.90it/s, loss=0.132, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000358, train/loss_step=0.106, global_step=4576.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5903/5971 [51:47<00:35,  1.90it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0711, train/loss_vlb_step=0.000238, train/loss_step=0.0711, global_step=4576.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5904/5971 [51:50<00:35,  1.90it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0711, train/loss_vlb_step=0.000238, train/loss_step=0.0711, global_step=4576.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5904/5971 [51:50<00:35,  1.90it/s, loss=0.153, v_num=0, train/loss_simple_step=0.365, train/loss_vlb_step=0.0019, train/loss_step=0.365, global_step=4576.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  99%|█████████▉| 5905/5971 [51:50<00:34,  1.90it/s, loss=0.178, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.0047, train/loss_step=0.504, global_step=4577.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5906/5971 [51:51<00:34,  1.90it/s, loss=0.196, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.00279, train/loss_step=0.404, global_step=4577.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5907/5971 [51:52<00:33,  1.90it/s, loss=0.17, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000344, train/loss_step=0.105, global_step=4577.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5908/5971 [51:54<00:33,  1.90it/s, loss=0.17, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000344, train/loss_step=0.105, global_step=4577.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5908/5971 [51:54<00:33,  1.90it/s, loss=0.173, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000523, train/loss_step=0.158, global_step=4577.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5909/5971 [51:55<00:32,  1.90it/s, loss=0.171, v_num=0, train/loss_simple_step=0.316, train/loss_vlb_step=0.00128, train/loss_step=0.316, global_step=4578.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  99%|█████████▉| 5910/5971 [51:56<00:32,  1.90it/s, loss=0.187, v_num=0, train/loss_simple_step=0.350, train/loss_vlb_step=0.00163, train/loss_step=0.350, global_step=4578.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5911/5971 [51:57<00:31,  1.90it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000241, train/loss_step=0.0698, global_step=4578.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5912/5971 [51:59<00:31,  1.90it/s, loss=0.19, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000241, train/loss_step=0.0698, global_step=4578.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5912/5971 [51:59<00:31,  1.90it/s, loss=0.194, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000471, train/loss_step=0.138, global_step=4578.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  99%|█████████▉| 5913/5971 [52:00<00:30,  1.90it/s, loss=0.226, v_num=0, train/loss_simple_step=0.834, train/loss_vlb_step=0.0292, train/loss_step=0.834, global_step=4579.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  99%|█████████▉| 5914/5971 [52:01<00:30,  1.89it/s, loss=0.228, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=4579.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5915/5971 [52:02<00:29,  1.89it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00545, train/loss_vlb_step=2.74e-5, train/loss_step=0.00545, global_step=4579.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5916/5971 [52:04<00:29,  1.89it/s, loss=0.202, v_num=0, train/loss_simple_step=0.00545, train/loss_vlb_step=2.74e-5, train/loss_step=0.00545, global_step=4579.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5916/5971 [52:04<00:29,  1.89it/s, loss=0.217, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00139, train/loss_step=0.328, global_step=4579.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7:  99%|█████████▉| 5917/5971 [52:05<00:28,  1.89it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00759, train/loss_vlb_step=3.66e-5, train/loss_step=0.00759, global_step=4580.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5918/5971 [52:06<00:27,  1.89it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0601, train/loss_vlb_step=0.000204, train/loss_step=0.0601, global_step=4580.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  99%|█████████▉| 5919/5971 [52:07<00:27,  1.89it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0602, train/loss_vlb_step=0.000198, train/loss_step=0.0602, global_step=4580.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5920/5971 [52:09<00:26,  1.89it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0602, train/loss_vlb_step=0.000198, train/loss_step=0.0602, global_step=4580.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5920/5971 [52:09<00:26,  1.89it/s, loss=0.215, v_num=0, train/loss_simple_step=0.0296, train/loss_vlb_step=0.000109, train/loss_step=0.0296, global_step=4580.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5921/5971 [52:10<00:26,  1.89it/s, loss=0.221, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00169, train/loss_step=0.385, global_step=4581.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  99%|█████████▉| 5922/5971 [52:11<00:25,  1.89it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=9.07e-5, train/loss_step=0.0236, global_step=4581.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5923/5971 [52:12<00:25,  1.89it/s, loss=0.225, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000889, train/loss_step=0.228, global_step=4581.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  99%|█████████▉| 5924/5971 [52:14<00:24,  1.89it/s, loss=0.225, v_num=0, train/loss_simple_step=0.228, train/loss_vlb_step=0.000889, train/loss_step=0.228, global_step=4581.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5924/5971 [52:14<00:24,  1.89it/s, loss=0.221, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00113, train/loss_step=0.282, global_step=4581.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  99%|█████████▉| 5925/5971 [52:15<00:24,  1.89it/s, loss=0.203, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000506, train/loss_step=0.150, global_step=4582.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5926/5971 [52:16<00:23,  1.89it/s, loss=0.203, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.0025, train/loss_step=0.404, global_step=4582.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7:  99%|█████████▉| 5927/5971 [52:17<00:23,  1.89it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0572, train/loss_vlb_step=0.000211, train/loss_step=0.0572, global_step=4582.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5928/5971 [52:19<00:22,  1.89it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0572, train/loss_vlb_step=0.000211, train/loss_step=0.0572, global_step=4582.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5928/5971 [52:19<00:22,  1.89it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.000146, train/loss_step=0.0406, global_step=4582.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5929/5971 [52:20<00:22,  1.89it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0839, train/loss_vlb_step=0.00028, train/loss_step=0.0839, global_step=4583.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  99%|█████████▉| 5930/5971 [52:21<00:21,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.801, train/loss_vlb_step=0.0515, train/loss_step=0.801, global_step=4583.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  99%|█████████▉| 5931/5971 [52:21<00:21,  1.89it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00789, train/loss_vlb_step=3.91e-5, train/loss_step=0.00789, global_step=4583.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5932/5971 [52:24<00:20,  1.89it/s, loss=0.203, v_num=0, train/loss_simple_step=0.00789, train/loss_vlb_step=3.91e-5, train/loss_step=0.00789, global_step=4583.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5932/5971 [52:24<00:20,  1.89it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.32e-5, train/loss_step=0.00671, global_step=4583.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5933/5971 [52:25<00:20,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0336, train/loss_vlb_step=0.000122, train/loss_step=0.0336, global_step=4584.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7:  99%|█████████▉| 5934/5971 [52:25<00:19,  1.89it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00129, train/loss_vlb_step=7.85e-6, train/loss_step=0.00129, global_step=4584.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5935/5971 [52:26<00:19,  1.89it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.68e-5, train/loss_step=0.00521, global_step=4584.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5936/5971 [52:28<00:18,  1.89it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00521, train/loss_vlb_step=2.68e-5, train/loss_step=0.00521, global_step=4584.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5936/5971 [52:28<00:18,  1.89it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0344, train/loss_vlb_step=0.000122, train/loss_step=0.0344, global_step=4584.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5937/5971 [52:29<00:18,  1.89it/s, loss=0.151, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00133, train/loss_step=0.334, global_step=4585.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7:  99%|█████████▉| 5938/5971 [52:30<00:17,  1.89it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=3.01e-5, train/loss_step=0.00588, global_step=4585.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5939/5971 [52:31<00:16,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00779, train/loss_vlb_step=3.61e-5, train/loss_step=0.00779, global_step=4585.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5940/5971 [52:33<00:16,  1.88it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00779, train/loss_vlb_step=3.61e-5, train/loss_step=0.00779, global_step=4585.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:  99%|█████████▉| 5940/5971 [52:33<00:16,  1.88it/s, loss=0.189, v_num=0, train/loss_simple_step=0.892, train/loss_vlb_step=0.151, train/loss_step=0.892, global_step=4585.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]      
Epoch 7:  99%|█████████▉| 5941/5971 [52:34<00:15,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.364, train/loss_vlb_step=0.00158, train/loss_step=0.364, global_step=4586.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5942/5971 [52:35<00:15,  1.88it/s, loss=0.192, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.000332, train/loss_step=0.100, global_step=4586.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5943/5971 [52:36<00:14,  1.88it/s, loss=0.215, v_num=0, train/loss_simple_step=0.692, train/loss_vlb_step=0.0127, train/loss_step=0.692, global_step=4586.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7: 100%|█████████▉| 5944/5971 [52:38<00:14,  1.88it/s, loss=0.215, v_num=0, train/loss_simple_step=0.692, train/loss_vlb_step=0.0127, train/loss_step=0.692, global_step=4586.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5944/5971 [52:38<00:14,  1.88it/s, loss=0.208, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000443, train/loss_step=0.133, global_step=4586.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5945/5971 [52:39<00:13,  1.88it/s, loss=0.214, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00106, train/loss_step=0.278, global_step=4587.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7: 100%|█████████▉| 5946/5971 [52:40<00:13,  1.88it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.25e-5, train/loss_step=0.00224, global_step=4587.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5947/5971 [52:41<00:12,  1.88it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00105, train/loss_vlb_step=6.34e-6, train/loss_step=0.00105, global_step=4587.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5948/5971 [52:43<00:12,  1.88it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00105, train/loss_vlb_step=6.34e-6, train/loss_step=0.00105, global_step=4587.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5948/5971 [52:43<00:12,  1.88it/s, loss=0.219, v_num=0, train/loss_simple_step=0.590, train/loss_vlb_step=0.00873, train/loss_step=0.590, global_step=4587.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7: 100%|█████████▉| 5949/5971 [52:44<00:11,  1.88it/s, loss=0.227, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000895, train/loss_step=0.253, global_step=4588.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5950/5971 [52:45<00:11,  1.88it/s, loss=0.196, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000598, train/loss_step=0.173, global_step=4588.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5951/5971 [52:46<00:10,  1.88it/s, loss=0.205, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000706, train/loss_step=0.201, global_step=4588.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5952/5971 [52:48<00:10,  1.88it/s, loss=0.205, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000706, train/loss_step=0.201, global_step=4588.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5952/5971 [52:48<00:10,  1.88it/s, loss=0.214, v_num=0, train/loss_simple_step=0.172, train/loss_vlb_step=0.000573, train/loss_step=0.172, global_step=4588.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5953/5971 [52:49<00:09,  1.88it/s, loss=0.232, v_num=0, train/loss_simple_step=0.399, train/loss_vlb_step=0.002, train/loss_step=0.399, global_step=4589.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7: 100%|█████████▉| 5954/5971 [52:50<00:09,  1.88it/s, loss=0.238, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00042, train/loss_step=0.119, global_step=4589.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5955/5971 [52:51<00:08,  1.88it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000308, train/loss_step=0.0928, global_step=4589.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5956/5971 [52:53<00:07,  1.88it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000308, train/loss_step=0.0928, global_step=4589.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5956/5971 [52:53<00:07,  1.88it/s, loss=0.241, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.41e-5, train/loss_step=0.019, global_step=4589.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7: 100%|█████████▉| 5957/5971 [52:54<00:07,  1.88it/s, loss=0.225, v_num=0, train/loss_simple_step=0.00239, train/loss_vlb_step=1.36e-5, train/loss_step=0.00239, global_step=4590.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5958/5971 [52:55<00:06,  1.88it/s, loss=0.225, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.33e-6, train/loss_step=0.0016, global_step=4590.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7: 100%|█████████▉| 5959/5971 [52:55<00:06,  1.88it/s, loss=0.225, v_num=0, train/loss_simple_step=0.00584, train/loss_vlb_step=2.88e-5, train/loss_step=0.00584, global_step=4590.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5960/5971 [52:58<00:05,  1.88it/s, loss=0.225, v_num=0, train/loss_simple_step=0.00584, train/loss_vlb_step=2.88e-5, train/loss_step=0.00584, global_step=4590.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5960/5971 [52:58<00:05,  1.88it/s, loss=0.206, v_num=0, train/loss_simple_step=0.525, train/loss_vlb_step=0.00435, train/loss_step=0.525, global_step=4590.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7: 100%|█████████▉| 5961/5971 [52:58<00:05,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00795, train/loss_vlb_step=3.73e-5, train/loss_step=0.00795, global_step=4591.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5962/5971 [52:59<00:04,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00577, train/loss_vlb_step=2.83e-5, train/loss_step=0.00577, global_step=4591.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5963/5971 [53:00<00:04,  1.88it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000227, train/loss_step=0.0675, global_step=4591.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7: 100%|█████████▉| 5964/5971 [53:03<00:03,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000227, train/loss_step=0.0675, global_step=4591.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5964/5971 [53:03<00:03,  1.87it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0254, train/loss_vlb_step=0.000102, train/loss_step=0.0254, global_step=4591.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5965/5971 [53:04<00:03,  1.87it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0374, train/loss_vlb_step=0.000135, train/loss_step=0.0374, global_step=4592.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5966/5971 [53:04<00:02,  1.87it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.91e-5, train/loss_step=0.0136, global_step=4592.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7: 100%|█████████▉| 5967/5971 [53:05<00:02,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00599, train/loss_step=0.533, global_step=4592.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7: 100%|█████████▉| 5968/5971 [53:07<00:01,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.533, train/loss_vlb_step=0.00599, train/loss_step=0.533, global_step=4592.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5968/5971 [53:07<00:01,  1.87it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0861, train/loss_vlb_step=0.000284, train/loss_step=0.0861, global_step=4592.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|█████████▉| 5969/5971 [53:08<00:01,  1.87it/s, loss=0.127, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000159, train/loss_step=0.043, global_step=4593.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7: 100%|█████████▉| 5970/5971 [53:09<00:00,  1.87it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.14e-5, train/loss_step=0.0129, global_step=4593.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:10<00:00,  1.87it/s, loss=0.118, v_num=0, train/loss_simple_step=0.193, train/loss_vlb_step=0.000695, train/loss_step=0.193, global_step=4593.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7: 100%|██████████| 5971/5971 [53:12<00:00,  1.87it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0626, train/loss_vlb_step=0.000218, train/loss_step=0.0626, global_step=4593.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:13<00:00,  1.87it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.26e-5, train/loss_step=0.0241, global_step=4594.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:14<00:00,  1.87it/s, loss=0.0899, v_num=0, train/loss_simple_step=0.0403, train/loss_vlb_step=0.000145, train/loss_step=0.0403, global_step=4594.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:15<00:00,  1.87it/s, loss=0.0892, v_num=0, train/loss_simple_step=0.078, train/loss_vlb_step=0.000264, train/loss_step=0.078, global_step=4594.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7: 100%|██████████| 5971/5971 [53:17<00:00,  1.87it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=4594.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:18<00:00,  1.87it/s, loss=0.0953, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000471, train/loss_step=0.142, global_step=4594.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:18<00:00,  1.87it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.27e-5, train/loss_step=0.0208, global_step=4595.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:19<00:00,  1.87it/s, loss=0.115, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00186, train/loss_step=0.378, global_step=4595.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7: 100%|██████████| 5971/5971 [53:20<00:00,  1.87it/s, loss=0.129, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00127, train/loss_step=0.293, global_step=4595.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:22<00:00,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0599, train/loss_vlb_step=0.000217, train/loss_step=0.0599, global_step=4595.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:23<00:00,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.15e-5, train/loss_step=0.0115, global_step=4596.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] 
Epoch 7: 100%|██████████| 5971/5971 [53:24<00:00,  1.86it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00341, train/loss_vlb_step=1.87e-5, train/loss_step=0.00341, global_step=4596.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:25<00:00,  1.86it/s, loss=0.113, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000855, train/loss_step=0.208, global_step=4596.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]   
Epoch 7: 100%|██████████| 5971/5971 [53:27<00:00,  1.86it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0061, train/loss_vlb_step=2.98e-5, train/loss_step=0.0061, global_step=4596.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:28<00:00,  1.86it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0206, train/loss_vlb_step=8.21e-5, train/loss_step=0.0206, global_step=4597.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:29<00:00,  1.86it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00213, train/loss_vlb_step=1.22e-5, train/loss_step=0.00213, global_step=4597.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:29<00:00,  1.86it/s, loss=0.0893, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000336, train/loss_step=0.101, global_step=4597.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]  
Epoch 7: 100%|██████████| 5971/5971 [53:32<00:00,  1.86it/s, loss=0.131, v_num=0, train/loss_simple_step=0.916, train/loss_vlb_step=0.231, train/loss_step=0.916, global_step=4597.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]    
Epoch 7: 100%|██████████| 5971/5971 [53:33<00:00,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000339, train/loss_step=0.102, global_step=4598.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:33<00:00,  1.86it/s, loss=0.142, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000633, train/loss_step=0.174, global_step=4598.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:34<00:00,  1.86it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0236, train/loss_vlb_step=8.67e-5, train/loss_step=0.0236, global_step=4598.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:37<00:00,  1.86it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0675, train/loss_vlb_step=0.000222, train/loss_step=0.0675, global_step=4598.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7: 100%|██████████| 5971/5971 [53:39<00:00,  1.85it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0695, train/loss_vlb_step=0.000237, train/loss_step=0.0695, global_step=4599.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 7:   0%|          | 0/5971 [00:00<00:00, 10230.01it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0695, train/loss_vlb_step=0.000237, train/loss_step=0.0695, global_step=4599.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 8:   0%|          | 0/5971 [00:00<00:02, 2678.36it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0695, train/loss_vlb_step=0.000237, train/loss_step=0.0695, global_step=4599.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:35,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.46it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.31it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.95it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.88it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.07it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.23it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.32it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.45it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.50it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.52it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.46it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.49it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.44it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.42it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.49it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.52it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.55it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.60it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.61it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.63it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.62it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.54it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.41it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.37it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.32it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.35it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.39it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.43it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.48it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.50it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.55it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.19it/s]

Epoch 8:   0%|          | 1/5971 [00:13<11:06:12,  6.70s/it, loss=0.136, v_num=0, train/loss_simple_step=0.0695, train/loss_vlb_step=0.000237, train/loss_step=0.0695, global_step=4599.0, train/loss_simple_epoch=0.143, train/loss_vlb_epoch=0.00288, train/loss_epoch=0.143]
Epoch 8:   0%|          | 1/5971 [00:13<11:06:17,  6.70s/it, loss=0.134, v_num=0, train/loss_simple_step=0.00793, train/loss_vlb_step=3.91e-5, train/loss_step=0.00793, global_step=4600.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:26,  1.86it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:16,  2.97it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:12,  3.67it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  4.12it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.43it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.64it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.78it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.01it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.15it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.27it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.45it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.53it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.54it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.55it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.56it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.58it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.58it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.58it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.59it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.59it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.58it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.58it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.58it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.59it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.60it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.58it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.56it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.46it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.38it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.31it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.25it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.27it/s]

Epoch 8:   0%|          | 2/5971 [00:25<14:05:11,  8.50s/it, loss=0.134, v_num=0, train/loss_simple_step=0.00793, train/loss_vlb_step=3.91e-5, train/loss_step=0.00793, global_step=4600.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 2/5971 [00:25<14:05:13,  8.50s/it, loss=0.132, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000152, train/loss_step=0.0417, global_step=4600.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.39it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.22it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.76it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.17it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.45it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.67it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  4.83it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:08,  4.92it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:08,  4.98it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.07it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.22it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:03<00:06,  5.33it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.42it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.37it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.41it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.45it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.48it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.39it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.27it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.26it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.25it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.40it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.47it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.52it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.56it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.47it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:07<00:02,  5.39it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.35it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.31it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.27it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:02,  5.25it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:08<00:01,  5.24it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.12it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.23it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.35it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:09<00:00,  5.52it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.61it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.62it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.07it/s]

Epoch 8:   0%|          | 3/5971 [00:37<15:37:12,  9.42s/it, loss=0.132, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000152, train/loss_step=0.0417, global_step=4600.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 3/5971 [00:37<15:37:14,  9.42s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000111, train/loss_step=0.0285, global_step=4600.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:31,  1.55it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:17,  2.71it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:00<00:13,  3.55it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  4.15it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:09,  4.59it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:08,  4.91it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.11it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.23it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.31it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.35it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.38it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:02<00:06,  5.44it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.50it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.40it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.37it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.33it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.27it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.22it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.18it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  5.19it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.27it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.30it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.31it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.32it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.33it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.35it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.39it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.37it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.36it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.38it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.46it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.52it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.56it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.59it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.64it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.66it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.67it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.62it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.63it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.65it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.22it/s]

Epoch 8:   0%|          | 4/5971 [00:51<16:58:41, 10.24s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000111, train/loss_step=0.0285, global_step=4600.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 4/5971 [00:51<16:58:43, 10.24s/it, loss=0.142, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.0013, train/loss_step=0.322, global_step=4600.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:   0%|          | 5/5971 [00:52<14:23:53,  8.69s/it, loss=0.142, v_num=0, train/loss_simple_step=0.322, train/loss_vlb_step=0.0013, train/loss_step=0.322, global_step=4600.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 5/5971 [00:52<14:23:54,  8.69s/it, loss=0.124, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.96e-5, train/loss_step=0.0159, global_step=4601.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 6/5971 [00:53<12:33:17,  7.58s/it, loss=0.124, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=6.96e-5, train/loss_step=0.0159, global_step=4601.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 6/5971 [00:53<12:33:18,  7.58s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000151, train/loss_step=0.0421, global_step=4601.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 7/5971 [00:53<11:10:08,  6.74s/it, loss=0.111, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000151, train/loss_step=0.0421, global_step=4601.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 7/5971 [00:53<11:10:09,  6.74s/it, loss=0.123, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00138, train/loss_step=0.298, global_step=4601.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   0%|          | 8/5971 [00:56<10:21:35,  6.25s/it, loss=0.123, v_num=0, train/loss_simple_step=0.298, train/loss_vlb_step=0.00138, train/loss_step=0.298, global_step=4601.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 8/5971 [00:56<10:21:35,  6.25s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0908, train/loss_vlb_step=0.000305, train/loss_step=0.0908, global_step=4601.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 9/5971 [00:57<9:28:05,  5.72s/it, loss=0.127, v_num=0, train/loss_simple_step=0.0908, train/loss_vlb_step=0.000305, train/loss_step=0.0908, global_step=4601.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   0%|          | 9/5971 [00:57<9:28:06,  5.72s/it, loss=0.128, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.96e-5, train/loss_step=0.0194, global_step=4602.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   0%|          | 10/5971 [00:58<8:44:15,  5.28s/it, loss=0.128, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.96e-5, train/loss_step=0.0194, global_step=4602.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 10/5971 [00:58<8:44:16,  5.28s/it, loss=0.137, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00176, train/loss_step=0.393, global_step=4602.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   0%|          | 11/5971 [00:58<8:07:35,  4.91s/it, loss=0.137, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.00176, train/loss_step=0.393, global_step=4602.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 11/5971 [00:58<8:07:35,  4.91s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00801, train/loss_vlb_step=3.56e-5, train/loss_step=0.00801, global_step=4602.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 12/5971 [01:01<7:47:59,  4.71s/it, loss=0.137, v_num=0, train/loss_simple_step=0.00801, train/loss_vlb_step=3.56e-5, train/loss_step=0.00801, global_step=4602.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 12/5971 [01:01<7:47:59,  4.71s/it, loss=0.136, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.38e-5, train/loss_step=0.00248, global_step=4602.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 13/5971 [01:02<7:20:49,  4.44s/it, loss=0.136, v_num=0, train/loss_simple_step=0.00248, train/loss_vlb_step=1.38e-5, train/loss_step=0.00248, global_step=4602.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 13/5971 [01:02<7:20:49,  4.44s/it, loss=0.136, v_num=0, train/loss_simple_step=0.00623, train/loss_vlb_step=2.99e-5, train/loss_step=0.00623, global_step=4603.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 14/5971 [01:03<6:57:05,  4.20s/it, loss=0.136, v_num=0, train/loss_simple_step=0.00623, train/loss_vlb_step=2.99e-5, train/loss_step=0.00623, global_step=4603.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 14/5971 [01:03<6:57:05,  4.20s/it, loss=0.137, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=4603.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   0%|          | 15/5971 [01:03<6:36:21,  3.99s/it, loss=0.137, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.000336, train/loss_step=0.102, global_step=4603.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 15/5971 [01:03<6:36:21,  3.99s/it, loss=0.104, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00123, train/loss_step=0.269, global_step=4603.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   0%|          | 16/5971 [01:05<6:25:17,  3.88s/it, loss=0.104, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00123, train/loss_step=0.269, global_step=4603.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 16/5971 [01:05<6:25:18,  3.88s/it, loss=0.127, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00581, train/loss_step=0.550, global_step=4603.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 17/5971 [01:06<6:08:46,  3.72s/it, loss=0.127, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00581, train/loss_step=0.550, global_step=4603.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 17/5971 [01:06<6:08:47,  3.72s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.39e-5, train/loss_step=0.0104, global_step=4604.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 18/5971 [01:07<5:53:48,  3.57s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.39e-5, train/loss_step=0.0104, global_step=4604.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 18/5971 [01:07<5:53:49,  3.57s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.58e-5, train/loss_step=0.0248, global_step=4604.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 19/5971 [01:08<5:40:21,  3.43s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.58e-5, train/loss_step=0.0248, global_step=4604.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 19/5971 [01:08<5:40:21,  3.43s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=4604.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 20/5971 [01:10<5:34:11,  3.37s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0552, train/loss_vlb_step=0.000191, train/loss_step=0.0552, global_step=4604.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 20/5971 [01:10<5:34:11,  3.37s/it, loss=0.117, v_num=0, train/loss_simple_step=0.054, train/loss_vlb_step=0.00018, train/loss_step=0.054, global_step=4604.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   0%|          | 21/5971 [01:11<5:23:02,  3.26s/it, loss=0.117, v_num=0, train/loss_simple_step=0.054, train/loss_vlb_step=0.00018, train/loss_step=0.054, global_step=4604.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 21/5971 [01:11<5:23:03,  3.26s/it, loss=0.117, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.04e-5, train/loss_step=0.0018, global_step=4605.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 22/5971 [01:12<5:12:43,  3.15s/it, loss=0.117, v_num=0, train/loss_simple_step=0.0018, train/loss_vlb_step=1.04e-5, train/loss_step=0.0018, global_step=4605.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 22/5971 [01:12<5:12:43,  3.15s/it, loss=0.119, v_num=0, train/loss_simple_step=0.0833, train/loss_vlb_step=0.000274, train/loss_step=0.0833, global_step=4605.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 23/5971 [01:13<5:03:11,  3.06s/it, loss=0.119, v_num=0, train/loss_simple_step=0.0833, train/loss_vlb_step=0.000274, train/loss_step=0.0833, global_step=4605.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 23/5971 [01:13<5:03:11,  3.06s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=6.15e-5, train/loss_step=0.0136, global_step=4605.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   0%|          | 24/5971 [01:15<4:59:32,  3.02s/it, loss=0.118, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=6.15e-5, train/loss_step=0.0136, global_step=4605.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 24/5971 [01:15<4:59:32,  3.02s/it, loss=0.107, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=4605.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   0%|          | 25/5971 [01:16<4:51:29,  2.94s/it, loss=0.107, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=4605.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 25/5971 [01:16<4:51:29,  2.94s/it, loss=0.109, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000194, train/loss_step=0.0564, global_step=4606.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 26/5971 [01:17<4:43:51,  2.86s/it, loss=0.109, v_num=0, train/loss_simple_step=0.0564, train/loss_vlb_step=0.000194, train/loss_step=0.0564, global_step=4606.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 26/5971 [01:17<4:43:51,  2.86s/it, loss=0.113, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=4606.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   0%|          | 27/5971 [01:18<4:36:46,  2.79s/it, loss=0.113, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=4606.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 27/5971 [01:18<4:36:46,  2.79s/it, loss=0.117, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00259, train/loss_step=0.375, global_step=4606.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 28/5971 [01:20<4:35:40,  2.78s/it, loss=0.117, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00259, train/loss_step=0.375, global_step=4606.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 28/5971 [01:20<4:35:40,  2.78s/it, loss=0.126, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00108, train/loss_step=0.282, global_step=4606.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 29/5971 [01:21<4:29:23,  2.72s/it, loss=0.126, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00108, train/loss_step=0.282, global_step=4606.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   0%|          | 29/5971 [01:21<4:29:23,  2.72s/it, loss=0.13, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000305, train/loss_step=0.0926, global_step=4607.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 30/5971 [01:22<4:23:26,  2.66s/it, loss=0.13, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000305, train/loss_step=0.0926, global_step=4607.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 30/5971 [01:22<4:23:26,  2.66s/it, loss=0.13, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00195, train/loss_step=0.387, global_step=4607.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|          | 31/5971 [01:23<4:17:53,  2.60s/it, loss=0.13, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00195, train/loss_step=0.387, global_step=4607.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 31/5971 [01:23<4:17:53,  2.60s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0745, train/loss_vlb_step=0.000249, train/loss_step=0.0745, global_step=4607.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 32/5971 [01:25<4:16:20,  2.59s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0745, train/loss_vlb_step=0.000249, train/loss_step=0.0745, global_step=4607.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 32/5971 [01:25<4:16:20,  2.59s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.53e-5, train/loss_step=0.0104, global_step=4607.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 33/5971 [01:26<4:11:26,  2.54s/it, loss=0.133, v_num=0, train/loss_simple_step=0.0104, train/loss_vlb_step=4.53e-5, train/loss_step=0.0104, global_step=4607.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 33/5971 [01:26<4:11:26,  2.54s/it, loss=0.134, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.7e-5, train/loss_step=0.019, global_step=4608.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|          | 34/5971 [01:27<4:06:45,  2.49s/it, loss=0.134, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.7e-5, train/loss_step=0.019, global_step=4608.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 34/5971 [01:27<4:06:45,  2.49s/it, loss=0.168, v_num=0, train/loss_simple_step=0.785, train/loss_vlb_step=0.0275, train/loss_step=0.785, global_step=4608.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 35/5971 [01:28<4:02:16,  2.45s/it, loss=0.168, v_num=0, train/loss_simple_step=0.785, train/loss_vlb_step=0.0275, train/loss_step=0.785, global_step=4608.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 35/5971 [01:28<4:02:16,  2.45s/it, loss=0.162, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000506, train/loss_step=0.146, global_step=4608.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 36/5971 [01:30<4:01:21,  2.44s/it, loss=0.162, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000506, train/loss_step=0.146, global_step=4608.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 36/5971 [01:30<4:01:22,  2.44s/it, loss=0.135, v_num=0, train/loss_simple_step=0.0015, train/loss_vlb_step=8.82e-6, train/loss_step=0.0015, global_step=4608.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 37/5971 [01:31<3:57:22,  2.40s/it, loss=0.135, v_num=0, train/loss_simple_step=0.0015, train/loss_vlb_step=8.82e-6, train/loss_step=0.0015, global_step=4608.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 37/5971 [01:31<3:57:23,  2.40s/it, loss=0.143, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000594, train/loss_step=0.180, global_step=4609.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 38/5971 [01:32<3:53:30,  2.36s/it, loss=0.143, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.000594, train/loss_step=0.180, global_step=4609.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 38/5971 [01:32<3:53:30,  2.36s/it, loss=0.142, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.34e-5, train/loss_step=0.0023, global_step=4609.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 39/5971 [01:32<3:49:49,  2.32s/it, loss=0.142, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.34e-5, train/loss_step=0.0023, global_step=4609.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 39/5971 [01:32<3:49:49,  2.32s/it, loss=0.144, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.00034, train/loss_step=0.102, global_step=4609.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   1%|          | 40/5971 [01:35<3:49:12,  2.32s/it, loss=0.144, v_num=0, train/loss_simple_step=0.102, train/loss_vlb_step=0.00034, train/loss_step=0.102, global_step=4609.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 40/5971 [01:35<3:49:12,  2.32s/it, loss=0.156, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00147, train/loss_step=0.296, global_step=4609.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 41/5971 [01:35<3:45:53,  2.29s/it, loss=0.156, v_num=0, train/loss_simple_step=0.296, train/loss_vlb_step=0.00147, train/loss_step=0.296, global_step=4609.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 41/5971 [01:35<3:45:53,  2.29s/it, loss=0.157, v_num=0, train/loss_simple_step=0.00627, train/loss_vlb_step=3.05e-5, train/loss_step=0.00627, global_step=4610.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 42/5971 [01:36<3:42:37,  2.25s/it, loss=0.157, v_num=0, train/loss_simple_step=0.00627, train/loss_vlb_step=3.05e-5, train/loss_step=0.00627, global_step=4610.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 42/5971 [01:36<3:42:37,  2.25s/it, loss=0.153, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.01e-5, train/loss_step=0.00403, global_step=4610.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 43/5971 [01:37<3:39:32,  2.22s/it, loss=0.153, v_num=0, train/loss_simple_step=0.00403, train/loss_vlb_step=2.01e-5, train/loss_step=0.00403, global_step=4610.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 43/5971 [01:37<3:39:32,  2.22s/it, loss=0.162, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000692, train/loss_step=0.205, global_step=4610.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|          | 44/5971 [01:39<3:39:28,  2.22s/it, loss=0.162, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000692, train/loss_step=0.205, global_step=4610.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 44/5971 [01:39<3:39:29,  2.22s/it, loss=0.167, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000782, train/loss_step=0.208, global_step=4610.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 45/5971 [01:40<3:36:35,  2.19s/it, loss=0.167, v_num=0, train/loss_simple_step=0.208, train/loss_vlb_step=0.000782, train/loss_step=0.208, global_step=4610.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 45/5971 [01:40<3:36:35,  2.19s/it, loss=0.188, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00314, train/loss_step=0.462, global_step=4611.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 46/5971 [01:41<3:33:47,  2.16s/it, loss=0.188, v_num=0, train/loss_simple_step=0.462, train/loss_vlb_step=0.00314, train/loss_step=0.462, global_step=4611.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 46/5971 [01:41<3:33:47,  2.16s/it, loss=0.185, v_num=0, train/loss_simple_step=0.068, train/loss_vlb_step=0.000228, train/loss_step=0.068, global_step=4611.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 47/5971 [01:42<3:31:08,  2.14s/it, loss=0.185, v_num=0, train/loss_simple_step=0.068, train/loss_vlb_step=0.000228, train/loss_step=0.068, global_step=4611.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 47/5971 [01:42<3:31:09,  2.14s/it, loss=0.195, v_num=0, train/loss_simple_step=0.569, train/loss_vlb_step=0.00754, train/loss_step=0.569, global_step=4611.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 48/5971 [01:44<3:31:24,  2.14s/it, loss=0.195, v_num=0, train/loss_simple_step=0.569, train/loss_vlb_step=0.00754, train/loss_step=0.569, global_step=4611.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 48/5971 [01:44<3:31:25,  2.14s/it, loss=0.181, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.44e-5, train/loss_step=0.00265, global_step=4611.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 49/5971 [01:45<3:28:56,  2.12s/it, loss=0.181, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.44e-5, train/loss_step=0.00265, global_step=4611.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 49/5971 [01:45<3:28:56,  2.12s/it, loss=0.177, v_num=0, train/loss_simple_step=0.00494, train/loss_vlb_step=2.51e-5, train/loss_step=0.00494, global_step=4612.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 50/5971 [01:46<3:26:33,  2.09s/it, loss=0.177, v_num=0, train/loss_simple_step=0.00494, train/loss_vlb_step=2.51e-5, train/loss_step=0.00494, global_step=4612.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 50/5971 [01:46<3:26:33,  2.09s/it, loss=0.173, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00167, train/loss_step=0.320, global_step=4612.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:   1%|          | 51/5971 [01:47<3:24:11,  2.07s/it, loss=0.173, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00167, train/loss_step=0.320, global_step=4612.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 51/5971 [01:47<3:24:11,  2.07s/it, loss=0.17, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.17e-5, train/loss_step=0.00199, global_step=4612.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 52/5971 [01:49<3:24:11,  2.07s/it, loss=0.17, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.17e-5, train/loss_step=0.00199, global_step=4612.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 52/5971 [01:49<3:24:11,  2.07s/it, loss=0.189, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00223, train/loss_step=0.407, global_step=4612.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|          | 53/5971 [01:50<3:22:01,  2.05s/it, loss=0.189, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00223, train/loss_step=0.407, global_step=4612.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 53/5971 [01:50<3:22:01,  2.05s/it, loss=0.19, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000116, train/loss_step=0.0299, global_step=4613.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 54/5971 [01:51<3:19:51,  2.03s/it, loss=0.19, v_num=0, train/loss_simple_step=0.0299, train/loss_vlb_step=0.000116, train/loss_step=0.0299, global_step=4613.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 54/5971 [01:51<3:19:51,  2.03s/it, loss=0.158, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000446, train/loss_step=0.136, global_step=4613.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 55/5971 [01:52<3:17:47,  2.01s/it, loss=0.158, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000446, train/loss_step=0.136, global_step=4613.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 55/5971 [01:52<3:17:48,  2.01s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000117, train/loss_step=0.0301, global_step=4613.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 56/5971 [01:54<3:18:21,  2.01s/it, loss=0.152, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000117, train/loss_step=0.0301, global_step=4613.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 56/5971 [01:54<3:18:21,  2.01s/it, loss=0.167, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00135, train/loss_step=0.307, global_step=4613.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|          | 57/5971 [01:55<3:16:25,  1.99s/it, loss=0.167, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00135, train/loss_step=0.307, global_step=4613.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 57/5971 [01:55<3:16:25,  1.99s/it, loss=0.158, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.72e-6, train/loss_step=0.00162, global_step=4614.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 58/5971 [01:56<3:14:30,  1.97s/it, loss=0.158, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.72e-6, train/loss_step=0.00162, global_step=4614.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 58/5971 [01:56<3:14:30,  1.97s/it, loss=0.173, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00137, train/loss_step=0.302, global_step=4614.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:   1%|          | 59/5971 [01:57<3:12:40,  1.96s/it, loss=0.173, v_num=0, train/loss_simple_step=0.302, train/loss_vlb_step=0.00137, train/loss_step=0.302, global_step=4614.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 59/5971 [01:57<3:12:40,  1.96s/it, loss=0.19, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00277, train/loss_step=0.430, global_step=4614.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 60/5971 [01:59<3:13:08,  1.96s/it, loss=0.19, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00277, train/loss_step=0.430, global_step=4614.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 60/5971 [01:59<3:13:08,  1.96s/it, loss=0.175, v_num=0, train/loss_simple_step=0.00685, train/loss_vlb_step=3.35e-5, train/loss_step=0.00685, global_step=4614.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 61/5971 [02:00<3:11:24,  1.94s/it, loss=0.175, v_num=0, train/loss_simple_step=0.00685, train/loss_vlb_step=3.35e-5, train/loss_step=0.00685, global_step=4614.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 61/5971 [02:00<3:11:25,  1.94s/it, loss=0.182, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000475, train/loss_step=0.141, global_step=4615.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|          | 62/5971 [02:01<3:09:41,  1.93s/it, loss=0.182, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000475, train/loss_step=0.141, global_step=4615.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 62/5971 [02:01<3:09:41,  1.93s/it, loss=0.213, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.0073, train/loss_step=0.628, global_step=4615.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   1%|          | 63/5971 [02:02<3:08:02,  1.91s/it, loss=0.213, v_num=0, train/loss_simple_step=0.628, train/loss_vlb_step=0.0073, train/loss_step=0.628, global_step=4615.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 63/5971 [02:02<3:08:03,  1.91s/it, loss=0.203, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.54e-5, train/loss_step=0.0128, global_step=4615.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 64/5971 [02:04<3:08:20,  1.91s/it, loss=0.203, v_num=0, train/loss_simple_step=0.0128, train/loss_vlb_step=5.54e-5, train/loss_step=0.0128, global_step=4615.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 64/5971 [02:04<3:08:20,  1.91s/it, loss=0.217, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00361, train/loss_step=0.476, global_step=4615.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   1%|          | 65/5971 [02:05<3:06:48,  1.90s/it, loss=0.217, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00361, train/loss_step=0.476, global_step=4615.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 65/5971 [02:05<3:06:48,  1.90s/it, loss=0.197, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000214, train/loss_step=0.0635, global_step=4616.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 66/5971 [02:06<3:05:16,  1.88s/it, loss=0.197, v_num=0, train/loss_simple_step=0.0635, train/loss_vlb_step=0.000214, train/loss_step=0.0635, global_step=4616.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 66/5971 [02:06<3:05:16,  1.88s/it, loss=0.194, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.6e-5, train/loss_step=0.00757, global_step=4616.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 67/5971 [02:07<3:03:47,  1.87s/it, loss=0.194, v_num=0, train/loss_simple_step=0.00757, train/loss_vlb_step=3.6e-5, train/loss_step=0.00757, global_step=4616.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 67/5971 [02:07<3:03:47,  1.87s/it, loss=0.166, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.65e-5, train/loss_step=0.0101, global_step=4616.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 68/5971 [02:09<3:04:22,  1.87s/it, loss=0.166, v_num=0, train/loss_simple_step=0.0101, train/loss_vlb_step=4.65e-5, train/loss_step=0.0101, global_step=4616.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 68/5971 [02:09<3:04:22,  1.87s/it, loss=0.178, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000854, train/loss_step=0.240, global_step=4616.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 69/5971 [02:10<3:02:59,  1.86s/it, loss=0.178, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000854, train/loss_step=0.240, global_step=4616.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 69/5971 [02:10<3:02:59,  1.86s/it, loss=0.18, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000153, train/loss_step=0.0426, global_step=4617.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 70/5971 [02:11<3:01:34,  1.85s/it, loss=0.18, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000153, train/loss_step=0.0426, global_step=4617.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 70/5971 [02:11<3:01:34,  1.85s/it, loss=0.174, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000774, train/loss_step=0.206, global_step=4617.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|          | 71/5971 [02:11<3:00:12,  1.83s/it, loss=0.174, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000774, train/loss_step=0.206, global_step=4617.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 71/5971 [02:11<3:00:12,  1.83s/it, loss=0.174, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=2.8e-5, train/loss_step=0.00593, global_step=4617.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 72/5971 [02:14<3:00:32,  1.84s/it, loss=0.174, v_num=0, train/loss_simple_step=0.00593, train/loss_vlb_step=2.8e-5, train/loss_step=0.00593, global_step=4617.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 72/5971 [02:14<3:00:32,  1.84s/it, loss=0.156, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000116, train/loss_step=0.0338, global_step=4617.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 73/5971 [02:14<2:59:15,  1.82s/it, loss=0.156, v_num=0, train/loss_simple_step=0.0338, train/loss_vlb_step=0.000116, train/loss_step=0.0338, global_step=4617.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 73/5971 [02:14<2:59:15,  1.82s/it, loss=0.175, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00233, train/loss_step=0.412, global_step=4618.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|          | 74/5971 [02:15<2:57:58,  1.81s/it, loss=0.175, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00233, train/loss_step=0.412, global_step=4618.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|          | 74/5971 [02:15<2:57:58,  1.81s/it, loss=0.168, v_num=0, train/loss_simple_step=0.00782, train/loss_vlb_step=3.48e-5, train/loss_step=0.00782, global_step=4618.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 75/5971 [02:16<2:56:43,  1.80s/it, loss=0.168, v_num=0, train/loss_simple_step=0.00782, train/loss_vlb_step=3.48e-5, train/loss_step=0.00782, global_step=4618.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 75/5971 [02:16<2:56:43,  1.80s/it, loss=0.185, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0019, train/loss_step=0.358, global_step=4618.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:   1%|▏         | 76/5971 [02:19<2:57:27,  1.81s/it, loss=0.185, v_num=0, train/loss_simple_step=0.358, train/loss_vlb_step=0.0019, train/loss_step=0.358, global_step=4618.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 76/5971 [02:19<2:57:27,  1.81s/it, loss=0.169, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.96e-5, train/loss_step=0.00348, global_step=4618.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 77/5971 [02:19<2:56:16,  1.79s/it, loss=0.169, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.96e-5, train/loss_step=0.00348, global_step=4618.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 77/5971 [02:19<2:56:16,  1.79s/it, loss=0.173, v_num=0, train/loss_simple_step=0.0774, train/loss_vlb_step=0.000256, train/loss_step=0.0774, global_step=4619.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|▏         | 78/5971 [02:20<2:55:06,  1.78s/it, loss=0.173, v_num=0, train/loss_simple_step=0.0774, train/loss_vlb_step=0.000256, train/loss_step=0.0774, global_step=4619.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 78/5971 [02:20<2:55:06,  1.78s/it, loss=0.16, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000163, train/loss_step=0.0456, global_step=4619.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|▏         | 79/5971 [02:21<2:53:59,  1.77s/it, loss=0.16, v_num=0, train/loss_simple_step=0.0456, train/loss_vlb_step=0.000163, train/loss_step=0.0456, global_step=4619.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 79/5971 [02:21<2:53:59,  1.77s/it, loss=0.142, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000183, train/loss_step=0.0533, global_step=4619.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 80/5971 [02:23<2:54:19,  1.78s/it, loss=0.142, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000183, train/loss_step=0.0533, global_step=4619.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 80/5971 [02:23<2:54:19,  1.78s/it, loss=0.148, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000438, train/loss_step=0.131, global_step=4619.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   1%|▏         | 81/5971 [02:24<2:53:15,  1.76s/it, loss=0.148, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000438, train/loss_step=0.131, global_step=4619.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 81/5971 [02:24<2:53:15,  1.76s/it, loss=0.149, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000551, train/loss_step=0.166, global_step=4620.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 82/5971 [02:25<2:52:10,  1.75s/it, loss=0.149, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000551, train/loss_step=0.166, global_step=4620.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 82/5971 [02:25<2:52:10,  1.75s/it, loss=0.13, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00111, train/loss_step=0.253, global_step=4620.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   1%|▏         | 83/5971 [02:26<2:51:06,  1.74s/it, loss=0.13, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00111, train/loss_step=0.253, global_step=4620.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 83/5971 [02:26<2:51:07,  1.74s/it, loss=0.13, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.44e-5, train/loss_step=0.00286, global_step=4620.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 84/5971 [02:28<2:51:46,  1.75s/it, loss=0.13, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.44e-5, train/loss_step=0.00286, global_step=4620.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 84/5971 [02:28<2:51:47,  1.75s/it, loss=0.106, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.46e-5, train/loss_step=0.00721, global_step=4620.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 85/5971 [02:29<2:50:46,  1.74s/it, loss=0.106, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.46e-5, train/loss_step=0.00721, global_step=4620.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 85/5971 [02:29<2:50:46,  1.74s/it, loss=0.111, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000537, train/loss_step=0.163, global_step=4621.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|▏         | 86/5971 [02:30<2:49:46,  1.73s/it, loss=0.111, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000537, train/loss_step=0.163, global_step=4621.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 86/5971 [02:30<2:49:46,  1.73s/it, loss=0.113, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000159, train/loss_step=0.0451, global_step=4621.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 87/5971 [02:31<2:48:46,  1.72s/it, loss=0.113, v_num=0, train/loss_simple_step=0.0451, train/loss_vlb_step=0.000159, train/loss_step=0.0451, global_step=4621.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 87/5971 [02:31<2:48:46,  1.72s/it, loss=0.128, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00136, train/loss_step=0.308, global_step=4621.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   1%|▏         | 88/5971 [02:33<2:49:12,  1.73s/it, loss=0.128, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00136, train/loss_step=0.308, global_step=4621.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 88/5971 [02:33<2:49:12,  1.73s/it, loss=0.154, v_num=0, train/loss_simple_step=0.769, train/loss_vlb_step=0.0205, train/loss_step=0.769, global_step=4621.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   1%|▏         | 89/5971 [02:34<2:48:17,  1.72s/it, loss=0.154, v_num=0, train/loss_simple_step=0.769, train/loss_vlb_step=0.0205, train/loss_step=0.769, global_step=4621.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   1%|▏         | 89/5971 [02:34<2:48:17,  1.72s/it, loss=0.152, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.37e-5, train/loss_step=0.00242, global_step=4622.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 90/5971 [02:35<2:47:22,  1.71s/it, loss=0.152, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.37e-5, train/loss_step=0.00242, global_step=4622.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 90/5971 [02:35<2:47:22,  1.71s/it, loss=0.152, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000662, train/loss_step=0.189, global_step=4622.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   2%|▏         | 91/5971 [02:36<2:46:27,  1.70s/it, loss=0.152, v_num=0, train/loss_simple_step=0.189, train/loss_vlb_step=0.000662, train/loss_step=0.189, global_step=4622.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 91/5971 [02:36<2:46:27,  1.70s/it, loss=0.152, v_num=0, train/loss_simple_step=0.00742, train/loss_vlb_step=3.5e-5, train/loss_step=0.00742, global_step=4622.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 92/5971 [02:38<2:46:51,  1.70s/it, loss=0.152, v_num=0, train/loss_simple_step=0.00742, train/loss_vlb_step=3.5e-5, train/loss_step=0.00742, global_step=4622.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 92/5971 [02:38<2:46:51,  1.70s/it, loss=0.158, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000523, train/loss_step=0.153, global_step=4622.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   2%|▏         | 93/5971 [02:39<2:46:00,  1.69s/it, loss=0.158, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000523, train/loss_step=0.153, global_step=4622.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 93/5971 [02:39<2:46:00,  1.69s/it, loss=0.138, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=8.49e-5, train/loss_step=0.0229, global_step=4623.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 94/5971 [02:40<2:45:08,  1.69s/it, loss=0.138, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=8.49e-5, train/loss_step=0.0229, global_step=4623.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 94/5971 [02:40<2:45:08,  1.69s/it, loss=0.152, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00128, train/loss_step=0.289, global_step=4623.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   2%|▏         | 95/5971 [02:41<2:44:16,  1.68s/it, loss=0.152, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00128, train/loss_step=0.289, global_step=4623.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 95/5971 [02:41<2:44:16,  1.68s/it, loss=0.163, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00646, train/loss_step=0.574, global_step=4623.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 96/5971 [02:43<2:45:02,  1.69s/it, loss=0.163, v_num=0, train/loss_simple_step=0.574, train/loss_vlb_step=0.00646, train/loss_step=0.574, global_step=4623.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 96/5971 [02:43<2:45:02,  1.69s/it, loss=0.163, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.26e-5, train/loss_step=0.00441, global_step=4623.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 97/5971 [02:44<2:44:13,  1.68s/it, loss=0.163, v_num=0, train/loss_simple_step=0.00441, train/loss_vlb_step=2.26e-5, train/loss_step=0.00441, global_step=4623.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 97/5971 [02:44<2:44:13,  1.68s/it, loss=0.162, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000218, train/loss_step=0.0615, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   2%|▏         | 98/5971 [02:45<2:43:23,  1.67s/it, loss=0.162, v_num=0, train/loss_simple_step=0.0615, train/loss_vlb_step=0.000218, train/loss_step=0.0615, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 98/5971 [02:45<2:43:23,  1.67s/it, loss=0.161, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.81e-5, train/loss_step=0.0133, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   2%|▏         | 99/5971 [02:46<2:42:34,  1.66s/it, loss=0.161, v_num=0, train/loss_simple_step=0.0133, train/loss_vlb_step=5.81e-5, train/loss_step=0.0133, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 99/5971 [02:46<2:42:35,  1.66s/it, loss=0.187, v_num=0, train/loss_simple_step=0.579, train/loss_vlb_step=0.00612, train/loss_step=0.579, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   2%|▏         | 100/5971 [02:48<2:42:59,  1.67s/it, loss=0.187, v_num=0, train/loss_simple_step=0.579, train/loss_vlb_step=0.00612, train/loss_step=0.579, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   2%|▏         | 100/5971 [02:48<2:42:59,  1.67s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:02,  2.65it/s][A
Epoch 8:   2%|▏         | 102/5971 [02:48<2:40:10,  1.64s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   1%|          | 2/167 [00:00<00:48,  3.43it/s][A
Epoch 8:   2%|▏         | 104/5971 [02:48<2:37:19,  1.61s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.10it/s][A
Epoch 8:   2%|▏         | 107/5971 [02:49<2:32:58,  1.57s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.92it/s][A
Epoch 8:   2%|▏         | 110/5971 [02:49<2:28:51,  1.52s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.78it/s][A
Epoch 8:   2%|▏         | 113/5971 [02:49<2:24:57,  1.48s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.34it/s][A
Epoch 8:   2%|▏         | 116/5971 [02:49<2:21:15,  1.45s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  11%|█         | 18/167 [00:01<00:06, 23.75it/s][A
Epoch 8:   2%|▏         | 120/5971 [02:49<2:16:35,  1.40s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  13%|█▎        | 21/167 [00:01<00:05, 24.58it/s][A
Epoch 8:   2%|▏         | 124/5971 [02:49<2:12:15,  1.36s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 25.32it/s][A
Epoch 8:   2%|▏         | 128/5971 [02:49<2:08:09,  1.32s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 26.84it/s][A

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.90it/s][A
Epoch 8:   2%|▏         | 132/5971 [02:49<2:04:20,  1.28s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  21%|██        | 35/167 [00:01<00:04, 26.56it/s][A
Epoch 8:   2%|▏         | 136/5971 [02:50<2:00:43,  1.24s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.39it/s][A
Epoch 8:   2%|▏         | 140/5971 [02:50<1:57:20,  1.21s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.75it/s][A
Epoch 8:   2%|▏         | 144/5971 [02:50<1:54:08,  1.18s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.12it/s][A

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.01it/s][A
Epoch 8:   2%|▏         | 148/5971 [02:50<1:51:05,  1.14s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 27.04it/s][A
Epoch 8:   3%|▎         | 152/5971 [02:50<1:48:12,  1.12s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 25.85it/s][A
Epoch 8:   3%|▎         | 156/5971 [02:50<1:45:27,  1.09s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.35it/s][A

Validating:  35%|███▌      | 59/167 [00:02<00:03, 27.19it/s][A
Epoch 8:   3%|▎         | 160/5971 [02:50<1:42:51,  1.06s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 26.77it/s][A
Epoch 8:   3%|▎         | 164/5971 [02:51<1:40:23,  1.04s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.74it/s][A
Epoch 8:   3%|▎         | 168/5971 [02:51<1:38:01,  1.01s/it, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  41%|████      | 68/167 [00:03<00:03, 27.58it/s][A

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.78it/s][A
Epoch 8:   3%|▎         | 172/5971 [02:51<1:35:46,  1.01it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 25.40it/s][A
Epoch 8:   3%|▎         | 176/5971 [02:51<1:33:39,  1.03it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 25.19it/s][A
Epoch 8:   3%|▎         | 180/5971 [02:51<1:31:35,  1.05it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 25.90it/s][A

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.11it/s][A
Epoch 8:   3%|▎         | 184/5971 [02:51<1:29:38,  1.08it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.27it/s][A
Epoch 8:   3%|▎         | 188/5971 [02:52<1:27:45,  1.10it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 26.12it/s][A
Epoch 8:   3%|▎         | 192/5971 [02:52<1:25:57,  1.12it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.12it/s][A
Epoch 8:   3%|▎         | 196/5971 [02:52<1:24:13,  1.14it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 24.84it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.76it/s][A
Epoch 8:   3%|▎         | 200/5971 [02:52<1:22:34,  1.16it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.88it/s][A
Epoch 8:   3%|▎         | 204/5971 [02:52<1:20:58,  1.19it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.50it/s][A
Epoch 8:   3%|▎         | 208/5971 [02:52<1:19:26,  1.21it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.17it/s][A

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.17it/s][A
Epoch 8:   4%|▎         | 212/5971 [02:53<1:17:57,  1.23it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 26.94it/s][A
Epoch 8:   4%|▎         | 216/5971 [02:53<1:16:31,  1.25it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.41it/s][A
Epoch 8:   4%|▎         | 220/5971 [02:53<1:15:09,  1.28it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.83it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.54it/s][A
Epoch 8:   4%|▍         | 224/5971 [02:53<1:13:49,  1.30it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.68it/s][A
Epoch 8:   4%|▍         | 228/5971 [02:53<1:12:33,  1.32it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.15it/s][A
Epoch 8:   4%|▍         | 232/5971 [02:53<1:11:19,  1.34it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.13it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.28it/s][A
Epoch 8:   4%|▍         | 236/5971 [02:53<1:10:07,  1.36it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  83%|████████▎ | 139/167 [00:05<00:00, 28.46it/s][A
Epoch 8:   4%|▍         | 240/5971 [02:54<1:08:58,  1.38it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 26.40it/s][A
Epoch 8:   4%|▍         | 244/5971 [02:54<1:07:51,  1.41it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 26.68it/s][A
Epoch 8:   4%|▍         | 248/5971 [02:54<1:06:46,  1.43it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.55it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 25.47it/s][A
Epoch 8:   4%|▍         | 252/5971 [02:54<1:05:44,  1.45it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 25.68it/s][A
Epoch 8:   4%|▍         | 256/5971 [02:54<1:04:43,  1.47it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 25.20it/s][A
Epoch 8:   4%|▍         | 260/5971 [02:54<1:03:45,  1.49it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.89it/s][A

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.25it/s][A
Epoch 8:   4%|▍         | 264/5971 [02:54<1:02:47,  1.51it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.23it/s][A
Epoch 8:   4%|▍         | 268/5971 [02:55<1:01:52,  1.54it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   4%|▍         | 268/5971 [02:55<1:01:56,  1.53it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0549, train/loss_vlb_step=0.000191, train/loss_step=0.0549, global_step=4624.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:   5%|▍         | 269/5971 [02:56<1:02:02,  1.53it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00201, train/loss_vlb_step=1.18e-5, train/loss_step=0.00201, global_step=4625.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 270/5971 [02:57<1:02:06,  1.53it/s, loss=0.176, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.00108, train/loss_step=0.264, global_step=4625.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:   5%|▍         | 271/5971 [02:57<1:02:09,  1.53it/s, loss=0.204, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.00704, train/loss_step=0.563, global_step=4625.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 272/5971 [03:00<1:02:47,  1.51it/s, loss=0.204, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.00704, train/loss_step=0.563, global_step=4625.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 272/5971 [03:00<1:02:47,  1.51it/s, loss=0.215, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000887, train/loss_step=0.244, global_step=4625.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 273/5971 [03:01<1:02:51,  1.51it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.2e-5, train/loss_step=0.0114, global_step=4626.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 274/5971 [03:02<1:02:55,  1.51it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.78e-5, train/loss_step=0.0136, global_step=4626.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 275/5971 [03:03<1:02:59,  1.51it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.51e-5, train/loss_step=0.0156, global_step=4626.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 276/5971 [03:05<1:03:27,  1.50it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.51e-5, train/loss_step=0.0156, global_step=4626.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 276/5971 [03:05<1:03:27,  1.50it/s, loss=0.16, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000476, train/loss_step=0.144, global_step=4626.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   5%|▍         | 277/5971 [03:06<1:03:31,  1.49it/s, loss=0.179, v_num=0, train/loss_simple_step=0.370, train/loss_vlb_step=0.00164, train/loss_step=0.370, global_step=4627.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 278/5971 [03:06<1:03:34,  1.49it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00194, train/loss_vlb_step=1.1e-5, train/loss_step=0.00194, global_step=4627.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 279/5971 [03:07<1:03:38,  1.49it/s, loss=0.175, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=4627.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   5%|▍         | 280/5971 [03:09<1:04:06,  1.48it/s, loss=0.175, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000393, train/loss_step=0.120, global_step=4627.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 280/5971 [03:09<1:04:06,  1.48it/s, loss=0.173, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000368, train/loss_step=0.112, global_step=4627.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 281/5971 [03:10<1:04:10,  1.48it/s, loss=0.178, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=4628.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   5%|▍         | 282/5971 [03:11<1:04:14,  1.48it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0501, train/loss_vlb_step=0.000174, train/loss_step=0.0501, global_step=4628.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 283/5971 [03:12<1:04:17,  1.47it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00862, train/loss_vlb_step=4.14e-5, train/loss_step=0.00862, global_step=4628.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 284/5971 [03:14<1:04:45,  1.46it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00862, train/loss_vlb_step=4.14e-5, train/loss_step=0.00862, global_step=4628.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 284/5971 [03:14<1:04:45,  1.46it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0609, train/loss_vlb_step=0.000223, train/loss_step=0.0609, global_step=4628.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   5%|▍         | 285/5971 [03:15<1:04:48,  1.46it/s, loss=0.174, v_num=0, train/loss_simple_step=0.733, train/loss_vlb_step=0.0228, train/loss_step=0.733, global_step=4629.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:   5%|▍         | 286/5971 [03:16<1:04:51,  1.46it/s, loss=0.183, v_num=0, train/loss_simple_step=0.177, train/loss_vlb_step=0.000619, train/loss_step=0.177, global_step=4629.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 287/5971 [03:17<1:04:55,  1.46it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00569, train/loss_vlb_step=2.81e-5, train/loss_step=0.00569, global_step=4629.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 288/5971 [03:19<1:05:23,  1.45it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00569, train/loss_vlb_step=2.81e-5, train/loss_step=0.00569, global_step=4629.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 288/5971 [03:19<1:05:23,  1.45it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00617, train/loss_vlb_step=3.08e-5, train/loss_step=0.00617, global_step=4629.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 289/5971 [03:20<1:05:26,  1.45it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.73e-6, train/loss_step=0.00144, global_step=4630.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 290/5971 [03:21<1:05:29,  1.45it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.28e-5, train/loss_step=0.00215, global_step=4630.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 291/5971 [03:22<1:05:32,  1.44it/s, loss=0.111, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.9e-5, train/loss_step=0.021, global_step=4630.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:   5%|▍         | 292/5971 [03:24<1:06:05,  1.43it/s, loss=0.111, v_num=0, train/loss_simple_step=0.021, train/loss_vlb_step=8.9e-5, train/loss_step=0.021, global_step=4630.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 292/5971 [03:24<1:06:05,  1.43it/s, loss=0.105, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000403, train/loss_step=0.122, global_step=4630.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 293/5971 [03:25<1:06:08,  1.43it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00211, train/loss_vlb_step=1.23e-5, train/loss_step=0.00211, global_step=4631.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 294/5971 [03:26<1:06:11,  1.43it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000162, train/loss_step=0.0467, global_step=4631.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   5%|▍         | 295/5971 [03:27<1:06:14,  1.43it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000147, train/loss_step=0.0396, global_step=4631.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 296/5971 [03:29<1:06:40,  1.42it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000147, train/loss_step=0.0396, global_step=4631.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 296/5971 [03:29<1:06:40,  1.42it/s, loss=0.107, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000428, train/loss_step=0.129, global_step=4631.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   5%|▍         | 297/5971 [03:30<1:06:43,  1.42it/s, loss=0.0969, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000582, train/loss_step=0.174, global_step=4632.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▍         | 298/5971 [03:31<1:06:45,  1.42it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000104, train/loss_step=0.0271, global_step=4632.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 299/5971 [03:31<1:06:47,  1.42it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.00024, train/loss_step=0.0727, global_step=4632.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   5%|▌         | 300/5971 [03:34<1:07:15,  1.41it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.00024, train/loss_step=0.0727, global_step=4632.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 300/5971 [03:34<1:07:15,  1.41it/s, loss=0.109, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00174, train/loss_step=0.380, global_step=4632.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   5%|▌         | 301/5971 [03:35<1:07:17,  1.40it/s, loss=0.114, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000832, train/loss_step=0.215, global_step=4633.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 302/5971 [03:35<1:07:20,  1.40it/s, loss=0.123, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.000969, train/loss_step=0.226, global_step=4633.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 303/5971 [03:36<1:07:23,  1.40it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=7.1e-5, train/loss_step=0.0159, global_step=4633.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 304/5971 [03:38<1:07:47,  1.39it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0159, train/loss_vlb_step=7.1e-5, train/loss_step=0.0159, global_step=4633.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 304/5971 [03:38<1:07:47,  1.39it/s, loss=0.13, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000669, train/loss_step=0.200, global_step=4633.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   5%|▌         | 305/5971 [03:39<1:07:50,  1.39it/s, loss=0.0935, v_num=0, train/loss_simple_step=0.00489, train/loss_vlb_step=2.49e-5, train/loss_step=0.00489, global_step=4634.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 306/5971 [03:40<1:07:52,  1.39it/s, loss=0.101, v_num=0, train/loss_simple_step=0.324, train/loss_vlb_step=0.00158, train/loss_step=0.324, global_step=4634.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:   5%|▌         | 307/5971 [03:41<1:07:54,  1.39it/s, loss=0.119, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00164, train/loss_step=0.375, global_step=4634.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 308/5971 [03:43<1:08:20,  1.38it/s, loss=0.119, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00164, train/loss_step=0.375, global_step=4634.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 308/5971 [03:43<1:08:20,  1.38it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0972, train/loss_vlb_step=0.000329, train/loss_step=0.0972, global_step=4634.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 309/5971 [03:44<1:08:23,  1.38it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0145, train/loss_vlb_step=6.05e-5, train/loss_step=0.0145, global_step=4635.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   5%|▌         | 310/5971 [03:45<1:08:24,  1.38it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0527, train/loss_vlb_step=0.000178, train/loss_step=0.0527, global_step=4635.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 311/5971 [03:46<1:08:26,  1.38it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000159, train/loss_step=0.0454, global_step=4635.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 312/5971 [03:48<1:08:54,  1.37it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0454, train/loss_vlb_step=0.000159, train/loss_step=0.0454, global_step=4635.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 312/5971 [03:48<1:08:54,  1.37it/s, loss=0.129, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000437, train/loss_step=0.129, global_step=4635.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   5%|▌         | 313/5971 [03:49<1:08:57,  1.37it/s, loss=0.169, v_num=0, train/loss_simple_step=0.808, train/loss_vlb_step=0.0463, train/loss_step=0.808, global_step=4636.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   5%|▌         | 314/5971 [03:50<1:08:59,  1.37it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=8.17e-5, train/loss_step=0.0197, global_step=4636.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 315/5971 [03:51<1:09:00,  1.37it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00263, train/loss_vlb_step=1.48e-5, train/loss_step=0.00263, global_step=4636.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 316/5971 [03:53<1:09:24,  1.36it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00263, train/loss_vlb_step=1.48e-5, train/loss_step=0.00263, global_step=4636.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 316/5971 [03:53<1:09:24,  1.36it/s, loss=0.17, v_num=0, train/loss_simple_step=0.220, train/loss_vlb_step=0.000807, train/loss_step=0.220, global_step=4636.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:   5%|▌         | 317/5971 [03:54<1:09:26,  1.36it/s, loss=0.182, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00417, train/loss_step=0.413, global_step=4637.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 318/5971 [03:55<1:09:28,  1.36it/s, loss=0.2, v_num=0, train/loss_simple_step=0.392, train/loss_vlb_step=0.00192, train/loss_step=0.392, global_step=4637.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   5%|▌         | 319/5971 [03:56<1:09:30,  1.36it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.000173, train/loss_step=0.0495, global_step=4637.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 320/5971 [03:58<1:10:00,  1.35it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.000173, train/loss_step=0.0495, global_step=4637.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 320/5971 [03:58<1:10:00,  1.35it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00913, train/loss_vlb_step=4.38e-5, train/loss_step=0.00913, global_step=4637.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 321/5971 [03:59<1:10:03,  1.34it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00645, train/loss_vlb_step=3.07e-5, train/loss_step=0.00645, global_step=4638.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   5%|▌         | 322/5971 [04:00<1:10:04,  1.34it/s, loss=0.168, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000619, train/loss_step=0.179, global_step=4638.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   5%|▌         | 323/5971 [04:01<1:10:07,  1.34it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.33e-5, train/loss_step=0.0124, global_step=4638.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 324/5971 [04:03<1:10:30,  1.33it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0124, train/loss_vlb_step=5.33e-5, train/loss_step=0.0124, global_step=4638.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 324/5971 [04:03<1:10:30,  1.33it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000206, train/loss_step=0.0586, global_step=4638.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 325/5971 [04:04<1:10:32,  1.33it/s, loss=0.177, v_num=0, train/loss_simple_step=0.339, train/loss_vlb_step=0.00141, train/loss_step=0.339, global_step=4639.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   5%|▌         | 326/5971 [04:05<1:10:34,  1.33it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.81e-5, train/loss_step=0.0156, global_step=4639.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 327/5971 [04:06<1:10:35,  1.33it/s, loss=0.153, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000685, train/loss_step=0.196, global_step=4639.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   5%|▌         | 328/5971 [04:08<1:10:57,  1.33it/s, loss=0.153, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000685, train/loss_step=0.196, global_step=4639.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   5%|▌         | 328/5971 [04:08<1:10:57,  1.33it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0692, train/loss_vlb_step=0.000236, train/loss_step=0.0692, global_step=4639.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 329/5971 [04:09<1:10:59,  1.32it/s, loss=0.159, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.000534, train/loss_step=0.161, global_step=4640.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   6%|▌         | 330/5971 [04:10<1:11:01,  1.32it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.67e-5, train/loss_step=0.0103, global_step=4640.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 331/5971 [04:10<1:11:02,  1.32it/s, loss=0.167, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00116, train/loss_step=0.253, global_step=4640.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   6%|▌         | 332/5971 [04:13<1:11:25,  1.32it/s, loss=0.167, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.00116, train/loss_step=0.253, global_step=4640.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 332/5971 [04:13<1:11:25,  1.32it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0607, train/loss_vlb_step=0.000203, train/loss_step=0.0607, global_step=4640.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 333/5971 [04:13<1:11:27,  1.32it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00467, train/loss_vlb_step=2.44e-5, train/loss_step=0.00467, global_step=4641.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 334/5971 [04:14<1:11:28,  1.31it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.22e-6, train/loss_step=0.00165, global_step=4641.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 335/5971 [04:15<1:11:29,  1.31it/s, loss=0.125, v_num=0, train/loss_simple_step=0.045, train/loss_vlb_step=0.000161, train/loss_step=0.045, global_step=4641.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   6%|▌         | 336/5971 [04:17<1:11:53,  1.31it/s, loss=0.125, v_num=0, train/loss_simple_step=0.045, train/loss_vlb_step=0.000161, train/loss_step=0.045, global_step=4641.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 336/5971 [04:17<1:11:53,  1.31it/s, loss=0.126, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000819, train/loss_step=0.241, global_step=4641.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 337/5971 [04:18<1:11:54,  1.31it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0383, train/loss_vlb_step=0.00014, train/loss_step=0.0383, global_step=4642.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 338/5971 [04:19<1:11:55,  1.31it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.0779, train/loss_vlb_step=0.000257, train/loss_step=0.0779, global_step=4642.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 339/5971 [04:20<1:11:57,  1.30it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.0075, train/loss_vlb_step=3.85e-5, train/loss_step=0.0075, global_step=4642.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   6%|▌         | 340/5971 [04:23<1:12:29,  1.29it/s, loss=0.0894, v_num=0, train/loss_simple_step=0.0075, train/loss_vlb_step=3.85e-5, train/loss_step=0.0075, global_step=4642.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 340/5971 [04:23<1:12:29,  1.29it/s, loss=0.0896, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6e-5, train/loss_step=0.0139, global_step=4642.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   6%|▌         | 341/5971 [04:24<1:12:30,  1.29it/s, loss=0.103, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00122, train/loss_step=0.278, global_step=4643.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 342/5971 [04:25<1:12:31,  1.29it/s, loss=0.1, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000385, train/loss_step=0.117, global_step=4643.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   6%|▌         | 343/5971 [04:26<1:12:32,  1.29it/s, loss=0.113, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=4643.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 344/5971 [04:28<1:12:58,  1.29it/s, loss=0.113, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00112, train/loss_step=0.269, global_step=4643.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 344/5971 [04:28<1:12:58,  1.29it/s, loss=0.128, v_num=0, train/loss_simple_step=0.359, train/loss_vlb_step=0.00159, train/loss_step=0.359, global_step=4643.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 345/5971 [04:29<1:12:59,  1.28it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0651, train/loss_vlb_step=0.000217, train/loss_step=0.0651, global_step=4644.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 346/5971 [04:30<1:13:00,  1.28it/s, loss=0.138, v_num=0, train/loss_simple_step=0.500, train/loss_vlb_step=0.00311, train/loss_step=0.500, global_step=4644.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   6%|▌         | 347/5971 [04:31<1:13:01,  1.28it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00563, train/loss_vlb_step=2.94e-5, train/loss_step=0.00563, global_step=4644.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 348/5971 [04:33<1:13:23,  1.28it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00563, train/loss_vlb_step=2.94e-5, train/loss_step=0.00563, global_step=4644.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 348/5971 [04:33<1:13:23,  1.28it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00414, train/loss_vlb_step=2.14e-5, train/loss_step=0.00414, global_step=4644.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 349/5971 [04:34<1:13:24,  1.28it/s, loss=0.124, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000436, train/loss_step=0.132, global_step=4645.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   6%|▌         | 350/5971 [04:35<1:13:25,  1.28it/s, loss=0.133, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000852, train/loss_step=0.196, global_step=4645.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 351/5971 [04:35<1:13:26,  1.28it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.34e-5, train/loss_step=0.00704, global_step=4645.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 352/5971 [04:38<1:13:46,  1.27it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00704, train/loss_vlb_step=3.34e-5, train/loss_step=0.00704, global_step=4645.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 352/5971 [04:38<1:13:46,  1.27it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00277, train/loss_vlb_step=1.47e-5, train/loss_step=0.00277, global_step=4645.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 353/5971 [04:38<1:13:47,  1.27it/s, loss=0.132, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00123, train/loss_step=0.277, global_step=4646.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:   6%|▌         | 354/5971 [04:39<1:13:48,  1.27it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.27e-5, train/loss_step=0.0119, global_step=4646.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 355/5971 [04:40<1:13:48,  1.27it/s, loss=0.132, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000158, train/loss_step=0.043, global_step=4646.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   6%|▌         | 356/5971 [04:42<1:14:08,  1.26it/s, loss=0.132, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000158, train/loss_step=0.043, global_step=4646.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 356/5971 [04:42<1:14:08,  1.26it/s, loss=0.152, v_num=0, train/loss_simple_step=0.643, train/loss_vlb_step=0.0105, train/loss_step=0.643, global_step=4646.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   6%|▌         | 357/5971 [04:43<1:14:09,  1.26it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000216, train/loss_step=0.0641, global_step=4647.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 358/5971 [04:44<1:14:09,  1.26it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0877, train/loss_vlb_step=0.000289, train/loss_step=0.0877, global_step=4647.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 359/5971 [04:45<1:14:10,  1.26it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0526, train/loss_vlb_step=0.000183, train/loss_step=0.0526, global_step=4647.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 360/5971 [04:48<1:14:36,  1.25it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0526, train/loss_vlb_step=0.000183, train/loss_step=0.0526, global_step=4647.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 360/5971 [04:48<1:14:36,  1.25it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00794, train/loss_vlb_step=3.96e-5, train/loss_step=0.00794, global_step=4647.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 361/5971 [04:48<1:14:37,  1.25it/s, loss=0.142, v_num=0, train/loss_simple_step=0.00343, train/loss_vlb_step=1.8e-5, train/loss_step=0.00343, global_step=4648.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   6%|▌         | 362/5971 [04:49<1:14:37,  1.25it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000133, train/loss_step=0.0343, global_step=4648.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 363/5971 [04:50<1:14:37,  1.25it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.9e-5, train/loss_step=0.0137, global_step=4648.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   6%|▌         | 364/5971 [04:52<1:14:57,  1.25it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0137, train/loss_vlb_step=5.9e-5, train/loss_step=0.0137, global_step=4648.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 364/5971 [04:52<1:14:57,  1.25it/s, loss=0.131, v_num=0, train/loss_simple_step=0.468, train/loss_vlb_step=0.0046, train/loss_step=0.468, global_step=4648.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   6%|▌         | 365/5971 [04:53<1:14:58,  1.25it/s, loss=0.135, v_num=0, train/loss_simple_step=0.152, train/loss_vlb_step=0.000515, train/loss_step=0.152, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 366/5971 [04:54<1:14:58,  1.25it/s, loss=0.125, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00134, train/loss_step=0.297, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   6%|▌         | 367/5971 [04:55<1:14:59,  1.25it/s, loss=0.147, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00369, train/loss_step=0.445, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 368/5971 [04:57<1:15:19,  1.24it/s, loss=0.147, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00369, train/loss_step=0.445, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   6%|▌         | 368/5971 [04:57<1:15:19,  1.24it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.26it/s][A

Validating:   1%|          | 2/167 [00:00<00:52,  3.16it/s][A
Epoch 8:   6%|▌         | 372/5971 [04:58<1:14:39,  1.25it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.27it/s][A
Epoch 8:   6%|▋         | 376/5971 [04:58<1:13:50,  1.26it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.35it/s][A

Validating:   7%|▋         | 11/167 [00:01<00:09, 15.98it/s][A
Epoch 8:   6%|▋         | 380/5971 [04:58<1:13:03,  1.28it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.97it/s][A
Epoch 8:   6%|▋         | 384/5971 [04:58<1:12:16,  1.29it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.12it/s][A
Epoch 8:   6%|▋         | 388/5971 [04:59<1:11:31,  1.30it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.78it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.21it/s][A
Epoch 8:   7%|▋         | 392/5971 [04:59<1:10:46,  1.31it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.53it/s][A
Epoch 8:   7%|▋         | 396/5971 [04:59<1:10:03,  1.33it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 29/167 [00:01<00:06, 22.38it/s][A
Epoch 8:   7%|▋         | 400/5971 [04:59<1:09:21,  1.34it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 23.33it/s][A
Epoch 8:   7%|▋         | 404/5971 [04:59<1:08:38,  1.35it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 25.52it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.29it/s][A
Epoch 8:   7%|▋         | 408/5971 [04:59<1:07:57,  1.36it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 25.29it/s][A
Epoch 8:   7%|▋         | 412/5971 [04:59<1:07:17,  1.38it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.36it/s][A
Epoch 8:   7%|▋         | 416/5971 [05:00<1:06:37,  1.39it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.76it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 24.98it/s][A
Epoch 8:   7%|▋         | 420/5971 [05:00<1:05:59,  1.40it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.09it/s][A
Epoch 8:   7%|▋         | 424/5971 [05:00<1:05:21,  1.41it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 25.83it/s][A
Epoch 8:   7%|▋         | 428/5971 [05:00<1:04:43,  1.43it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 26.51it/s][A
Epoch 8:   7%|▋         | 432/5971 [05:00<1:04:06,  1.44it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 26.16it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 26.72it/s][A
Epoch 8:   7%|▋         | 436/5971 [05:00<1:03:30,  1.45it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 26.70it/s][A
Epoch 8:   7%|▋         | 440/5971 [05:01<1:02:55,  1.47it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.32it/s][A
Epoch 8:   7%|▋         | 444/5971 [05:01<1:02:20,  1.48it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 25.51it/s][A

Validating:  47%|████▋     | 79/167 [00:03<00:03, 25.67it/s][A
Epoch 8:   8%|▊         | 448/5971 [05:01<1:01:46,  1.49it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 25.44it/s][A
Epoch 8:   8%|▊         | 452/5971 [05:01<1:01:13,  1.50it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  51%|█████     | 85/167 [00:03<00:03, 25.55it/s][A
Epoch 8:   8%|▊         | 456/5971 [05:01<1:00:40,  1.52it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 24.87it/s][A

Validating:  54%|█████▍    | 91/167 [00:04<00:02, 25.81it/s][A
Epoch 8:   8%|▊         | 460/5971 [05:01<1:00:08,  1.53it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.37it/s][A
Epoch 8:   8%|▊         | 464/5971 [05:01<59:36,  1.54it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.69it/s][A
Epoch 8:   8%|▊         | 468/5971 [05:02<59:04,  1.55it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.70it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 27.24it/s][A
Epoch 8:   8%|▊         | 472/5971 [05:02<58:33,  1.56it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 26.17it/s][A
Epoch 8:   8%|▊         | 476/5971 [05:02<58:03,  1.58it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 26.57it/s][A
Epoch 8:   8%|▊         | 480/5971 [05:02<57:34,  1.59it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 25.82it/s][A

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 26.03it/s][A
Epoch 8:   8%|▊         | 484/5971 [05:02<57:04,  1.60it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████   | 118/167 [00:05<00:02, 22.68it/s][A
Epoch 8:   8%|▊         | 488/5971 [05:02<56:36,  1.61it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 23.78it/s][A
Epoch 8:   8%|▊         | 492/5971 [05:03<56:08,  1.63it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 23.37it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 23.41it/s][A
Epoch 8:   8%|▊         | 496/5971 [05:03<55:40,  1.64it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 24.03it/s][A
Epoch 8:   8%|▊         | 500/5971 [05:03<55:13,  1.65it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 24.16it/s][A
Epoch 8:   8%|▊         | 504/5971 [05:03<54:46,  1.66it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 24.84it/s][A

Validating:  83%|████████▎ | 139/167 [00:06<00:01, 26.11it/s][A
Epoch 8:   9%|▊         | 508/5971 [05:03<54:19,  1.68it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 27.08it/s][A
Epoch 8:   9%|▊         | 512/5971 [05:03<53:53,  1.69it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.61it/s][A
Epoch 8:   9%|▊         | 516/5971 [05:04<53:27,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 25.92it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.16it/s][A
Epoch 8:   9%|▊         | 520/5971 [05:04<53:02,  1.71it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 24.56it/s][A
Epoch 8:   9%|▉         | 524/5971 [05:04<52:37,  1.72it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 23.53it/s][A
Epoch 8:   9%|▉         | 528/5971 [05:04<52:13,  1.74it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 24.69it/s][A

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 24.22it/s][A
Epoch 8:   9%|▉         | 532/5971 [05:04<51:49,  1.75it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 23.59it/s][A
Epoch 8:   9%|▉         | 536/5971 [05:04<51:25,  1.76it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 536/5971 [05:05<51:29,  1.76it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.63e-5, train/loss_step=0.0149, global_step=4649.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:   9%|▉         | 537/5971 [05:06<51:33,  1.76it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000185, train/loss_step=0.0561, global_step=4650.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 538/5971 [05:07<51:36,  1.75it/s, loss=0.158, v_num=0, train/loss_simple_step=0.467, train/loss_vlb_step=0.00289, train/loss_step=0.467, global_step=4650.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   9%|▉         | 539/5971 [05:08<51:38,  1.75it/s, loss=0.171, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00116, train/loss_step=0.279, global_step=4650.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 540/5971 [05:10<51:53,  1.74it/s, loss=0.171, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00116, train/loss_step=0.279, global_step=4650.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 540/5971 [05:10<51:53,  1.74it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.07e-6, train/loss_step=0.00135, global_step=4650.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 541/5971 [05:11<51:56,  1.74it/s, loss=0.165, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000503, train/loss_step=0.150, global_step=4651.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   9%|▉         | 542/5971 [05:11<51:58,  1.74it/s, loss=0.185, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00207, train/loss_step=0.417, global_step=4651.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   9%|▉         | 543/5971 [05:12<52:00,  1.74it/s, loss=0.209, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00657, train/loss_step=0.526, global_step=4651.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 544/5971 [05:14<52:15,  1.73it/s, loss=0.209, v_num=0, train/loss_simple_step=0.526, train/loss_vlb_step=0.00657, train/loss_step=0.526, global_step=4651.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 544/5971 [05:14<52:15,  1.73it/s, loss=0.177, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.25e-5, train/loss_step=0.00218, global_step=4651.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 545/5971 [05:15<52:18,  1.73it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.47e-5, train/loss_step=0.0132, global_step=4652.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   9%|▉         | 546/5971 [05:16<52:20,  1.73it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00602, train/loss_vlb_step=3.06e-5, train/loss_step=0.00602, global_step=4652.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 547/5971 [05:17<52:22,  1.73it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.52e-5, train/loss_step=0.0117, global_step=4652.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   9%|▉         | 548/5971 [05:19<52:37,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0117, train/loss_vlb_step=5.52e-5, train/loss_step=0.0117, global_step=4652.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 548/5971 [05:19<52:37,  1.72it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=9.83e-6, train/loss_step=0.00168, global_step=4652.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 549/5971 [05:20<52:39,  1.72it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.98e-5, train/loss_step=0.0142, global_step=4653.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   9%|▉         | 550/5971 [05:21<52:41,  1.71it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00717, train/loss_vlb_step=3.31e-5, train/loss_step=0.00717, global_step=4653.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 551/5971 [05:22<52:44,  1.71it/s, loss=0.177, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000714, train/loss_step=0.210, global_step=4653.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   9%|▉         | 552/5971 [05:24<52:58,  1.70it/s, loss=0.177, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000714, train/loss_step=0.210, global_step=4653.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 552/5971 [05:24<52:58,  1.70it/s, loss=0.165, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.00104, train/loss_step=0.230, global_step=4653.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:   9%|▉         | 553/5971 [05:25<53:00,  1.70it/s, loss=0.162, v_num=0, train/loss_simple_step=0.097, train/loss_vlb_step=0.000319, train/loss_step=0.097, global_step=4654.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 554/5971 [05:26<53:02,  1.70it/s, loss=0.185, v_num=0, train/loss_simple_step=0.759, train/loss_vlb_step=0.0266, train/loss_step=0.759, global_step=4654.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   9%|▉         | 555/5971 [05:26<53:05,  1.70it/s, loss=0.201, v_num=0, train/loss_simple_step=0.751, train/loss_vlb_step=0.0302, train/loss_step=0.751, global_step=4654.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 556/5971 [05:29<53:22,  1.69it/s, loss=0.201, v_num=0, train/loss_simple_step=0.751, train/loss_vlb_step=0.0302, train/loss_step=0.751, global_step=4654.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 556/5971 [05:29<53:22,  1.69it/s, loss=0.206, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000417, train/loss_step=0.125, global_step=4654.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 557/5971 [05:30<53:24,  1.69it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00506, train/loss_vlb_step=2.49e-5, train/loss_step=0.00506, global_step=4655.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 558/5971 [05:31<53:26,  1.69it/s, loss=0.217, v_num=0, train/loss_simple_step=0.726, train/loss_vlb_step=0.0214, train/loss_step=0.726, global_step=4655.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:   9%|▉         | 559/5971 [05:32<53:28,  1.69it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000264, train/loss_step=0.0757, global_step=4655.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 560/5971 [05:34<53:43,  1.68it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0757, train/loss_vlb_step=0.000264, train/loss_step=0.0757, global_step=4655.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 560/5971 [05:34<53:43,  1.68it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00523, train/loss_vlb_step=2.6e-5, train/loss_step=0.00523, global_step=4655.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 561/5971 [05:35<53:45,  1.68it/s, loss=0.205, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.000412, train/loss_step=0.125, global_step=4656.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:   9%|▉         | 562/5971 [05:35<53:47,  1.68it/s, loss=0.186, v_num=0, train/loss_simple_step=0.031, train/loss_vlb_step=0.000119, train/loss_step=0.031, global_step=4656.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 563/5971 [05:36<53:50,  1.67it/s, loss=0.195, v_num=0, train/loss_simple_step=0.695, train/loss_vlb_step=0.026, train/loss_step=0.695, global_step=4656.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:   9%|▉         | 564/5971 [05:39<54:04,  1.67it/s, loss=0.195, v_num=0, train/loss_simple_step=0.695, train/loss_vlb_step=0.026, train/loss_step=0.695, global_step=4656.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 564/5971 [05:39<54:04,  1.67it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.26e-5, train/loss_step=0.00223, global_step=4656.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 565/5971 [05:39<54:07,  1.66it/s, loss=0.242, v_num=0, train/loss_simple_step=0.959, train/loss_vlb_step=0.483, train/loss_step=0.959, global_step=4657.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]      
Epoch 8:   9%|▉         | 566/5971 [05:40<54:09,  1.66it/s, loss=0.242, v_num=0, train/loss_simple_step=0.0134, train/loss_vlb_step=5.6e-5, train/loss_step=0.0134, global_step=4657.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:   9%|▉         | 567/5971 [05:41<54:10,  1.66it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.71e-6, train/loss_step=0.00162, global_step=4657.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 568/5971 [05:43<54:24,  1.66it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00162, train/loss_vlb_step=9.71e-6, train/loss_step=0.00162, global_step=4657.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 568/5971 [05:43<54:24,  1.66it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00286, train/loss_vlb_step=1.55e-5, train/loss_step=0.00286, global_step=4657.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 569/5971 [05:44<54:26,  1.65it/s, loss=0.241, v_num=0, train/loss_simple_step=0.0025, train/loss_vlb_step=1.4e-5, train/loss_step=0.0025, global_step=4658.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  10%|▉         | 570/5971 [05:45<54:28,  1.65it/s, loss=0.251, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000761, train/loss_step=0.197, global_step=4658.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 571/5971 [05:46<54:30,  1.65it/s, loss=0.25, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000625, train/loss_step=0.186, global_step=4658.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|▉         | 572/5971 [05:48<54:44,  1.64it/s, loss=0.25, v_num=0, train/loss_simple_step=0.186, train/loss_vlb_step=0.000625, train/loss_step=0.186, global_step=4658.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 572/5971 [05:48<54:44,  1.64it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0279, train/loss_vlb_step=0.000103, train/loss_step=0.0279, global_step=4658.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 573/5971 [05:49<54:45,  1.64it/s, loss=0.235, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.11e-5, train/loss_step=0.00405, global_step=4659.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 574/5971 [05:50<54:47,  1.64it/s, loss=0.204, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000475, train/loss_step=0.141, global_step=4659.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  10%|▉         | 575/5971 [05:51<54:49,  1.64it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000276, train/loss_step=0.0841, global_step=4659.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 576/5971 [05:53<55:03,  1.63it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000276, train/loss_step=0.0841, global_step=4659.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 576/5971 [05:53<55:03,  1.63it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0614, train/loss_vlb_step=0.000206, train/loss_step=0.0614, global_step=4659.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 577/5971 [05:54<55:05,  1.63it/s, loss=0.193, v_num=0, train/loss_simple_step=0.523, train/loss_vlb_step=0.00494, train/loss_step=0.523, global_step=4660.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  10%|▉         | 578/5971 [05:55<55:06,  1.63it/s, loss=0.188, v_num=0, train/loss_simple_step=0.616, train/loss_vlb_step=0.0133, train/loss_step=0.616, global_step=4660.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|▉         | 579/5971 [05:55<55:08,  1.63it/s, loss=0.19, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.00041, train/loss_step=0.123, global_step=4660.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 580/5971 [05:58<55:22,  1.62it/s, loss=0.19, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.00041, train/loss_step=0.123, global_step=4660.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 580/5971 [05:58<55:22,  1.62it/s, loss=0.203, v_num=0, train/loss_simple_step=0.254, train/loss_vlb_step=0.000953, train/loss_step=0.254, global_step=4660.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 581/5971 [05:58<55:24,  1.62it/s, loss=0.216, v_num=0, train/loss_simple_step=0.404, train/loss_vlb_step=0.0023, train/loss_step=0.404, global_step=4661.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  10%|▉         | 582/5971 [05:59<55:25,  1.62it/s, loss=0.256, v_num=0, train/loss_simple_step=0.819, train/loss_vlb_step=0.0424, train/loss_step=0.819, global_step=4661.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 583/5971 [06:00<55:27,  1.62it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000316, train/loss_step=0.0948, global_step=4661.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 584/5971 [06:03<55:44,  1.61it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0948, train/loss_vlb_step=0.000316, train/loss_step=0.0948, global_step=4661.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 584/5971 [06:03<55:44,  1.61it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0402, train/loss_vlb_step=0.000152, train/loss_step=0.0402, global_step=4661.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 585/5971 [06:04<55:46,  1.61it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=5.08e-5, train/loss_step=0.0115, global_step=4662.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  10%|▉         | 586/5971 [06:04<55:48,  1.61it/s, loss=0.189, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.00061, train/loss_step=0.180, global_step=4662.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|▉         | 587/5971 [06:05<55:49,  1.61it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=1.98e-5, train/loss_step=0.00372, global_step=4662.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 588/5971 [06:07<56:02,  1.60it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=1.98e-5, train/loss_step=0.00372, global_step=4662.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 588/5971 [06:07<56:02,  1.60it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0459, train/loss_vlb_step=0.000153, train/loss_step=0.0459, global_step=4662.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|▉         | 589/5971 [06:08<56:04,  1.60it/s, loss=0.192, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=6.75e-5, train/loss_step=0.016, global_step=4663.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  10%|▉         | 590/5971 [06:09<56:05,  1.60it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00396, train/loss_vlb_step=2.07e-5, train/loss_step=0.00396, global_step=4663.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 591/5971 [06:10<56:07,  1.60it/s, loss=0.173, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.17e-5, train/loss_step=0.004, global_step=4663.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  10%|▉         | 592/5971 [06:12<56:21,  1.59it/s, loss=0.173, v_num=0, train/loss_simple_step=0.004, train/loss_vlb_step=2.17e-5, train/loss_step=0.004, global_step=4663.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 592/5971 [06:12<56:21,  1.59it/s, loss=0.188, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.0015, train/loss_step=0.331, global_step=4663.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|▉         | 593/5971 [06:13<56:22,  1.59it/s, loss=0.2, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000876, train/loss_step=0.238, global_step=4664.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 594/5971 [06:14<56:24,  1.59it/s, loss=0.22, v_num=0, train/loss_simple_step=0.548, train/loss_vlb_step=0.00617, train/loss_step=0.548, global_step=4664.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 595/5971 [06:15<56:25,  1.59it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.5e-5, train/loss_step=0.0205, global_step=4664.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 596/5971 [06:17<56:38,  1.58it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.5e-5, train/loss_step=0.0205, global_step=4664.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 596/5971 [06:17<56:38,  1.58it/s, loss=0.216, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.000146, train/loss_step=0.044, global_step=4664.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|▉         | 597/5971 [06:18<56:41,  1.58it/s, loss=0.216, v_num=0, train/loss_simple_step=0.531, train/loss_vlb_step=0.00651, train/loss_step=0.531, global_step=4665.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 598/5971 [06:19<56:43,  1.58it/s, loss=0.227, v_num=0, train/loss_simple_step=0.819, train/loss_vlb_step=0.0527, train/loss_step=0.819, global_step=4665.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 599/5971 [06:20<56:44,  1.58it/s, loss=0.225, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=4665.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 600/5971 [06:22<56:56,  1.57it/s, loss=0.225, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000334, train/loss_step=0.101, global_step=4665.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 600/5971 [06:22<56:56,  1.57it/s, loss=0.244, v_num=0, train/loss_simple_step=0.622, train/loss_vlb_step=0.00763, train/loss_step=0.622, global_step=4665.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 601/5971 [06:23<56:59,  1.57it/s, loss=0.226, v_num=0, train/loss_simple_step=0.0545, train/loss_vlb_step=0.000194, train/loss_step=0.0545, global_step=4666.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 602/5971 [06:24<57:00,  1.57it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00639, train/loss_vlb_step=2.86e-5, train/loss_step=0.00639, global_step=4666.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 603/5971 [06:25<57:02,  1.57it/s, loss=0.187, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.00037, train/loss_step=0.113, global_step=4666.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  10%|█         | 604/5971 [06:27<57:16,  1.56it/s, loss=0.187, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.00037, train/loss_step=0.113, global_step=4666.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 604/5971 [06:27<57:16,  1.56it/s, loss=0.197, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000938, train/loss_step=0.240, global_step=4666.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 605/5971 [06:28<57:18,  1.56it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.51e-5, train/loss_step=0.0214, global_step=4667.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 606/5971 [06:29<57:19,  1.56it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0702, train/loss_vlb_step=0.000242, train/loss_step=0.0702, global_step=4667.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 607/5971 [06:30<57:20,  1.56it/s, loss=0.201, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000693, train/loss_step=0.195, global_step=4667.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  10%|█         | 608/5971 [06:32<57:33,  1.55it/s, loss=0.201, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000693, train/loss_step=0.195, global_step=4667.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 608/5971 [06:32<57:33,  1.55it/s, loss=0.217, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00168, train/loss_step=0.369, global_step=4667.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 609/5971 [06:33<57:35,  1.55it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0683, train/loss_vlb_step=0.000228, train/loss_step=0.0683, global_step=4668.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 610/5971 [06:33<57:36,  1.55it/s, loss=0.229, v_num=0, train/loss_simple_step=0.182, train/loss_vlb_step=0.000618, train/loss_step=0.182, global_step=4668.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 611/5971 [06:34<57:37,  1.55it/s, loss=0.229, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.8e-5, train/loss_step=0.00329, global_step=4668.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 612/5971 [06:37<57:52,  1.54it/s, loss=0.229, v_num=0, train/loss_simple_step=0.00329, train/loss_vlb_step=1.8e-5, train/loss_step=0.00329, global_step=4668.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 612/5971 [06:37<57:52,  1.54it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00224, train/loss_vlb_step=1.25e-5, train/loss_step=0.00224, global_step=4668.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 613/5971 [06:38<57:54,  1.54it/s, loss=0.222, v_num=0, train/loss_simple_step=0.424, train/loss_vlb_step=0.00213, train/loss_step=0.424, global_step=4669.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  10%|█         | 614/5971 [06:38<57:55,  1.54it/s, loss=0.219, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00342, train/loss_step=0.493, global_step=4669.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 615/5971 [06:39<57:56,  1.54it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00187, train/loss_vlb_step=1.11e-5, train/loss_step=0.00187, global_step=4669.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 616/5971 [06:42<58:09,  1.53it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00187, train/loss_vlb_step=1.11e-5, train/loss_step=0.00187, global_step=4669.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 616/5971 [06:42<58:09,  1.53it/s, loss=0.217, v_num=0, train/loss_simple_step=0.0189, train/loss_vlb_step=7.55e-5, train/loss_step=0.0189, global_step=4669.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  10%|█         | 617/5971 [06:42<58:11,  1.53it/s, loss=0.197, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000469, train/loss_step=0.142, global_step=4670.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 618/5971 [06:43<58:12,  1.53it/s, loss=0.174, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00173, train/loss_step=0.344, global_step=4670.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 619/5971 [06:44<58:13,  1.53it/s, loss=0.207, v_num=0, train/loss_simple_step=0.779, train/loss_vlb_step=0.0273, train/loss_step=0.779, global_step=4670.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 620/5971 [06:46<58:25,  1.53it/s, loss=0.207, v_num=0, train/loss_simple_step=0.779, train/loss_vlb_step=0.0273, train/loss_step=0.779, global_step=4670.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 620/5971 [06:46<58:25,  1.53it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=0.000102, train/loss_step=0.0246, global_step=4670.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 621/5971 [06:47<58:27,  1.53it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.14e-5, train/loss_step=0.00192, global_step=4671.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 622/5971 [06:48<58:29,  1.52it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.00019, train/loss_step=0.0566, global_step=4671.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  10%|█         | 623/5971 [06:49<58:30,  1.52it/s, loss=0.179, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000485, train/loss_step=0.148, global_step=4671.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 624/5971 [06:51<58:44,  1.52it/s, loss=0.179, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000485, train/loss_step=0.148, global_step=4671.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 624/5971 [06:51<58:44,  1.52it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00104, train/loss_vlb_step=6.34e-6, train/loss_step=0.00104, global_step=4671.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  10%|█         | 625/5971 [06:52<58:45,  1.52it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0349, train/loss_vlb_step=0.000129, train/loss_step=0.0349, global_step=4672.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  10%|█         | 626/5971 [06:53<58:46,  1.52it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.56e-5, train/loss_step=0.0214, global_step=4672.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  11%|█         | 627/5971 [06:54<58:48,  1.51it/s, loss=0.163, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000504, train/loss_step=0.153, global_step=4672.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  11%|█         | 628/5971 [06:56<59:00,  1.51it/s, loss=0.163, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000504, train/loss_step=0.153, global_step=4672.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  11%|█         | 628/5971 [06:56<59:00,  1.51it/s, loss=0.186, v_num=0, train/loss_simple_step=0.813, train/loss_vlb_step=0.0421, train/loss_step=0.813, global_step=4672.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  11%|█         | 629/5971 [06:57<59:01,  1.51it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00881, train/loss_vlb_step=3.97e-5, train/loss_step=0.00881, global_step=4673.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  11%|█         | 630/5971 [06:58<59:02,  1.51it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.65e-5, train/loss_step=0.00287, global_step=4673.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  11%|█         | 631/5971 [06:59<59:03,  1.51it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.00016, train/loss_step=0.0474, global_step=4673.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  11%|█         | 632/5971 [07:01<59:17,  1.50it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0474, train/loss_vlb_step=0.00016, train/loss_step=0.0474, global_step=4673.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  11%|█         | 632/5971 [07:01<59:17,  1.50it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000176, train/loss_step=0.0491, global_step=4673.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  11%|█         | 633/5971 [07:02<59:18,  1.50it/s, loss=0.177, v_num=0, train/loss_simple_step=0.406, train/loss_vlb_step=0.00308, train/loss_step=0.406, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  11%|█         | 634/5971 [07:03<59:19,  1.50it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0239, train/loss_vlb_step=8.45e-5, train/loss_step=0.0239, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  11%|█         | 635/5971 [07:04<59:20,  1.50it/s, loss=0.156, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000136, train/loss_step=0.037, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  11%|█         | 636/5971 [07:06<59:32,  1.49it/s, loss=0.156, v_num=0, train/loss_simple_step=0.037, train/loss_vlb_step=0.000136, train/loss_step=0.037, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  11%|█         | 636/5971 [07:06<59:32,  1.49it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:05,  2.52it/s][A

Validating:   1%|          | 2/167 [00:00<00:41,  3.94it/s][A
Epoch 8:  11%|█         | 640/5971 [07:07<59:13,  1.50it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:16, 10.12it/s][A
Epoch 8:  11%|█         | 644/5971 [07:07<58:49,  1.51it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:10, 14.70it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 18.28it/s][A
Epoch 8:  11%|█         | 648/5971 [07:07<58:26,  1.52it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   8%|▊         | 14/167 [00:00<00:07, 20.82it/s][A
Epoch 8:  11%|█         | 652/5971 [07:07<58:03,  1.53it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  11%|█         | 18/167 [00:01<00:06, 22.64it/s][A
Epoch 8:  11%|█         | 656/5971 [07:07<57:41,  1.54it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.58it/s][A
Epoch 8:  11%|█         | 660/5971 [07:07<57:18,  1.54it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.33it/s][A

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.75it/s][A
Epoch 8:  11%|█         | 664/5971 [07:08<56:56,  1.55it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 24.37it/s][A
Epoch 8:  11%|█         | 668/5971 [07:08<56:35,  1.56it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.37it/s][A
Epoch 8:  11%|█▏        | 672/5971 [07:08<56:13,  1.57it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 24.43it/s][A
Epoch 8:  11%|█▏        | 676/5971 [07:08<55:52,  1.58it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 26.29it/s][A

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.19it/s][A
Epoch 8:  11%|█▏        | 680/5971 [07:08<55:31,  1.59it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 25.60it/s][A
Epoch 8:  11%|█▏        | 684/5971 [07:08<55:10,  1.60it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 25.62it/s][A
Epoch 8:  12%|█▏        | 688/5971 [07:09<54:49,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  31%|███       | 52/167 [00:02<00:04, 26.19it/s][A

Validating:  33%|███▎      | 55/167 [00:02<00:04, 26.64it/s][A
Epoch 8:  12%|█▏        | 692/5971 [07:09<54:29,  1.61it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 25.99it/s][A
Epoch 8:  12%|█▏        | 696/5971 [07:09<54:09,  1.62it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 61/167 [00:02<00:04, 26.43it/s][A
Epoch 8:  12%|█▏        | 700/5971 [07:09<53:49,  1.63it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 64/167 [00:02<00:04, 25.67it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 26.64it/s][A
Epoch 8:  12%|█▏        | 704/5971 [07:09<53:30,  1.64it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.46it/s][A
Epoch 8:  12%|█▏        | 708/5971 [07:09<53:10,  1.65it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.90it/s][A
Epoch 8:  12%|█▏        | 712/5971 [07:09<52:51,  1.66it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 27.46it/s][A

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.65it/s][A
Epoch 8:  12%|█▏        | 716/5971 [07:10<52:32,  1.67it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.02it/s][A
Epoch 8:  12%|█▏        | 720/5971 [07:10<52:13,  1.68it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  51%|█████     | 85/167 [00:03<00:03, 26.34it/s][A
Epoch 8:  12%|█▏        | 724/5971 [07:10<51:55,  1.68it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 88/167 [00:03<00:03, 25.77it/s][A

Validating:  54%|█████▍    | 91/167 [00:03<00:03, 24.86it/s][A
Epoch 8:  12%|█▏        | 728/5971 [07:10<51:36,  1.69it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.17it/s][A
Epoch 8:  12%|█▏        | 732/5971 [07:10<51:18,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 24.60it/s][A
Epoch 8:  12%|█▏        | 736/5971 [07:10<51:00,  1.71it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.04it/s][A

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.69it/s][A
Epoch 8:  12%|█▏        | 740/5971 [07:11<50:43,  1.72it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 24.98it/s][A
Epoch 8:  12%|█▏        | 744/5971 [07:11<50:25,  1.73it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 25.47it/s][A
Epoch 8:  13%|█▎        | 748/5971 [07:11<50:08,  1.74it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  67%|██████▋   | 112/167 [00:04<00:02, 23.91it/s][A

Validating:  69%|██████▉   | 115/167 [00:04<00:02, 23.68it/s][A
Epoch 8:  13%|█▎        | 752/5971 [07:11<49:51,  1.74it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████   | 118/167 [00:05<00:02, 24.24it/s][A
Epoch 8:  13%|█▎        | 756/5971 [07:11<49:34,  1.75it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 24.96it/s][A
Epoch 8:  13%|█▎        | 760/5971 [07:11<49:17,  1.76it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 26.09it/s][A

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.03it/s][A
Epoch 8:  13%|█▎        | 764/5971 [07:12<49:00,  1.77it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.35it/s][A
Epoch 8:  13%|█▎        | 768/5971 [07:12<48:44,  1.78it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 26.08it/s][A
Epoch 8:  13%|█▎        | 772/5971 [07:12<48:27,  1.79it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.49it/s][A

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.04it/s][A
Epoch 8:  13%|█▎        | 776/5971 [07:12<48:11,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 27.11it/s][A
Epoch 8:  13%|█▎        | 780/5971 [07:12<47:55,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 26.88it/s][A
Epoch 8:  13%|█▎        | 784/5971 [07:12<47:39,  1.81it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.47it/s][A

Validating:  90%|█████████ | 151/167 [00:06<00:00, 26.61it/s][A
Epoch 8:  13%|█▎        | 788/5971 [07:12<47:23,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 25.15it/s][A
Epoch 8:  13%|█▎        | 792/5971 [07:13<47:08,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 24.50it/s][A
Epoch 8:  13%|█▎        | 796/5971 [07:13<46:53,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.37it/s][A
Epoch 8:  13%|█▎        | 800/5971 [07:13<46:37,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 27.04it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 24.46it/s][A
Epoch 8:  13%|█▎        | 804/5971 [07:13<46:23,  1.86it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  13%|█▎        | 804/5971 [07:13<46:25,  1.86it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0392, train/loss_vlb_step=0.000139, train/loss_step=0.0392, global_step=4674.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  13%|█▎        | 805/5971 [07:14<46:27,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00211, train/loss_vlb_step=1.23e-5, train/loss_step=0.00211, global_step=4675.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  13%|█▎        | 806/5971 [07:15<46:29,  1.85it/s, loss=0.143, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000847, train/loss_step=0.205, global_step=4675.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  14%|█▎        | 807/5971 [07:16<46:30,  1.85it/s, loss=0.111, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000498, train/loss_step=0.134, global_step=4675.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 808/5971 [07:18<46:40,  1.84it/s, loss=0.111, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000498, train/loss_step=0.134, global_step=4675.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 808/5971 [07:18<46:40,  1.84it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00632, train/loss_vlb_step=3.24e-5, train/loss_step=0.00632, global_step=4675.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 809/5971 [07:19<46:42,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000711, train/loss_step=0.213, global_step=4676.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  14%|█▎        | 810/5971 [07:20<46:43,  1.84it/s, loss=0.139, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00414, train/loss_step=0.428, global_step=4676.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 811/5971 [07:21<46:45,  1.84it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.8e-5, train/loss_step=0.0256, global_step=4676.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 812/5971 [07:23<46:55,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.8e-5, train/loss_step=0.0256, global_step=4676.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 812/5971 [07:23<46:55,  1.83it/s, loss=0.133, v_num=0, train/loss_simple_step=0.00477, train/loss_vlb_step=2.32e-5, train/loss_step=0.00477, global_step=4676.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 813/5971 [07:24<46:57,  1.83it/s, loss=0.149, v_num=0, train/loss_simple_step=0.362, train/loss_vlb_step=0.0031, train/loss_step=0.362, global_step=4677.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  14%|█▎        | 814/5971 [07:25<46:59,  1.83it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0136, train/loss_vlb_step=5.78e-5, train/loss_step=0.0136, global_step=4677.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 815/5971 [07:26<47:00,  1.83it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.58e-6, train/loss_step=0.00141, global_step=4677.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 816/5971 [07:28<47:11,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00141, train/loss_vlb_step=8.58e-6, train/loss_step=0.00141, global_step=4677.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 816/5971 [07:28<47:11,  1.82it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00105, train/loss_vlb_step=6.37e-6, train/loss_step=0.00105, global_step=4677.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 817/5971 [07:29<47:13,  1.82it/s, loss=0.129, v_num=0, train/loss_simple_step=0.572, train/loss_vlb_step=0.00866, train/loss_step=0.572, global_step=4678.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  14%|█▎        | 818/5971 [07:30<47:14,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00479, train/loss_step=0.501, global_step=4678.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 819/5971 [07:31<47:16,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000118, train/loss_step=0.0317, global_step=4678.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 820/5971 [07:33<47:27,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000118, train/loss_step=0.0317, global_step=4678.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 820/5971 [07:33<47:27,  1.81it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00217, train/loss_vlb_step=1.29e-5, train/loss_step=0.00217, global_step=4678.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▎        | 821/5971 [07:34<47:29,  1.81it/s, loss=0.14, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.000707, train/loss_step=0.199, global_step=4679.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  14%|█▍        | 822/5971 [07:35<47:30,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=4679.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 823/5971 [07:36<47:32,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000165, train/loss_step=0.0462, global_step=4679.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 824/5971 [07:38<47:41,  1.80it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000165, train/loss_step=0.0462, global_step=4679.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 824/5971 [07:38<47:41,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00876, train/loss_vlb_step=4.04e-5, train/loss_step=0.00876, global_step=4679.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 825/5971 [07:39<47:42,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.552, train/loss_vlb_step=0.00535, train/loss_step=0.552, global_step=4680.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  14%|█▍        | 826/5971 [07:40<47:44,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.086, train/loss_vlb_step=0.000286, train/loss_step=0.086, global_step=4680.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 827/5971 [07:41<47:45,  1.80it/s, loss=0.161, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000181, train/loss_step=0.053, global_step=4680.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 828/5971 [07:43<47:57,  1.79it/s, loss=0.161, v_num=0, train/loss_simple_step=0.053, train/loss_vlb_step=0.000181, train/loss_step=0.053, global_step=4680.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 828/5971 [07:43<47:57,  1.79it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0531, train/loss_vlb_step=0.000183, train/loss_step=0.0531, global_step=4680.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 829/5971 [07:44<47:58,  1.79it/s, loss=0.201, v_num=0, train/loss_simple_step=0.971, train/loss_vlb_step=0.489, train/loss_step=0.971, global_step=4681.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  14%|█▍        | 830/5971 [07:45<48:00,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000897, train/loss_step=0.238, global_step=4681.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 831/5971 [07:46<48:01,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000121, train/loss_step=0.0335, global_step=4681.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 832/5971 [07:48<48:10,  1.78it/s, loss=0.192, v_num=0, train/loss_simple_step=0.0335, train/loss_vlb_step=0.000121, train/loss_step=0.0335, global_step=4681.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 832/5971 [07:48<48:10,  1.78it/s, loss=0.202, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000712, train/loss_step=0.197, global_step=4681.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  14%|█▍        | 833/5971 [07:49<48:12,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=5.01e-5, train/loss_step=0.0106, global_step=4682.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 834/5971 [07:50<48:13,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00272, train/loss_vlb_step=1.53e-5, train/loss_step=0.00272, global_step=4682.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 835/5971 [07:51<48:15,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000368, train/loss_step=0.112, global_step=4682.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  14%|█▍        | 836/5971 [07:53<48:23,  1.77it/s, loss=0.189, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.000368, train/loss_step=0.112, global_step=4682.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 836/5971 [07:53<48:23,  1.77it/s, loss=0.196, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000482, train/loss_step=0.145, global_step=4682.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 837/5971 [07:54<48:25,  1.77it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0495, train/loss_vlb_step=0.000183, train/loss_step=0.0495, global_step=4683.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 838/5971 [07:55<48:26,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00143, train/loss_step=0.306, global_step=4683.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  14%|█▍        | 839/5971 [07:55<48:27,  1.76it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00361, train/loss_vlb_step=1.95e-5, train/loss_step=0.00361, global_step=4683.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 840/5971 [07:58<48:36,  1.76it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00361, train/loss_vlb_step=1.95e-5, train/loss_step=0.00361, global_step=4683.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 840/5971 [07:58<48:36,  1.76it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00175, train/loss_vlb_step=1.05e-5, train/loss_step=0.00175, global_step=4683.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 841/5971 [07:59<48:38,  1.76it/s, loss=0.158, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000629, train/loss_step=0.185, global_step=4684.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  14%|█▍        | 842/5971 [07:59<48:39,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00972, train/loss_vlb_step=4.64e-5, train/loss_step=0.00972, global_step=4684.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 843/5971 [08:00<48:40,  1.76it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00903, train/loss_vlb_step=4.44e-5, train/loss_step=0.00903, global_step=4684.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 844/5971 [08:02<48:50,  1.75it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00903, train/loss_vlb_step=4.44e-5, train/loss_step=0.00903, global_step=4684.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 844/5971 [08:02<48:50,  1.75it/s, loss=0.169, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00182, train/loss_step=0.356, global_step=4684.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  14%|█▍        | 845/5971 [08:03<48:52,  1.75it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00145, train/loss_vlb_step=8.4e-6, train/loss_step=0.00145, global_step=4685.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 846/5971 [08:04<48:53,  1.75it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.07e-5, train/loss_step=0.0234, global_step=4685.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  14%|█▍        | 847/5971 [08:05<48:54,  1.75it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0915, train/loss_vlb_step=0.000304, train/loss_step=0.0915, global_step=4685.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 848/5971 [08:08<49:06,  1.74it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0915, train/loss_vlb_step=0.000304, train/loss_step=0.0915, global_step=4685.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 848/5971 [08:08<49:06,  1.74it/s, loss=0.139, v_num=0, train/loss_simple_step=0.026, train/loss_vlb_step=0.000104, train/loss_step=0.026, global_step=4685.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  14%|█▍        | 849/5971 [08:09<49:07,  1.74it/s, loss=0.0931, v_num=0, train/loss_simple_step=0.060, train/loss_vlb_step=0.000213, train/loss_step=0.060, global_step=4686.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 850/5971 [08:09<49:08,  1.74it/s, loss=0.082, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=7.15e-5, train/loss_step=0.0168, global_step=4686.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 851/5971 [08:10<49:09,  1.74it/s, loss=0.0805, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.24e-5, train/loss_step=0.00218, global_step=4686.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 852/5971 [08:13<49:18,  1.73it/s, loss=0.0805, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.24e-5, train/loss_step=0.00218, global_step=4686.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 852/5971 [08:13<49:18,  1.73it/s, loss=0.0722, v_num=0, train/loss_simple_step=0.0323, train/loss_vlb_step=0.000121, train/loss_step=0.0323, global_step=4686.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  14%|█▍        | 853/5971 [08:13<49:19,  1.73it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00129, train/loss_step=0.291, global_step=4687.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  14%|█▍        | 854/5971 [08:14<49:21,  1.73it/s, loss=0.107, v_num=0, train/loss_simple_step=0.417, train/loss_vlb_step=0.00217, train/loss_step=0.417, global_step=4687.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  14%|█▍        | 855/5971 [08:15<49:22,  1.73it/s, loss=0.127, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00514, train/loss_step=0.508, global_step=4687.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 856/5971 [08:17<49:31,  1.72it/s, loss=0.127, v_num=0, train/loss_simple_step=0.508, train/loss_vlb_step=0.00514, train/loss_step=0.508, global_step=4687.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 856/5971 [08:17<49:31,  1.72it/s, loss=0.13, v_num=0, train/loss_simple_step=0.200, train/loss_vlb_step=0.000852, train/loss_step=0.200, global_step=4687.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 857/5971 [08:18<49:32,  1.72it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0677, train/loss_vlb_step=0.000226, train/loss_step=0.0677, global_step=4688.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 858/5971 [08:19<49:33,  1.72it/s, loss=0.13, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.0014, train/loss_step=0.290, global_step=4688.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  14%|█▍        | 859/5971 [08:20<49:34,  1.72it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000224, train/loss_step=0.0656, global_step=4688.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 860/5971 [08:22<49:43,  1.71it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0656, train/loss_vlb_step=0.000224, train/loss_step=0.0656, global_step=4688.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 860/5971 [08:22<49:43,  1.71it/s, loss=0.154, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.0031, train/loss_step=0.428, global_step=4688.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  14%|█▍        | 861/5971 [08:23<49:44,  1.71it/s, loss=0.15, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000351, train/loss_step=0.107, global_step=4689.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 862/5971 [08:24<49:45,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=4689.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 863/5971 [08:25<49:46,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.38e-5, train/loss_step=0.0201, global_step=4689.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 864/5971 [08:27<49:54,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.38e-5, train/loss_step=0.0201, global_step=4689.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 864/5971 [08:27<49:54,  1.71it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0726, train/loss_vlb_step=0.000242, train/loss_step=0.0726, global_step=4689.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  14%|█▍        | 865/5971 [08:28<49:56,  1.70it/s, loss=0.15, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000573, train/loss_step=0.173, global_step=4690.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  15%|█▍        | 866/5971 [08:29<49:57,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0485, train/loss_vlb_step=0.000172, train/loss_step=0.0485, global_step=4690.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 867/5971 [08:29<49:58,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000109, train/loss_step=0.0285, global_step=4690.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 868/5971 [08:32<50:08,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0285, train/loss_vlb_step=0.000109, train/loss_step=0.0285, global_step=4690.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 868/5971 [08:32<50:08,  1.70it/s, loss=0.154, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000472, train/loss_step=0.143, global_step=4690.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  15%|█▍        | 869/5971 [08:33<50:09,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00928, train/loss_vlb_step=4.27e-5, train/loss_step=0.00928, global_step=4691.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 870/5971 [08:34<50:10,  1.69it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.12e-5, train/loss_step=0.00387, global_step=4691.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▍        | 871/5971 [08:35<50:12,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.12e-5, train/loss_step=0.0231, global_step=4691.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▍        | 872/5971 [08:37<50:21,  1.69it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0231, train/loss_vlb_step=9.12e-5, train/loss_step=0.0231, global_step=4691.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 872/5971 [08:37<50:21,  1.69it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000118, train/loss_step=0.0311, global_step=4691.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 873/5971 [08:38<50:22,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.88e-5, train/loss_step=0.0129, global_step=4692.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▍        | 874/5971 [08:39<50:23,  1.69it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.59e-5, train/loss_step=0.00285, global_step=4692.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 875/5971 [08:39<50:24,  1.68it/s, loss=0.103, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000897, train/loss_step=0.227, global_step=4692.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  15%|█▍        | 876/5971 [08:42<50:34,  1.68it/s, loss=0.103, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000897, train/loss_step=0.227, global_step=4692.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 876/5971 [08:42<50:34,  1.68it/s, loss=0.0955, v_num=0, train/loss_simple_step=0.0543, train/loss_vlb_step=0.000184, train/loss_step=0.0543, global_step=4692.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 877/5971 [08:43<50:35,  1.68it/s, loss=0.0998, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000507, train/loss_step=0.153, global_step=4693.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  15%|█▍        | 878/5971 [08:44<50:36,  1.68it/s, loss=0.0861, v_num=0, train/loss_simple_step=0.0172, train/loss_vlb_step=7.44e-5, train/loss_step=0.0172, global_step=4693.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 879/5971 [08:44<50:37,  1.68it/s, loss=0.089, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000417, train/loss_step=0.123, global_step=4693.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  15%|█▍        | 880/5971 [08:47<50:46,  1.67it/s, loss=0.089, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000417, train/loss_step=0.123, global_step=4693.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 880/5971 [08:47<50:46,  1.67it/s, loss=0.068, v_num=0, train/loss_simple_step=0.00829, train/loss_vlb_step=4.01e-5, train/loss_step=0.00829, global_step=4693.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 881/5971 [08:48<50:47,  1.67it/s, loss=0.0632, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.97e-5, train/loss_step=0.0113, global_step=4694.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▍        | 882/5971 [08:48<50:48,  1.67it/s, loss=0.0712, v_num=0, train/loss_simple_step=0.261, train/loss_vlb_step=0.000922, train/loss_step=0.261, global_step=4694.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▍        | 883/5971 [08:49<50:49,  1.67it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000573, train/loss_step=0.174, global_step=4694.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 884/5971 [08:51<50:57,  1.66it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000573, train/loss_step=0.174, global_step=4694.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 884/5971 [08:51<50:57,  1.66it/s, loss=0.0774, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000156, train/loss_step=0.0421, global_step=4694.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 885/5971 [08:52<50:58,  1.66it/s, loss=0.0756, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000455, train/loss_step=0.137, global_step=4695.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  15%|█▍        | 886/5971 [08:53<50:59,  1.66it/s, loss=0.0791, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000396, train/loss_step=0.120, global_step=4695.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 887/5971 [08:54<51:00,  1.66it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.19e-5, train/loss_step=0.00196, global_step=4695.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 888/5971 [08:56<51:08,  1.66it/s, loss=0.0778, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.19e-5, train/loss_step=0.00196, global_step=4695.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 888/5971 [08:56<51:08,  1.66it/s, loss=0.0762, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=4695.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  15%|█▍        | 889/5971 [08:57<51:09,  1.66it/s, loss=0.077, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=0.0001, train/loss_step=0.0248, global_step=4696.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▍        | 890/5971 [08:58<51:10,  1.65it/s, loss=0.0791, v_num=0, train/loss_simple_step=0.0462, train/loss_vlb_step=0.000172, train/loss_step=0.0462, global_step=4696.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 891/5971 [08:59<51:11,  1.65it/s, loss=0.0791, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=9.3e-5, train/loss_step=0.0225, global_step=4696.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  15%|█▍        | 892/5971 [09:01<51:18,  1.65it/s, loss=0.0791, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=9.3e-5, train/loss_step=0.0225, global_step=4696.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 892/5971 [09:01<51:18,  1.65it/s, loss=0.084, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000437, train/loss_step=0.131, global_step=4696.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▍        | 893/5971 [09:02<51:20,  1.65it/s, loss=0.0835, v_num=0, train/loss_simple_step=0.00124, train/loss_vlb_step=7.48e-6, train/loss_step=0.00124, global_step=4697.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▍        | 894/5971 [09:03<51:20,  1.65it/s, loss=0.0848, v_num=0, train/loss_simple_step=0.0292, train/loss_vlb_step=0.000116, train/loss_step=0.0292, global_step=4697.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▍        | 895/5971 [09:03<51:21,  1.65it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00301, train/loss_step=0.405, global_step=4697.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  15%|█▌        | 896/5971 [09:06<51:31,  1.64it/s, loss=0.0937, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00301, train/loss_step=0.405, global_step=4697.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▌        | 896/5971 [09:06<51:31,  1.64it/s, loss=0.0911, v_num=0, train/loss_simple_step=0.00312, train/loss_vlb_step=1.72e-5, train/loss_step=0.00312, global_step=4697.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▌        | 897/5971 [09:07<51:32,  1.64it/s, loss=0.0893, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000405, train/loss_step=0.116, global_step=4698.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  15%|█▌        | 898/5971 [09:08<51:33,  1.64it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.110, train/loss_vlb_step=0.000362, train/loss_step=0.110, global_step=4698.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▌        | 899/5971 [09:09<51:34,  1.64it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000331, train/loss_step=0.0992, global_step=4698.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▌        | 900/5971 [09:11<51:42,  1.63it/s, loss=0.0927, v_num=0, train/loss_simple_step=0.0992, train/loss_vlb_step=0.000331, train/loss_step=0.0992, global_step=4698.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▌        | 900/5971 [09:11<51:42,  1.63it/s, loss=0.0962, v_num=0, train/loss_simple_step=0.0788, train/loss_vlb_step=0.000261, train/loss_step=0.0788, global_step=4698.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▌        | 901/5971 [09:12<51:43,  1.63it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0949, train/loss_vlb_step=0.000315, train/loss_step=0.0949, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  15%|█▌        | 902/5971 [09:13<51:44,  1.63it/s, loss=0.0875, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.53e-5, train/loss_step=0.00265, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▌        | 903/5971 [09:13<51:45,  1.63it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000243, train/loss_step=0.0698, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  15%|█▌        | 904/5971 [09:16<51:53,  1.63it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000243, train/loss_step=0.0698, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  15%|█▌        | 904/5971 [09:16<51:53,  1.63it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.23it/s][A

Validating:   1%|          | 2/167 [00:00<00:44,  3.68it/s][A
Epoch 8:  15%|█▌        | 908/5971 [09:16<51:40,  1.63it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.28it/s][A
Epoch 8:  15%|█▌        | 912/5971 [09:16<51:25,  1.64it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.70it/s][A

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.46it/s][A
Epoch 8:  15%|█▌        | 916/5971 [09:17<51:10,  1.65it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   8%|▊         | 14/167 [00:01<00:07, 20.65it/s][A
Epoch 8:  15%|█▌        | 920/5971 [09:17<50:55,  1.65it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  10%|█         | 17/167 [00:01<00:06, 21.54it/s][A
Epoch 8:  15%|█▌        | 924/5971 [09:17<50:41,  1.66it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.78it/s][A

Validating:  14%|█▍        | 23/167 [00:01<00:06, 23.31it/s][A
Epoch 8:  16%|█▌        | 928/5971 [09:17<50:26,  1.67it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.53it/s][A
Epoch 8:  16%|█▌        | 932/5971 [09:17<50:11,  1.67it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.19it/s][A
Epoch 8:  16%|█▌        | 936/5971 [09:17<49:57,  1.68it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.15it/s][A
Epoch 8:  16%|█▌        | 940/5971 [09:17<49:43,  1.69it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.96it/s][A

Validating:  23%|██▎       | 39/167 [00:02<00:04, 26.34it/s][A
Epoch 8:  16%|█▌        | 944/5971 [09:18<49:28,  1.69it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 26.04it/s][A
Epoch 8:  16%|█▌        | 948/5971 [09:18<49:14,  1.70it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 25.81it/s][A
Epoch 8:  16%|█▌        | 952/5971 [09:18<49:00,  1.71it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 25.83it/s][A

Validating:  31%|███       | 51/167 [00:02<00:04, 25.30it/s][A
Epoch 8:  16%|█▌        | 956/5971 [09:18<48:47,  1.71it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.20it/s][A
Epoch 8:  16%|█▌        | 960/5971 [09:18<48:33,  1.72it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.67it/s][A
Epoch 8:  16%|█▌        | 964/5971 [09:18<48:19,  1.73it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 27.51it/s][A
Epoch 8:  16%|█▌        | 968/5971 [09:19<48:06,  1.73it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.42it/s][A

Validating:  40%|████      | 67/167 [00:03<00:03, 27.08it/s][A
Epoch 8:  16%|█▋        | 972/5971 [09:19<47:52,  1.74it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.29it/s][A
Epoch 8:  16%|█▋        | 976/5971 [09:19<47:39,  1.75it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.34it/s][A
Epoch 8:  16%|█▋        | 980/5971 [09:19<47:26,  1.75it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.62it/s][A

Validating:  47%|████▋     | 79/167 [00:03<00:03, 26.85it/s][A
Epoch 8:  16%|█▋        | 984/5971 [09:19<47:13,  1.76it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.49it/s][A
Epoch 8:  17%|█▋        | 988/5971 [09:19<47:00,  1.77it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  51%|█████     | 85/167 [00:03<00:02, 28.13it/s][A
Epoch 8:  17%|█▋        | 992/5971 [09:19<46:47,  1.77it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 27.29it/s][A
Epoch 8:  17%|█▋        | 996/5971 [09:20<46:34,  1.78it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 28.50it/s][A
Epoch 8:  17%|█▋        | 1000/5971 [09:20<46:21,  1.79it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 28.65it/s][A

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 28.93it/s][A
Epoch 8:  17%|█▋        | 1004/5971 [09:20<46:09,  1.79it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 29.52it/s][A
Epoch 8:  17%|█▋        | 1008/5971 [09:20<45:56,  1.80it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  63%|██████▎   | 106/167 [00:04<00:02, 28.30it/s][A
Epoch 8:  17%|█▋        | 1012/5971 [09:20<45:44,  1.81it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 28.45it/s][A
Epoch 8:  17%|█▋        | 1016/5971 [09:20<45:31,  1.81it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.87it/s][A
Epoch 8:  17%|█▋        | 1020/5971 [09:20<45:19,  1.82it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 116/167 [00:04<00:02, 25.38it/s][A
Epoch 8:  17%|█▋        | 1024/5971 [09:21<45:07,  1.83it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 26.68it/s][A

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.04it/s][A
Epoch 8:  17%|█▋        | 1028/5971 [09:21<44:55,  1.83it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.03it/s][A
Epoch 8:  17%|█▋        | 1032/5971 [09:21<44:43,  1.84it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.31it/s][A
Epoch 8:  17%|█▋        | 1036/5971 [09:21<44:32,  1.85it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.85it/s][A

Validating:  81%|████████  | 135/167 [00:05<00:01, 25.62it/s][A
Epoch 8:  17%|█▋        | 1040/5971 [09:21<44:20,  1.85it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 26.77it/s][A
Epoch 8:  17%|█▋        | 1044/5971 [09:21<44:08,  1.86it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 27.50it/s][A
Epoch 8:  18%|█▊        | 1048/5971 [09:21<43:57,  1.87it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 27.89it/s][A
Epoch 8:  18%|█▊        | 1052/5971 [09:22<43:45,  1.87it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 27.48it/s][A
Epoch 8:  18%|█▊        | 1056/5971 [09:22<43:34,  1.88it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 27.28it/s][A

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.14it/s][A
Epoch 8:  18%|█▊        | 1060/5971 [09:22<43:23,  1.89it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 26.15it/s][A
Epoch 8:  18%|█▊        | 1064/5971 [09:22<43:11,  1.89it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 26.84it/s][A
Epoch 8:  18%|█▊        | 1068/5971 [09:22<43:00,  1.90it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.72it/s][A

Validating: 100%|██████████| 167/167 [00:06<00:00, 27.09it/s][A
Epoch 8:  18%|█▊        | 1072/5971 [09:22<42:49,  1.91it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1072/5971 [09:23<42:51,  1.91it/s, loss=0.0809, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=5.94e-5, train/loss_step=0.0142, global_step=4699.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:39,  1.23it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:21,  2.26it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:15,  3.12it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.80it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.33it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.72it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.99it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.20it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.44it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.52it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.62it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.66it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.68it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.71it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.70it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.70it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.68it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.68it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.68it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.68it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.67it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.66it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.58it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.57it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.53it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.57it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.60it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.62it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.69it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.70it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.71it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.29it/s]

Epoch 8:  18%|█▊        | 1073/5971 [09:35<43:42,  1.87it/s, loss=0.0844, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000827, train/loss_step=0.207, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.32it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.28it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.94it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.38it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.65it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.88it/s][A
Epoch 8:  18%|█▊        | 1073/5971 [09:38<43:56,  1.86it/s, loss=0.0844, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000827, train/loss_step=0.207, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.06it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.46it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.53it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.57it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.60it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.63it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.66it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.70it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.71it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.64it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.61it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.62it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.46it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.46it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.47it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.47it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.53it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.57it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.61it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.58it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.54it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.28it/s]

Epoch 8:  18%|█▊        | 1074/5971 [09:46<44:33,  1.83it/s, loss=0.0844, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000827, train/loss_step=0.207, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1074/5971 [09:46<44:33,  1.83it/s, loss=0.0791, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.52e-5, train/loss_step=0.0132, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:37,  1.31it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.35it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.17it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.75it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.16it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.52it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:09,  4.75it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.01it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.21it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.25it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.30it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.40it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.63it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.44it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.46it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.48it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.48it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.40it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.35it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.42it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.45it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.48it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.54it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.58it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.62it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.65it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.64it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.63it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:08<00:01,  5.56it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.58it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.58it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.61it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.65it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.68it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.69it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.21it/s]

Epoch 8:  18%|█▊        | 1075/5971 [09:58<45:24,  1.80it/s, loss=0.0791, v_num=0, train/loss_simple_step=0.0132, train/loss_vlb_step=5.52e-5, train/loss_step=0.0132, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1075/5971 [09:58<45:24,  1.80it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000229, train/loss_step=0.0665, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.21it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.85it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.34it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.68it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.97it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.18it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.33it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.54it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.59it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.62it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.60it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.58it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.55it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.56it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.60it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.58it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.57it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.56it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.56it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.54it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.55it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.54it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.54it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.54it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.54it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.55it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.56it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.56it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.55it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.55it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.54it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.54it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.55it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.55it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.56it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.56it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.57it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.24it/s]

Epoch 8:  18%|█▊        | 1076/5971 [10:11<46:21,  1.76it/s, loss=0.0823, v_num=0, train/loss_simple_step=0.0665, train/loss_vlb_step=0.000229, train/loss_step=0.0665, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1076/5971 [10:11<46:21,  1.76it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00117, train/loss_step=0.295, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  18%|█▊        | 1077/5971 [10:12<46:22,  1.76it/s, loss=0.0915, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00117, train/loss_step=0.295, global_step=4700.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1077/5971 [10:12<46:22,  1.76it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=4.1e-5, train/loss_step=0.00851, global_step=4701.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1078/5971 [10:13<46:23,  1.76it/s, loss=0.0907, v_num=0, train/loss_simple_step=0.00851, train/loss_vlb_step=4.1e-5, train/loss_step=0.00851, global_step=4701.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1078/5971 [10:13<46:23,  1.76it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.74e-5, train/loss_step=0.00344, global_step=4701.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1079/5971 [10:14<46:23,  1.76it/s, loss=0.0886, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.74e-5, train/loss_step=0.00344, global_step=4701.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1079/5971 [10:14<46:23,  1.76it/s, loss=0.109, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00229, train/loss_step=0.436, global_step=4701.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  18%|█▊        | 1080/5971 [10:16<46:30,  1.75it/s, loss=0.109, v_num=0, train/loss_simple_step=0.436, train/loss_vlb_step=0.00229, train/loss_step=0.436, global_step=4701.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1080/5971 [10:16<46:30,  1.75it/s, loss=0.108, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000347, train/loss_step=0.104, global_step=4701.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1081/5971 [10:17<46:31,  1.75it/s, loss=0.108, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000347, train/loss_step=0.104, global_step=4701.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1081/5971 [10:17<46:31,  1.75it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.74e-5, train/loss_step=0.0178, global_step=4702.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1082/5971 [10:18<46:31,  1.75it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0178, train/loss_vlb_step=7.74e-5, train/loss_step=0.0178, global_step=4702.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1082/5971 [10:18<46:31,  1.75it/s, loss=0.114, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000442, train/loss_step=0.134, global_step=4702.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  18%|█▊        | 1083/5971 [10:19<46:32,  1.75it/s, loss=0.114, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000442, train/loss_step=0.134, global_step=4702.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1083/5971 [10:19<46:32,  1.75it/s, loss=0.0945, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.11e-5, train/loss_step=0.0152, global_step=4702.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1084/5971 [10:21<46:40,  1.74it/s, loss=0.0945, v_num=0, train/loss_simple_step=0.0152, train/loss_vlb_step=6.11e-5, train/loss_step=0.0152, global_step=4702.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1084/5971 [10:21<46:40,  1.74it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.45e-5, train/loss_step=0.00479, global_step=4702.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1085/5971 [10:22<46:41,  1.74it/s, loss=0.0946, v_num=0, train/loss_simple_step=0.00479, train/loss_vlb_step=2.45e-5, train/loss_step=0.00479, global_step=4702.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1085/5971 [10:22<46:41,  1.74it/s, loss=0.109, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00275, train/loss_step=0.412, global_step=4703.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  18%|█▊        | 1086/5971 [10:23<46:42,  1.74it/s, loss=0.109, v_num=0, train/loss_simple_step=0.412, train/loss_vlb_step=0.00275, train/loss_step=0.412, global_step=4703.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1086/5971 [10:23<46:42,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.54e-5, train/loss_step=0.00273, global_step=4703.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1087/5971 [10:24<46:42,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00273, train/loss_vlb_step=1.54e-5, train/loss_step=0.00273, global_step=4703.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1087/5971 [10:24<46:42,  1.74it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=6.93e-5, train/loss_step=0.0174, global_step=4703.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  18%|█▊        | 1088/5971 [10:26<46:49,  1.74it/s, loss=0.0999, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=6.93e-5, train/loss_step=0.0174, global_step=4703.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1088/5971 [10:26<46:49,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=4703.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  18%|█▊        | 1089/5971 [10:27<46:50,  1.74it/s, loss=0.102, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.00039, train/loss_step=0.118, global_step=4703.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1089/5971 [10:27<46:50,  1.74it/s, loss=0.103, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=4704.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1090/5971 [10:28<46:50,  1.74it/s, loss=0.103, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000355, train/loss_step=0.108, global_step=4704.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1090/5971 [10:28<46:50,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000117, train/loss_step=0.0313, global_step=4704.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1091/5971 [10:29<46:51,  1.74it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0313, train/loss_vlb_step=0.000117, train/loss_step=0.0313, global_step=4704.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1091/5971 [10:29<46:51,  1.74it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.63e-5, train/loss_step=0.0154, global_step=4704.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  18%|█▊        | 1092/5971 [10:31<46:57,  1.73it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0154, train/loss_vlb_step=6.63e-5, train/loss_step=0.0154, global_step=4704.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1092/5971 [10:31<46:57,  1.73it/s, loss=0.106, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=4704.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  18%|█▊        | 1093/5971 [10:32<46:58,  1.73it/s, loss=0.106, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=4704.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1093/5971 [10:32<46:58,  1.73it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.11e-5, train/loss_step=0.0173, global_step=4705.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1094/5971 [10:33<46:59,  1.73it/s, loss=0.097, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.11e-5, train/loss_step=0.0173, global_step=4705.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1094/5971 [10:33<46:59,  1.73it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000135, train/loss_step=0.0364, global_step=4705.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1095/5971 [10:33<47:00,  1.73it/s, loss=0.0982, v_num=0, train/loss_simple_step=0.0364, train/loss_vlb_step=0.000135, train/loss_step=0.0364, global_step=4705.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1095/5971 [10:33<47:00,  1.73it/s, loss=0.095, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.42e-5, train/loss_step=0.00244, global_step=4705.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1096/5971 [10:36<47:06,  1.72it/s, loss=0.095, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.42e-5, train/loss_step=0.00244, global_step=4705.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1096/5971 [10:36<47:06,  1.72it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000203, train/loss_step=0.0569, global_step=4705.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  18%|█▊        | 1097/5971 [10:36<47:07,  1.72it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000203, train/loss_step=0.0569, global_step=4705.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1097/5971 [10:36<47:07,  1.72it/s, loss=0.113, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.00777, train/loss_step=0.609, global_step=4706.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  18%|█▊        | 1098/5971 [10:37<47:07,  1.72it/s, loss=0.113, v_num=0, train/loss_simple_step=0.609, train/loss_vlb_step=0.00777, train/loss_step=0.609, global_step=4706.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1098/5971 [10:37<47:07,  1.72it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000304, train/loss_step=0.0922, global_step=4706.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1099/5971 [10:38<47:08,  1.72it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0922, train/loss_vlb_step=0.000304, train/loss_step=0.0922, global_step=4706.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1099/5971 [10:38<47:08,  1.72it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.00022, train/loss_step=0.064, global_step=4706.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  18%|█▊        | 1100/5971 [10:41<47:17,  1.72it/s, loss=0.0989, v_num=0, train/loss_simple_step=0.064, train/loss_vlb_step=0.00022, train/loss_step=0.064, global_step=4706.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1100/5971 [10:41<47:17,  1.72it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.00282, train/loss_vlb_step=1.55e-5, train/loss_step=0.00282, global_step=4706.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1101/5971 [10:42<47:17,  1.72it/s, loss=0.0939, v_num=0, train/loss_simple_step=0.00282, train/loss_vlb_step=1.55e-5, train/loss_step=0.00282, global_step=4706.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1101/5971 [10:42<47:17,  1.72it/s, loss=0.117, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00532, train/loss_step=0.473, global_step=4707.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  18%|█▊        | 1102/5971 [10:43<47:18,  1.72it/s, loss=0.117, v_num=0, train/loss_simple_step=0.473, train/loss_vlb_step=0.00532, train/loss_step=0.473, global_step=4707.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1102/5971 [10:43<47:18,  1.72it/s, loss=0.118, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.00052, train/loss_step=0.158, global_step=4707.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1103/5971 [10:43<47:19,  1.71it/s, loss=0.118, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.00052, train/loss_step=0.158, global_step=4707.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1103/5971 [10:43<47:19,  1.71it/s, loss=0.131, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00144, train/loss_step=0.286, global_step=4707.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1104/5971 [10:46<47:27,  1.71it/s, loss=0.131, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.00144, train/loss_step=0.286, global_step=4707.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  18%|█▊        | 1104/5971 [10:46<47:27,  1.71it/s, loss=0.168, v_num=0, train/loss_simple_step=0.743, train/loss_vlb_step=0.0219, train/loss_step=0.743, global_step=4707.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▊        | 1105/5971 [10:47<47:28,  1.71it/s, loss=0.168, v_num=0, train/loss_simple_step=0.743, train/loss_vlb_step=0.0219, train/loss_step=0.743, global_step=4707.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1105/5971 [10:47<47:28,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000529, train/loss_step=0.147, global_step=4708.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1106/5971 [10:48<47:28,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000529, train/loss_step=0.147, global_step=4708.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1106/5971 [10:48<47:28,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00345, train/loss_vlb_step=1.85e-5, train/loss_step=0.00345, global_step=4708.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1107/5971 [10:49<47:29,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00345, train/loss_vlb_step=1.85e-5, train/loss_step=0.00345, global_step=4708.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1107/5971 [10:49<47:29,  1.71it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=5.59e-5, train/loss_step=0.0157, global_step=4708.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  19%|█▊        | 1108/5971 [10:51<47:35,  1.70it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0157, train/loss_vlb_step=5.59e-5, train/loss_step=0.0157, global_step=4708.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1108/5971 [10:51<47:35,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000163, train/loss_step=0.0449, global_step=4708.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1109/5971 [10:52<47:36,  1.70it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000163, train/loss_step=0.0449, global_step=4708.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1109/5971 [10:52<47:36,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000133, train/loss_step=0.0391, global_step=4709.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1110/5971 [10:52<47:36,  1.70it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0391, train/loss_vlb_step=0.000133, train/loss_step=0.0391, global_step=4709.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1110/5971 [10:52<47:36,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000738, train/loss_step=0.206, global_step=4709.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  19%|█▊        | 1111/5971 [10:53<47:37,  1.70it/s, loss=0.157, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000738, train/loss_step=0.206, global_step=4709.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1111/5971 [10:53<47:37,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000913, train/loss_step=0.246, global_step=4709.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1112/5971 [10:56<47:44,  1.70it/s, loss=0.168, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.000913, train/loss_step=0.246, global_step=4709.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1112/5971 [10:56<47:44,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00115, train/loss_step=0.285, global_step=4709.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▊        | 1113/5971 [10:56<47:45,  1.70it/s, loss=0.176, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00115, train/loss_step=0.285, global_step=4709.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1113/5971 [10:56<47:45,  1.70it/s, loss=0.193, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.0018, train/loss_step=0.352, global_step=4710.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▊        | 1114/5971 [10:57<47:45,  1.69it/s, loss=0.193, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.0018, train/loss_step=0.352, global_step=4710.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1114/5971 [10:57<47:45,  1.69it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00794, train/loss_vlb_step=3.8e-5, train/loss_step=0.00794, global_step=4710.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1115/5971 [10:58<47:46,  1.69it/s, loss=0.192, v_num=0, train/loss_simple_step=0.00794, train/loss_vlb_step=3.8e-5, train/loss_step=0.00794, global_step=4710.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1115/5971 [10:58<47:46,  1.69it/s, loss=0.198, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000431, train/loss_step=0.131, global_step=4710.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  19%|█▊        | 1116/5971 [11:00<47:52,  1.69it/s, loss=0.198, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000431, train/loss_step=0.131, global_step=4710.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1116/5971 [11:00<47:52,  1.69it/s, loss=0.232, v_num=0, train/loss_simple_step=0.735, train/loss_vlb_step=0.0538, train/loss_step=0.735, global_step=4710.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  19%|█▊        | 1117/5971 [11:01<47:53,  1.69it/s, loss=0.232, v_num=0, train/loss_simple_step=0.735, train/loss_vlb_step=0.0538, train/loss_step=0.735, global_step=4710.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1117/5971 [11:01<47:53,  1.69it/s, loss=0.208, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000425, train/loss_step=0.129, global_step=4711.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1118/5971 [11:02<47:53,  1.69it/s, loss=0.208, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000425, train/loss_step=0.129, global_step=4711.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1118/5971 [11:02<47:53,  1.69it/s, loss=0.22, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00158, train/loss_step=0.326, global_step=4711.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  19%|█▊        | 1119/5971 [11:03<47:54,  1.69it/s, loss=0.22, v_num=0, train/loss_simple_step=0.326, train/loss_vlb_step=0.00158, train/loss_step=0.326, global_step=4711.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▊        | 1119/5971 [11:03<47:54,  1.69it/s, loss=0.23, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00122, train/loss_step=0.277, global_step=4711.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1120/5971 [11:05<48:00,  1.68it/s, loss=0.23, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00122, train/loss_step=0.277, global_step=4711.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1120/5971 [11:05<48:00,  1.68it/s, loss=0.237, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000431, train/loss_step=0.131, global_step=4711.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1121/5971 [11:06<48:01,  1.68it/s, loss=0.237, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000431, train/loss_step=0.131, global_step=4711.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1121/5971 [11:06<48:01,  1.68it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.05e-5, train/loss_step=0.00392, global_step=4712.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1122/5971 [11:07<48:01,  1.68it/s, loss=0.213, v_num=0, train/loss_simple_step=0.00392, train/loss_vlb_step=2.05e-5, train/loss_step=0.00392, global_step=4712.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1122/5971 [11:07<48:01,  1.68it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0907, train/loss_vlb_step=0.000298, train/loss_step=0.0907, global_step=4712.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  19%|█▉        | 1123/5971 [11:08<48:02,  1.68it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0907, train/loss_vlb_step=0.000298, train/loss_step=0.0907, global_step=4712.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1123/5971 [11:08<48:02,  1.68it/s, loss=0.228, v_num=0, train/loss_simple_step=0.641, train/loss_vlb_step=0.018, train/loss_step=0.641, global_step=4712.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  19%|█▉        | 1124/5971 [11:10<48:09,  1.68it/s, loss=0.228, v_num=0, train/loss_simple_step=0.641, train/loss_vlb_step=0.018, train/loss_step=0.641, global_step=4712.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1124/5971 [11:10<48:09,  1.68it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0416, train/loss_vlb_step=0.000155, train/loss_step=0.0416, global_step=4712.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1125/5971 [11:11<48:10,  1.68it/s, loss=0.193, v_num=0, train/loss_simple_step=0.0416, train/loss_vlb_step=0.000155, train/loss_step=0.0416, global_step=4712.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1125/5971 [11:11<48:10,  1.68it/s, loss=0.199, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00112, train/loss_step=0.281, global_step=4713.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  19%|█▉        | 1126/5971 [11:12<48:10,  1.68it/s, loss=0.199, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.00112, train/loss_step=0.281, global_step=4713.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1126/5971 [11:12<48:10,  1.68it/s, loss=0.226, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00636, train/loss_step=0.537, global_step=4713.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1127/5971 [11:13<48:11,  1.68it/s, loss=0.226, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00636, train/loss_step=0.537, global_step=4713.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1127/5971 [11:13<48:11,  1.68it/s, loss=0.232, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.00046, train/loss_step=0.129, global_step=4713.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1128/5971 [11:15<48:17,  1.67it/s, loss=0.232, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.00046, train/loss_step=0.129, global_step=4713.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1128/5971 [11:15<48:17,  1.67it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.00015, train/loss_step=0.0413, global_step=4713.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1129/5971 [11:16<48:17,  1.67it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0413, train/loss_vlb_step=0.00015, train/loss_step=0.0413, global_step=4713.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1129/5971 [11:16<48:17,  1.67it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000198, train/loss_step=0.0579, global_step=4714.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1130/5971 [11:17<48:18,  1.67it/s, loss=0.232, v_num=0, train/loss_simple_step=0.0579, train/loss_vlb_step=0.000198, train/loss_step=0.0579, global_step=4714.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1130/5971 [11:17<48:18,  1.67it/s, loss=0.259, v_num=0, train/loss_simple_step=0.743, train/loss_vlb_step=0.0385, train/loss_step=0.743, global_step=4714.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  19%|█▉        | 1131/5971 [11:18<48:18,  1.67it/s, loss=0.259, v_num=0, train/loss_simple_step=0.743, train/loss_vlb_step=0.0385, train/loss_step=0.743, global_step=4714.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1131/5971 [11:18<48:18,  1.67it/s, loss=0.253, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=4714.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1132/5971 [11:20<48:26,  1.67it/s, loss=0.253, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=4714.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1132/5971 [11:20<48:26,  1.67it/s, loss=0.243, v_num=0, train/loss_simple_step=0.072, train/loss_vlb_step=0.000244, train/loss_step=0.072, global_step=4714.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1133/5971 [11:21<48:26,  1.66it/s, loss=0.243, v_num=0, train/loss_simple_step=0.072, train/loss_vlb_step=0.000244, train/loss_step=0.072, global_step=4714.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1133/5971 [11:21<48:26,  1.66it/s, loss=0.256, v_num=0, train/loss_simple_step=0.623, train/loss_vlb_step=0.00657, train/loss_step=0.623, global_step=4715.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▉        | 1134/5971 [11:22<48:27,  1.66it/s, loss=0.256, v_num=0, train/loss_simple_step=0.623, train/loss_vlb_step=0.00657, train/loss_step=0.623, global_step=4715.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1134/5971 [11:22<48:27,  1.66it/s, loss=0.256, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.46e-5, train/loss_step=0.00529, global_step=4715.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1135/5971 [11:23<48:28,  1.66it/s, loss=0.256, v_num=0, train/loss_simple_step=0.00529, train/loss_vlb_step=2.46e-5, train/loss_step=0.00529, global_step=4715.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1135/5971 [11:23<48:28,  1.66it/s, loss=0.265, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00135, train/loss_step=0.304, global_step=4715.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  19%|█▉        | 1136/5971 [11:25<48:33,  1.66it/s, loss=0.265, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00135, train/loss_step=0.304, global_step=4715.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1136/5971 [11:25<48:33,  1.66it/s, loss=0.235, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000484, train/loss_step=0.143, global_step=4715.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1137/5971 [11:26<48:34,  1.66it/s, loss=0.235, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000484, train/loss_step=0.143, global_step=4715.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1137/5971 [11:26<48:34,  1.66it/s, loss=0.247, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00176, train/loss_step=0.367, global_step=4716.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▉        | 1138/5971 [11:26<48:34,  1.66it/s, loss=0.247, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00176, train/loss_step=0.367, global_step=4716.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1138/5971 [11:26<48:34,  1.66it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000131, train/loss_step=0.0355, global_step=4716.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1139/5971 [11:27<48:35,  1.66it/s, loss=0.233, v_num=0, train/loss_simple_step=0.0355, train/loss_vlb_step=0.000131, train/loss_step=0.0355, global_step=4716.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1139/5971 [11:27<48:35,  1.66it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0486, train/loss_vlb_step=0.000168, train/loss_step=0.0486, global_step=4716.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1140/5971 [11:29<48:41,  1.65it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0486, train/loss_vlb_step=0.000168, train/loss_step=0.0486, global_step=4716.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1140/5971 [11:29<48:41,  1.65it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000111, train/loss_step=0.0286, global_step=4716.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1141/5971 [11:30<48:41,  1.65it/s, loss=0.216, v_num=0, train/loss_simple_step=0.0286, train/loss_vlb_step=0.000111, train/loss_step=0.0286, global_step=4716.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1141/5971 [11:30<48:41,  1.65it/s, loss=0.223, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=4717.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  19%|█▉        | 1142/5971 [11:31<48:42,  1.65it/s, loss=0.223, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000452, train/loss_step=0.137, global_step=4717.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1142/5971 [11:31<48:42,  1.65it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00233, train/loss_vlb_step=1.33e-5, train/loss_step=0.00233, global_step=4717.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1143/5971 [11:32<48:42,  1.65it/s, loss=0.218, v_num=0, train/loss_simple_step=0.00233, train/loss_vlb_step=1.33e-5, train/loss_step=0.00233, global_step=4717.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1143/5971 [11:32<48:42,  1.65it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.88e-5, train/loss_step=0.00358, global_step=4717.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1144/5971 [11:34<48:49,  1.65it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00358, train/loss_vlb_step=1.88e-5, train/loss_step=0.00358, global_step=4717.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1144/5971 [11:34<48:49,  1.65it/s, loss=0.193, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000587, train/loss_step=0.168, global_step=4717.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  19%|█▉        | 1145/5971 [11:35<48:50,  1.65it/s, loss=0.193, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000587, train/loss_step=0.168, global_step=4717.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1145/5971 [11:35<48:50,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00992, train/loss_vlb_step=4.69e-5, train/loss_step=0.00992, global_step=4718.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1146/5971 [11:36<48:51,  1.65it/s, loss=0.179, v_num=0, train/loss_simple_step=0.00992, train/loss_vlb_step=4.69e-5, train/loss_step=0.00992, global_step=4718.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1146/5971 [11:36<48:51,  1.65it/s, loss=0.191, v_num=0, train/loss_simple_step=0.768, train/loss_vlb_step=0.0494, train/loss_step=0.768, global_step=4718.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  19%|█▉        | 1147/5971 [11:37<48:51,  1.65it/s, loss=0.191, v_num=0, train/loss_simple_step=0.768, train/loss_vlb_step=0.0494, train/loss_step=0.768, global_step=4718.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1147/5971 [11:37<48:51,  1.65it/s, loss=0.193, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00059, train/loss_step=0.168, global_step=4718.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1148/5971 [11:39<48:57,  1.64it/s, loss=0.193, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.00059, train/loss_step=0.168, global_step=4718.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1148/5971 [11:39<48:57,  1.64it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=1.85e-5, train/loss_step=0.00355, global_step=4718.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1149/5971 [11:40<48:57,  1.64it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=1.85e-5, train/loss_step=0.00355, global_step=4718.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1149/5971 [11:40<48:57,  1.64it/s, loss=0.194, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=4719.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  19%|█▉        | 1150/5971 [11:41<48:58,  1.64it/s, loss=0.194, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000396, train/loss_step=0.121, global_step=4719.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1150/5971 [11:41<48:58,  1.64it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.46e-5, train/loss_step=0.00749, global_step=4719.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1151/5971 [11:42<48:58,  1.64it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00749, train/loss_vlb_step=3.46e-5, train/loss_step=0.00749, global_step=4719.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1151/5971 [11:42<48:58,  1.64it/s, loss=0.162, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000837, train/loss_step=0.219, global_step=4719.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  19%|█▉        | 1152/5971 [11:44<49:05,  1.64it/s, loss=0.162, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000837, train/loss_step=0.219, global_step=4719.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1152/5971 [11:44<49:05,  1.64it/s, loss=0.174, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00144, train/loss_step=0.320, global_step=4719.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▉        | 1153/5971 [11:45<49:06,  1.64it/s, loss=0.174, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.00144, train/loss_step=0.320, global_step=4719.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1153/5971 [11:45<49:06,  1.64it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.44e-5, train/loss_step=0.0116, global_step=4720.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1154/5971 [11:46<49:07,  1.63it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.44e-5, train/loss_step=0.0116, global_step=4720.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1154/5971 [11:46<49:07,  1.63it/s, loss=0.145, v_num=0, train/loss_simple_step=0.032, train/loss_vlb_step=0.000122, train/loss_step=0.032, global_step=4720.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▉        | 1155/5971 [11:47<49:07,  1.63it/s, loss=0.145, v_num=0, train/loss_simple_step=0.032, train/loss_vlb_step=0.000122, train/loss_step=0.032, global_step=4720.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1155/5971 [11:47<49:07,  1.63it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000215, train/loss_step=0.0622, global_step=4720.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1156/5971 [11:49<49:14,  1.63it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000215, train/loss_step=0.0622, global_step=4720.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1156/5971 [11:49<49:14,  1.63it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0957, train/loss_vlb_step=0.000317, train/loss_step=0.0957, global_step=4720.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▉        | 1157/5971 [11:50<49:14,  1.63it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0957, train/loss_vlb_step=0.000317, train/loss_step=0.0957, global_step=4720.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1157/5971 [11:50<49:14,  1.63it/s, loss=0.138, v_num=0, train/loss_simple_step=0.521, train/loss_vlb_step=0.00457, train/loss_step=0.521, global_step=4721.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  19%|█▉        | 1158/5971 [11:51<49:15,  1.63it/s, loss=0.138, v_num=0, train/loss_simple_step=0.521, train/loss_vlb_step=0.00457, train/loss_step=0.521, global_step=4721.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1158/5971 [11:51<49:15,  1.63it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00873, train/loss_vlb_step=4.02e-5, train/loss_step=0.00873, global_step=4721.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1159/5971 [11:52<49:15,  1.63it/s, loss=0.137, v_num=0, train/loss_simple_step=0.00873, train/loss_vlb_step=4.02e-5, train/loss_step=0.00873, global_step=4721.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1159/5971 [11:52<49:15,  1.63it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.29e-5, train/loss_step=0.00222, global_step=4721.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1160/5971 [11:54<49:21,  1.62it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.29e-5, train/loss_step=0.00222, global_step=4721.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1160/5971 [11:54<49:21,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000785, train/loss_step=0.206, global_step=4721.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  19%|█▉        | 1161/5971 [11:55<49:21,  1.62it/s, loss=0.143, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000785, train/loss_step=0.206, global_step=4721.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1161/5971 [11:55<49:21,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00274, train/loss_step=0.378, global_step=4722.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  19%|█▉        | 1162/5971 [11:56<49:22,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.378, train/loss_vlb_step=0.00274, train/loss_step=0.378, global_step=4722.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1162/5971 [11:56<49:22,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00137, train/loss_vlb_step=8.09e-6, train/loss_step=0.00137, global_step=4722.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1163/5971 [11:57<49:22,  1.62it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00137, train/loss_vlb_step=8.09e-6, train/loss_step=0.00137, global_step=4722.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1163/5971 [11:57<49:22,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.718, train/loss_vlb_step=0.0212, train/loss_step=0.718, global_step=4722.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  19%|█▉        | 1164/5971 [11:59<49:27,  1.62it/s, loss=0.191, v_num=0, train/loss_simple_step=0.718, train/loss_vlb_step=0.0212, train/loss_step=0.718, global_step=4722.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  19%|█▉        | 1164/5971 [11:59<49:27,  1.62it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000196, train/loss_step=0.0538, global_step=4722.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1165/5971 [12:00<49:28,  1.62it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000196, train/loss_step=0.0538, global_step=4722.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1165/5971 [12:00<49:28,  1.62it/s, loss=0.196, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000767, train/loss_step=0.216, global_step=4723.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  20%|█▉        | 1166/5971 [12:01<49:28,  1.62it/s, loss=0.196, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000767, train/loss_step=0.216, global_step=4723.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1166/5971 [12:01<49:28,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00369, train/loss_step=0.518, global_step=4723.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  20%|█▉        | 1167/5971 [12:01<49:29,  1.62it/s, loss=0.183, v_num=0, train/loss_simple_step=0.518, train/loss_vlb_step=0.00369, train/loss_step=0.518, global_step=4723.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1167/5971 [12:01<49:29,  1.62it/s, loss=0.222, v_num=0, train/loss_simple_step=0.951, train/loss_vlb_step=0.479, train/loss_step=0.951, global_step=4723.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  20%|█▉        | 1168/5971 [12:04<49:34,  1.61it/s, loss=0.222, v_num=0, train/loss_simple_step=0.951, train/loss_vlb_step=0.479, train/loss_step=0.951, global_step=4723.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1168/5971 [12:04<49:34,  1.61it/s, loss=0.243, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00223, train/loss_step=0.426, global_step=4723.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1169/5971 [12:04<49:35,  1.61it/s, loss=0.243, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00223, train/loss_step=0.426, global_step=4723.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1169/5971 [12:04<49:35,  1.61it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000133, train/loss_step=0.0371, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1170/5971 [12:05<49:35,  1.61it/s, loss=0.239, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000133, train/loss_step=0.0371, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1170/5971 [12:05<49:35,  1.61it/s, loss=0.239, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.33e-5, train/loss_step=0.00242, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1171/5971 [12:06<49:36,  1.61it/s, loss=0.239, v_num=0, train/loss_simple_step=0.00242, train/loss_vlb_step=1.33e-5, train/loss_step=0.00242, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1171/5971 [12:06<49:36,  1.61it/s, loss=0.234, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  20%|█▉        | 1172/5971 [12:08<49:42,  1.61it/s, loss=0.234, v_num=0, train/loss_simple_step=0.117, train/loss_vlb_step=0.000387, train/loss_step=0.117, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  20%|█▉        | 1172/5971 [12:09<49:42,  1.61it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:13,  2.26it/s][A
Epoch 8:  20%|█▉        | 1174/5971 [12:09<49:38,  1.61it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   2%|▏         | 3/167 [00:00<00:27,  6.05it/s][A
Epoch 8:  20%|█▉        | 1176/5971 [12:09<49:32,  1.61it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   4%|▎         | 6/167 [00:00<00:13, 11.74it/s][A
Epoch 8:  20%|█▉        | 1179/5971 [12:09<49:23,  1.62it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▌         | 9/167 [00:00<00:10, 14.90it/s][A
Epoch 8:  20%|█▉        | 1182/5971 [12:09<49:14,  1.62it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 12/167 [00:00<00:08, 18.31it/s][A
Epoch 8:  20%|█▉        | 1185/5971 [12:09<49:05,  1.62it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   9%|▉         | 15/167 [00:01<00:07, 21.12it/s][A
Epoch 8:  20%|█▉        | 1188/5971 [12:10<48:56,  1.63it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  11%|█         | 18/167 [00:01<00:06, 23.22it/s][A
Epoch 8:  20%|█▉        | 1191/5971 [12:10<48:48,  1.63it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 24.11it/s][A
Epoch 8:  20%|█▉        | 1194/5971 [12:10<48:39,  1.64it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 24.56it/s][A
Epoch 8:  20%|██        | 1198/5971 [12:10<48:27,  1.64it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 24.52it/s][A
Epoch 8:  20%|██        | 1202/5971 [12:10<48:16,  1.65it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 23.69it/s][A
Epoch 8:  20%|██        | 1206/5971 [12:10<48:05,  1.65it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  20%|██        | 34/167 [00:01<00:05, 24.15it/s][A

Validating:  22%|██▏       | 37/167 [00:01<00:05, 24.31it/s][A
Epoch 8:  20%|██        | 1210/5971 [12:10<47:53,  1.66it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  24%|██▍       | 40/167 [00:02<00:05, 24.75it/s][A
Epoch 8:  20%|██        | 1214/5971 [12:11<47:42,  1.66it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.74it/s][A
Epoch 8:  20%|██        | 1218/5971 [12:11<47:31,  1.67it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.45it/s][A

Validating:  29%|██▉       | 49/167 [00:02<00:04, 25.88it/s][A
Epoch 8:  20%|██        | 1222/5971 [12:11<47:20,  1.67it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 27.00it/s][A
Epoch 8:  21%|██        | 1226/5971 [12:11<47:08,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 23.68it/s][A
Epoch 8:  21%|██        | 1230/5971 [12:11<46:58,  1.68it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 22.05it/s][A
Epoch 8:  21%|██        | 1234/5971 [12:11<46:47,  1.69it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 63/167 [00:02<00:04, 23.97it/s][A
Epoch 8:  21%|██        | 1238/5971 [12:12<46:36,  1.69it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 24.35it/s][A

Validating:  41%|████▏     | 69/167 [00:03<00:03, 25.13it/s][A
Epoch 8:  21%|██        | 1242/5971 [12:12<46:25,  1.70it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 23.90it/s][A
Epoch 8:  21%|██        | 1246/5971 [12:12<46:15,  1.70it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 24.76it/s][A
Epoch 8:  21%|██        | 1250/5971 [12:12<46:04,  1.71it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.05it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.85it/s][A
Epoch 8:  21%|██        | 1254/5971 [12:12<45:53,  1.71it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.49it/s][A
Epoch 8:  21%|██        | 1258/5971 [12:12<45:43,  1.72it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 26.26it/s][A
Epoch 8:  21%|██        | 1262/5971 [12:12<45:32,  1.72it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 25.18it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.43it/s][A
Epoch 8:  21%|██        | 1266/5971 [12:13<45:22,  1.73it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.28it/s][A
Epoch 8:  21%|██▏       | 1270/5971 [12:13<45:12,  1.73it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  59%|█████▉    | 99/167 [00:04<00:03, 22.18it/s][A
Epoch 8:  21%|██▏       | 1274/5971 [12:13<45:02,  1.74it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  61%|██████    | 102/167 [00:04<00:02, 22.47it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 23.34it/s][A
Epoch 8:  21%|██▏       | 1278/5971 [12:13<44:52,  1.74it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 24.77it/s][A
Epoch 8:  21%|██▏       | 1282/5971 [12:13<44:41,  1.75it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 25.71it/s][A
Epoch 8:  22%|██▏       | 1286/5971 [12:13<44:31,  1.75it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 115/167 [00:05<00:01, 27.10it/s][A
Epoch 8:  22%|██▏       | 1290/5971 [12:14<44:21,  1.76it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 28.46it/s][A
Epoch 8:  22%|██▏       | 1294/5971 [12:14<44:11,  1.76it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 27.44it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 27.45it/s][A
Epoch 8:  22%|██▏       | 1298/5971 [12:14<44:01,  1.77it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 27.85it/s][A
Epoch 8:  22%|██▏       | 1302/5971 [12:14<43:52,  1.77it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 28.19it/s][A
Epoch 8:  22%|██▏       | 1306/5971 [12:14<43:42,  1.78it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  80%|████████  | 134/167 [00:05<00:01, 28.58it/s][A

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 28.37it/s][A
Epoch 8:  22%|██▏       | 1310/5971 [12:14<43:32,  1.78it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 28.93it/s][A
Epoch 8:  22%|██▏       | 1314/5971 [12:14<43:22,  1.79it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 27.30it/s][A
Epoch 8:  22%|██▏       | 1318/5971 [12:15<43:13,  1.79it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 26.39it/s][A
Epoch 8:  22%|██▏       | 1322/5971 [12:15<43:03,  1.80it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.17it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.21it/s][A
Epoch 8:  22%|██▏       | 1326/5971 [12:15<42:54,  1.80it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.27it/s][A
Epoch 8:  22%|██▏       | 1330/5971 [12:15<42:44,  1.81it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 25.39it/s][A
Epoch 8:  22%|██▏       | 1334/5971 [12:15<42:35,  1.81it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 25.43it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.37it/s][A
Epoch 8:  22%|██▏       | 1338/5971 [12:15<42:26,  1.82it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  22%|██▏       | 1340/5971 [12:16<42:22,  1.82it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.74e-5, train/loss_step=0.0118, global_step=4724.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  22%|██▏       | 1341/5971 [12:17<42:23,  1.82it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000157, train/loss_step=0.0449, global_step=4725.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  22%|██▏       | 1342/5971 [12:18<42:24,  1.82it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0449, train/loss_vlb_step=0.000157, train/loss_step=0.0449, global_step=4725.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  22%|██▏       | 1342/5971 [12:18<42:24,  1.82it/s, loss=0.235, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00138, train/loss_step=0.325, global_step=4725.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  22%|██▏       | 1343/5971 [12:18<42:24,  1.82it/s, loss=0.236, v_num=0, train/loss_simple_step=0.0926, train/loss_vlb_step=0.000304, train/loss_step=0.0926, global_step=4725.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1344/5971 [12:21<42:30,  1.81it/s, loss=0.233, v_num=0, train/loss_simple_step=0.027, train/loss_vlb_step=0.000106, train/loss_step=0.027, global_step=4725.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  23%|██▎       | 1345/5971 [12:22<42:31,  1.81it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.45e-5, train/loss_step=0.00268, global_step=4726.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1346/5971 [12:23<42:32,  1.81it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00268, train/loss_vlb_step=1.45e-5, train/loss_step=0.00268, global_step=4726.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1346/5971 [12:23<42:32,  1.81it/s, loss=0.209, v_num=0, train/loss_simple_step=0.0551, train/loss_vlb_step=0.000188, train/loss_step=0.0551, global_step=4726.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1347/5971 [12:24<42:32,  1.81it/s, loss=0.209, v_num=0, train/loss_simple_step=0.00144, train/loss_vlb_step=8.14e-6, train/loss_step=0.00144, global_step=4726.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1348/5971 [12:26<42:38,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.58e-5, train/loss_step=0.0194, global_step=4726.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  23%|██▎       | 1349/5971 [12:27<42:39,  1.81it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00634, train/loss_vlb_step=3.15e-5, train/loss_step=0.00634, global_step=4727.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1350/5971 [12:28<42:39,  1.81it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00634, train/loss_vlb_step=3.15e-5, train/loss_step=0.00634, global_step=4727.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1350/5971 [12:28<42:39,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0401, train/loss_vlb_step=0.000145, train/loss_step=0.0401, global_step=4727.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1351/5971 [12:29<42:40,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0022, train/loss_vlb_step=1.27e-5, train/loss_step=0.0022, global_step=4727.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1352/5971 [12:31<42:45,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.146, train/loss_vlb_step=0.000493, train/loss_step=0.146, global_step=4727.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1353/5971 [12:32<42:45,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000154, train/loss_step=0.0426, global_step=4728.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1354/5971 [12:33<42:46,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000154, train/loss_step=0.0426, global_step=4728.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1354/5971 [12:33<42:46,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00356, train/loss_step=0.415, global_step=4728.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  23%|██▎       | 1355/5971 [12:34<42:46,  1.80it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.85e-5, train/loss_step=0.00351, global_step=4728.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1356/5971 [12:36<42:52,  1.79it/s, loss=0.0709, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.27e-5, train/loss_step=0.0246, global_step=4728.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  23%|██▎       | 1357/5971 [12:37<42:52,  1.79it/s, loss=0.0697, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.93e-5, train/loss_step=0.0139, global_step=4729.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1358/5971 [12:38<42:53,  1.79it/s, loss=0.0697, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.93e-5, train/loss_step=0.0139, global_step=4729.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1358/5971 [12:38<42:53,  1.79it/s, loss=0.0703, v_num=0, train/loss_simple_step=0.0147, train/loss_vlb_step=6.2e-5, train/loss_step=0.0147, global_step=4729.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1359/5971 [12:38<42:53,  1.79it/s, loss=0.0652, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.44e-5, train/loss_step=0.0149, global_step=4729.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1360/5971 [12:41<42:58,  1.79it/s, loss=0.0721, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000495, train/loss_step=0.150, global_step=4729.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1361/5971 [12:41<42:59,  1.79it/s, loss=0.076, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000408, train/loss_step=0.122, global_step=4730.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1362/5971 [12:42<42:59,  1.79it/s, loss=0.076, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000408, train/loss_step=0.122, global_step=4730.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1362/5971 [12:42<42:59,  1.79it/s, loss=0.0685, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000639, train/loss_step=0.175, global_step=4730.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1363/5971 [12:43<43:00,  1.79it/s, loss=0.064, v_num=0, train/loss_simple_step=0.00216, train/loss_vlb_step=1.19e-5, train/loss_step=0.00216, global_step=4730.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1364/5971 [12:45<43:04,  1.78it/s, loss=0.0781, v_num=0, train/loss_simple_step=0.309, train/loss_vlb_step=0.00126, train/loss_step=0.309, global_step=4730.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  23%|██▎       | 1365/5971 [12:46<43:05,  1.78it/s, loss=0.0785, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.95e-5, train/loss_step=0.011, global_step=4731.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1366/5971 [12:47<43:05,  1.78it/s, loss=0.0785, v_num=0, train/loss_simple_step=0.011, train/loss_vlb_step=4.95e-5, train/loss_step=0.011, global_step=4731.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1366/5971 [12:47<43:05,  1.78it/s, loss=0.0758, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.29e-5, train/loss_step=0.0023, global_step=4731.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1367/5971 [12:48<43:06,  1.78it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.0622, train/loss_vlb_step=0.000214, train/loss_step=0.0622, global_step=4731.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1368/5971 [12:50<43:11,  1.78it/s, loss=0.0826, v_num=0, train/loss_simple_step=0.0937, train/loss_vlb_step=0.000308, train/loss_step=0.0937, global_step=4731.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1369/5971 [12:51<43:11,  1.78it/s, loss=0.101, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00209, train/loss_step=0.381, global_step=4732.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  23%|██▎       | 1370/5971 [12:52<43:12,  1.77it/s, loss=0.101, v_num=0, train/loss_simple_step=0.381, train/loss_vlb_step=0.00209, train/loss_step=0.381, global_step=4732.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1370/5971 [12:52<43:12,  1.77it/s, loss=0.107, v_num=0, train/loss_simple_step=0.156, train/loss_vlb_step=0.000521, train/loss_step=0.156, global_step=4732.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1371/5971 [12:53<43:12,  1.77it/s, loss=0.118, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.000736, train/loss_step=0.210, global_step=4732.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1372/5971 [12:55<43:17,  1.77it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00444, train/loss_vlb_step=2.19e-5, train/loss_step=0.00444, global_step=4732.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1373/5971 [12:56<43:18,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00195, train/loss_step=0.407, global_step=4733.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  23%|██▎       | 1374/5971 [12:57<43:18,  1.77it/s, loss=0.129, v_num=0, train/loss_simple_step=0.407, train/loss_vlb_step=0.00195, train/loss_step=0.407, global_step=4733.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1374/5971 [12:57<43:18,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.219, train/loss_vlb_step=0.000801, train/loss_step=0.219, global_step=4733.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1375/5971 [12:58<43:19,  1.77it/s, loss=0.119, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.96e-5, train/loss_step=0.00588, global_step=4733.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1376/5971 [13:00<43:24,  1.76it/s, loss=0.118, v_num=0, train/loss_simple_step=0.012, train/loss_vlb_step=5.54e-5, train/loss_step=0.012, global_step=4733.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  23%|██▎       | 1377/5971 [13:01<43:25,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00174, train/loss_step=0.374, global_step=4734.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1378/5971 [13:02<43:25,  1.76it/s, loss=0.136, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00174, train/loss_step=0.374, global_step=4734.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1378/5971 [13:02<43:25,  1.76it/s, loss=0.155, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00213, train/loss_step=0.387, global_step=4734.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1379/5971 [13:03<43:26,  1.76it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.8e-5, train/loss_step=0.00337, global_step=4734.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1380/5971 [13:05<43:30,  1.76it/s, loss=0.172, v_num=0, train/loss_simple_step=0.493, train/loss_vlb_step=0.00418, train/loss_step=0.493, global_step=4734.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  23%|██▎       | 1381/5971 [13:06<43:31,  1.76it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.23e-5, train/loss_step=0.0208, global_step=4735.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1382/5971 [13:07<43:31,  1.76it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0208, train/loss_vlb_step=8.23e-5, train/loss_step=0.0208, global_step=4735.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1382/5971 [13:07<43:31,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.000198, train/loss_step=0.057, global_step=4735.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1383/5971 [13:08<43:32,  1.76it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00579, train/loss_vlb_step=2.88e-5, train/loss_step=0.00579, global_step=4735.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1384/5971 [13:10<43:37,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000419, train/loss_step=0.127, global_step=4735.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  23%|██▎       | 1385/5971 [13:11<43:37,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.95e-5, train/loss_step=0.0109, global_step=4736.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1386/5971 [13:12<43:38,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.95e-5, train/loss_step=0.0109, global_step=4736.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1386/5971 [13:12<43:38,  1.75it/s, loss=0.154, v_num=0, train/loss_simple_step=0.057, train/loss_vlb_step=0.000201, train/loss_step=0.057, global_step=4736.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1387/5971 [13:12<43:38,  1.75it/s, loss=0.158, v_num=0, train/loss_simple_step=0.137, train/loss_vlb_step=0.000451, train/loss_step=0.137, global_step=4736.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1388/5971 [13:14<43:43,  1.75it/s, loss=0.17, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00166, train/loss_step=0.325, global_step=4736.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  23%|██▎       | 1389/5971 [13:15<43:43,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.21e-5, train/loss_step=0.0214, global_step=4737.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1390/5971 [13:16<43:44,  1.75it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0214, train/loss_vlb_step=8.21e-5, train/loss_step=0.0214, global_step=4737.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1390/5971 [13:16<43:44,  1.75it/s, loss=0.163, v_num=0, train/loss_simple_step=0.375, train/loss_vlb_step=0.00231, train/loss_step=0.375, global_step=4737.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  23%|██▎       | 1391/5971 [13:17<43:44,  1.75it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.34e-5, train/loss_step=0.0151, global_step=4737.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1392/5971 [13:19<43:48,  1.74it/s, loss=0.171, v_num=0, train/loss_simple_step=0.371, train/loss_vlb_step=0.0017, train/loss_step=0.371, global_step=4737.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  23%|██▎       | 1393/5971 [13:20<43:49,  1.74it/s, loss=0.157, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000426, train/loss_step=0.128, global_step=4738.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1394/5971 [13:21<43:49,  1.74it/s, loss=0.157, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000426, train/loss_step=0.128, global_step=4738.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1394/5971 [13:21<43:49,  1.74it/s, loss=0.152, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000345, train/loss_step=0.104, global_step=4738.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1395/5971 [13:22<43:50,  1.74it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00941, train/loss_vlb_step=4.39e-5, train/loss_step=0.00941, global_step=4738.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1396/5971 [13:24<43:55,  1.74it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0962, train/loss_vlb_step=0.000316, train/loss_step=0.0962, global_step=4738.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1397/5971 [13:25<43:56,  1.74it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000172, train/loss_step=0.0503, global_step=4739.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  23%|██▎       | 1398/5971 [13:26<43:56,  1.73it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0503, train/loss_vlb_step=0.000172, train/loss_step=0.0503, global_step=4739.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1398/5971 [13:26<43:56,  1.73it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.15e-5, train/loss_step=0.00406, global_step=4739.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1399/5971 [13:27<43:57,  1.73it/s, loss=0.129, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000611, train/loss_step=0.173, global_step=4739.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  23%|██▎       | 1400/5971 [13:29<44:01,  1.73it/s, loss=0.112, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000557, train/loss_step=0.157, global_step=4739.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1401/5971 [13:30<44:01,  1.73it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.16e-5, train/loss_step=0.0163, global_step=4740.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1402/5971 [13:31<44:02,  1.73it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=7.16e-5, train/loss_step=0.0163, global_step=4740.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1402/5971 [13:31<44:02,  1.73it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0806, train/loss_vlb_step=0.000276, train/loss_step=0.0806, global_step=4740.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  23%|██▎       | 1403/5971 [13:32<44:02,  1.73it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0807, train/loss_vlb_step=0.000273, train/loss_step=0.0807, global_step=4740.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1404/5971 [13:34<44:07,  1.72it/s, loss=0.131, v_num=0, train/loss_simple_step=0.413, train/loss_vlb_step=0.00254, train/loss_step=0.413, global_step=4740.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  24%|██▎       | 1405/5971 [13:35<44:08,  1.72it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.02e-5, train/loss_step=0.0114, global_step=4741.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1406/5971 [13:36<44:08,  1.72it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.02e-5, train/loss_step=0.0114, global_step=4741.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1406/5971 [13:36<44:08,  1.72it/s, loss=0.133, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=4741.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  24%|██▎       | 1407/5971 [13:37<44:09,  1.72it/s, loss=0.133, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=4741.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1408/5971 [13:39<44:13,  1.72it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.13e-6, train/loss_step=0.00135, global_step=4741.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1409/5971 [13:40<44:13,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0769, train/loss_vlb_step=0.000257, train/loss_step=0.0769, global_step=4742.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  24%|██▎       | 1410/5971 [13:41<44:14,  1.72it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0769, train/loss_vlb_step=0.000257, train/loss_step=0.0769, global_step=4742.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1410/5971 [13:41<44:14,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.00038, train/loss_step=0.114, global_step=4742.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  24%|██▎       | 1411/5971 [13:42<44:14,  1.72it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00846, train/loss_vlb_step=4.06e-5, train/loss_step=0.00846, global_step=4742.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1412/5971 [13:44<44:19,  1.71it/s, loss=0.0919, v_num=0, train/loss_simple_step=0.0908, train/loss_vlb_step=0.000306, train/loss_step=0.0908, global_step=4742.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1413/5971 [13:45<44:20,  1.71it/s, loss=0.091, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=4743.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  24%|██▎       | 1414/5971 [13:46<44:20,  1.71it/s, loss=0.091, v_num=0, train/loss_simple_step=0.112, train/loss_vlb_step=0.00037, train/loss_step=0.112, global_step=4743.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1414/5971 [13:46<44:20,  1.71it/s, loss=0.0913, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.00036, train/loss_step=0.108, global_step=4743.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1415/5971 [13:46<44:20,  1.71it/s, loss=0.0958, v_num=0, train/loss_simple_step=0.0994, train/loss_vlb_step=0.000334, train/loss_step=0.0994, global_step=4743.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1416/5971 [13:49<44:25,  1.71it/s, loss=0.0974, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=4743.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  24%|██▎       | 1417/5971 [13:49<44:25,  1.71it/s, loss=0.102, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000479, train/loss_step=0.143, global_step=4744.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  24%|██▎       | 1418/5971 [13:50<44:26,  1.71it/s, loss=0.102, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000479, train/loss_step=0.143, global_step=4744.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▎       | 1418/5971 [13:50<44:26,  1.71it/s, loss=0.116, v_num=0, train/loss_simple_step=0.282, train/loss_vlb_step=0.00113, train/loss_step=0.282, global_step=4744.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  24%|██▍       | 1419/5971 [13:51<44:26,  1.71it/s, loss=0.116, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000628, train/loss_step=0.184, global_step=4744.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1420/5971 [13:53<44:30,  1.70it/s, loss=0.124, v_num=0, train/loss_simple_step=0.314, train/loss_vlb_step=0.00158, train/loss_step=0.314, global_step=4744.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  24%|██▍       | 1421/5971 [13:54<44:30,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000779, train/loss_step=0.225, global_step=4745.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1422/5971 [13:55<44:31,  1.70it/s, loss=0.135, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000779, train/loss_step=0.225, global_step=4745.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1422/5971 [13:55<44:31,  1.70it/s, loss=0.132, v_num=0, train/loss_simple_step=0.033, train/loss_vlb_step=0.000126, train/loss_step=0.033, global_step=4745.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1423/5971 [13:56<44:31,  1.70it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0672, train/loss_vlb_step=0.000231, train/loss_step=0.0672, global_step=4745.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1424/5971 [13:58<44:36,  1.70it/s, loss=0.13, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00198, train/loss_step=0.382, global_step=4745.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  24%|██▍       | 1425/5971 [13:59<44:36,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00114, train/loss_step=0.289, global_step=4746.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1426/5971 [14:00<44:37,  1.70it/s, loss=0.144, v_num=0, train/loss_simple_step=0.289, train/loss_vlb_step=0.00114, train/loss_step=0.289, global_step=4746.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1426/5971 [14:00<44:37,  1.70it/s, loss=0.146, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000483, train/loss_step=0.147, global_step=4746.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1427/5971 [14:01<44:37,  1.70it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00481, train/loss_vlb_step=2.43e-5, train/loss_step=0.00481, global_step=4746.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1428/5971 [14:03<44:41,  1.69it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.99e-5, train/loss_step=0.0163, global_step=4746.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  24%|██▍       | 1429/5971 [14:04<44:42,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00537, train/loss_vlb_step=2.63e-5, train/loss_step=0.00537, global_step=4747.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1430/5971 [14:05<44:42,  1.69it/s, loss=0.138, v_num=0, train/loss_simple_step=0.00537, train/loss_vlb_step=2.63e-5, train/loss_step=0.00537, global_step=4747.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1430/5971 [14:05<44:42,  1.69it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000178, train/loss_step=0.0505, global_step=4747.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  24%|██▍       | 1431/5971 [14:06<44:42,  1.69it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0623, train/loss_vlb_step=0.000209, train/loss_step=0.0623, global_step=4747.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1432/5971 [14:08<44:47,  1.69it/s, loss=0.155, v_num=0, train/loss_simple_step=0.449, train/loss_vlb_step=0.00441, train/loss_step=0.449, global_step=4747.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  24%|██▍       | 1433/5971 [14:09<44:48,  1.69it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.27e-5, train/loss_step=0.00924, global_step=4748.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1434/5971 [14:10<44:48,  1.69it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00924, train/loss_vlb_step=4.27e-5, train/loss_step=0.00924, global_step=4748.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1434/5971 [14:10<44:48,  1.69it/s, loss=0.171, v_num=0, train/loss_simple_step=0.529, train/loss_vlb_step=0.00822, train/loss_step=0.529, global_step=4748.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  24%|██▍       | 1435/5971 [14:11<44:48,  1.69it/s, loss=0.172, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000372, train/loss_step=0.113, global_step=4748.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1436/5971 [14:13<44:52,  1.68it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.04e-5, train/loss_step=0.00182, global_step=4748.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1437/5971 [14:14<44:53,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.55e-5, train/loss_step=0.0246, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  24%|██▍       | 1438/5971 [14:15<44:53,  1.68it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0246, train/loss_vlb_step=9.55e-5, train/loss_step=0.0246, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1438/5971 [14:15<44:53,  1.68it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0126, train/loss_vlb_step=5.51e-5, train/loss_step=0.0126, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1439/5971 [14:15<44:53,  1.68it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0298, train/loss_vlb_step=0.000118, train/loss_step=0.0298, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  24%|██▍       | 1440/5971 [14:18<44:58,  1.68it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:15,  2.21it/s][A
Epoch 8:  24%|██▍       | 1442/5971 [14:18<44:54,  1.68it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   2%|▏         | 3/167 [00:00<00:26,  6.16it/s][A
Epoch 8:  24%|██▍       | 1446/5971 [14:18<44:45,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   4%|▍         | 7/167 [00:00<00:11, 13.40it/s][A
Epoch 8:  24%|██▍       | 1450/5971 [14:18<44:36,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   6%|▌         | 10/167 [00:00<00:09, 16.45it/s][A
Epoch 8:  24%|██▍       | 1454/5971 [14:19<44:26,  1.69it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   8%|▊         | 14/167 [00:00<00:07, 20.47it/s][A
Epoch 8:  24%|██▍       | 1458/5971 [14:19<44:17,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  11%|█         | 18/167 [00:01<00:06, 22.64it/s][A

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.54it/s][A
Epoch 8:  24%|██▍       | 1462/5971 [14:19<44:08,  1.70it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.65it/s][A
Epoch 8:  25%|██▍       | 1466/5971 [14:19<43:59,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 26.23it/s][A
Epoch 8:  25%|██▍       | 1470/5971 [14:19<43:50,  1.71it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 26.07it/s][A
Epoch 8:  25%|██▍       | 1474/5971 [14:19<43:41,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  20%|██        | 34/167 [00:01<00:05, 26.27it/s][A

Validating:  22%|██▏       | 37/167 [00:01<00:05, 25.39it/s][A
Epoch 8:  25%|██▍       | 1478/5971 [14:19<43:32,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  24%|██▍       | 40/167 [00:01<00:04, 26.01it/s][A
Epoch 8:  25%|██▍       | 1482/5971 [14:20<43:23,  1.72it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 26.79it/s][A
Epoch 8:  25%|██▍       | 1486/5971 [14:20<43:14,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.82it/s][A

Validating:  29%|██▉       | 49/167 [00:02<00:04, 26.31it/s][A
Epoch 8:  25%|██▍       | 1490/5971 [14:20<43:05,  1.73it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  31%|███       | 52/167 [00:02<00:04, 26.81it/s][A
Epoch 8:  25%|██▌       | 1494/5971 [14:20<42:56,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 27.60it/s][A
Epoch 8:  25%|██▌       | 1498/5971 [14:20<42:48,  1.74it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.96it/s][A
Epoch 8:  25%|██▌       | 1502/5971 [14:20<42:39,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.01it/s][A

Validating:  39%|███▉      | 65/167 [00:02<00:03, 27.42it/s][A
Epoch 8:  25%|██▌       | 1506/5971 [14:20<42:30,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  41%|████▏     | 69/167 [00:02<00:03, 28.83it/s][A
Epoch 8:  25%|██▌       | 1510/5971 [14:21<42:22,  1.75it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.77it/s][A
Epoch 8:  25%|██▌       | 1514/5971 [14:21<42:13,  1.76it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 28.21it/s][A
Epoch 8:  25%|██▌       | 1518/5971 [14:21<42:05,  1.76it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 28.16it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 27.12it/s][A
Epoch 8:  25%|██▌       | 1522/5971 [14:21<41:56,  1.77it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  51%|█████     | 85/167 [00:03<00:02, 28.20it/s][A
Epoch 8:  26%|██▌       | 1526/5971 [14:21<41:48,  1.77it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 28.67it/s][A
Epoch 8:  26%|██▌       | 1530/5971 [14:21<41:39,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.60it/s][A
Epoch 8:  26%|██▌       | 1534/5971 [14:21<41:31,  1.78it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▋    | 94/167 [00:03<00:02, 25.58it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.48it/s][A
Epoch 8:  26%|██▌       | 1538/5971 [14:22<41:23,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 25.84it/s][A
Epoch 8:  26%|██▌       | 1542/5971 [14:22<41:15,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.96it/s][A
Epoch 8:  26%|██▌       | 1546/5971 [14:22<41:06,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.70it/s][A
Epoch 8:  26%|██▌       | 1550/5971 [14:22<40:58,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.65it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.13it/s][A
Epoch 8:  26%|██▌       | 1554/5971 [14:22<40:50,  1.80it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 116/167 [00:04<00:02, 25.49it/s][A
Epoch 8:  26%|██▌       | 1558/5971 [14:22<40:42,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 25.40it/s][A
Epoch 8:  26%|██▌       | 1562/5971 [14:23<40:34,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  73%|███████▎  | 122/167 [00:04<00:01, 25.68it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 24.93it/s][A
Epoch 8:  26%|██▌       | 1566/5971 [14:23<40:26,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.36it/s][A
Epoch 8:  26%|██▋       | 1570/5971 [14:23<40:18,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.78it/s][A
Epoch 8:  26%|██▋       | 1574/5971 [14:23<40:10,  1.82it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.22it/s][A
Epoch 8:  26%|██▋       | 1578/5971 [14:23<40:02,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.27it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.47it/s][A
Epoch 8:  26%|██▋       | 1582/5971 [14:23<39:54,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 26.07it/s][A
Epoch 8:  27%|██▋       | 1586/5971 [14:23<39:47,  1.84it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  88%|████████▊ | 147/167 [00:05<00:00, 26.54it/s][A
Epoch 8:  27%|██▋       | 1590/5971 [14:24<39:39,  1.84it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.29it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.35it/s][A
Epoch 8:  27%|██▋       | 1594/5971 [14:24<39:31,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 26.16it/s][A
Epoch 8:  27%|██▋       | 1598/5971 [14:24<39:23,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.46it/s][A
Epoch 8:  27%|██▋       | 1602/5971 [14:24<39:16,  1.85it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.02it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 27.00it/s][A
Epoch 8:  27%|██▋       | 1606/5971 [14:24<39:08,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1608/5971 [14:25<39:05,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.51e-6, train/loss_step=0.00156, global_step=4749.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  27%|██▋       | 1609/5971 [14:26<39:06,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000632, train/loss_step=0.181, global_step=4750.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  27%|██▋       | 1610/5971 [14:26<39:06,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000632, train/loss_step=0.181, global_step=4750.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1610/5971 [14:26<39:06,  1.86it/s, loss=0.119, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.2e-5, train/loss_step=0.002, global_step=4750.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  27%|██▋       | 1611/5971 [14:27<39:07,  1.86it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0245, train/loss_vlb_step=9.7e-5, train/loss_step=0.0245, global_step=4750.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1612/5971 [14:29<39:10,  1.85it/s, loss=0.104, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.00045, train/loss_step=0.134, global_step=4750.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  27%|██▋       | 1613/5971 [14:30<39:11,  1.85it/s, loss=0.102, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.001, train/loss_step=0.243, global_step=4751.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  27%|██▋       | 1614/5971 [14:31<39:11,  1.85it/s, loss=0.102, v_num=0, train/loss_simple_step=0.243, train/loss_vlb_step=0.001, train/loss_step=0.243, global_step=4751.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1614/5971 [14:31<39:11,  1.85it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.0412, train/loss_vlb_step=0.000151, train/loss_step=0.0412, global_step=4751.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1615/5971 [14:32<39:12,  1.85it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0818, train/loss_vlb_step=0.000269, train/loss_step=0.0818, global_step=4751.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  27%|██▋       | 1616/5971 [14:34<39:15,  1.85it/s, loss=0.1, v_num=0, train/loss_simple_step=0.00588, train/loss_vlb_step=2.93e-5, train/loss_step=0.00588, global_step=4751.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  27%|██▋       | 1617/5971 [14:35<39:16,  1.85it/s, loss=0.105, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=4752.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  27%|██▋       | 1618/5971 [14:36<39:17,  1.85it/s, loss=0.105, v_num=0, train/loss_simple_step=0.113, train/loss_vlb_step=0.000373, train/loss_step=0.113, global_step=4752.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1618/5971 [14:36<39:17,  1.85it/s, loss=0.108, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000345, train/loss_step=0.105, global_step=4752.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1619/5971 [14:37<39:17,  1.85it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.59e-5, train/loss_step=0.0181, global_step=4752.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1620/5971 [14:39<39:20,  1.84it/s, loss=0.084, v_num=0, train/loss_simple_step=0.010, train/loss_vlb_step=4.58e-5, train/loss_step=0.010, global_step=4752.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  27%|██▋       | 1621/5971 [14:40<39:21,  1.84it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.41e-5, train/loss_step=0.00251, global_step=4753.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1622/5971 [14:41<39:21,  1.84it/s, loss=0.0837, v_num=0, train/loss_simple_step=0.00251, train/loss_vlb_step=1.41e-5, train/loss_step=0.00251, global_step=4753.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1622/5971 [14:41<39:21,  1.84it/s, loss=0.0575, v_num=0, train/loss_simple_step=0.0047, train/loss_vlb_step=2.37e-5, train/loss_step=0.0047, global_step=4753.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  27%|██▋       | 1623/5971 [14:42<39:22,  1.84it/s, loss=0.0652, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00117, train/loss_step=0.268, global_step=4753.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  27%|██▋       | 1624/5971 [14:44<39:25,  1.84it/s, loss=0.0652, v_num=0, train/loss_simple_step=0.00157, train/loss_vlb_step=9.21e-6, train/loss_step=0.00157, global_step=4753.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1625/5971 [14:45<39:26,  1.84it/s, loss=0.0743, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000838, train/loss_step=0.207, global_step=4754.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  27%|██▋       | 1626/5971 [14:46<39:26,  1.84it/s, loss=0.0743, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000838, train/loss_step=0.207, global_step=4754.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1626/5971 [14:46<39:26,  1.84it/s, loss=0.0832, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000661, train/loss_step=0.190, global_step=4754.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1627/5971 [14:47<39:26,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.438, train/loss_vlb_step=0.0029, train/loss_step=0.438, global_step=4754.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  27%|██▋       | 1628/5971 [14:49<39:31,  1.83it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00212, train/loss_vlb_step=1.24e-5, train/loss_step=0.00212, global_step=4754.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1629/5971 [14:50<39:31,  1.83it/s, loss=0.112, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00129, train/loss_step=0.338, global_step=4755.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  27%|██▋       | 1630/5971 [14:51<39:31,  1.83it/s, loss=0.112, v_num=0, train/loss_simple_step=0.338, train/loss_vlb_step=0.00129, train/loss_step=0.338, global_step=4755.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1630/5971 [14:51<39:31,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.034, train/loss_vlb_step=0.000129, train/loss_step=0.034, global_step=4755.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1631/5971 [14:52<39:32,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=6.17e-5, train/loss_step=0.014, global_step=4755.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  27%|██▋       | 1632/5971 [14:54<39:35,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.095, train/loss_vlb_step=0.000314, train/loss_step=0.095, global_step=4755.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1633/5971 [14:55<39:36,  1.83it/s, loss=0.117, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00268, train/loss_step=0.366, global_step=4756.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  27%|██▋       | 1634/5971 [14:55<39:36,  1.83it/s, loss=0.117, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00268, train/loss_step=0.366, global_step=4756.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1634/5971 [14:55<39:36,  1.83it/s, loss=0.116, v_num=0, train/loss_simple_step=0.0293, train/loss_vlb_step=0.000109, train/loss_step=0.0293, global_step=4756.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1635/5971 [14:56<39:36,  1.82it/s, loss=0.14, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00513, train/loss_step=0.564, global_step=4756.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  27%|██▋       | 1636/5971 [14:58<39:40,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.225, train/loss_vlb_step=0.000904, train/loss_step=0.225, global_step=4756.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1637/5971 [14:59<39:41,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000502, train/loss_step=0.150, global_step=4757.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1638/5971 [15:00<39:41,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.000502, train/loss_step=0.150, global_step=4757.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1638/5971 [15:00<39:41,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000723, train/loss_step=0.214, global_step=4757.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1639/5971 [15:01<39:41,  1.82it/s, loss=0.169, v_num=0, train/loss_simple_step=0.218, train/loss_vlb_step=0.000772, train/loss_step=0.218, global_step=4757.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1640/5971 [15:03<39:45,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.306, train/loss_vlb_step=0.00226, train/loss_step=0.306, global_step=4757.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  27%|██▋       | 1641/5971 [15:04<39:45,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.13e-5, train/loss_step=0.002, global_step=4758.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1642/5971 [15:05<39:45,  1.81it/s, loss=0.183, v_num=0, train/loss_simple_step=0.002, train/loss_vlb_step=1.13e-5, train/loss_step=0.002, global_step=4758.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  27%|██▋       | 1642/5971 [15:05<39:45,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000798, train/loss_step=0.221, global_step=4758.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1643/5971 [15:06<39:46,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.380, train/loss_vlb_step=0.00226, train/loss_step=0.380, global_step=4758.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  28%|██▊       | 1644/5971 [15:08<39:49,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=2.12e-5, train/loss_step=0.00393, global_step=4758.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1645/5971 [15:09<39:50,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.000329, train/loss_step=0.0998, global_step=4759.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1646/5971 [15:10<39:50,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0998, train/loss_vlb_step=0.000329, train/loss_step=0.0998, global_step=4759.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1646/5971 [15:10<39:50,  1.81it/s, loss=0.185, v_num=0, train/loss_simple_step=0.00253, train/loss_vlb_step=1.37e-5, train/loss_step=0.00253, global_step=4759.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1647/5971 [15:11<39:50,  1.81it/s, loss=0.169, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=4759.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  28%|██▊       | 1648/5971 [15:13<39:54,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.022, train/loss_vlb_step=8.97e-5, train/loss_step=0.022, global_step=4759.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1649/5971 [15:14<39:55,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0737, train/loss_vlb_step=0.000249, train/loss_step=0.0737, global_step=4760.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1650/5971 [15:15<39:55,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0737, train/loss_vlb_step=0.000249, train/loss_step=0.0737, global_step=4760.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1650/5971 [15:15<39:55,  1.80it/s, loss=0.174, v_num=0, train/loss_simple_step=0.382, train/loss_vlb_step=0.00275, train/loss_step=0.382, global_step=4760.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  28%|██▊       | 1651/5971 [15:16<39:55,  1.80it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0197, train/loss_vlb_step=7.92e-5, train/loss_step=0.0197, global_step=4760.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1652/5971 [15:18<39:59,  1.80it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00172, train/loss_vlb_step=1e-5, train/loss_step=0.00172, global_step=4760.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1653/5971 [15:19<39:59,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00187, train/loss_step=0.403, global_step=4761.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1654/5971 [15:20<39:59,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00187, train/loss_step=0.403, global_step=4761.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1654/5971 [15:20<39:59,  1.80it/s, loss=0.189, v_num=0, train/loss_simple_step=0.390, train/loss_vlb_step=0.00217, train/loss_step=0.390, global_step=4761.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1655/5971 [15:20<40:00,  1.80it/s, loss=0.19, v_num=0, train/loss_simple_step=0.569, train/loss_vlb_step=0.00907, train/loss_step=0.569, global_step=4761.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1656/5971 [15:23<40:04,  1.79it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0427, train/loss_vlb_step=0.00015, train/loss_step=0.0427, global_step=4761.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1657/5971 [15:24<40:05,  1.79it/s, loss=0.183, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000713, train/loss_step=0.207, global_step=4762.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1658/5971 [15:25<40:05,  1.79it/s, loss=0.183, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000713, train/loss_step=0.207, global_step=4762.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1658/5971 [15:25<40:05,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0477, train/loss_vlb_step=0.000168, train/loss_step=0.0477, global_step=4762.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1659/5971 [15:26<40:05,  1.79it/s, loss=0.171, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000477, train/loss_step=0.144, global_step=4762.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1660/5971 [15:28<40:09,  1.79it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00959, train/loss_vlb_step=4.51e-5, train/loss_step=0.00959, global_step=4762.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1661/5971 [15:29<40:09,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00239, train/loss_step=0.427, global_step=4763.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  28%|██▊       | 1662/5971 [15:30<40:09,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.427, train/loss_vlb_step=0.00239, train/loss_step=0.427, global_step=4763.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1662/5971 [15:30<40:09,  1.79it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0195, train/loss_vlb_step=8.07e-5, train/loss_step=0.0195, global_step=4763.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1663/5971 [15:30<40:10,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.050, train/loss_vlb_step=0.000176, train/loss_step=0.050, global_step=4763.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1664/5971 [15:33<40:13,  1.78it/s, loss=0.188, v_num=0, train/loss_simple_step=0.747, train/loss_vlb_step=0.0232, train/loss_step=0.747, global_step=4763.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1665/5971 [15:34<40:14,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.00075, train/loss_step=0.209, global_step=4764.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1666/5971 [15:34<40:14,  1.78it/s, loss=0.194, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.00075, train/loss_step=0.209, global_step=4764.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1666/5971 [15:34<40:14,  1.78it/s, loss=0.202, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000634, train/loss_step=0.174, global_step=4764.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1667/5971 [15:35<40:14,  1.78it/s, loss=0.21, v_num=0, train/loss_simple_step=0.269, train/loss_vlb_step=0.00109, train/loss_step=0.269, global_step=4764.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1668/5971 [15:37<40:18,  1.78it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=8.22e-5, train/loss_step=0.0225, global_step=4764.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1669/5971 [15:38<40:18,  1.78it/s, loss=0.234, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00519, train/loss_step=0.542, global_step=4765.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1670/5971 [15:39<40:19,  1.78it/s, loss=0.234, v_num=0, train/loss_simple_step=0.542, train/loss_vlb_step=0.00519, train/loss_step=0.542, global_step=4765.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1670/5971 [15:39<40:19,  1.78it/s, loss=0.215, v_num=0, train/loss_simple_step=0.00378, train/loss_vlb_step=1.98e-5, train/loss_step=0.00378, global_step=4765.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1671/5971 [15:40<40:19,  1.78it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0041, train/loss_vlb_step=2.19e-5, train/loss_step=0.0041, global_step=4765.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1672/5971 [15:42<40:22,  1.77it/s, loss=0.221, v_num=0, train/loss_simple_step=0.134, train/loss_vlb_step=0.000442, train/loss_step=0.134, global_step=4765.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1673/5971 [15:43<40:23,  1.77it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000305, train/loss_step=0.0928, global_step=4766.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1674/5971 [15:44<40:23,  1.77it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0928, train/loss_vlb_step=0.000305, train/loss_step=0.0928, global_step=4766.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1674/5971 [15:44<40:23,  1.77it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0389, train/loss_vlb_step=0.00015, train/loss_step=0.0389, global_step=4766.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1675/5971 [15:45<40:24,  1.77it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0115, train/loss_vlb_step=4.95e-5, train/loss_step=0.0115, global_step=4766.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1676/5971 [15:48<40:28,  1.77it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00724, train/loss_vlb_step=3.26e-5, train/loss_step=0.00724, global_step=4766.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1677/5971 [15:49<40:29,  1.77it/s, loss=0.17, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00285, train/loss_step=0.445, global_step=4767.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  28%|██▊       | 1678/5971 [15:50<40:29,  1.77it/s, loss=0.17, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00285, train/loss_step=0.445, global_step=4767.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1678/5971 [15:50<40:29,  1.77it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0174, train/loss_vlb_step=7.29e-5, train/loss_step=0.0174, global_step=4767.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1679/5971 [15:50<40:29,  1.77it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0524, train/loss_vlb_step=0.00018, train/loss_step=0.0524, global_step=4767.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1680/5971 [15:53<40:33,  1.76it/s, loss=0.17, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000424, train/loss_step=0.129, global_step=4767.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1681/5971 [15:54<40:33,  1.76it/s, loss=0.155, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000451, train/loss_step=0.131, global_step=4768.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1682/5971 [15:54<40:33,  1.76it/s, loss=0.155, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000451, train/loss_step=0.131, global_step=4768.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1682/5971 [15:54<40:33,  1.76it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.96e-5, train/loss_step=0.0193, global_step=4768.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1683/5971 [15:55<40:33,  1.76it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00321, train/loss_vlb_step=1.73e-5, train/loss_step=0.00321, global_step=4768.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1684/5971 [15:58<40:38,  1.76it/s, loss=0.127, v_num=0, train/loss_simple_step=0.231, train/loss_vlb_step=0.000867, train/loss_step=0.231, global_step=4768.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  28%|██▊       | 1685/5971 [15:59<40:38,  1.76it/s, loss=0.125, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000617, train/loss_step=0.179, global_step=4769.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1686/5971 [16:00<40:38,  1.76it/s, loss=0.125, v_num=0, train/loss_simple_step=0.179, train/loss_vlb_step=0.000617, train/loss_step=0.179, global_step=4769.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1686/5971 [16:00<40:38,  1.76it/s, loss=0.128, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.00104, train/loss_step=0.230, global_step=4769.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1687/5971 [16:01<40:39,  1.76it/s, loss=0.13, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.0013, train/loss_step=0.310, global_step=4769.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1688/5971 [16:03<40:42,  1.75it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00356, train/loss_vlb_step=1.92e-5, train/loss_step=0.00356, global_step=4769.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1689/5971 [16:04<40:42,  1.75it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.00013, train/loss_step=0.0343, global_step=4770.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1690/5971 [16:04<40:42,  1.75it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.00013, train/loss_step=0.0343, global_step=4770.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1690/5971 [16:04<40:42,  1.75it/s, loss=0.113, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.00071, train/loss_step=0.195, global_step=4770.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1691/5971 [16:05<40:43,  1.75it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.29e-5, train/loss_step=0.0173, global_step=4770.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1692/5971 [16:07<40:46,  1.75it/s, loss=0.124, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00215, train/loss_step=0.337, global_step=4770.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1693/5971 [16:08<40:46,  1.75it/s, loss=0.159, v_num=0, train/loss_simple_step=0.789, train/loss_vlb_step=0.0578, train/loss_step=0.789, global_step=4771.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  28%|██▊       | 1694/5971 [16:09<40:46,  1.75it/s, loss=0.159, v_num=0, train/loss_simple_step=0.789, train/loss_vlb_step=0.0578, train/loss_step=0.789, global_step=4771.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1694/5971 [16:09<40:46,  1.75it/s, loss=0.185, v_num=0, train/loss_simple_step=0.550, train/loss_vlb_step=0.00556, train/loss_step=0.550, global_step=4771.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1695/5971 [16:10<40:46,  1.75it/s, loss=0.208, v_num=0, train/loss_simple_step=0.479, train/loss_vlb_step=0.00491, train/loss_step=0.479, global_step=4771.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1696/5971 [16:12<40:50,  1.74it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0406, train/loss_vlb_step=0.00014, train/loss_step=0.0406, global_step=4771.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1697/5971 [16:13<40:50,  1.74it/s, loss=0.195, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000509, train/loss_step=0.142, global_step=4772.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1698/5971 [16:14<40:51,  1.74it/s, loss=0.195, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000509, train/loss_step=0.142, global_step=4772.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1698/5971 [16:14<40:51,  1.74it/s, loss=0.21, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00152, train/loss_step=0.329, global_step=4772.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  28%|██▊       | 1699/5971 [16:15<40:51,  1.74it/s, loss=0.214, v_num=0, train/loss_simple_step=0.138, train/loss_vlb_step=0.000581, train/loss_step=0.138, global_step=4772.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1700/5971 [16:17<40:54,  1.74it/s, loss=0.208, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.16e-5, train/loss_step=0.00196, global_step=4772.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  28%|██▊       | 1701/5971 [16:18<40:54,  1.74it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000289, train/loss_step=0.0859, global_step=4773.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  29%|██▊       | 1702/5971 [16:19<40:54,  1.74it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000289, train/loss_step=0.0859, global_step=4773.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  29%|██▊       | 1702/5971 [16:19<40:54,  1.74it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0362, train/loss_vlb_step=0.000128, train/loss_step=0.0362, global_step=4773.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  29%|██▊       | 1703/5971 [16:20<40:55,  1.74it/s, loss=0.217, v_num=0, train/loss_simple_step=0.210, train/loss_vlb_step=0.00079, train/loss_step=0.210, global_step=4773.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  29%|██▊       | 1704/5971 [16:22<40:58,  1.74it/s, loss=0.218, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.000968, train/loss_step=0.250, global_step=4773.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  29%|██▊       | 1705/5971 [16:23<40:59,  1.73it/s, loss=0.218, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000588, train/loss_step=0.173, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  29%|██▊       | 1706/5971 [16:24<40:59,  1.73it/s, loss=0.218, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000588, train/loss_step=0.173, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  29%|██▊       | 1706/5971 [16:24<40:59,  1.73it/s, loss=0.213, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000445, train/loss_step=0.135, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  29%|██▊       | 1707/5971 [16:25<40:59,  1.73it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.62e-5, train/loss_step=0.00291, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  29%|██▊       | 1708/5971 [16:27<41:02,  1.73it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:03,  2.61it/s][A
Epoch 8:  29%|██▊       | 1710/5971 [16:27<40:59,  1.73it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   1%|          | 2/167 [00:01<02:00,  1.37it/s][A

Validating:   3%|▎         | 5/167 [00:01<00:37,  4.32it/s][A
Epoch 8:  29%|██▊       | 1714/5971 [16:28<40:54,  1.73it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:01<00:21,  7.49it/s][A
Epoch 8:  29%|██▉       | 1718/5971 [16:28<40:46,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 12/167 [00:01<00:12, 11.99it/s][A
Epoch 8:  29%|██▉       | 1722/5971 [16:29<40:39,  1.74it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   9%|▉         | 15/167 [00:01<00:10, 14.93it/s][A
Epoch 8:  29%|██▉       | 1726/5971 [16:29<40:31,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  11%|█         | 18/167 [00:01<00:08, 17.26it/s][A

Validating:  13%|█▎        | 21/167 [00:02<00:07, 18.95it/s][A
Epoch 8:  29%|██▉       | 1730/5971 [16:29<40:24,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 24/167 [00:02<00:06, 21.41it/s][A
Epoch 8:  29%|██▉       | 1734/5971 [16:29<40:16,  1.75it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  16%|█▌        | 27/167 [00:02<00:06, 23.13it/s][A
Epoch 8:  29%|██▉       | 1738/5971 [16:29<40:09,  1.76it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  18%|█▊        | 30/167 [00:02<00:05, 23.86it/s][A

Validating:  20%|█▉        | 33/167 [00:02<00:05, 24.37it/s][A
Epoch 8:  29%|██▉       | 1742/5971 [16:29<40:01,  1.76it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  22%|██▏       | 36/167 [00:02<00:05, 24.98it/s][A
Epoch 8:  29%|██▉       | 1746/5971 [16:29<39:54,  1.76it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  23%|██▎       | 39/167 [00:02<00:05, 25.07it/s][A
Epoch 8:  29%|██▉       | 1750/5971 [16:30<39:46,  1.77it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▌       | 42/167 [00:02<00:05, 24.23it/s][A

Validating:  27%|██▋       | 45/167 [00:02<00:04, 24.65it/s][A
Epoch 8:  29%|██▉       | 1754/5971 [16:30<39:39,  1.77it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  29%|██▊       | 48/167 [00:03<00:04, 25.85it/s][A
Epoch 8:  29%|██▉       | 1758/5971 [16:30<39:32,  1.78it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  31%|███       | 51/167 [00:03<00:04, 25.71it/s][A
Epoch 8:  30%|██▉       | 1762/5971 [16:30<39:24,  1.78it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 54/167 [00:03<00:04, 25.73it/s][A

Validating:  34%|███▍      | 57/167 [00:03<00:04, 26.34it/s][A
Epoch 8:  30%|██▉       | 1766/5971 [16:30<39:17,  1.78it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  36%|███▌      | 60/167 [00:03<00:04, 26.03it/s][A
Epoch 8:  30%|██▉       | 1770/5971 [16:30<39:10,  1.79it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 64/167 [00:03<00:03, 28.17it/s][A
Epoch 8:  30%|██▉       | 1774/5971 [16:31<39:03,  1.79it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.39it/s][A
Epoch 8:  30%|██▉       | 1778/5971 [16:31<38:56,  1.79it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.06it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.19it/s][A
Epoch 8:  30%|██▉       | 1782/5971 [16:31<38:49,  1.80it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  46%|████▌     | 76/167 [00:04<00:03, 26.89it/s][A
Epoch 8:  30%|██▉       | 1786/5971 [16:31<38:42,  1.80it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  47%|████▋     | 79/167 [00:04<00:03, 26.22it/s][A
Epoch 8:  30%|██▉       | 1790/5971 [16:31<38:35,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  49%|████▉     | 82/167 [00:04<00:03, 25.09it/s][A

Validating:  51%|█████     | 85/167 [00:04<00:03, 25.40it/s][A
Epoch 8:  30%|███       | 1794/5971 [16:31<38:27,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 88/167 [00:04<00:03, 25.98it/s][A
Epoch 8:  30%|███       | 1798/5971 [16:31<38:20,  1.81it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  54%|█████▍    | 91/167 [00:04<00:03, 24.54it/s][A
Epoch 8:  30%|███       | 1802/5971 [16:32<38:14,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▋    | 94/167 [00:04<00:03, 24.25it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 25.62it/s][A
Epoch 8:  30%|███       | 1806/5971 [16:32<38:07,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  60%|█████▉    | 100/167 [00:05<00:02, 25.29it/s][A
Epoch 8:  30%|███       | 1810/5971 [16:32<38:00,  1.82it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  62%|██████▏   | 103/167 [00:05<00:02, 25.42it/s][A
Epoch 8:  30%|███       | 1814/5971 [16:32<37:53,  1.83it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  64%|██████▍   | 107/167 [00:05<00:02, 26.81it/s][A
Epoch 8:  30%|███       | 1818/5971 [16:32<37:46,  1.83it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▌   | 110/167 [00:05<00:02, 24.57it/s][A

Validating:  68%|██████▊   | 113/167 [00:05<00:02, 23.17it/s][A
Epoch 8:  31%|███       | 1822/5971 [16:32<37:39,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 116/167 [00:05<00:02, 23.85it/s][A
Epoch 8:  31%|███       | 1826/5971 [16:33<37:33,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 24.09it/s][A
Epoch 8:  31%|███       | 1830/5971 [16:33<37:26,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 25.08it/s][A

Validating:  75%|███████▍  | 125/167 [00:06<00:01, 25.64it/s][A
Epoch 8:  31%|███       | 1834/5971 [16:33<37:19,  1.85it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  77%|███████▋  | 128/167 [00:06<00:01, 25.37it/s][A
Epoch 8:  31%|███       | 1838/5971 [16:33<37:13,  1.85it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  78%|███████▊  | 131/167 [00:06<00:01, 25.78it/s][A
Epoch 8:  31%|███       | 1842/5971 [16:33<37:06,  1.85it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  80%|████████  | 134/167 [00:06<00:01, 25.25it/s][A

Validating:  82%|████████▏ | 137/167 [00:06<00:01, 25.51it/s][A
Epoch 8:  31%|███       | 1846/5971 [16:33<36:59,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  84%|████████▍ | 140/167 [00:06<00:01, 25.51it/s][A
Epoch 8:  31%|███       | 1850/5971 [16:34<36:53,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 143/167 [00:06<00:00, 26.39it/s][A
Epoch 8:  31%|███       | 1854/5971 [16:34<36:46,  1.87it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 146/167 [00:06<00:00, 27.00it/s][A

Validating:  89%|████████▉ | 149/167 [00:07<00:00, 25.28it/s][A
Epoch 8:  31%|███       | 1858/5971 [16:34<36:40,  1.87it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  91%|█████████ | 152/167 [00:07<00:00, 26.39it/s][A
Epoch 8:  31%|███       | 1862/5971 [16:34<36:33,  1.87it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  93%|█████████▎| 155/167 [00:07<00:00, 26.47it/s][A
Epoch 8:  31%|███▏      | 1866/5971 [16:34<36:26,  1.88it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▍| 158/167 [00:07<00:00, 26.89it/s][A

Validating:  96%|█████████▋| 161/167 [00:07<00:00, 26.62it/s][A
Epoch 8:  31%|███▏      | 1870/5971 [16:34<36:20,  1.88it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  98%|█████████▊| 164/167 [00:07<00:00, 25.45it/s][A
Epoch 8:  31%|███▏      | 1874/5971 [16:34<36:14,  1.88it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 100%|██████████| 167/167 [00:07<00:00, 26.34it/s][A
Epoch 8:  31%|███▏      | 1876/5971 [16:35<36:11,  1.89it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0596, train/loss_vlb_step=0.000208, train/loss_step=0.0596, global_step=4774.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  31%|███▏      | 1877/5971 [16:36<36:12,  1.88it/s, loss=0.209, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000771, train/loss_step=0.211, global_step=4775.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  31%|███▏      | 1878/5971 [16:37<36:12,  1.88it/s, loss=0.209, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000771, train/loss_step=0.211, global_step=4775.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  31%|███▏      | 1878/5971 [16:37<36:12,  1.88it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0161, train/loss_vlb_step=6.75e-5, train/loss_step=0.0161, global_step=4775.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  31%|███▏      | 1879/5971 [16:38<36:12,  1.88it/s, loss=0.201, v_num=0, train/loss_simple_step=0.039, train/loss_vlb_step=0.000152, train/loss_step=0.039, global_step=4775.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  31%|███▏      | 1880/5971 [16:40<36:15,  1.88it/s, loss=0.195, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000716, train/loss_step=0.209, global_step=4775.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1881/5971 [16:41<36:15,  1.88it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000201, train/loss_step=0.0575, global_step=4776.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1882/5971 [16:42<36:16,  1.88it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0575, train/loss_vlb_step=0.000201, train/loss_step=0.0575, global_step=4776.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1882/5971 [16:42<36:16,  1.88it/s, loss=0.165, v_num=0, train/loss_simple_step=0.683, train/loss_vlb_step=0.0167, train/loss_step=0.683, global_step=4776.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  32%|███▏      | 1883/5971 [16:42<36:16,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000115, train/loss_step=0.0311, global_step=4776.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1884/5971 [16:45<36:19,  1.88it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000169, train/loss_step=0.0467, global_step=4776.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1885/5971 [16:46<36:19,  1.87it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.04e-5, train/loss_step=0.0219, global_step=4777.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  32%|███▏      | 1886/5971 [16:47<36:20,  1.87it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0219, train/loss_vlb_step=8.04e-5, train/loss_step=0.0219, global_step=4777.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1886/5971 [16:47<36:20,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00951, train/loss_vlb_step=4.32e-5, train/loss_step=0.00951, global_step=4777.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1887/5971 [16:48<36:20,  1.87it/s, loss=0.12, v_num=0, train/loss_simple_step=0.125, train/loss_vlb_step=0.00041, train/loss_step=0.125, global_step=4777.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  32%|███▏      | 1888/5971 [16:50<36:23,  1.87it/s, loss=0.137, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00159, train/loss_step=0.334, global_step=4777.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1889/5971 [16:51<36:24,  1.87it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=7.16e-5, train/loss_step=0.0162, global_step=4778.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1890/5971 [16:52<36:24,  1.87it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=7.16e-5, train/loss_step=0.0162, global_step=4778.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1890/5971 [16:52<36:24,  1.87it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0621, train/loss_vlb_step=0.000213, train/loss_step=0.0621, global_step=4778.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1891/5971 [16:53<36:24,  1.87it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=5.98e-5, train/loss_step=0.0139, global_step=4778.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  32%|███▏      | 1892/5971 [16:55<36:28,  1.86it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0933, train/loss_vlb_step=0.00031, train/loss_step=0.0933, global_step=4778.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1893/5971 [16:56<36:28,  1.86it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.00016, train/loss_step=0.0464, global_step=4779.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1894/5971 [16:57<36:28,  1.86it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0464, train/loss_vlb_step=0.00016, train/loss_step=0.0464, global_step=4779.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1894/5971 [16:57<36:28,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00348, train/loss_step=0.387, global_step=4779.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  32%|███▏      | 1895/5971 [16:58<36:28,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.05e-5, train/loss_step=0.00395, global_step=4779.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1896/5971 [17:00<36:31,  1.86it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00731, train/loss_vlb_step=3.51e-5, train/loss_step=0.00731, global_step=4779.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1897/5971 [17:01<36:32,  1.86it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0712, train/loss_vlb_step=0.000248, train/loss_step=0.0712, global_step=4780.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  32%|███▏      | 1898/5971 [17:02<36:32,  1.86it/s, loss=0.114, v_num=0, train/loss_simple_step=0.0712, train/loss_vlb_step=0.000248, train/loss_step=0.0712, global_step=4780.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1898/5971 [17:02<36:32,  1.86it/s, loss=0.123, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000711, train/loss_step=0.201, global_step=4780.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  32%|███▏      | 1899/5971 [17:03<36:32,  1.86it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0758, train/loss_vlb_step=0.000261, train/loss_step=0.0758, global_step=4780.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1900/5971 [17:05<36:35,  1.85it/s, loss=0.121, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000416, train/loss_step=0.127, global_step=4780.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  32%|███▏      | 1901/5971 [17:06<36:35,  1.85it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000113, train/loss_step=0.0303, global_step=4781.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1902/5971 [17:06<36:35,  1.85it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0303, train/loss_vlb_step=0.000113, train/loss_step=0.0303, global_step=4781.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1902/5971 [17:06<36:35,  1.85it/s, loss=0.101, v_num=0, train/loss_simple_step=0.311, train/loss_vlb_step=0.0015, train/loss_step=0.311, global_step=4781.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  32%|███▏      | 1903/5971 [17:07<36:36,  1.85it/s, loss=0.107, v_num=0, train/loss_simple_step=0.147, train/loss_vlb_step=0.000494, train/loss_step=0.147, global_step=4781.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1904/5971 [17:10<36:38,  1.85it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=0.0001, train/loss_step=0.0263, global_step=4781.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1905/5971 [17:10<36:39,  1.85it/s, loss=0.11, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000418, train/loss_step=0.122, global_step=4782.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  32%|███▏      | 1906/5971 [17:11<36:39,  1.85it/s, loss=0.11, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000418, train/loss_step=0.122, global_step=4782.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1906/5971 [17:11<36:39,  1.85it/s, loss=0.12, v_num=0, train/loss_simple_step=0.203, train/loss_vlb_step=0.000785, train/loss_step=0.203, global_step=4782.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1907/5971 [17:12<36:39,  1.85it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0192, train/loss_vlb_step=7.9e-5, train/loss_step=0.0192, global_step=4782.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1908/5971 [17:14<36:42,  1.84it/s, loss=0.121, v_num=0, train/loss_simple_step=0.447, train/loss_vlb_step=0.00321, train/loss_step=0.447, global_step=4782.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  32%|███▏      | 1909/5971 [17:15<36:42,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00326, train/loss_step=0.446, global_step=4783.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1910/5971 [17:16<36:42,  1.84it/s, loss=0.142, v_num=0, train/loss_simple_step=0.446, train/loss_vlb_step=0.00326, train/loss_step=0.446, global_step=4783.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1910/5971 [17:16<36:42,  1.84it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0164, train/loss_vlb_step=6.63e-5, train/loss_step=0.0164, global_step=4783.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1911/5971 [17:17<36:42,  1.84it/s, loss=0.154, v_num=0, train/loss_simple_step=0.295, train/loss_vlb_step=0.00141, train/loss_step=0.295, global_step=4783.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  32%|███▏      | 1912/5971 [17:19<36:46,  1.84it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000318, train/loss_step=0.0968, global_step=4783.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1913/5971 [17:20<36:46,  1.84it/s, loss=0.171, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00213, train/loss_step=0.388, global_step=4784.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  32%|███▏      | 1914/5971 [17:21<36:46,  1.84it/s, loss=0.171, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00213, train/loss_step=0.388, global_step=4784.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1914/5971 [17:21<36:46,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=5.01e-5, train/loss_step=0.0108, global_step=4784.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1915/5971 [17:22<36:46,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00191, train/loss_vlb_step=1.14e-5, train/loss_step=0.00191, global_step=4784.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1916/5971 [17:24<36:49,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0588, train/loss_vlb_step=0.000204, train/loss_step=0.0588, global_step=4784.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  32%|███▏      | 1917/5971 [17:25<36:49,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000458, train/loss_step=0.136, global_step=4785.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  32%|███▏      | 1918/5971 [17:26<36:49,  1.83it/s, loss=0.158, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000458, train/loss_step=0.136, global_step=4785.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1918/5971 [17:26<36:49,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0822, train/loss_vlb_step=0.000273, train/loss_step=0.0822, global_step=4785.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1919/5971 [17:27<36:49,  1.83it/s, loss=0.155, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.00042, train/loss_step=0.128, global_step=4785.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  32%|███▏      | 1920/5971 [17:29<36:53,  1.83it/s, loss=0.159, v_num=0, train/loss_simple_step=0.209, train/loss_vlb_step=0.000726, train/loss_step=0.209, global_step=4785.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1921/5971 [17:30<36:54,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.26e-5, train/loss_step=0.00218, global_step=4786.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1922/5971 [17:31<36:54,  1.83it/s, loss=0.157, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.26e-5, train/loss_step=0.00218, global_step=4786.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1922/5971 [17:31<36:54,  1.83it/s, loss=0.15, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000606, train/loss_step=0.175, global_step=4786.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  32%|███▏      | 1923/5971 [17:32<36:54,  1.83it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00244, train/loss_vlb_step=1.32e-5, train/loss_step=0.00244, global_step=4786.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1924/5971 [17:34<36:57,  1.83it/s, loss=0.17, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00883, train/loss_step=0.568, global_step=4786.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  32%|███▏      | 1925/5971 [17:35<36:57,  1.82it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00624, train/loss_vlb_step=3.05e-5, train/loss_step=0.00624, global_step=4787.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1926/5971 [17:36<36:57,  1.82it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00624, train/loss_vlb_step=3.05e-5, train/loss_step=0.00624, global_step=4787.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1926/5971 [17:36<36:57,  1.82it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00684, train/loss_vlb_step=3.36e-5, train/loss_step=0.00684, global_step=4787.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1927/5971 [17:37<36:57,  1.82it/s, loss=0.154, v_num=0, train/loss_simple_step=0.00646, train/loss_vlb_step=3.15e-5, train/loss_step=0.00646, global_step=4787.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1928/5971 [17:39<37:01,  1.82it/s, loss=0.132, v_num=0, train/loss_simple_step=0.00754, train/loss_vlb_step=3.47e-5, train/loss_step=0.00754, global_step=4787.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1929/5971 [17:40<37:01,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.850, train/loss_vlb_step=0.0298, train/loss_step=0.850, global_step=4788.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  32%|███▏      | 1930/5971 [17:41<37:01,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.850, train/loss_vlb_step=0.0298, train/loss_step=0.850, global_step=4788.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1930/5971 [17:41<37:01,  1.82it/s, loss=0.17, v_num=0, train/loss_simple_step=0.374, train/loss_vlb_step=0.00201, train/loss_step=0.374, global_step=4788.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1931/5971 [17:42<37:01,  1.82it/s, loss=0.191, v_num=0, train/loss_simple_step=0.710, train/loss_vlb_step=0.0249, train/loss_step=0.710, global_step=4788.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1932/5971 [17:45<37:05,  1.81it/s, loss=0.213, v_num=0, train/loss_simple_step=0.540, train/loss_vlb_step=0.00716, train/loss_step=0.540, global_step=4788.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1933/5971 [17:45<37:05,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00947, train/loss_vlb_step=4e-5, train/loss_step=0.00947, global_step=4789.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1934/5971 [17:46<37:05,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00947, train/loss_vlb_step=4e-5, train/loss_step=0.00947, global_step=4789.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1934/5971 [17:46<37:05,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.000627, train/loss_step=0.181, global_step=4789.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1935/5971 [17:47<37:05,  1.81it/s, loss=0.215, v_num=0, train/loss_simple_step=0.244, train/loss_vlb_step=0.000889, train/loss_step=0.244, global_step=4789.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1936/5971 [17:49<37:08,  1.81it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00214, train/loss_vlb_step=1.28e-5, train/loss_step=0.00214, global_step=4789.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1937/5971 [17:50<37:08,  1.81it/s, loss=0.224, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00291, train/loss_step=0.373, global_step=4790.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  32%|███▏      | 1938/5971 [17:51<37:08,  1.81it/s, loss=0.224, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00291, train/loss_step=0.373, global_step=4790.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1938/5971 [17:51<37:09,  1.81it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00407, train/loss_vlb_step=2.15e-5, train/loss_step=0.00407, global_step=4790.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  32%|███▏      | 1939/5971 [17:52<37:09,  1.81it/s, loss=0.214, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=7.94e-6, train/loss_step=0.0014, global_step=4790.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  32%|███▏      | 1940/5971 [17:54<37:11,  1.81it/s, loss=0.236, v_num=0, train/loss_simple_step=0.655, train/loss_vlb_step=0.0229, train/loss_step=0.655, global_step=4790.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  33%|███▎      | 1941/5971 [17:55<37:12,  1.81it/s, loss=0.236, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.08e-5, train/loss_step=0.00387, global_step=4791.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1942/5971 [17:56<37:12,  1.80it/s, loss=0.236, v_num=0, train/loss_simple_step=0.00387, train/loss_vlb_step=2.08e-5, train/loss_step=0.00387, global_step=4791.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1942/5971 [17:56<37:12,  1.80it/s, loss=0.234, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000472, train/loss_step=0.143, global_step=4791.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  33%|███▎      | 1943/5971 [17:57<37:12,  1.80it/s, loss=0.235, v_num=0, train/loss_simple_step=0.00524, train/loss_vlb_step=2.71e-5, train/loss_step=0.00524, global_step=4791.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1944/5971 [17:59<37:15,  1.80it/s, loss=0.212, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000368, train/loss_step=0.108, global_step=4791.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  33%|███▎      | 1945/5971 [18:00<37:15,  1.80it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.39e-5, train/loss_step=0.0181, global_step=4792.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1946/5971 [18:01<37:15,  1.80it/s, loss=0.212, v_num=0, train/loss_simple_step=0.0181, train/loss_vlb_step=7.39e-5, train/loss_step=0.0181, global_step=4792.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1946/5971 [18:01<37:15,  1.80it/s, loss=0.212, v_num=0, train/loss_simple_step=0.00715, train/loss_vlb_step=3.56e-5, train/loss_step=0.00715, global_step=4792.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1947/5971 [18:02<37:15,  1.80it/s, loss=0.226, v_num=0, train/loss_simple_step=0.278, train/loss_vlb_step=0.00129, train/loss_step=0.278, global_step=4792.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  33%|███▎      | 1948/5971 [18:04<37:18,  1.80it/s, loss=0.261, v_num=0, train/loss_simple_step=0.713, train/loss_vlb_step=0.019, train/loss_step=0.713, global_step=4792.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  33%|███▎      | 1949/5971 [18:05<37:18,  1.80it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.95e-5, train/loss_step=0.0248, global_step=4793.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1950/5971 [18:06<37:18,  1.80it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0248, train/loss_vlb_step=9.95e-5, train/loss_step=0.0248, global_step=4793.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1950/5971 [18:06<37:18,  1.80it/s, loss=0.217, v_num=0, train/loss_simple_step=0.319, train/loss_vlb_step=0.0016, train/loss_step=0.319, global_step=4793.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  33%|███▎      | 1951/5971 [18:07<37:18,  1.80it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00686, train/loss_vlb_step=3.29e-5, train/loss_step=0.00686, global_step=4793.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1952/5971 [18:09<37:21,  1.79it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000196, train/loss_step=0.0542, global_step=4793.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  33%|███▎      | 1953/5971 [18:10<37:22,  1.79it/s, loss=0.199, v_num=0, train/loss_simple_step=0.842, train/loss_vlb_step=0.0436, train/loss_step=0.842, global_step=4794.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  33%|███▎      | 1954/5971 [18:11<37:22,  1.79it/s, loss=0.199, v_num=0, train/loss_simple_step=0.842, train/loss_vlb_step=0.0436, train/loss_step=0.842, global_step=4794.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1954/5971 [18:11<37:22,  1.79it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.79e-5, train/loss_step=0.0138, global_step=4794.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1955/5971 [18:12<37:22,  1.79it/s, loss=0.211, v_num=0, train/loss_simple_step=0.651, train/loss_vlb_step=0.0131, train/loss_step=0.651, global_step=4794.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  33%|███▎      | 1956/5971 [18:14<37:24,  1.79it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000161, train/loss_step=0.0466, global_step=4794.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1957/5971 [18:15<37:25,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00976, train/loss_vlb_step=4.52e-5, train/loss_step=0.00976, global_step=4795.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1958/5971 [18:15<37:25,  1.79it/s, loss=0.195, v_num=0, train/loss_simple_step=0.00976, train/loss_vlb_step=4.52e-5, train/loss_step=0.00976, global_step=4795.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1958/5971 [18:15<37:25,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.0475, train/loss_vlb_step=0.00016, train/loss_step=0.0475, global_step=4795.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  33%|███▎      | 1959/5971 [18:16<37:25,  1.79it/s, loss=0.225, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00806, train/loss_step=0.545, global_step=4795.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  33%|███▎      | 1960/5971 [18:19<37:28,  1.78it/s, loss=0.199, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000517, train/loss_step=0.143, global_step=4795.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1961/5971 [18:20<37:28,  1.78it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000132, train/loss_step=0.0371, global_step=4796.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1962/5971 [18:20<37:28,  1.78it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0371, train/loss_vlb_step=0.000132, train/loss_step=0.0371, global_step=4796.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1962/5971 [18:20<37:28,  1.78it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0461, train/loss_vlb_step=0.000162, train/loss_step=0.0461, global_step=4796.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1963/5971 [18:21<37:28,  1.78it/s, loss=0.203, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000585, train/loss_step=0.160, global_step=4796.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  33%|███▎      | 1964/5971 [18:23<37:31,  1.78it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00181, train/loss_vlb_step=1.08e-5, train/loss_step=0.00181, global_step=4796.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1965/5971 [18:24<37:31,  1.78it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.76e-5, train/loss_step=0.00337, global_step=4797.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1966/5971 [18:25<37:31,  1.78it/s, loss=0.197, v_num=0, train/loss_simple_step=0.00337, train/loss_vlb_step=1.76e-5, train/loss_step=0.00337, global_step=4797.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1966/5971 [18:25<37:31,  1.78it/s, loss=0.218, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00303, train/loss_step=0.415, global_step=4797.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  33%|███▎      | 1967/5971 [18:26<37:31,  1.78it/s, loss=0.215, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000826, train/loss_step=0.230, global_step=4797.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1968/5971 [18:28<37:33,  1.78it/s, loss=0.19, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000729, train/loss_step=0.206, global_step=4797.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  33%|███▎      | 1969/5971 [18:29<37:33,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.05e-5, train/loss_step=0.00406, global_step=4798.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1970/5971 [18:30<37:33,  1.78it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00406, train/loss_vlb_step=2.05e-5, train/loss_step=0.00406, global_step=4798.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1970/5971 [18:30<37:33,  1.78it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000208, train/loss_step=0.0573, global_step=4798.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  33%|███▎      | 1971/5971 [18:31<37:33,  1.77it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0502, train/loss_vlb_step=0.000176, train/loss_step=0.0502, global_step=4798.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1972/5971 [18:33<37:36,  1.77it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00434, train/loss_vlb_step=2.32e-5, train/loss_step=0.00434, global_step=4798.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1973/5971 [18:34<37:36,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.05e-5, train/loss_step=0.0212, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  33%|███▎      | 1974/5971 [18:35<37:36,  1.77it/s, loss=0.135, v_num=0, train/loss_simple_step=0.0212, train/loss_vlb_step=8.05e-5, train/loss_step=0.0212, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1974/5971 [18:35<37:36,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00171, train/loss_step=0.356, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  33%|███▎      | 1975/5971 [18:36<37:36,  1.77it/s, loss=0.13, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000774, train/loss_step=0.221, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  33%|███▎      | 1976/5971 [18:38<37:39,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:24,  1.97it/s][A
Epoch 8:  33%|███▎      | 1978/5971 [18:38<37:37,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   1%|          | 2/167 [00:00<00:46,  3.54it/s][A

Validating:   2%|▏         | 4/167 [00:00<00:22,  7.18it/s][A
Epoch 8:  33%|███▎      | 1982/5971 [18:39<37:30,  1.77it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   4%|▍         | 7/167 [00:00<00:13, 11.94it/s][A
Epoch 8:  33%|███▎      | 1986/5971 [18:39<37:24,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   6%|▌         | 10/167 [00:00<00:09, 15.92it/s][A

Validating:   8%|▊         | 13/167 [00:01<00:08, 18.66it/s][A
Epoch 8:  33%|███▎      | 1990/5971 [18:39<37:18,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  10%|▉         | 16/167 [00:01<00:07, 20.57it/s][A
Epoch 8:  33%|███▎      | 1994/5971 [18:39<37:11,  1.78it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  11%|█▏        | 19/167 [00:01<00:06, 22.62it/s][A
Epoch 8:  33%|███▎      | 1998/5971 [18:39<37:05,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  13%|█▎        | 22/167 [00:01<00:05, 24.19it/s][A

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.09it/s][A
Epoch 8:  34%|███▎      | 2002/5971 [18:39<36:58,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 27.39it/s][A
Epoch 8:  34%|███▎      | 2006/5971 [18:39<36:52,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▉        | 32/167 [00:01<00:04, 27.89it/s][A
Epoch 8:  34%|███▎      | 2010/5971 [18:40<36:46,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  21%|██        | 35/167 [00:01<00:04, 27.08it/s][A
Epoch 8:  34%|███▎      | 2014/5971 [18:40<36:39,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  23%|██▎       | 38/167 [00:01<00:04, 27.12it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:04, 27.52it/s][A
Epoch 8:  34%|███▍      | 2018/5971 [18:40<36:33,  1.80it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 27.94it/s][A
Epoch 8:  34%|███▍      | 2022/5971 [18:40<36:27,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 28.14it/s][A
Epoch 8:  34%|███▍      | 2026/5971 [18:40<36:20,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  31%|███       | 51/167 [00:02<00:04, 28.56it/s][A
Epoch 8:  34%|███▍      | 2030/5971 [18:40<36:14,  1.81it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.84it/s][A
Epoch 8:  34%|███▍      | 2034/5971 [18:40<36:08,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▍      | 58/167 [00:02<00:04, 26.40it/s][A

Validating:  37%|███▋      | 61/167 [00:02<00:03, 27.05it/s][A
Epoch 8:  34%|███▍      | 2038/5971 [18:41<36:02,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.43it/s][A
Epoch 8:  34%|███▍      | 2042/5971 [18:41<35:56,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  41%|████      | 68/167 [00:03<00:03, 28.37it/s][A
Epoch 8:  34%|███▍      | 2046/5971 [18:41<35:50,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 28.47it/s][A
Epoch 8:  34%|███▍      | 2050/5971 [18:41<35:43,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 28.11it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.26it/s][A
Epoch 8:  34%|███▍      | 2054/5971 [18:41<35:37,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 26.62it/s][A
Epoch 8:  34%|███▍      | 2058/5971 [18:41<35:31,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 27.28it/s][A
Epoch 8:  35%|███▍      | 2062/5971 [18:41<35:25,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.83it/s][A

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 27.46it/s][A
Epoch 8:  35%|███▍      | 2066/5971 [18:42<35:19,  1.84it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 26.49it/s][A
Epoch 8:  35%|███▍      | 2070/5971 [18:42<35:13,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 27.33it/s][A
Epoch 8:  35%|███▍      | 2074/5971 [18:42<35:07,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 26.79it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.86it/s][A
Epoch 8:  35%|███▍      | 2078/5971 [18:42<35:01,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 28.00it/s][A
Epoch 8:  35%|███▍      | 2082/5971 [18:42<34:56,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 28.07it/s][A
Epoch 8:  35%|███▍      | 2086/5971 [18:42<34:50,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▋   | 111/167 [00:04<00:01, 28.41it/s][A
Epoch 8:  35%|███▌      | 2090/5971 [18:42<34:44,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  68%|██████▊   | 114/167 [00:04<00:01, 28.22it/s][A

Validating:  70%|███████   | 117/167 [00:04<00:01, 27.65it/s][A
Epoch 8:  35%|███▌      | 2094/5971 [18:43<34:38,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 120/167 [00:04<00:01, 26.64it/s][A
Epoch 8:  35%|███▌      | 2098/5971 [18:43<34:32,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.09it/s][A
Epoch 8:  35%|███▌      | 2102/5971 [18:43<34:26,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.67it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.34it/s][A
Epoch 8:  35%|███▌      | 2106/5971 [18:43<34:20,  1.88it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.70it/s][A
Epoch 8:  35%|███▌      | 2110/5971 [18:43<34:15,  1.88it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████  | 135/167 [00:05<00:01, 27.49it/s][A
Epoch 8:  35%|███▌      | 2114/5971 [18:43<34:09,  1.88it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 27.68it/s][A

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 26.40it/s][A
Epoch 8:  35%|███▌      | 2118/5971 [18:43<34:03,  1.89it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 26.65it/s][A
Epoch 8:  36%|███▌      | 2122/5971 [18:44<33:58,  1.89it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  88%|████████▊ | 147/167 [00:05<00:00, 27.12it/s][A
Epoch 8:  36%|███▌      | 2126/5971 [18:44<33:52,  1.89it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.16it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 26.05it/s][A
Epoch 8:  36%|███▌      | 2130/5971 [18:44<33:46,  1.90it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.36it/s][A
Epoch 8:  36%|███▌      | 2134/5971 [18:44<33:41,  1.90it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 25.96it/s][A
Epoch 8:  36%|███▌      | 2138/5971 [18:44<33:35,  1.90it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 26.50it/s][A
Epoch 8:  36%|███▌      | 2142/5971 [18:44<33:29,  1.91it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 26.98it/s][A
Epoch 8:  36%|███▌      | 2144/5971 [18:45<33:27,  1.91it/s, loss=0.152, v_num=0, train/loss_simple_step=0.476, train/loss_vlb_step=0.00394, train/loss_step=0.476, global_step=4799.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.45it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.31it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.95it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.23it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.46it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.54it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.60it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.64it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.66it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.67it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.59it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.46it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.54it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.59it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.62it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.63it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.70it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.70it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.70it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.71it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.67it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.68it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.69it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.69it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.61it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.32it/s]

Epoch 8:  36%|███▌      | 2145/5971 [18:57<33:47,  1.89it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.1e-5, train/loss_step=0.00388, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A
Epoch 8:  36%|███▌      | 2145/5971 [18:58<33:49,  1.89it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.1e-5, train/loss_step=0.00388, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.44it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.89it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.35it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.69it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.93it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.12it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.29it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.50it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.56it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.58it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.62it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.71it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.72it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.65it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.62it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.61it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.61it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.63it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.66it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.69it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.72it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.73it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.73it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.71it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.71it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.73it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.73it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.72it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.73it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.73it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.33it/s]

Epoch 8:  36%|███▌      | 2146/5971 [19:08<34:06,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00388, train/loss_vlb_step=2.1e-5, train/loss_step=0.00388, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2146/5971 [19:08<34:06,  1.87it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.69e-5, train/loss_step=0.00531, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.29it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.95it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.44it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.25it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.39it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.48it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.56it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.61it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.64it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.72it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.72it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.72it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.71it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.68it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.69it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.68it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.60it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.59it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.60it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.63it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.68it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.69it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.66it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.60it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.63it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.66it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.67it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.33it/s]

Epoch 8:  36%|███▌      | 2147/5971 [19:20<34:25,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00531, train/loss_vlb_step=2.69e-5, train/loss_step=0.00531, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2147/5971 [19:20<34:25,  1.85it/s, loss=0.128, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.43it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.97it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.46it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.78it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.04it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.24it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.48it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.55it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.60it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.64it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.54it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.57it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.65it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.68it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.71it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.72it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.69it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.70it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.72it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.73it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.74it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.74it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.74it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.72it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.71it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.69it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.70it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.62it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.65it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.71it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.73it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.70it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.69it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.35it/s]

Epoch 8:  36%|███▌      | 2148/5971 [19:33<34:47,  1.83it/s, loss=0.128, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000376, train/loss_step=0.114, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2148/5971 [19:33<34:47,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.62e-5, train/loss_step=0.0193, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2149/5971 [19:34<34:47,  1.83it/s, loss=0.122, v_num=0, train/loss_simple_step=0.0193, train/loss_vlb_step=7.62e-5, train/loss_step=0.0193, global_step=4800.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2149/5971 [19:34<34:47,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00194, train/loss_step=0.373, global_step=4801.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  36%|███▌      | 2150/5971 [19:35<34:47,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00194, train/loss_step=0.373, global_step=4801.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2150/5971 [19:35<34:47,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=8.87e-5, train/loss_step=0.0228, global_step=4801.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2151/5971 [19:35<34:47,  1.83it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0228, train/loss_vlb_step=8.87e-5, train/loss_step=0.0228, global_step=4801.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2151/5971 [19:35<34:47,  1.83it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.27e-5, train/loss_step=0.00223, global_step=4801.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2152/5971 [19:38<34:49,  1.83it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00223, train/loss_vlb_step=1.27e-5, train/loss_step=0.00223, global_step=4801.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2152/5971 [19:38<34:49,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.670, train/loss_vlb_step=0.113, train/loss_step=0.670, global_step=4801.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]      
Epoch 8:  36%|███▌      | 2153/5971 [19:38<34:49,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.670, train/loss_vlb_step=0.113, train/loss_step=0.670, global_step=4801.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2153/5971 [19:38<34:49,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.52e-5, train/loss_step=0.00265, global_step=4802.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2154/5971 [19:39<34:49,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00265, train/loss_vlb_step=1.52e-5, train/loss_step=0.00265, global_step=4802.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2154/5971 [19:39<34:49,  1.83it/s, loss=0.149, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=4802.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  36%|███▌      | 2155/5971 [19:40<34:49,  1.83it/s, loss=0.149, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000474, train/loss_step=0.144, global_step=4802.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2155/5971 [19:40<34:49,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000195, train/loss_step=0.0538, global_step=4802.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2156/5971 [19:42<34:52,  1.82it/s, loss=0.14, v_num=0, train/loss_simple_step=0.0538, train/loss_vlb_step=0.000195, train/loss_step=0.0538, global_step=4802.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2156/5971 [19:42<34:52,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000776, train/loss_step=0.221, global_step=4802.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  36%|███▌      | 2157/5971 [19:43<34:52,  1.82it/s, loss=0.141, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000776, train/loss_step=0.221, global_step=4802.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2157/5971 [19:43<34:52,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00283, train/loss_step=0.433, global_step=4803.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  36%|███▌      | 2158/5971 [19:44<34:52,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00283, train/loss_step=0.433, global_step=4803.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2158/5971 [19:44<34:52,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.67e-6, train/loss_step=0.00158, global_step=4803.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2159/5971 [19:45<34:52,  1.82it/s, loss=0.16, v_num=0, train/loss_simple_step=0.00158, train/loss_vlb_step=9.67e-6, train/loss_step=0.00158, global_step=4803.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2159/5971 [19:45<34:52,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.503, train/loss_vlb_step=0.00463, train/loss_step=0.503, global_step=4803.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  36%|███▌      | 2160/5971 [19:47<34:54,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.503, train/loss_vlb_step=0.00463, train/loss_step=0.503, global_step=4803.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2160/5971 [19:47<34:54,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.05e-5, train/loss_step=0.0198, global_step=4803.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2161/5971 [19:48<34:54,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.05e-5, train/loss_step=0.0198, global_step=4803.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2161/5971 [19:48<34:54,  1.82it/s, loss=0.205, v_num=0, train/loss_simple_step=0.460, train/loss_vlb_step=0.00408, train/loss_step=0.460, global_step=4804.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  36%|███▌      | 2162/5971 [19:49<34:54,  1.82it/s, loss=0.205, v_num=0, train/loss_simple_step=0.460, train/loss_vlb_step=0.00408, train/loss_step=0.460, global_step=4804.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2162/5971 [19:49<34:54,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00491, train/loss_vlb_step=2.42e-5, train/loss_step=0.00491, global_step=4804.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2163/5971 [19:50<34:54,  1.82it/s, loss=0.188, v_num=0, train/loss_simple_step=0.00491, train/loss_vlb_step=2.42e-5, train/loss_step=0.00491, global_step=4804.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2163/5971 [19:50<34:54,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0764, train/loss_vlb_step=0.00026, train/loss_step=0.0764, global_step=4804.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  36%|███▌      | 2164/5971 [19:52<34:57,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0764, train/loss_vlb_step=0.00026, train/loss_step=0.0764, global_step=4804.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▌      | 2164/5971 [19:52<34:57,  1.82it/s, loss=0.159, v_num=0, train/loss_simple_step=0.056, train/loss_vlb_step=0.000193, train/loss_step=0.056, global_step=4804.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2165/5971 [19:53<34:57,  1.81it/s, loss=0.159, v_num=0, train/loss_simple_step=0.056, train/loss_vlb_step=0.000193, train/loss_step=0.056, global_step=4804.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2165/5971 [19:53<34:57,  1.81it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.00012, train/loss_step=0.0317, global_step=4805.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2166/5971 [19:54<34:57,  1.81it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.00012, train/loss_step=0.0317, global_step=4805.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2166/5971 [19:54<34:57,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00138, train/loss_step=0.294, global_step=4805.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  36%|███▋      | 2167/5971 [19:55<34:57,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.294, train/loss_vlb_step=0.00138, train/loss_step=0.294, global_step=4805.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2167/5971 [19:55<34:57,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.81e-5, train/loss_step=0.00316, global_step=4805.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2168/5971 [19:57<35:00,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.81e-5, train/loss_step=0.00316, global_step=4805.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2168/5971 [19:57<35:00,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00374, train/loss_step=0.535, global_step=4805.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  36%|███▋      | 2169/5971 [19:58<35:00,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.535, train/loss_vlb_step=0.00374, train/loss_step=0.535, global_step=4805.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2169/5971 [19:58<35:00,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00227, train/loss_step=0.337, global_step=4806.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2170/5971 [19:59<35:00,  1.81it/s, loss=0.194, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00227, train/loss_step=0.337, global_step=4806.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2170/5971 [19:59<35:00,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.000183, train/loss_step=0.0511, global_step=4806.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2171/5971 [20:00<35:00,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0511, train/loss_vlb_step=0.000183, train/loss_step=0.0511, global_step=4806.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2171/5971 [20:00<35:00,  1.81it/s, loss=0.207, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000899, train/loss_step=0.249, global_step=4806.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  36%|███▋      | 2172/5971 [20:02<35:02,  1.81it/s, loss=0.207, v_num=0, train/loss_simple_step=0.249, train/loss_vlb_step=0.000899, train/loss_step=0.249, global_step=4806.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2172/5971 [20:02<35:02,  1.81it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00596, train/loss_vlb_step=3.06e-5, train/loss_step=0.00596, global_step=4806.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2173/5971 [20:03<35:02,  1.81it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00596, train/loss_vlb_step=3.06e-5, train/loss_step=0.00596, global_step=4806.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2173/5971 [20:03<35:02,  1.81it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00873, train/loss_vlb_step=3.99e-5, train/loss_step=0.00873, global_step=4807.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2174/5971 [20:04<35:02,  1.81it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00873, train/loss_vlb_step=3.99e-5, train/loss_step=0.00873, global_step=4807.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2174/5971 [20:04<35:02,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00323, train/loss_vlb_step=1.79e-5, train/loss_step=0.00323, global_step=4807.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2175/5971 [20:05<35:02,  1.81it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00323, train/loss_vlb_step=1.79e-5, train/loss_step=0.00323, global_step=4807.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2175/5971 [20:05<35:02,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.07e-5, train/loss_step=0.00189, global_step=4807.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2176/5971 [20:07<35:04,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.07e-5, train/loss_step=0.00189, global_step=4807.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2176/5971 [20:07<35:04,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.0061, train/loss_step=0.638, global_step=4807.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  36%|███▋      | 2177/5971 [20:08<35:04,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.638, train/loss_vlb_step=0.0061, train/loss_step=0.638, global_step=4807.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2177/5971 [20:08<35:04,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00147, train/loss_step=0.334, global_step=4808.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2178/5971 [20:09<35:04,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00147, train/loss_step=0.334, global_step=4808.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2178/5971 [20:09<35:04,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00859, train/loss_vlb_step=4.06e-5, train/loss_step=0.00859, global_step=4808.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2179/5971 [20:10<35:04,  1.80it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00859, train/loss_vlb_step=4.06e-5, train/loss_step=0.00859, global_step=4808.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  36%|███▋      | 2179/5971 [20:10<35:04,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000108, train/loss_step=0.029, global_step=4808.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  37%|███▋      | 2180/5971 [20:12<35:07,  1.80it/s, loss=0.157, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000108, train/loss_step=0.029, global_step=4808.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2180/5971 [20:12<35:07,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.00015, train/loss_step=0.0434, global_step=4808.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2181/5971 [20:13<35:07,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0434, train/loss_vlb_step=0.00015, train/loss_step=0.0434, global_step=4808.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2181/5971 [20:13<35:07,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=2.01e-5, train/loss_step=0.00376, global_step=4809.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2182/5971 [20:13<35:07,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00376, train/loss_vlb_step=2.01e-5, train/loss_step=0.00376, global_step=4809.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2182/5971 [20:13<35:07,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000149, train/loss_step=0.043, global_step=4809.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  37%|███▋      | 2183/5971 [20:14<35:07,  1.80it/s, loss=0.138, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000149, train/loss_step=0.043, global_step=4809.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2183/5971 [20:14<35:07,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0457, train/loss_vlb_step=0.000164, train/loss_step=0.0457, global_step=4809.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2184/5971 [20:16<35:09,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0457, train/loss_vlb_step=0.000164, train/loss_step=0.0457, global_step=4809.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2184/5971 [20:16<35:09,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00211, train/loss_step=0.360, global_step=4809.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  37%|███▋      | 2185/5971 [20:17<35:09,  1.79it/s, loss=0.151, v_num=0, train/loss_simple_step=0.360, train/loss_vlb_step=0.00211, train/loss_step=0.360, global_step=4809.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2185/5971 [20:17<35:09,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.074, train/loss_vlb_step=0.000244, train/loss_step=0.074, global_step=4810.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2186/5971 [20:18<35:09,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.074, train/loss_vlb_step=0.000244, train/loss_step=0.074, global_step=4810.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2186/5971 [20:18<35:09,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.99e-5, train/loss_step=0.019, global_step=4810.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  37%|███▋      | 2187/5971 [20:19<35:09,  1.79it/s, loss=0.14, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.99e-5, train/loss_step=0.019, global_step=4810.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2187/5971 [20:19<35:09,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=4810.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2188/5971 [20:22<35:11,  1.79it/s, loss=0.145, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=4810.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2188/5971 [20:22<35:11,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.487, train/loss_vlb_step=0.00531, train/loss_step=0.487, global_step=4810.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2189/5971 [20:22<35:11,  1.79it/s, loss=0.142, v_num=0, train/loss_simple_step=0.487, train/loss_vlb_step=0.00531, train/loss_step=0.487, global_step=4810.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2189/5971 [20:22<35:11,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.64e-5, train/loss_step=0.00317, global_step=4811.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2190/5971 [20:23<35:11,  1.79it/s, loss=0.126, v_num=0, train/loss_simple_step=0.00317, train/loss_vlb_step=1.64e-5, train/loss_step=0.00317, global_step=4811.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2190/5971 [20:23<35:11,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=8.89e-6, train/loss_step=0.00149, global_step=4811.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2191/5971 [20:24<35:11,  1.79it/s, loss=0.123, v_num=0, train/loss_simple_step=0.00149, train/loss_vlb_step=8.89e-6, train/loss_step=0.00149, global_step=4811.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2191/5971 [20:24<35:11,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.0011, train/loss_step=0.280, global_step=4811.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  37%|███▋      | 2192/5971 [20:26<35:14,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.280, train/loss_vlb_step=0.0011, train/loss_step=0.280, global_step=4811.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2192/5971 [20:26<35:14,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.00021, train/loss_step=0.0613, global_step=4811.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2193/5971 [20:27<35:14,  1.79it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0613, train/loss_vlb_step=0.00021, train/loss_step=0.0613, global_step=4811.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2193/5971 [20:27<35:14,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.000245, train/loss_step=0.0727, global_step=4812.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2194/5971 [20:28<35:13,  1.79it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0727, train/loss_vlb_step=0.000245, train/loss_step=0.0727, global_step=4812.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2194/5971 [20:28<35:13,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000837, train/loss_step=0.212, global_step=4812.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  37%|███▋      | 2195/5971 [20:29<35:13,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000837, train/loss_step=0.212, global_step=4812.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2195/5971 [20:29<35:13,  1.79it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.55e-6, train/loss_step=0.0014, global_step=4812.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2196/5971 [20:31<35:16,  1.78it/s, loss=0.141, v_num=0, train/loss_simple_step=0.0014, train/loss_vlb_step=8.55e-6, train/loss_step=0.0014, global_step=4812.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2196/5971 [20:31<35:16,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000159, train/loss_step=0.0467, global_step=4812.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2197/5971 [20:32<35:16,  1.78it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0467, train/loss_vlb_step=0.000159, train/loss_step=0.0467, global_step=4812.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2197/5971 [20:32<35:16,  1.78it/s, loss=0.098, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000218, train/loss_step=0.0644, global_step=4813.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2198/5971 [20:33<35:16,  1.78it/s, loss=0.098, v_num=0, train/loss_simple_step=0.0644, train/loss_vlb_step=0.000218, train/loss_step=0.0644, global_step=4813.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2198/5971 [20:33<35:16,  1.78it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.65e-5, train/loss_step=0.00295, global_step=4813.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2199/5971 [20:34<35:16,  1.78it/s, loss=0.0977, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.65e-5, train/loss_step=0.00295, global_step=4813.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2199/5971 [20:34<35:16,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00478, train/loss_step=0.564, global_step=4813.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  37%|███▋      | 2200/5971 [20:36<35:18,  1.78it/s, loss=0.124, v_num=0, train/loss_simple_step=0.564, train/loss_vlb_step=0.00478, train/loss_step=0.564, global_step=4813.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2200/5971 [20:36<35:18,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000711, train/loss_step=0.212, global_step=4813.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2201/5971 [20:37<35:18,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.212, train/loss_vlb_step=0.000711, train/loss_step=0.212, global_step=4813.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2201/5971 [20:37<35:18,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000309, train/loss_step=0.0935, global_step=4814.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2202/5971 [20:38<35:18,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0935, train/loss_vlb_step=0.000309, train/loss_step=0.0935, global_step=4814.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2202/5971 [20:38<35:18,  1.78it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.7e-5, train/loss_step=0.00308, global_step=4814.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2203/5971 [20:39<35:18,  1.78it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00308, train/loss_vlb_step=1.7e-5, train/loss_step=0.00308, global_step=4814.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2203/5971 [20:39<35:18,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00127, train/loss_step=0.291, global_step=4814.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  37%|███▋      | 2204/5971 [20:41<35:20,  1.78it/s, loss=0.148, v_num=0, train/loss_simple_step=0.291, train/loss_vlb_step=0.00127, train/loss_step=0.291, global_step=4814.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2204/5971 [20:41<35:20,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=2.05e-5, train/loss_step=0.00372, global_step=4814.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2205/5971 [20:42<35:20,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00372, train/loss_vlb_step=2.05e-5, train/loss_step=0.00372, global_step=4814.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2205/5971 [20:42<35:20,  1.78it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0947, train/loss_vlb_step=0.000317, train/loss_step=0.0947, global_step=4815.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2206/5971 [20:43<35:20,  1.78it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0947, train/loss_vlb_step=0.000317, train/loss_step=0.0947, global_step=4815.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2206/5971 [20:43<35:20,  1.78it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.6e-5, train/loss_step=0.00292, global_step=4815.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2207/5971 [20:43<35:20,  1.77it/s, loss=0.13, v_num=0, train/loss_simple_step=0.00292, train/loss_vlb_step=1.6e-5, train/loss_step=0.00292, global_step=4815.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2207/5971 [20:43<35:20,  1.77it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.71e-5, train/loss_step=0.0103, global_step=4815.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2208/5971 [20:46<35:22,  1.77it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0103, train/loss_vlb_step=4.71e-5, train/loss_step=0.0103, global_step=4815.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2208/5971 [20:46<35:22,  1.77it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.24e-5, train/loss_step=0.00671, global_step=4815.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2209/5971 [20:46<35:22,  1.77it/s, loss=0.101, v_num=0, train/loss_simple_step=0.00671, train/loss_vlb_step=3.24e-5, train/loss_step=0.00671, global_step=4815.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2209/5971 [20:46<35:22,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.000313, train/loss_step=0.0941, global_step=4816.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2210/5971 [20:48<35:23,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.000313, train/loss_step=0.0941, global_step=4816.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2210/5971 [20:48<35:23,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.23e-5, train/loss_step=0.00209, global_step=4816.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2211/5971 [20:49<35:23,  1.77it/s, loss=0.106, v_num=0, train/loss_simple_step=0.00209, train/loss_vlb_step=1.23e-5, train/loss_step=0.00209, global_step=4816.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2211/5971 [20:49<35:23,  1.77it/s, loss=0.093, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.06e-5, train/loss_step=0.0205, global_step=4816.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  37%|███▋      | 2212/5971 [20:51<35:25,  1.77it/s, loss=0.093, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.06e-5, train/loss_step=0.0205, global_step=4816.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2212/5971 [20:51<35:25,  1.77it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.48e-5, train/loss_step=0.019, global_step=4816.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2213/5971 [20:52<35:25,  1.77it/s, loss=0.0909, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.48e-5, train/loss_step=0.019, global_step=4816.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2213/5971 [20:52<35:25,  1.77it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.0985, train/loss_vlb_step=0.000328, train/loss_step=0.0985, global_step=4817.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2214/5971 [20:52<35:25,  1.77it/s, loss=0.0922, v_num=0, train/loss_simple_step=0.0985, train/loss_vlb_step=0.000328, train/loss_step=0.0985, global_step=4817.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2214/5971 [20:52<35:25,  1.77it/s, loss=0.084, v_num=0, train/loss_simple_step=0.0489, train/loss_vlb_step=0.000172, train/loss_step=0.0489, global_step=4817.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2215/5971 [20:53<35:25,  1.77it/s, loss=0.084, v_num=0, train/loss_simple_step=0.0489, train/loss_vlb_step=0.000172, train/loss_step=0.0489, global_step=4817.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2215/5971 [20:53<35:25,  1.77it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000168, train/loss_step=0.047, global_step=4817.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2216/5971 [20:56<35:27,  1.76it/s, loss=0.0863, v_num=0, train/loss_simple_step=0.047, train/loss_vlb_step=0.000168, train/loss_step=0.047, global_step=4817.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2216/5971 [20:56<35:27,  1.76it/s, loss=0.096, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000958, train/loss_step=0.239, global_step=4817.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2217/5971 [20:57<35:27,  1.76it/s, loss=0.096, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000958, train/loss_step=0.239, global_step=4817.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2217/5971 [20:57<35:27,  1.76it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.0804, train/loss_vlb_step=0.000264, train/loss_step=0.0804, global_step=4818.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2218/5971 [20:58<35:27,  1.76it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.0804, train/loss_vlb_step=0.000264, train/loss_step=0.0804, global_step=4818.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2218/5971 [20:58<35:27,  1.76it/s, loss=0.111, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00129, train/loss_step=0.297, global_step=4818.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  37%|███▋      | 2219/5971 [20:59<35:27,  1.76it/s, loss=0.111, v_num=0, train/loss_simple_step=0.297, train/loss_vlb_step=0.00129, train/loss_step=0.297, global_step=4818.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2219/5971 [20:59<35:27,  1.76it/s, loss=0.0854, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.000154, train/loss_step=0.0433, global_step=4818.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2220/5971 [21:01<35:29,  1.76it/s, loss=0.0854, v_num=0, train/loss_simple_step=0.0433, train/loss_vlb_step=0.000154, train/loss_step=0.0433, global_step=4818.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2220/5971 [21:01<35:29,  1.76it/s, loss=0.0892, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00104, train/loss_step=0.288, global_step=4818.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  37%|███▋      | 2221/5971 [21:01<35:29,  1.76it/s, loss=0.0892, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00104, train/loss_step=0.288, global_step=4818.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2221/5971 [21:01<35:29,  1.76it/s, loss=0.0847, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.75e-5, train/loss_step=0.00346, global_step=4819.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2222/5971 [21:02<35:29,  1.76it/s, loss=0.0847, v_num=0, train/loss_simple_step=0.00346, train/loss_vlb_step=1.75e-5, train/loss_step=0.00346, global_step=4819.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2222/5971 [21:02<35:29,  1.76it/s, loss=0.0866, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000156, train/loss_step=0.0419, global_step=4819.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2223/5971 [21:03<35:29,  1.76it/s, loss=0.0866, v_num=0, train/loss_simple_step=0.0419, train/loss_vlb_step=0.000156, train/loss_step=0.0419, global_step=4819.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2223/5971 [21:03<35:29,  1.76it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000457, train/loss_step=0.136, global_step=4819.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  37%|███▋      | 2224/5971 [21:05<35:31,  1.76it/s, loss=0.0789, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000457, train/loss_step=0.136, global_step=4819.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2224/5971 [21:05<35:31,  1.76it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000565, train/loss_step=0.165, global_step=4819.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2225/5971 [21:06<35:31,  1.76it/s, loss=0.0869, v_num=0, train/loss_simple_step=0.165, train/loss_vlb_step=0.000565, train/loss_step=0.165, global_step=4819.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2225/5971 [21:06<35:31,  1.76it/s, loss=0.0911, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000665, train/loss_step=0.178, global_step=4820.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2226/5971 [21:07<35:31,  1.76it/s, loss=0.0911, v_num=0, train/loss_simple_step=0.178, train/loss_vlb_step=0.000665, train/loss_step=0.178, global_step=4820.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2226/5971 [21:07<35:31,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00121, train/loss_step=0.299, global_step=4820.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  37%|███▋      | 2227/5971 [21:08<35:31,  1.76it/s, loss=0.106, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00121, train/loss_step=0.299, global_step=4820.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2227/5971 [21:08<35:31,  1.76it/s, loss=0.122, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00153, train/loss_step=0.332, global_step=4820.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2228/5971 [21:10<35:33,  1.75it/s, loss=0.122, v_num=0, train/loss_simple_step=0.332, train/loss_vlb_step=0.00153, train/loss_step=0.332, global_step=4820.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2228/5971 [21:10<35:33,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00279, train/loss_step=0.445, global_step=4820.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2229/5971 [21:11<35:33,  1.75it/s, loss=0.144, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00279, train/loss_step=0.445, global_step=4820.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2229/5971 [21:11<35:33,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000167, train/loss_step=0.0466, global_step=4821.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2230/5971 [21:12<35:33,  1.75it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0466, train/loss_vlb_step=0.000167, train/loss_step=0.0466, global_step=4821.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2230/5971 [21:12<35:33,  1.75it/s, loss=0.158, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00145, train/loss_step=0.331, global_step=4821.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  37%|███▋      | 2231/5971 [21:13<35:33,  1.75it/s, loss=0.158, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00145, train/loss_step=0.331, global_step=4821.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2231/5971 [21:13<35:33,  1.75it/s, loss=0.173, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00161, train/loss_step=0.313, global_step=4821.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2232/5971 [21:15<35:35,  1.75it/s, loss=0.173, v_num=0, train/loss_simple_step=0.313, train/loss_vlb_step=0.00161, train/loss_step=0.313, global_step=4821.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2232/5971 [21:15<35:35,  1.75it/s, loss=0.189, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00176, train/loss_step=0.347, global_step=4821.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2233/5971 [21:16<35:35,  1.75it/s, loss=0.189, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.00176, train/loss_step=0.347, global_step=4821.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2233/5971 [21:16<35:35,  1.75it/s, loss=0.206, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.00323, train/loss_step=0.435, global_step=4822.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2234/5971 [21:17<35:35,  1.75it/s, loss=0.206, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.00323, train/loss_step=0.435, global_step=4822.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2234/5971 [21:17<35:35,  1.75it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0442, train/loss_vlb_step=0.000169, train/loss_step=0.0442, global_step=4822.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2235/5971 [21:17<35:35,  1.75it/s, loss=0.206, v_num=0, train/loss_simple_step=0.0442, train/loss_vlb_step=0.000169, train/loss_step=0.0442, global_step=4822.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2235/5971 [21:17<35:35,  1.75it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=6.9e-5, train/loss_step=0.0167, global_step=4822.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  37%|███▋      | 2236/5971 [21:20<35:37,  1.75it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0167, train/loss_vlb_step=6.9e-5, train/loss_step=0.0167, global_step=4822.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2236/5971 [21:20<35:37,  1.75it/s, loss=0.207, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00145, train/loss_step=0.304, global_step=4822.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  37%|███▋      | 2237/5971 [21:21<35:37,  1.75it/s, loss=0.207, v_num=0, train/loss_simple_step=0.304, train/loss_vlb_step=0.00145, train/loss_step=0.304, global_step=4822.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2237/5971 [21:21<35:37,  1.75it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000125, train/loss_step=0.0342, global_step=4823.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2238/5971 [21:22<35:37,  1.75it/s, loss=0.205, v_num=0, train/loss_simple_step=0.0342, train/loss_vlb_step=0.000125, train/loss_step=0.0342, global_step=4823.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2238/5971 [21:22<35:37,  1.75it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000101, train/loss_step=0.0259, global_step=4823.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2239/5971 [21:22<35:37,  1.75it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=0.000101, train/loss_step=0.0259, global_step=4823.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  37%|███▋      | 2239/5971 [21:22<35:37,  1.75it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.19e-6, train/loss_step=0.00135, global_step=4823.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2240/5971 [21:25<35:39,  1.74it/s, loss=0.189, v_num=0, train/loss_simple_step=0.00135, train/loss_vlb_step=8.19e-6, train/loss_step=0.00135, global_step=4823.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2240/5971 [21:25<35:39,  1.74it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.5e-5, train/loss_step=0.0188, global_step=4823.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  38%|███▊      | 2241/5971 [21:25<35:39,  1.74it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0188, train/loss_vlb_step=7.5e-5, train/loss_step=0.0188, global_step=4823.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2241/5971 [21:25<35:39,  1.74it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2242/5971 [21:26<35:39,  1.74it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00347, train/loss_vlb_step=1.91e-5, train/loss_step=0.00347, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2242/5971 [21:26<35:39,  1.74it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00928, train/loss_vlb_step=4.25e-5, train/loss_step=0.00928, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2243/5971 [21:27<35:39,  1.74it/s, loss=0.174, v_num=0, train/loss_simple_step=0.00928, train/loss_vlb_step=4.25e-5, train/loss_step=0.00928, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2243/5971 [21:27<35:39,  1.74it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00644, train/loss_vlb_step=3.31e-5, train/loss_step=0.00644, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2244/5971 [21:30<35:41,  1.74it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00644, train/loss_vlb_step=3.31e-5, train/loss_step=0.00644, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  38%|███▊      | 2244/5971 [21:30<35:41,  1.74it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.44it/s][A
Epoch 8:  38%|███▊      | 2246/5971 [21:30<35:39,  1.74it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   1%|          | 2/167 [00:00<00:48,  3.44it/s][A
Epoch 8:  38%|███▊      | 2248/5971 [21:30<35:36,  1.74it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.20it/s][A
Epoch 8:  38%|███▊      | 2251/5971 [21:30<35:32,  1.74it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▌         | 9/167 [00:00<00:10, 15.45it/s][A
Epoch 8:  38%|███▊      | 2255/5971 [21:30<35:26,  1.75it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 12/167 [00:00<00:08, 17.86it/s][A
Epoch 8:  38%|███▊      | 2259/5971 [21:31<35:20,  1.75it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   9%|▉         | 15/167 [00:01<00:07, 20.49it/s][A

Validating:  11%|█         | 18/167 [00:01<00:06, 22.56it/s][A
Epoch 8:  38%|███▊      | 2263/5971 [21:31<35:14,  1.75it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.12it/s][A
Epoch 8:  38%|███▊      | 2267/5971 [21:31<35:09,  1.76it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 24/167 [00:01<00:06, 23.44it/s][A
Epoch 8:  38%|███▊      | 2271/5971 [21:31<35:03,  1.76it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 23.57it/s][A
Epoch 8:  38%|███▊      | 2275/5971 [21:31<34:57,  1.76it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 26.18it/s][A
Epoch 8:  38%|███▊      | 2279/5971 [21:31<34:51,  1.76it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  21%|██        | 35/167 [00:01<00:04, 27.41it/s][A

Validating:  23%|██▎       | 38/167 [00:01<00:04, 26.75it/s][A
Epoch 8:  38%|███▊      | 2283/5971 [21:32<34:46,  1.77it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.82it/s][A
Epoch 8:  38%|███▊      | 2287/5971 [21:32<34:40,  1.77it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.07it/s][A
Epoch 8:  38%|███▊      | 2291/5971 [21:32<34:34,  1.77it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.10it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 26.47it/s][A
Epoch 8:  38%|███▊      | 2295/5971 [21:32<34:29,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 53/167 [00:02<00:04, 26.66it/s][A
Epoch 8:  39%|███▊      | 2299/5971 [21:32<34:23,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  34%|███▎      | 56/167 [00:02<00:04, 26.34it/s][A
Epoch 8:  39%|███▊      | 2303/5971 [21:32<34:18,  1.78it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▌      | 59/167 [00:02<00:04, 26.70it/s][A

Validating:  37%|███▋      | 62/167 [00:02<00:03, 27.52it/s][A
Epoch 8:  39%|███▊      | 2307/5971 [21:32<34:12,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.70it/s][A
Epoch 8:  39%|███▊      | 2311/5971 [21:33<34:06,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.63it/s][A
Epoch 8:  39%|███▉      | 2315/5971 [21:33<34:01,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 27.02it/s][A
Epoch 8:  39%|███▉      | 2319/5971 [21:33<33:55,  1.79it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 26.34it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 26.50it/s][A
Epoch 8:  39%|███▉      | 2323/5971 [21:33<33:50,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.72it/s][A
Epoch 8:  39%|███▉      | 2327/5971 [21:33<33:44,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.40it/s][A
Epoch 8:  39%|███▉      | 2331/5971 [21:33<33:39,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 25.33it/s][A

Validating:  54%|█████▍    | 90/167 [00:03<00:02, 25.93it/s][A
Epoch 8:  39%|███▉      | 2335/5971 [21:33<33:34,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▌    | 93/167 [00:04<00:03, 24.34it/s][A
Epoch 8:  39%|███▉      | 2339/5971 [21:34<33:28,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 25.58it/s][A
Epoch 8:  39%|███▉      | 2343/5971 [21:34<33:23,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 26.18it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.66it/s][A
Epoch 8:  39%|███▉      | 2347/5971 [21:34<33:17,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 25.49it/s][A
Epoch 8:  39%|███▉      | 2351/5971 [21:34<33:12,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.61it/s][A
Epoch 8:  39%|███▉      | 2355/5971 [21:34<33:07,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 27.15it/s][A

Validating:  68%|██████▊   | 114/167 [00:04<00:02, 26.24it/s][A
Epoch 8:  40%|███▉      | 2359/5971 [21:34<33:01,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  70%|███████   | 117/167 [00:04<00:01, 26.77it/s][A
Epoch 8:  40%|███▉      | 2363/5971 [21:35<32:56,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 26.99it/s][A
Epoch 8:  40%|███▉      | 2367/5971 [21:35<32:51,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 28.02it/s][A
Epoch 8:  40%|███▉      | 2371/5971 [21:35<32:45,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 26.13it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.63it/s][A
Epoch 8:  40%|███▉      | 2375/5971 [21:35<32:40,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  80%|████████  | 134/167 [00:05<00:01, 28.09it/s][A
Epoch 8:  40%|███▉      | 2379/5971 [21:35<32:35,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 27.94it/s][A
Epoch 8:  40%|███▉      | 2383/5971 [21:35<32:30,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.22it/s][A
Epoch 8:  40%|███▉      | 2387/5971 [21:35<32:24,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 28.57it/s][A
Epoch 8:  40%|████      | 2391/5971 [21:36<32:19,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 27.34it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 27.06it/s][A
Epoch 8:  40%|████      | 2395/5971 [21:36<32:14,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 28.17it/s][A
Epoch 8:  40%|████      | 2399/5971 [21:36<32:09,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 29.03it/s][A
Epoch 8:  40%|████      | 2403/5971 [21:36<32:04,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 28.83it/s][A
Epoch 8:  40%|████      | 2407/5971 [21:36<31:59,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 27.98it/s][A
Epoch 8:  40%|████      | 2411/5971 [21:36<31:53,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 100%|██████████| 167/167 [00:06<00:00, 28.38it/s][A
Epoch 8:  40%|████      | 2412/5971 [21:37<31:53,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.363, train/loss_vlb_step=0.00183, train/loss_step=0.363, global_step=4824.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  40%|████      | 2413/5971 [21:38<31:54,  1.86it/s, loss=0.189, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00261, train/loss_step=0.397, global_step=4825.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  40%|████      | 2414/5971 [21:39<31:54,  1.86it/s, loss=0.198, v_num=0, train/loss_simple_step=0.491, train/loss_vlb_step=0.00552, train/loss_step=0.491, global_step=4825.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  40%|████      | 2415/5971 [21:40<31:53,  1.86it/s, loss=0.198, v_num=0, train/loss_simple_step=0.491, train/loss_vlb_step=0.00552, train/loss_step=0.491, global_step=4825.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  40%|████      | 2415/5971 [21:40<31:53,  1.86it/s, loss=0.2, v_num=0, train/loss_simple_step=0.373, train/loss_vlb_step=0.00217, train/loss_step=0.373, global_step=4825.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  40%|████      | 2416/5971 [21:42<31:55,  1.86it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0899, train/loss_vlb_step=0.000299, train/loss_step=0.0899, global_step=4825.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  40%|████      | 2417/5971 [21:43<31:55,  1.86it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000116, train/loss_step=0.0327, global_step=4826.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  40%|████      | 2418/5971 [21:44<31:55,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=5.04e-5, train/loss_step=0.0107, global_step=4826.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2419/5971 [21:45<31:55,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0107, train/loss_vlb_step=5.04e-5, train/loss_step=0.0107, global_step=4826.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2419/5971 [21:45<31:55,  1.85it/s, loss=0.172, v_num=0, train/loss_simple_step=0.435, train/loss_vlb_step=0.004, train/loss_step=0.435, global_step=4826.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  41%|████      | 2420/5971 [21:47<31:57,  1.85it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0327, train/loss_vlb_step=0.000116, train/loss_step=0.0327, global_step=4826.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2421/5971 [21:48<31:57,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.00257, train/loss_vlb_step=1.43e-5, train/loss_step=0.00257, global_step=4827.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2422/5971 [21:48<31:57,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00241, train/loss_step=0.377, global_step=4827.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  41%|████      | 2423/5971 [21:49<31:57,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00241, train/loss_step=0.377, global_step=4827.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2423/5971 [21:49<31:57,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.349, train/loss_vlb_step=0.00205, train/loss_step=0.349, global_step=4827.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2424/5971 [21:51<31:58,  1.85it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=6.99e-5, train/loss_step=0.0179, global_step=4827.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2425/5971 [21:52<31:58,  1.85it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.83e-5, train/loss_step=0.0105, global_step=4828.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2426/5971 [21:53<31:58,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000239, train/loss_step=0.0716, global_step=4828.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2427/5971 [21:54<31:58,  1.85it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0716, train/loss_vlb_step=0.000239, train/loss_step=0.0716, global_step=4828.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2427/5971 [21:54<31:58,  1.85it/s, loss=0.167, v_num=0, train/loss_simple_step=0.250, train/loss_vlb_step=0.00102, train/loss_step=0.250, global_step=4828.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  41%|████      | 2428/5971 [21:56<32:00,  1.84it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0282, train/loss_vlb_step=0.000105, train/loss_step=0.0282, global_step=4828.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2429/5971 [21:57<32:00,  1.84it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00187, train/loss_vlb_step=1.09e-5, train/loss_step=0.00187, global_step=4829.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2430/5971 [21:58<32:00,  1.84it/s, loss=0.189, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00343, train/loss_step=0.439, global_step=4829.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  41%|████      | 2431/5971 [21:59<32:00,  1.84it/s, loss=0.189, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00343, train/loss_step=0.439, global_step=4829.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2431/5971 [21:59<32:00,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000626, train/loss_step=0.174, global_step=4829.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2432/5971 [22:01<32:02,  1.84it/s, loss=0.18, v_num=0, train/loss_simple_step=0.00764, train/loss_vlb_step=3.73e-5, train/loss_step=0.00764, global_step=4829.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2433/5971 [22:02<32:02,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.759, train/loss_vlb_step=0.0358, train/loss_step=0.759, global_step=4830.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  41%|████      | 2434/5971 [22:03<32:02,  1.84it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.12e-5, train/loss_step=0.00395, global_step=4830.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2435/5971 [22:04<32:02,  1.84it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.12e-5, train/loss_step=0.00395, global_step=4830.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2435/5971 [22:04<32:02,  1.84it/s, loss=0.161, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000441, train/loss_step=0.132, global_step=4830.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  41%|████      | 2436/5971 [22:06<32:04,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000509, train/loss_step=0.148, global_step=4830.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2437/5971 [22:07<32:04,  1.84it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00542, train/loss_vlb_step=2.78e-5, train/loss_step=0.00542, global_step=4831.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2438/5971 [22:08<32:04,  1.84it/s, loss=0.175, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000999, train/loss_step=0.262, global_step=4831.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  41%|████      | 2439/5971 [22:09<32:04,  1.84it/s, loss=0.175, v_num=0, train/loss_simple_step=0.262, train/loss_vlb_step=0.000999, train/loss_step=0.262, global_step=4831.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2439/5971 [22:09<32:04,  1.84it/s, loss=0.162, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.00058, train/loss_step=0.175, global_step=4831.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2440/5971 [22:11<32:06,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.82e-5, train/loss_step=0.0165, global_step=4831.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2441/5971 [22:12<32:06,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0123, train/loss_vlb_step=5.73e-5, train/loss_step=0.0123, global_step=4832.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2442/5971 [22:13<32:06,  1.83it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.39e-6, train/loss_step=0.00156, global_step=4832.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2443/5971 [22:14<32:05,  1.83it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00156, train/loss_vlb_step=9.39e-6, train/loss_step=0.00156, global_step=4832.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2443/5971 [22:14<32:05,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.127, train/loss_vlb_step=0.000416, train/loss_step=0.127, global_step=4832.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  41%|████      | 2444/5971 [22:16<32:07,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=8.82e-5, train/loss_step=0.0234, global_step=4832.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2445/5971 [22:17<32:07,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.132, train/loss_vlb_step=0.000436, train/loss_step=0.132, global_step=4833.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2446/5971 [22:18<32:07,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000245, train/loss_step=0.0706, global_step=4833.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2447/5971 [22:18<32:07,  1.83it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0706, train/loss_vlb_step=0.000245, train/loss_step=0.0706, global_step=4833.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2447/5971 [22:18<32:07,  1.83it/s, loss=0.127, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.32e-5, train/loss_step=0.0121, global_step=4833.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2448/5971 [22:20<32:09,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00687, train/loss_vlb_step=3.37e-5, train/loss_step=0.00687, global_step=4833.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2449/5971 [22:21<32:09,  1.83it/s, loss=0.132, v_num=0, train/loss_simple_step=0.135, train/loss_vlb_step=0.000449, train/loss_step=0.135, global_step=4834.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  41%|████      | 2450/5971 [22:22<32:08,  1.83it/s, loss=0.125, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00133, train/loss_step=0.287, global_step=4834.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2451/5971 [22:23<32:08,  1.82it/s, loss=0.125, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.00133, train/loss_step=0.287, global_step=4834.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2451/5971 [22:23<32:08,  1.82it/s, loss=0.116, v_num=0, train/loss_simple_step=0.00174, train/loss_vlb_step=1.03e-5, train/loss_step=0.00174, global_step=4834.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2452/5971 [22:25<32:10,  1.82it/s, loss=0.15, v_num=0, train/loss_simple_step=0.681, train/loss_vlb_step=0.0148, train/loss_step=0.681, global_step=4834.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]      
Epoch 8:  41%|████      | 2453/5971 [22:26<32:10,  1.82it/s, loss=0.112, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.14e-5, train/loss_step=0.00196, global_step=4835.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2454/5971 [22:27<32:10,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000214, train/loss_step=0.0631, global_step=4835.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2455/5971 [22:28<32:10,  1.82it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0631, train/loss_vlb_step=0.000214, train/loss_step=0.0631, global_step=4835.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2455/5971 [22:28<32:10,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0119, train/loss_vlb_step=5.18e-5, train/loss_step=0.0119, global_step=4835.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2456/5971 [22:30<32:12,  1.82it/s, loss=0.107, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000392, train/loss_step=0.119, global_step=4835.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2457/5971 [22:31<32:12,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.0594, train/loss_vlb_step=0.000203, train/loss_step=0.0594, global_step=4836.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2458/5971 [22:32<32:12,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000881, train/loss_step=0.240, global_step=4836.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2459/5971 [22:33<32:11,  1.82it/s, loss=0.109, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.000881, train/loss_step=0.240, global_step=4836.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2459/5971 [22:33<32:11,  1.82it/s, loss=0.111, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000752, train/loss_step=0.214, global_step=4836.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2460/5971 [22:35<32:14,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00393, train/loss_vlb_step=1.98e-5, train/loss_step=0.00393, global_step=4836.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2461/5971 [22:36<32:14,  1.81it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00678, train/loss_vlb_step=3.3e-5, train/loss_step=0.00678, global_step=4837.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████      | 2462/5971 [22:37<32:14,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.908, train/loss_vlb_step=0.229, train/loss_step=0.908, global_step=4837.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  41%|████      | 2463/5971 [22:38<32:13,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.908, train/loss_vlb_step=0.229, train/loss_step=0.908, global_step=4837.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████      | 2463/5971 [22:38<32:13,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0887, train/loss_vlb_step=0.000291, train/loss_step=0.0887, global_step=4837.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2464/5971 [22:40<32:15,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0116, train/loss_vlb_step=5.22e-5, train/loss_step=0.0116, global_step=4837.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████▏     | 2465/5971 [22:41<32:15,  1.81it/s, loss=0.153, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000483, train/loss_step=0.142, global_step=4838.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  41%|████▏     | 2466/5971 [22:42<32:15,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000571, train/loss_step=0.170, global_step=4838.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2467/5971 [22:43<32:15,  1.81it/s, loss=0.158, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000571, train/loss_step=0.170, global_step=4838.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2467/5971 [22:43<32:15,  1.81it/s, loss=0.203, v_num=0, train/loss_simple_step=0.908, train/loss_vlb_step=0.229, train/loss_step=0.908, global_step=4838.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  41%|████▏     | 2468/5971 [22:45<32:17,  1.81it/s, loss=0.204, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000109, train/loss_step=0.0309, global_step=4838.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2469/5971 [22:46<32:17,  1.81it/s, loss=0.199, v_num=0, train/loss_simple_step=0.0343, train/loss_vlb_step=0.000129, train/loss_step=0.0343, global_step=4839.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2470/5971 [22:47<32:16,  1.81it/s, loss=0.221, v_num=0, train/loss_simple_step=0.732, train/loss_vlb_step=0.0165, train/loss_step=0.732, global_step=4839.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  41%|████▏     | 2471/5971 [22:47<32:16,  1.81it/s, loss=0.221, v_num=0, train/loss_simple_step=0.732, train/loss_vlb_step=0.0165, train/loss_step=0.732, global_step=4839.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2471/5971 [22:47<32:16,  1.81it/s, loss=0.223, v_num=0, train/loss_simple_step=0.0396, train/loss_vlb_step=0.000138, train/loss_step=0.0396, global_step=4839.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2472/5971 [22:50<32:18,  1.80it/s, loss=0.205, v_num=0, train/loss_simple_step=0.321, train/loss_vlb_step=0.00135, train/loss_step=0.321, global_step=4839.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  41%|████▏     | 2473/5971 [22:51<32:18,  1.80it/s, loss=0.206, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=7.92e-5, train/loss_step=0.019, global_step=4840.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2474/5971 [22:51<32:18,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00116, train/loss_step=0.293, global_step=4840.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2475/5971 [22:52<32:18,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.293, train/loss_vlb_step=0.00116, train/loss_step=0.293, global_step=4840.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2475/5971 [22:52<32:18,  1.80it/s, loss=0.225, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000914, train/loss_step=0.168, global_step=4840.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2476/5971 [22:54<32:20,  1.80it/s, loss=0.22, v_num=0, train/loss_simple_step=0.00623, train/loss_vlb_step=3.05e-5, train/loss_step=0.00623, global_step=4840.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  41%|████▏     | 2477/5971 [22:55<32:19,  1.80it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0173, train/loss_vlb_step=7.18e-5, train/loss_step=0.0173, global_step=4841.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  42%|████▏     | 2478/5971 [22:56<32:19,  1.80it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.7e-5, train/loss_step=0.0156, global_step=4841.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  42%|████▏     | 2479/5971 [22:57<32:19,  1.80it/s, loss=0.207, v_num=0, train/loss_simple_step=0.0156, train/loss_vlb_step=6.7e-5, train/loss_step=0.0156, global_step=4841.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2479/5971 [22:57<32:19,  1.80it/s, loss=0.209, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00125, train/loss_step=0.259, global_step=4841.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  42%|████▏     | 2480/5971 [22:59<32:21,  1.80it/s, loss=0.209, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.17e-5, train/loss_step=0.00199, global_step=4841.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2481/5971 [23:00<32:21,  1.80it/s, loss=0.214, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000371, train/loss_step=0.111, global_step=4842.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  42%|████▏     | 2482/5971 [23:01<32:21,  1.80it/s, loss=0.18, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000934, train/loss_step=0.232, global_step=4842.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  42%|████▏     | 2483/5971 [23:02<32:21,  1.80it/s, loss=0.18, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000934, train/loss_step=0.232, global_step=4842.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2483/5971 [23:02<32:21,  1.80it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0893, train/loss_vlb_step=0.000293, train/loss_step=0.0893, global_step=4842.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2484/5971 [23:04<32:23,  1.79it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.91e-5, train/loss_step=0.0217, global_step=4842.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2485/5971 [23:05<32:23,  1.79it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0264, train/loss_vlb_step=0.0001, train/loss_step=0.0264, global_step=4843.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  42%|████▏     | 2486/5971 [23:06<32:22,  1.79it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.47e-5, train/loss_step=0.00266, global_step=4843.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2487/5971 [23:07<32:22,  1.79it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00266, train/loss_vlb_step=1.47e-5, train/loss_step=0.00266, global_step=4843.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2487/5971 [23:07<32:22,  1.79it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=8.29e-6, train/loss_step=0.00136, global_step=4843.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2488/5971 [23:09<32:24,  1.79it/s, loss=0.153, v_num=0, train/loss_simple_step=0.669, train/loss_vlb_step=0.049, train/loss_step=0.669, global_step=4843.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]      
Epoch 8:  42%|████▏     | 2489/5971 [23:10<32:24,  1.79it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00285, train/loss_vlb_step=1.59e-5, train/loss_step=0.00285, global_step=4844.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2490/5971 [23:11<32:24,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0733, train/loss_vlb_step=0.000243, train/loss_step=0.0733, global_step=4844.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  42%|████▏     | 2491/5971 [23:12<32:24,  1.79it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0733, train/loss_vlb_step=0.000243, train/loss_step=0.0733, global_step=4844.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2491/5971 [23:12<32:24,  1.79it/s, loss=0.124, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000463, train/loss_step=0.140, global_step=4844.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  42%|████▏     | 2492/5971 [23:14<32:26,  1.79it/s, loss=0.129, v_num=0, train/loss_simple_step=0.437, train/loss_vlb_step=0.0037, train/loss_step=0.437, global_step=4844.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  42%|████▏     | 2493/5971 [23:15<32:25,  1.79it/s, loss=0.135, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000481, train/loss_step=0.140, global_step=4845.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2494/5971 [23:16<32:25,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0947, train/loss_vlb_step=0.000318, train/loss_step=0.0947, global_step=4845.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2495/5971 [23:17<32:25,  1.79it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0947, train/loss_vlb_step=0.000318, train/loss_step=0.0947, global_step=4845.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2495/5971 [23:17<32:25,  1.79it/s, loss=0.117, v_num=0, train/loss_simple_step=0.00219, train/loss_vlb_step=1.27e-5, train/loss_step=0.00219, global_step=4845.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2496/5971 [23:19<32:27,  1.78it/s, loss=0.133, v_num=0, train/loss_simple_step=0.329, train/loss_vlb_step=0.00168, train/loss_step=0.329, global_step=4845.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  42%|████▏     | 2497/5971 [23:20<32:27,  1.78it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0853, train/loss_vlb_step=0.000286, train/loss_step=0.0853, global_step=4846.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2498/5971 [23:21<32:27,  1.78it/s, loss=0.174, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0237, train/loss_step=0.761, global_step=4846.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  42%|████▏     | 2499/5971 [23:21<32:26,  1.78it/s, loss=0.174, v_num=0, train/loss_simple_step=0.761, train/loss_vlb_step=0.0237, train/loss_step=0.761, global_step=4846.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2499/5971 [23:21<32:26,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.480, train/loss_vlb_step=0.00389, train/loss_step=0.480, global_step=4846.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2500/5971 [23:24<32:28,  1.78it/s, loss=0.191, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.00043, train/loss_step=0.131, global_step=4846.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2501/5971 [23:25<32:28,  1.78it/s, loss=0.186, v_num=0, train/loss_simple_step=0.00351, train/loss_vlb_step=1.9e-5, train/loss_step=0.00351, global_step=4847.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2502/5971 [23:25<32:28,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000789, train/loss_step=0.206, global_step=4847.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  42%|████▏     | 2503/5971 [23:26<32:28,  1.78it/s, loss=0.185, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000789, train/loss_step=0.206, global_step=4847.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2503/5971 [23:26<32:28,  1.78it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0764, train/loss_vlb_step=0.000264, train/loss_step=0.0764, global_step=4847.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2504/5971 [23:28<32:29,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00533, train/loss_vlb_step=2.76e-5, train/loss_step=0.00533, global_step=4847.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2505/5971 [23:29<32:29,  1.78it/s, loss=0.182, v_num=0, train/loss_simple_step=0.00208, train/loss_vlb_step=1.21e-5, train/loss_step=0.00208, global_step=4848.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2506/5971 [23:30<32:29,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8e-5, train/loss_step=0.0191, global_step=4848.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  42%|████▏     | 2507/5971 [23:31<32:29,  1.78it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0191, train/loss_vlb_step=8e-5, train/loss_step=0.0191, global_step=4848.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2507/5971 [23:31<32:29,  1.78it/s, loss=0.204, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00365, train/loss_step=0.425, global_step=4848.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2508/5971 [23:33<32:31,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.008, train/loss_vlb_step=3.88e-5, train/loss_step=0.008, global_step=4848.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2509/5971 [23:34<32:31,  1.77it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0055, train/loss_vlb_step=2.69e-5, train/loss_step=0.0055, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2510/5971 [23:35<32:31,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  42%|████▏     | 2511/5971 [23:36<32:31,  1.77it/s, loss=0.173, v_num=0, train/loss_simple_step=0.115, train/loss_vlb_step=0.000381, train/loss_step=0.115, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  42%|████▏     | 2511/5971 [23:36<32:31,  1.77it/s, loss=0.185, v_num=0, train/loss_simple_step=0.368, train/loss_vlb_step=0.00155, train/loss_step=0.368, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  42%|████▏     | 2512/5971 [23:38<32:32,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:06,  2.51it/s][A

Validating:   1%|          | 2/167 [00:00<00:46,  3.52it/s][A
Epoch 8:  42%|████▏     | 2515/5971 [23:39<32:29,  1.77it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.35it/s][A
Epoch 8:  42%|████▏     | 2519/5971 [23:39<32:24,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.60it/s][A
Epoch 8:  42%|████▏     | 2523/5971 [23:39<32:19,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.30it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.39it/s][A
Epoch 8:  42%|████▏     | 2527/5971 [23:39<32:14,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.29it/s][A
Epoch 8:  42%|████▏     | 2531/5971 [23:39<32:09,  1.78it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.33it/s][A
Epoch 8:  42%|████▏     | 2535/5971 [23:40<32:04,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.58it/s][A
Epoch 8:  43%|████▎     | 2539/5971 [23:40<31:58,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 25.74it/s][A
Epoch 8:  43%|████▎     | 2543/5971 [23:40<31:53,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 26.55it/s][A
Epoch 8:  43%|████▎     | 2547/5971 [23:40<31:48,  1.79it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  21%|██        | 35/167 [00:01<00:04, 27.76it/s][A

Validating:  23%|██▎       | 38/167 [00:01<00:04, 28.07it/s][A
Epoch 8:  43%|████▎     | 2551/5971 [23:40<31:43,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 28.08it/s][A
Epoch 8:  43%|████▎     | 2555/5971 [23:40<31:38,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 27.81it/s][A
Epoch 8:  43%|████▎     | 2559/5971 [23:40<31:33,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.39it/s][A
Epoch 8:  43%|████▎     | 2563/5971 [23:41<31:28,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  31%|███       | 51/167 [00:02<00:04, 27.14it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.31it/s][A
Epoch 8:  43%|████▎     | 2567/5971 [23:41<31:23,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▍      | 58/167 [00:02<00:03, 28.15it/s][A
Epoch 8:  43%|████▎     | 2571/5971 [23:41<31:18,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 28.51it/s][A
Epoch 8:  43%|████▎     | 2575/5971 [23:41<31:13,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.28it/s][A
Epoch 8:  43%|████▎     | 2579/5971 [23:41<31:09,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  40%|████      | 67/167 [00:03<00:03, 26.25it/s][A
Epoch 8:  43%|████▎     | 2583/5971 [23:41<31:04,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 27.68it/s][A

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.80it/s][A
Epoch 8:  43%|████▎     | 2587/5971 [23:41<30:59,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.97it/s][A
Epoch 8:  43%|████▎     | 2591/5971 [23:42<30:54,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  48%|████▊     | 80/167 [00:03<00:03, 27.43it/s][A
Epoch 8:  43%|████▎     | 2595/5971 [23:42<30:49,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  50%|████▉     | 83/167 [00:03<00:03, 25.79it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 25.86it/s][A
Epoch 8:  44%|████▎     | 2599/5971 [23:42<30:44,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 89/167 [00:03<00:03, 24.75it/s][A
Epoch 8:  44%|████▎     | 2603/5971 [23:42<30:39,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 25.28it/s][A
Epoch 8:  44%|████▎     | 2607/5971 [23:42<30:35,  1.83it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 95/167 [00:04<00:02, 25.86it/s][A

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 25.69it/s][A
Epoch 8:  44%|████▎     | 2611/5971 [23:42<30:30,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  60%|██████    | 101/167 [00:04<00:02, 26.35it/s][A
Epoch 8:  44%|████▍     | 2615/5971 [23:43<30:25,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.74it/s][A
Epoch 8:  44%|████▍     | 2619/5971 [23:43<30:20,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 27.45it/s][A

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 27.66it/s][A
Epoch 8:  44%|████▍     | 2623/5971 [23:43<30:16,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.17it/s][A
Epoch 8:  44%|████▍     | 2627/5971 [23:43<30:11,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 27.84it/s][A
Epoch 8:  44%|████▍     | 2631/5971 [23:43<30:06,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 27.06it/s][A

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.73it/s][A
Epoch 8:  44%|████▍     | 2635/5971 [23:43<30:01,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.07it/s][A
Epoch 8:  44%|████▍     | 2639/5971 [23:43<29:57,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.54it/s][A
Epoch 8:  44%|████▍     | 2643/5971 [23:44<29:52,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 27.68it/s][A
Epoch 8:  44%|████▍     | 2647/5971 [23:44<29:47,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.92it/s][A

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.93it/s][A
Epoch 8:  44%|████▍     | 2651/5971 [23:44<29:43,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  84%|████████▍ | 141/167 [00:05<00:00, 27.32it/s][A
Epoch 8:  44%|████▍     | 2655/5971 [23:44<29:38,  1.86it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 144/167 [00:05<00:00, 26.78it/s][A
Epoch 8:  45%|████▍     | 2659/5971 [23:44<29:33,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  88%|████████▊ | 147/167 [00:05<00:00, 26.99it/s][A

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 26.58it/s][A
Epoch 8:  45%|████▍     | 2663/5971 [23:44<29:29,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 27.07it/s][A
Epoch 8:  45%|████▍     | 2667/5971 [23:44<29:24,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 27.73it/s][A
Epoch 8:  45%|████▍     | 2671/5971 [23:45<29:20,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 26.95it/s][A

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 26.66it/s][A
Epoch 8:  45%|████▍     | 2675/5971 [23:45<29:15,  1.88it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 26.78it/s][A
Epoch 8:  45%|████▍     | 2679/5971 [23:45<29:10,  1.88it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▍     | 2680/5971 [23:45<29:10,  1.88it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0423, train/loss_vlb_step=0.000159, train/loss_step=0.0423, global_step=4849.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  45%|████▍     | 2681/5971 [23:46<29:09,  1.88it/s, loss=0.16, v_num=0, train/loss_simple_step=0.044, train/loss_vlb_step=0.000157, train/loss_step=0.044, global_step=4850.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  45%|████▍     | 2682/5971 [23:47<29:09,  1.88it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000244, train/loss_step=0.0735, global_step=4850.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▍     | 2683/5971 [23:48<29:09,  1.88it/s, loss=0.159, v_num=0, train/loss_simple_step=0.0735, train/loss_vlb_step=0.000244, train/loss_step=0.0735, global_step=4850.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▍     | 2683/5971 [23:48<29:09,  1.88it/s, loss=0.187, v_num=0, train/loss_simple_step=0.563, train/loss_vlb_step=0.00833, train/loss_step=0.563, global_step=4850.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  45%|████▍     | 2684/5971 [23:50<29:11,  1.88it/s, loss=0.171, v_num=0, train/loss_simple_step=0.00218, train/loss_vlb_step=1.21e-5, train/loss_step=0.00218, global_step=4850.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▍     | 2685/5971 [23:51<29:11,  1.88it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=4.07e-5, train/loss_step=0.00854, global_step=4851.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▍     | 2686/5971 [23:52<29:11,  1.88it/s, loss=0.135, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000413, train/loss_step=0.126, global_step=4851.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  45%|████▌     | 2687/5971 [23:53<29:11,  1.88it/s, loss=0.135, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000413, train/loss_step=0.126, global_step=4851.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2687/5971 [23:53<29:11,  1.88it/s, loss=0.125, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00116, train/loss_step=0.268, global_step=4851.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  45%|████▌     | 2688/5971 [23:55<29:12,  1.87it/s, loss=0.139, v_num=0, train/loss_simple_step=0.426, train/loss_vlb_step=0.00209, train/loss_step=0.426, global_step=4851.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2689/5971 [23:56<29:12,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.267, train/loss_vlb_step=0.00117, train/loss_step=0.267, global_step=4852.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2690/5971 [23:57<29:12,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000152, train/loss_step=0.0428, global_step=4852.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2691/5971 [23:58<29:12,  1.87it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0428, train/loss_vlb_step=0.000152, train/loss_step=0.0428, global_step=4852.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2691/5971 [23:58<29:12,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0216, train/loss_vlb_step=8.23e-5, train/loss_step=0.0216, global_step=4852.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  45%|████▌     | 2692/5971 [24:00<29:13,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.286, train/loss_vlb_step=0.0015, train/loss_step=0.286, global_step=4852.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  45%|████▌     | 2693/5971 [24:01<29:13,  1.87it/s, loss=0.162, v_num=0, train/loss_simple_step=0.120, train/loss_vlb_step=0.000395, train/loss_step=0.120, global_step=4853.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2694/5971 [24:02<29:13,  1.87it/s, loss=0.171, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000746, train/loss_step=0.215, global_step=4853.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2695/5971 [24:03<29:13,  1.87it/s, loss=0.171, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000746, train/loss_step=0.215, global_step=4853.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2695/5971 [24:03<29:13,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0426, train/loss_vlb_step=0.000155, train/loss_step=0.0426, global_step=4853.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2696/5971 [24:05<29:14,  1.87it/s, loss=0.157, v_num=0, train/loss_simple_step=0.105, train/loss_vlb_step=0.000344, train/loss_step=0.105, global_step=4853.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  45%|████▌     | 2697/5971 [24:06<29:14,  1.87it/s, loss=0.173, v_num=0, train/loss_simple_step=0.331, train/loss_vlb_step=0.00194, train/loss_step=0.331, global_step=4854.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  45%|████▌     | 2698/5971 [24:07<29:14,  1.87it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.42e-5, train/loss_step=0.00255, global_step=4854.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2699/5971 [24:07<29:14,  1.86it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00255, train/loss_vlb_step=1.42e-5, train/loss_step=0.00255, global_step=4854.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2699/5971 [24:07<29:14,  1.86it/s, loss=0.172, v_num=0, train/loss_simple_step=0.458, train/loss_vlb_step=0.00273, train/loss_step=0.458, global_step=4854.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  45%|████▌     | 2700/5971 [24:10<29:16,  1.86it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.43e-5, train/loss_step=0.0238, global_step=4854.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2701/5971 [24:11<29:16,  1.86it/s, loss=0.191, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00426, train/loss_step=0.445, global_step=4855.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  45%|████▌     | 2702/5971 [24:11<29:15,  1.86it/s, loss=0.198, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000664, train/loss_step=0.202, global_step=4855.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2703/5971 [24:12<29:15,  1.86it/s, loss=0.198, v_num=0, train/loss_simple_step=0.202, train/loss_vlb_step=0.000664, train/loss_step=0.202, global_step=4855.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2703/5971 [24:12<29:15,  1.86it/s, loss=0.17, v_num=0, train/loss_simple_step=0.00442, train/loss_vlb_step=2.37e-5, train/loss_step=0.00442, global_step=4855.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2704/5971 [24:15<29:17,  1.86it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.97e-5, train/loss_step=0.0259, global_step=4855.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  45%|████▌     | 2705/5971 [24:16<29:17,  1.86it/s, loss=0.171, v_num=0, train/loss_simple_step=0.003, train/loss_vlb_step=1.59e-5, train/loss_step=0.003, global_step=4856.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  45%|████▌     | 2706/5971 [24:17<29:17,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0927, train/loss_vlb_step=0.000307, train/loss_step=0.0927, global_step=4856.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2707/5971 [24:17<29:17,  1.86it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0927, train/loss_vlb_step=0.000307, train/loss_step=0.0927, global_step=4856.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2707/5971 [24:17<29:17,  1.86it/s, loss=0.179, v_num=0, train/loss_simple_step=0.464, train/loss_vlb_step=0.00325, train/loss_step=0.464, global_step=4856.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  45%|████▌     | 2708/5971 [24:20<29:18,  1.86it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00421, train/loss_vlb_step=2.15e-5, train/loss_step=0.00421, global_step=4856.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2709/5971 [24:21<29:18,  1.85it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00583, train/loss_vlb_step=2.96e-5, train/loss_step=0.00583, global_step=4857.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2710/5971 [24:21<29:18,  1.85it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.73e-5, train/loss_step=0.00319, global_step=4857.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2711/5971 [24:22<29:18,  1.85it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.73e-5, train/loss_step=0.00319, global_step=4857.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2711/5971 [24:22<29:18,  1.85it/s, loss=0.143, v_num=0, train/loss_simple_step=0.0184, train/loss_vlb_step=7.97e-5, train/loss_step=0.0184, global_step=4857.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  45%|████▌     | 2712/5971 [24:25<29:20,  1.85it/s, loss=0.132, v_num=0, train/loss_simple_step=0.0824, train/loss_vlb_step=0.000271, train/loss_step=0.0824, global_step=4857.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2713/5971 [24:26<29:20,  1.85it/s, loss=0.14, v_num=0, train/loss_simple_step=0.268, train/loss_vlb_step=0.00116, train/loss_step=0.268, global_step=4858.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  45%|████▌     | 2714/5971 [24:27<29:19,  1.85it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00549, train/loss_vlb_step=2.67e-5, train/loss_step=0.00549, global_step=4858.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2715/5971 [24:27<29:19,  1.85it/s, loss=0.129, v_num=0, train/loss_simple_step=0.00549, train/loss_vlb_step=2.67e-5, train/loss_step=0.00549, global_step=4858.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  45%|████▌     | 2715/5971 [24:27<29:19,  1.85it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0704, train/loss_vlb_step=0.000235, train/loss_step=0.0704, global_step=4858.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  45%|████▌     | 2716/5971 [24:30<29:21,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.00109, train/loss_step=0.260, global_step=4858.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  46%|████▌     | 2717/5971 [24:30<29:20,  1.85it/s, loss=0.122, v_num=0, train/loss_simple_step=0.00367, train/loss_vlb_step=1.89e-5, train/loss_step=0.00367, global_step=4859.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2718/5971 [24:31<29:20,  1.85it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0856, train/loss_vlb_step=0.000281, train/loss_step=0.0856, global_step=4859.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▌     | 2719/5971 [24:32<29:20,  1.85it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0856, train/loss_vlb_step=0.000281, train/loss_step=0.0856, global_step=4859.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2719/5971 [24:32<29:20,  1.85it/s, loss=0.116, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000956, train/loss_step=0.253, global_step=4859.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▌     | 2720/5971 [24:34<29:22,  1.85it/s, loss=0.139, v_num=0, train/loss_simple_step=0.483, train/loss_vlb_step=0.00429, train/loss_step=0.483, global_step=4859.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▌     | 2721/5971 [24:35<29:21,  1.84it/s, loss=0.146, v_num=0, train/loss_simple_step=0.575, train/loss_vlb_step=0.00706, train/loss_step=0.575, global_step=4860.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2722/5971 [24:36<29:21,  1.84it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.3e-5, train/loss_step=0.00237, global_step=4860.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2723/5971 [24:37<29:21,  1.84it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00237, train/loss_vlb_step=1.3e-5, train/loss_step=0.00237, global_step=4860.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2723/5971 [24:37<29:21,  1.84it/s, loss=0.137, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000153, train/loss_step=0.0424, global_step=4860.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2724/5971 [24:39<29:23,  1.84it/s, loss=0.144, v_num=0, train/loss_simple_step=0.150, train/loss_vlb_step=0.0005, train/loss_step=0.150, global_step=4860.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  46%|████▌     | 2725/5971 [24:40<29:23,  1.84it/s, loss=0.182, v_num=0, train/loss_simple_step=0.775, train/loss_vlb_step=0.0366, train/loss_step=0.775, global_step=4861.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2726/5971 [24:41<29:23,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.0022, train/loss_step=0.445, global_step=4861.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▌     | 2727/5971 [24:42<29:23,  1.84it/s, loss=0.2, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.0022, train/loss_step=0.445, global_step=4861.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2727/5971 [24:42<29:23,  1.84it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0375, train/loss_vlb_step=0.000135, train/loss_step=0.0375, global_step=4861.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2728/5971 [24:44<29:24,  1.84it/s, loss=0.186, v_num=0, train/loss_simple_step=0.162, train/loss_vlb_step=0.000541, train/loss_step=0.162, global_step=4861.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▌     | 2729/5971 [24:45<29:24,  1.84it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0359, train/loss_vlb_step=0.000131, train/loss_step=0.0359, global_step=4862.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2730/5971 [24:46<29:24,  1.84it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.87e-5, train/loss_step=0.0168, global_step=4862.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▌     | 2731/5971 [24:47<29:23,  1.84it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0168, train/loss_vlb_step=6.87e-5, train/loss_step=0.0168, global_step=4862.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2731/5971 [24:47<29:23,  1.84it/s, loss=0.204, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00214, train/loss_step=0.328, global_step=4862.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▌     | 2732/5971 [24:49<29:25,  1.83it/s, loss=0.221, v_num=0, train/loss_simple_step=0.414, train/loss_vlb_step=0.00272, train/loss_step=0.414, global_step=4862.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2733/5971 [24:50<29:25,  1.83it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.36e-6, train/loss_step=0.00139, global_step=4863.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2734/5971 [24:51<29:25,  1.83it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00523, train/loss_vlb_step=2.6e-5, train/loss_step=0.00523, global_step=4863.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▌     | 2735/5971 [24:52<29:24,  1.83it/s, loss=0.207, v_num=0, train/loss_simple_step=0.00523, train/loss_vlb_step=2.6e-5, train/loss_step=0.00523, global_step=4863.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2735/5971 [24:52<29:24,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.00583, train/loss_vlb_step=2.9e-5, train/loss_step=0.00583, global_step=4863.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2736/5971 [24:54<29:26,  1.83it/s, loss=0.199, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000549, train/loss_step=0.160, global_step=4863.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▌     | 2737/5971 [24:55<29:25,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.000342, train/loss_step=0.104, global_step=4864.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2738/5971 [24:56<29:25,  1.83it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0039, train/loss_vlb_step=2.08e-5, train/loss_step=0.0039, global_step=4864.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▌     | 2739/5971 [24:56<29:25,  1.83it/s, loss=0.2, v_num=0, train/loss_simple_step=0.0039, train/loss_vlb_step=2.08e-5, train/loss_step=0.0039, global_step=4864.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2739/5971 [24:56<29:25,  1.83it/s, loss=0.194, v_num=0, train/loss_simple_step=0.126, train/loss_vlb_step=0.000414, train/loss_step=0.126, global_step=4864.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2740/5971 [24:58<29:26,  1.83it/s, loss=0.19, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00199, train/loss_step=0.415, global_step=4864.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▌     | 2741/5971 [24:59<29:26,  1.83it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0109, train/loss_vlb_step=4.7e-5, train/loss_step=0.0109, global_step=4865.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2742/5971 [25:00<29:26,  1.83it/s, loss=0.183, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00197, train/loss_step=0.415, global_step=4865.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▌     | 2743/5971 [25:01<29:26,  1.83it/s, loss=0.183, v_num=0, train/loss_simple_step=0.415, train/loss_vlb_step=0.00197, train/loss_step=0.415, global_step=4865.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2743/5971 [25:01<29:26,  1.83it/s, loss=0.185, v_num=0, train/loss_simple_step=0.090, train/loss_vlb_step=0.000301, train/loss_step=0.090, global_step=4865.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2744/5971 [25:03<29:28,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.00179, train/loss_vlb_step=1.05e-5, train/loss_step=0.00179, global_step=4865.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2745/5971 [25:04<29:27,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.537, train/loss_vlb_step=0.00604, train/loss_step=0.537, global_step=4866.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  46%|████▌     | 2746/5971 [25:05<29:27,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00105, train/loss_step=0.259, global_step=4866.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2747/5971 [25:06<29:27,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.259, train/loss_vlb_step=0.00105, train/loss_step=0.259, global_step=4866.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2747/5971 [25:06<29:27,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.214, train/loss_vlb_step=0.000812, train/loss_step=0.214, global_step=4866.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2748/5971 [25:08<29:28,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000504, train/loss_step=0.153, global_step=4866.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2749/5971 [25:09<29:28,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.343, train/loss_vlb_step=0.00166, train/loss_step=0.343, global_step=4867.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▌     | 2750/5971 [25:10<29:28,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.21e-5, train/loss_step=0.0165, global_step=4867.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2751/5971 [25:11<29:28,  1.82it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0165, train/loss_vlb_step=6.21e-5, train/loss_step=0.0165, global_step=4867.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2751/5971 [25:11<29:28,  1.82it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.97e-5, train/loss_step=0.0037, global_step=4867.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2752/5971 [25:13<29:29,  1.82it/s, loss=0.145, v_num=0, train/loss_simple_step=0.029, train/loss_vlb_step=0.000116, train/loss_step=0.029, global_step=4867.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▌     | 2753/5971 [25:14<29:29,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000532, train/loss_step=0.154, global_step=4868.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2754/5971 [25:15<29:29,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00462, train/loss_vlb_step=2.47e-5, train/loss_step=0.00462, global_step=4868.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2755/5971 [25:16<29:29,  1.82it/s, loss=0.152, v_num=0, train/loss_simple_step=0.00462, train/loss_vlb_step=2.47e-5, train/loss_step=0.00462, global_step=4868.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2755/5971 [25:16<29:29,  1.82it/s, loss=0.17, v_num=0, train/loss_simple_step=0.366, train/loss_vlb_step=0.00158, train/loss_step=0.366, global_step=4868.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  46%|████▌     | 2756/5971 [25:18<29:30,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0106, train/loss_vlb_step=4.73e-5, train/loss_step=0.0106, global_step=4868.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2757/5971 [25:19<29:30,  1.82it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00229, train/loss_vlb_step=1.32e-5, train/loss_step=0.00229, global_step=4869.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2758/5971 [25:20<29:30,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000169, train/loss_step=0.0505, global_step=4869.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▌     | 2759/5971 [25:21<29:30,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0505, train/loss_vlb_step=0.000169, train/loss_step=0.0505, global_step=4869.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2759/5971 [25:21<29:30,  1.81it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.000122, train/loss_step=0.0317, global_step=4869.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▌     | 2760/5971 [25:23<29:31,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0179, train/loss_vlb_step=7.27e-5, train/loss_step=0.0179, global_step=4869.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▌     | 2761/5971 [25:24<29:31,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000392, train/loss_step=0.119, global_step=4870.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▋     | 2762/5971 [25:25<29:31,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.22e-5, train/loss_step=0.00435, global_step=4870.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2763/5971 [25:25<29:31,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.00435, train/loss_vlb_step=2.22e-5, train/loss_step=0.00435, global_step=4870.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2763/5971 [25:25<29:31,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.393, train/loss_vlb_step=0.0029, train/loss_step=0.393, global_step=4870.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  46%|████▋     | 2764/5971 [25:28<29:32,  1.81it/s, loss=0.142, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=4870.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2765/5971 [25:28<29:32,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0944, train/loss_vlb_step=0.00031, train/loss_step=0.0944, global_step=4871.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2766/5971 [25:29<29:31,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00122, train/loss_step=0.258, global_step=4871.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▋     | 2767/5971 [25:30<29:31,  1.81it/s, loss=0.12, v_num=0, train/loss_simple_step=0.258, train/loss_vlb_step=0.00122, train/loss_step=0.258, global_step=4871.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2767/5971 [25:30<29:31,  1.81it/s, loss=0.123, v_num=0, train/loss_simple_step=0.285, train/loss_vlb_step=0.00106, train/loss_step=0.285, global_step=4871.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2768/5971 [25:32<29:33,  1.81it/s, loss=0.125, v_num=0, train/loss_simple_step=0.199, train/loss_vlb_step=0.00068, train/loss_step=0.199, global_step=4871.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2769/5971 [25:33<29:32,  1.81it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00862, train/loss_vlb_step=3.96e-5, train/loss_step=0.00862, global_step=4872.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2770/5971 [25:34<29:32,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0813, train/loss_vlb_step=0.00028, train/loss_step=0.0813, global_step=4872.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▋     | 2771/5971 [25:35<29:32,  1.81it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0813, train/loss_vlb_step=0.00028, train/loss_step=0.0813, global_step=4872.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2771/5971 [25:35<29:32,  1.81it/s, loss=0.121, v_num=0, train/loss_simple_step=0.181, train/loss_vlb_step=0.00063, train/loss_step=0.181, global_step=4872.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  46%|████▋     | 2772/5971 [25:38<29:34,  1.80it/s, loss=0.147, v_num=0, train/loss_simple_step=0.549, train/loss_vlb_step=0.00687, train/loss_step=0.549, global_step=4872.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2773/5971 [25:38<29:34,  1.80it/s, loss=0.151, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000827, train/loss_step=0.232, global_step=4873.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2774/5971 [25:39<29:34,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00631, train/loss_step=0.524, global_step=4873.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  46%|████▋     | 2775/5971 [25:40<29:33,  1.80it/s, loss=0.177, v_num=0, train/loss_simple_step=0.524, train/loss_vlb_step=0.00631, train/loss_step=0.524, global_step=4873.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2775/5971 [25:40<29:33,  1.80it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00326, train/loss_vlb_step=1.76e-5, train/loss_step=0.00326, global_step=4873.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  46%|████▋     | 2776/5971 [25:42<29:35,  1.80it/s, loss=0.163, v_num=0, train/loss_simple_step=0.108, train/loss_vlb_step=0.000357, train/loss_step=0.108, global_step=4873.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  47%|████▋     | 2777/5971 [25:43<29:35,  1.80it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0384, train/loss_vlb_step=0.000136, train/loss_step=0.0384, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  47%|████▋     | 2778/5971 [25:44<29:34,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000598, train/loss_step=0.170, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  47%|████▋     | 2779/5971 [25:45<29:34,  1.80it/s, loss=0.171, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000598, train/loss_step=0.170, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  47%|████▋     | 2779/5971 [25:45<29:34,  1.80it/s, loss=0.175, v_num=0, train/loss_simple_step=0.104, train/loss_vlb_step=0.00034, train/loss_step=0.104, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  47%|████▋     | 2780/5971 [25:47<29:36,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:07,  2.48it/s][A

Validating:   1%|          | 2/167 [00:00<00:52,  3.15it/s][A
Epoch 8:  47%|████▋     | 2783/5971 [25:48<29:33,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:19,  8.35it/s][A
Epoch 8:  47%|████▋     | 2787/5971 [25:48<29:28,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.71it/s][A
Epoch 8:  47%|████▋     | 2791/5971 [25:48<29:24,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 11/167 [00:01<00:09, 15.83it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:08, 18.51it/s][A
Epoch 8:  47%|████▋     | 2795/5971 [25:49<29:19,  1.80it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.00it/s][A
Epoch 8:  47%|████▋     | 2799/5971 [25:49<29:15,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.83it/s][A
Epoch 8:  47%|████▋     | 2803/5971 [25:49<29:10,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.44it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 25.02it/s][A
Epoch 8:  47%|████▋     | 2807/5971 [25:49<29:05,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 26.34it/s][A
Epoch 8:  47%|████▋     | 2811/5971 [25:49<29:01,  1.81it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 26.59it/s][A
Epoch 8:  47%|████▋     | 2815/5971 [25:49<28:56,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  21%|██        | 35/167 [00:01<00:04, 26.70it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.93it/s][A
Epoch 8:  47%|████▋     | 2819/5971 [25:49<28:52,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 27.01it/s][A
Epoch 8:  47%|████▋     | 2823/5971 [25:50<28:47,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.47it/s][A
Epoch 8:  47%|████▋     | 2827/5971 [25:50<28:43,  1.82it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 27.63it/s][A
Epoch 8:  47%|████▋     | 2831/5971 [25:50<28:38,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  31%|███       | 51/167 [00:02<00:04, 25.23it/s][A

Validating:  32%|███▏      | 54/167 [00:02<00:04, 24.61it/s][A
Epoch 8:  47%|████▋     | 2835/5971 [25:50<28:34,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  34%|███▍      | 57/167 [00:02<00:04, 25.46it/s][A
Epoch 8:  48%|████▊     | 2839/5971 [25:50<28:30,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.52it/s][A
Epoch 8:  48%|████▊     | 2843/5971 [25:50<28:25,  1.83it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 63/167 [00:02<00:03, 26.52it/s][A

Validating:  40%|███▉      | 66/167 [00:03<00:03, 25.33it/s][A
Epoch 8:  48%|████▊     | 2847/5971 [25:51<28:21,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  41%|████▏     | 69/167 [00:03<00:04, 23.92it/s][A
Epoch 8:  48%|████▊     | 2851/5971 [25:51<28:16,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 72/167 [00:03<00:04, 23.73it/s][A
Epoch 8:  48%|████▊     | 2855/5971 [25:51<28:12,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 24.23it/s][A

Validating:  47%|████▋     | 78/167 [00:03<00:03, 24.15it/s][A
Epoch 8:  48%|████▊     | 2859/5971 [25:51<28:08,  1.84it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  49%|████▊     | 81/167 [00:03<00:03, 25.30it/s][A
Epoch 8:  48%|████▊     | 2863/5971 [25:51<28:03,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  50%|█████     | 84/167 [00:03<00:03, 24.15it/s][A
Epoch 8:  48%|████▊     | 2867/5971 [25:51<27:59,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  52%|█████▏    | 87/167 [00:03<00:03, 24.76it/s][A

Validating:  54%|█████▍    | 90/167 [00:04<00:03, 24.80it/s][A
Epoch 8:  48%|████▊     | 2871/5971 [25:52<27:55,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 25.47it/s][A
Epoch 8:  48%|████▊     | 2875/5971 [25:52<27:50,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 24.81it/s][A
Epoch 8:  48%|████▊     | 2879/5971 [25:52<27:46,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 25.71it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 25.56it/s][A
Epoch 8:  48%|████▊     | 2883/5971 [25:52<27:42,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 24.98it/s][A
Epoch 8:  48%|████▊     | 2887/5971 [25:52<27:37,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 26.18it/s][A
Epoch 8:  48%|████▊     | 2891/5971 [25:52<27:33,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▋   | 111/167 [00:04<00:02, 26.85it/s][A

Validating:  68%|██████▊   | 114/167 [00:05<00:02, 26.32it/s][A
Epoch 8:  48%|████▊     | 2895/5971 [25:52<27:29,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  70%|███████   | 117/167 [00:05<00:01, 26.61it/s][A
Epoch 8:  49%|████▊     | 2899/5971 [25:53<27:25,  1.87it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 27.19it/s][A
Epoch 8:  49%|████▊     | 2903/5971 [25:53<27:20,  1.87it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.01it/s][A

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 26.77it/s][A
Epoch 8:  49%|████▊     | 2907/5971 [25:53<27:16,  1.87it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.04it/s][A
Epoch 8:  49%|████▉     | 2911/5971 [25:53<27:12,  1.87it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 26.81it/s][A
Epoch 8:  49%|████▉     | 2915/5971 [25:53<27:08,  1.88it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.00it/s][A
Epoch 8:  49%|████▉     | 2919/5971 [25:53<27:04,  1.88it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 27.02it/s][A

Validating:  85%|████████▌ | 142/167 [00:06<00:00, 26.14it/s][A
Epoch 8:  49%|████▉     | 2923/5971 [25:53<26:59,  1.88it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 27.12it/s][A
Epoch 8:  49%|████▉     | 2927/5971 [25:54<26:55,  1.88it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 27.38it/s][A
Epoch 8:  49%|████▉     | 2931/5971 [25:54<26:51,  1.89it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 27.99it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.92it/s][A
Epoch 8:  49%|████▉     | 2935/5971 [25:54<26:47,  1.89it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.37it/s][A
Epoch 8:  49%|████▉     | 2939/5971 [25:54<26:43,  1.89it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▌| 160/167 [00:06<00:00, 26.31it/s][A
Epoch 8:  49%|████▉     | 2943/5971 [25:54<26:39,  1.89it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  98%|█████████▊| 163/167 [00:06<00:00, 25.57it/s][A

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 25.58it/s][A
Epoch 8:  49%|████▉     | 2947/5971 [25:54<26:34,  1.90it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  49%|████▉     | 2948/5971 [25:55<26:34,  1.90it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000284, train/loss_step=0.0844, global_step=4874.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  49%|████▉     | 2949/5971 [25:56<26:34,  1.90it/s, loss=0.183, v_num=0, train/loss_simple_step=0.215, train/loss_vlb_step=0.000796, train/loss_step=0.215, global_step=4875.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  49%|████▉     | 2950/5971 [25:57<26:34,  1.90it/s, loss=0.193, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000709, train/loss_step=0.197, global_step=4875.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  49%|████▉     | 2951/5971 [25:58<26:33,  1.89it/s, loss=0.193, v_num=0, train/loss_simple_step=0.197, train/loss_vlb_step=0.000709, train/loss_step=0.197, global_step=4875.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  49%|████▉     | 2951/5971 [25:58<26:33,  1.89it/s, loss=0.179, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000456, train/loss_step=0.130, global_step=4875.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  49%|████▉     | 2952/5971 [26:00<26:35,  1.89it/s, loss=0.175, v_num=0, train/loss_simple_step=0.0448, train/loss_vlb_step=0.000159, train/loss_step=0.0448, global_step=4875.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  49%|████▉     | 2953/5971 [26:01<26:35,  1.89it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000191, train/loss_step=0.0561, global_step=4876.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  49%|████▉     | 2954/5971 [26:02<26:34,  1.89it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.21e-5, train/loss_step=0.0148, global_step=4876.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  49%|████▉     | 2955/5971 [26:03<26:34,  1.89it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0148, train/loss_vlb_step=6.21e-5, train/loss_step=0.0148, global_step=4876.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  49%|████▉     | 2955/5971 [26:03<26:34,  1.89it/s, loss=0.159, v_num=0, train/loss_simple_step=0.232, train/loss_vlb_step=0.000787, train/loss_step=0.232, global_step=4876.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|████▉     | 2956/5971 [26:05<26:35,  1.89it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0796, train/loss_vlb_step=0.000263, train/loss_step=0.0796, global_step=4876.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2957/5971 [26:06<26:35,  1.89it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0673, train/loss_vlb_step=0.000227, train/loss_step=0.0673, global_step=4877.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2958/5971 [26:06<26:35,  1.89it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000108, train/loss_step=0.0304, global_step=4877.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2959/5971 [26:07<26:35,  1.89it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0304, train/loss_vlb_step=0.000108, train/loss_step=0.0304, global_step=4877.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2959/5971 [26:07<26:35,  1.89it/s, loss=0.158, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00128, train/loss_step=0.288, global_step=4877.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  50%|████▉     | 2960/5971 [26:10<26:36,  1.89it/s, loss=0.137, v_num=0, train/loss_simple_step=0.114, train/loss_vlb_step=0.000383, train/loss_step=0.114, global_step=4877.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2961/5971 [26:10<26:36,  1.89it/s, loss=0.145, v_num=0, train/loss_simple_step=0.397, train/loss_vlb_step=0.00227, train/loss_step=0.397, global_step=4878.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|████▉     | 2962/5971 [26:11<26:36,  1.89it/s, loss=0.127, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000598, train/loss_step=0.174, global_step=4878.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2963/5971 [26:12<26:36,  1.88it/s, loss=0.127, v_num=0, train/loss_simple_step=0.174, train/loss_vlb_step=0.000598, train/loss_step=0.174, global_step=4878.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2963/5971 [26:12<26:36,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.0129, train/loss_vlb_step=5.48e-5, train/loss_step=0.0129, global_step=4878.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2964/5971 [26:14<26:37,  1.88it/s, loss=0.124, v_num=0, train/loss_simple_step=0.0234, train/loss_vlb_step=9.6e-5, train/loss_step=0.0234, global_step=4878.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|████▉     | 2965/5971 [26:15<26:36,  1.88it/s, loss=0.128, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000382, train/loss_step=0.116, global_step=4879.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2966/5971 [26:16<26:36,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000279, train/loss_step=0.0844, global_step=4879.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2967/5971 [26:17<26:36,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0844, train/loss_vlb_step=0.000279, train/loss_step=0.0844, global_step=4879.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2967/5971 [26:17<26:36,  1.88it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00322, train/loss_vlb_step=1.77e-5, train/loss_step=0.00322, global_step=4879.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2968/5971 [26:19<26:37,  1.88it/s, loss=0.118, v_num=0, train/loss_simple_step=0.0783, train/loss_vlb_step=0.000267, train/loss_step=0.0783, global_step=4879.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|████▉     | 2969/5971 [26:20<26:37,  1.88it/s, loss=0.108, v_num=0, train/loss_simple_step=0.0263, train/loss_vlb_step=0.000103, train/loss_step=0.0263, global_step=4880.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2970/5971 [26:21<26:37,  1.88it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.15e-5, train/loss_step=0.00196, global_step=4880.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2971/5971 [26:22<26:37,  1.88it/s, loss=0.0987, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.15e-5, train/loss_step=0.00196, global_step=4880.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2971/5971 [26:22<26:37,  1.88it/s, loss=0.108, v_num=0, train/loss_simple_step=0.320, train/loss_vlb_step=0.0021, train/loss_step=0.320, global_step=4880.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]      
Epoch 8:  50%|████▉     | 2972/5971 [26:24<26:38,  1.88it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0072, train/loss_vlb_step=3.54e-5, train/loss_step=0.0072, global_step=4880.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2973/5971 [26:25<26:38,  1.88it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0217, train/loss_vlb_step=8.87e-5, train/loss_step=0.0217, global_step=4881.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2974/5971 [26:26<26:38,  1.88it/s, loss=0.109, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=4881.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|████▉     | 2975/5971 [26:27<26:37,  1.87it/s, loss=0.109, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000332, train/loss_step=0.101, global_step=4881.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2975/5971 [26:27<26:37,  1.87it/s, loss=0.109, v_num=0, train/loss_simple_step=0.240, train/loss_vlb_step=0.00104, train/loss_step=0.240, global_step=4881.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|████▉     | 2976/5971 [26:29<26:38,  1.87it/s, loss=0.106, v_num=0, train/loss_simple_step=0.0205, train/loss_vlb_step=8.4e-5, train/loss_step=0.0205, global_step=4881.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2977/5971 [26:30<26:38,  1.87it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0393, train/loss_vlb_step=0.000142, train/loss_step=0.0393, global_step=4882.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2978/5971 [26:31<26:38,  1.87it/s, loss=0.118, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00119, train/loss_step=0.290, global_step=4882.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  50%|████▉     | 2979/5971 [26:32<26:38,  1.87it/s, loss=0.118, v_num=0, train/loss_simple_step=0.290, train/loss_vlb_step=0.00119, train/loss_step=0.290, global_step=4882.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2979/5971 [26:32<26:38,  1.87it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0331, train/loss_vlb_step=0.000122, train/loss_step=0.0331, global_step=4882.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2980/5971 [26:34<26:39,  1.87it/s, loss=0.1, v_num=0, train/loss_simple_step=0.0131, train/loss_vlb_step=5.19e-5, train/loss_step=0.0131, global_step=4882.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  50%|████▉     | 2981/5971 [26:35<26:39,  1.87it/s, loss=0.11, v_num=0, train/loss_simple_step=0.599, train/loss_vlb_step=0.00581, train/loss_step=0.599, global_step=4883.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|████▉     | 2982/5971 [26:35<26:39,  1.87it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.93e-5, train/loss_step=0.0138, global_step=4883.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2983/5971 [26:36<26:39,  1.87it/s, loss=0.102, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.93e-5, train/loss_step=0.0138, global_step=4883.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2983/5971 [26:36<26:39,  1.87it/s, loss=0.102, v_num=0, train/loss_simple_step=0.00309, train/loss_vlb_step=1.74e-5, train/loss_step=0.00309, global_step=4883.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|████▉     | 2984/5971 [26:38<26:40,  1.87it/s, loss=0.109, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000551, train/loss_step=0.166, global_step=4883.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  50%|████▉     | 2985/5971 [26:39<26:39,  1.87it/s, loss=0.121, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00149, train/loss_step=0.353, global_step=4884.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|█████     | 2986/5971 [26:40<26:39,  1.87it/s, loss=0.117, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.89e-5, train/loss_step=0.013, global_step=4884.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2987/5971 [26:41<26:39,  1.87it/s, loss=0.117, v_num=0, train/loss_simple_step=0.013, train/loss_vlb_step=5.89e-5, train/loss_step=0.013, global_step=4884.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2987/5971 [26:41<26:39,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.504, train/loss_vlb_step=0.00464, train/loss_step=0.504, global_step=4884.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2988/5971 [26:44<26:40,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.754, train/loss_vlb_step=0.0192, train/loss_step=0.754, global_step=4884.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|█████     | 2989/5971 [26:44<26:40,  1.86it/s, loss=0.175, v_num=0, train/loss_simple_step=0.00931, train/loss_vlb_step=4.28e-5, train/loss_step=0.00931, global_step=4885.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2990/5971 [26:45<26:40,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0251, train/loss_vlb_step=0.000101, train/loss_step=0.0251, global_step=4885.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|█████     | 2991/5971 [26:46<26:40,  1.86it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0251, train/loss_vlb_step=0.000101, train/loss_step=0.0251, global_step=4885.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2991/5971 [26:46<26:40,  1.86it/s, loss=0.162, v_num=0, train/loss_simple_step=0.0256, train/loss_vlb_step=9.87e-5, train/loss_step=0.0256, global_step=4885.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|█████     | 2992/5971 [26:48<26:41,  1.86it/s, loss=0.167, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000399, train/loss_step=0.119, global_step=4885.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|█████     | 2993/5971 [26:49<26:41,  1.86it/s, loss=0.174, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000551, train/loss_step=0.157, global_step=4886.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2994/5971 [26:50<26:40,  1.86it/s, loss=0.17, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=5.85e-5, train/loss_step=0.014, global_step=4886.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  50%|█████     | 2995/5971 [26:51<26:40,  1.86it/s, loss=0.17, v_num=0, train/loss_simple_step=0.014, train/loss_vlb_step=5.85e-5, train/loss_step=0.014, global_step=4886.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2995/5971 [26:51<26:40,  1.86it/s, loss=0.174, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00154, train/loss_step=0.328, global_step=4886.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2996/5971 [26:53<26:41,  1.86it/s, loss=0.173, v_num=0, train/loss_simple_step=0.00231, train/loss_vlb_step=1.31e-5, train/loss_step=0.00231, global_step=4886.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2997/5971 [26:54<26:41,  1.86it/s, loss=0.202, v_num=0, train/loss_simple_step=0.617, train/loss_vlb_step=0.0231, train/loss_step=0.617, global_step=4887.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  50%|█████     | 2998/5971 [26:55<26:41,  1.86it/s, loss=0.193, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000394, train/loss_step=0.118, global_step=4887.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2999/5971 [26:56<26:41,  1.86it/s, loss=0.193, v_num=0, train/loss_simple_step=0.118, train/loss_vlb_step=0.000394, train/loss_step=0.118, global_step=4887.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 2999/5971 [26:56<26:41,  1.86it/s, loss=0.206, v_num=0, train/loss_simple_step=0.281, train/loss_vlb_step=0.0013, train/loss_step=0.281, global_step=4887.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  50%|█████     | 3000/5971 [26:58<26:42,  1.85it/s, loss=0.215, v_num=0, train/loss_simple_step=0.195, train/loss_vlb_step=0.000693, train/loss_step=0.195, global_step=4887.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3001/5971 [26:59<26:42,  1.85it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0479, train/loss_vlb_step=0.000169, train/loss_step=0.0479, global_step=4888.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3002/5971 [27:00<26:41,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000179, train/loss_step=0.052, global_step=4888.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  50%|█████     | 3003/5971 [27:01<26:41,  1.85it/s, loss=0.189, v_num=0, train/loss_simple_step=0.052, train/loss_vlb_step=0.000179, train/loss_step=0.052, global_step=4888.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3003/5971 [27:01<26:41,  1.85it/s, loss=0.191, v_num=0, train/loss_simple_step=0.0339, train/loss_vlb_step=0.000128, train/loss_step=0.0339, global_step=4888.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3004/5971 [27:03<26:42,  1.85it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0867, train/loss_vlb_step=0.000287, train/loss_step=0.0867, global_step=4888.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3005/5971 [27:04<26:42,  1.85it/s, loss=0.195, v_num=0, train/loss_simple_step=0.511, train/loss_vlb_step=0.00509, train/loss_step=0.511, global_step=4889.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  50%|█████     | 3006/5971 [27:05<26:42,  1.85it/s, loss=0.204, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000634, train/loss_step=0.191, global_step=4889.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3007/5971 [27:06<26:42,  1.85it/s, loss=0.204, v_num=0, train/loss_simple_step=0.191, train/loss_vlb_step=0.000634, train/loss_step=0.191, global_step=4889.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3007/5971 [27:06<26:42,  1.85it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=5.04e-5, train/loss_step=0.0118, global_step=4889.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3008/5971 [27:08<26:43,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.000391, train/loss_step=0.119, global_step=4889.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|█████     | 3009/5971 [27:09<26:43,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00296, train/loss_vlb_step=1.57e-5, train/loss_step=0.00296, global_step=4890.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3010/5971 [27:10<26:43,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00257, train/loss_vlb_step=1.5e-5, train/loss_step=0.00257, global_step=4890.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|█████     | 3011/5971 [27:11<26:43,  1.85it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00257, train/loss_vlb_step=1.5e-5, train/loss_step=0.00257, global_step=4890.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3011/5971 [27:11<26:43,  1.85it/s, loss=0.151, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000442, train/loss_step=0.133, global_step=4890.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  50%|█████     | 3012/5971 [27:13<26:44,  1.84it/s, loss=0.15, v_num=0, train/loss_simple_step=0.088, train/loss_vlb_step=0.000304, train/loss_step=0.088, global_step=4890.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  50%|█████     | 3013/5971 [27:14<26:43,  1.84it/s, loss=0.154, v_num=0, train/loss_simple_step=0.246, train/loss_vlb_step=0.00105, train/loss_step=0.246, global_step=4891.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3014/5971 [27:15<26:43,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000417, train/loss_step=0.123, global_step=4891.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3015/5971 [27:16<26:43,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.123, train/loss_vlb_step=0.000417, train/loss_step=0.123, global_step=4891.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  50%|█████     | 3015/5971 [27:16<26:43,  1.84it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.000113, train/loss_step=0.0288, global_step=4891.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3016/5971 [27:18<26:44,  1.84it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00357, train/loss_vlb_step=1.9e-5, train/loss_step=0.00357, global_step=4891.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3017/5971 [27:19<26:44,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.122, train/loss_vlb_step=0.000403, train/loss_step=0.122, global_step=4892.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  51%|█████     | 3018/5971 [27:20<26:44,  1.84it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.19e-5, train/loss_step=0.00203, global_step=4892.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3019/5971 [27:21<26:44,  1.84it/s, loss=0.114, v_num=0, train/loss_simple_step=0.00203, train/loss_vlb_step=1.19e-5, train/loss_step=0.00203, global_step=4892.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3019/5971 [27:21<26:44,  1.84it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0841, train/loss_vlb_step=0.000289, train/loss_step=0.0841, global_step=4892.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  51%|█████     | 3020/5971 [27:23<26:45,  1.84it/s, loss=0.117, v_num=0, train/loss_simple_step=0.457, train/loss_vlb_step=0.00298, train/loss_step=0.457, global_step=4892.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  51%|█████     | 3021/5971 [27:24<26:44,  1.84it/s, loss=0.132, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00157, train/loss_step=0.335, global_step=4893.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3022/5971 [27:24<26:44,  1.84it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000285, train/loss_step=0.0859, global_step=4893.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3023/5971 [27:25<26:44,  1.84it/s, loss=0.133, v_num=0, train/loss_simple_step=0.0859, train/loss_vlb_step=0.000285, train/loss_step=0.0859, global_step=4893.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3023/5971 [27:25<26:44,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0429, train/loss_vlb_step=0.000147, train/loss_step=0.0429, global_step=4893.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3024/5971 [27:27<26:45,  1.84it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0114, train/loss_vlb_step=5.17e-5, train/loss_step=0.0114, global_step=4893.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  51%|█████     | 3025/5971 [27:28<26:45,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.305, train/loss_vlb_step=0.00124, train/loss_step=0.305, global_step=4894.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  51%|█████     | 3026/5971 [27:29<26:45,  1.83it/s, loss=0.117, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000477, train/loss_step=0.143, global_step=4894.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3027/5971 [27:30<26:44,  1.83it/s, loss=0.117, v_num=0, train/loss_simple_step=0.143, train/loss_vlb_step=0.000477, train/loss_step=0.143, global_step=4894.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3027/5971 [27:30<26:44,  1.83it/s, loss=0.139, v_num=0, train/loss_simple_step=0.452, train/loss_vlb_step=0.00371, train/loss_step=0.452, global_step=4894.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  51%|█████     | 3028/5971 [27:32<26:45,  1.83it/s, loss=0.134, v_num=0, train/loss_simple_step=0.00192, train/loss_vlb_step=1.16e-5, train/loss_step=0.00192, global_step=4894.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3029/5971 [27:33<26:45,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.130, train/loss_vlb_step=0.000432, train/loss_step=0.130, global_step=4895.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  51%|█████     | 3030/5971 [27:34<26:45,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000834, train/loss_step=0.234, global_step=4895.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3031/5971 [27:35<26:45,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.234, train/loss_vlb_step=0.000834, train/loss_step=0.234, global_step=4895.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3031/5971 [27:35<26:45,  1.83it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.3e-5, train/loss_step=0.00222, global_step=4895.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3032/5971 [27:37<26:46,  1.83it/s, loss=0.159, v_num=0, train/loss_simple_step=0.369, train/loss_vlb_step=0.00198, train/loss_step=0.369, global_step=4895.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  51%|█████     | 3033/5971 [27:38<26:45,  1.83it/s, loss=0.154, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000506, train/loss_step=0.144, global_step=4896.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3034/5971 [27:39<26:45,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00143, train/loss_step=0.301, global_step=4896.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  51%|█████     | 3035/5971 [27:40<26:45,  1.83it/s, loss=0.163, v_num=0, train/loss_simple_step=0.301, train/loss_vlb_step=0.00143, train/loss_step=0.301, global_step=4896.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3035/5971 [27:40<26:45,  1.83it/s, loss=0.185, v_num=0, train/loss_simple_step=0.469, train/loss_vlb_step=0.00402, train/loss_step=0.469, global_step=4896.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3036/5971 [27:42<26:46,  1.83it/s, loss=0.192, v_num=0, train/loss_simple_step=0.142, train/loss_vlb_step=0.000468, train/loss_step=0.142, global_step=4896.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3037/5971 [27:43<26:46,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000876, train/loss_step=0.233, global_step=4897.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3038/5971 [27:44<26:46,  1.83it/s, loss=0.22, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00235, train/loss_step=0.454, global_step=4897.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  51%|█████     | 3039/5971 [27:45<26:46,  1.83it/s, loss=0.22, v_num=0, train/loss_simple_step=0.454, train/loss_vlb_step=0.00235, train/loss_step=0.454, global_step=4897.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3039/5971 [27:45<26:46,  1.83it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0415, train/loss_vlb_step=0.000149, train/loss_step=0.0415, global_step=4897.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3040/5971 [27:47<26:47,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0619, train/loss_vlb_step=0.000216, train/loss_step=0.0619, global_step=4897.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3041/5971 [27:48<26:46,  1.82it/s, loss=0.182, v_num=0, train/loss_simple_step=0.0209, train/loss_vlb_step=8.92e-5, train/loss_step=0.0209, global_step=4898.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  51%|█████     | 3042/5971 [27:49<26:46,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000345, train/loss_step=0.103, global_step=4898.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  51%|█████     | 3043/5971 [27:50<26:46,  1.82it/s, loss=0.183, v_num=0, train/loss_simple_step=0.103, train/loss_vlb_step=0.000345, train/loss_step=0.103, global_step=4898.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3043/5971 [27:50<26:46,  1.82it/s, loss=0.181, v_num=0, train/loss_simple_step=0.00161, train/loss_vlb_step=9.59e-6, train/loss_step=0.00161, global_step=4898.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3044/5971 [27:52<26:47,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.196, train/loss_vlb_step=0.000671, train/loss_step=0.196, global_step=4898.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  51%|█████     | 3045/5971 [27:53<26:47,  1.82it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0808, train/loss_vlb_step=0.000265, train/loss_step=0.0808, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3046/5971 [27:53<26:46,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.17e-5, train/loss_step=0.0139, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  51%|█████     | 3047/5971 [27:54<26:46,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.0139, train/loss_vlb_step=6.17e-5, train/loss_step=0.0139, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  51%|█████     | 3047/5971 [27:54<26:46,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.310, train/loss_vlb_step=0.00133, train/loss_step=0.310, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  51%|█████     | 3048/5971 [27:56<26:47,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:14,  2.22it/s][A

Validating:   1%|          | 2/167 [00:00<00:48,  3.40it/s][A
Epoch 8:  51%|█████     | 3051/5971 [27:57<26:45,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.94it/s][A
Epoch 8:  51%|█████     | 3055/5971 [27:57<26:40,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:12, 13.24it/s][A
Epoch 8:  51%|█████     | 3059/5971 [27:57<26:36,  1.82it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.60it/s][A

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.20it/s][A
Epoch 8:  51%|█████▏    | 3063/5971 [27:58<26:32,  1.83it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  10%|█         | 17/167 [00:01<00:07, 21.20it/s][A
Epoch 8:  51%|█████▏    | 3067/5971 [27:58<26:28,  1.83it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 22.63it/s][A
Epoch 8:  51%|█████▏    | 3071/5971 [27:58<26:24,  1.83it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 23/167 [00:01<00:05, 24.50it/s][A

Validating:  16%|█▌        | 26/167 [00:01<00:05, 24.62it/s][A
Epoch 8:  51%|█████▏    | 3075/5971 [27:58<26:20,  1.83it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 25.65it/s][A
Epoch 8:  52%|█████▏    | 3079/5971 [27:58<26:16,  1.83it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 25.23it/s][A
Epoch 8:  52%|█████▏    | 3083/5971 [27:58<26:12,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  21%|██        | 35/167 [00:01<00:05, 26.24it/s][A

Validating:  23%|██▎       | 38/167 [00:02<00:04, 26.29it/s][A
Epoch 8:  52%|█████▏    | 3087/5971 [27:59<26:08,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▍       | 41/167 [00:02<00:04, 26.27it/s][A
Epoch 8:  52%|█████▏    | 3091/5971 [27:59<26:04,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 25.83it/s][A
Epoch 8:  52%|█████▏    | 3095/5971 [27:59<25:59,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 26.23it/s][A

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.88it/s][A
Epoch 8:  52%|█████▏    | 3099/5971 [27:59<25:55,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 27.59it/s][A
Epoch 8:  52%|█████▏    | 3103/5971 [27:59<25:51,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▍      | 58/167 [00:02<00:03, 28.34it/s][A
Epoch 8:  52%|█████▏    | 3107/5971 [27:59<25:47,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 61/167 [00:02<00:03, 27.91it/s][A
Epoch 8:  52%|█████▏    | 3111/5971 [27:59<25:43,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 27.44it/s][A
Epoch 8:  52%|█████▏    | 3115/5971 [28:00<25:39,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.56it/s][A

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.27it/s][A
Epoch 8:  52%|█████▏    | 3119/5971 [28:00<25:35,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.87it/s][A
Epoch 8:  52%|█████▏    | 3123/5971 [28:00<25:31,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.78it/s][A
Epoch 8:  52%|█████▏    | 3127/5971 [28:00<25:27,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 27.54it/s][A
Epoch 8:  52%|█████▏    | 3131/5971 [28:00<25:23,  1.86it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  50%|████▉     | 83/167 [00:03<00:02, 28.31it/s][A

Validating:  51%|█████▏    | 86/167 [00:03<00:03, 26.53it/s][A
Epoch 8:  53%|█████▎    | 3135/5971 [28:00<25:19,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 89/167 [00:03<00:02, 26.82it/s][A
Epoch 8:  53%|█████▎    | 3139/5971 [28:00<25:16,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  55%|█████▌    | 92/167 [00:03<00:02, 26.77it/s][A
Epoch 8:  53%|█████▎    | 3143/5971 [28:01<25:12,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.66it/s][A
Epoch 8:  53%|█████▎    | 3147/5971 [28:01<25:08,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.51it/s][A

Validating:  61%|██████    | 102/167 [00:04<00:02, 27.76it/s][A
Epoch 8:  53%|█████▎    | 3151/5971 [28:01<25:04,  1.87it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 26.79it/s][A
Epoch 8:  53%|█████▎    | 3155/5971 [28:01<25:00,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▌   | 109/167 [00:04<00:02, 28.03it/s][A
Epoch 8:  53%|█████▎    | 3159/5971 [28:01<24:56,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 27.90it/s][A
Epoch 8:  53%|█████▎    | 3163/5971 [28:01<24:52,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 27.09it/s][A

Validating:  71%|███████   | 118/167 [00:04<00:01, 27.11it/s][A
Epoch 8:  53%|█████▎    | 3167/5971 [28:01<24:48,  1.88it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 121/167 [00:05<00:01, 27.23it/s][A
Epoch 8:  53%|█████▎    | 3171/5971 [28:02<24:44,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.84it/s][A
Epoch 8:  53%|█████▎    | 3175/5971 [28:02<24:40,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.41it/s][A

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 27.81it/s][A
Epoch 8:  53%|█████▎    | 3179/5971 [28:02<24:37,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 28.21it/s][A
Epoch 8:  53%|█████▎    | 3183/5971 [28:02<24:33,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 26.40it/s][A
Epoch 8:  53%|█████▎    | 3187/5971 [28:02<24:29,  1.89it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  83%|████████▎ | 139/167 [00:05<00:01, 25.16it/s][A

Validating:  85%|████████▌ | 142/167 [00:05<00:00, 25.58it/s][A
Epoch 8:  53%|█████▎    | 3191/5971 [28:02<24:25,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 145/167 [00:05<00:00, 25.60it/s][A
Epoch 8:  54%|█████▎    | 3195/5971 [28:03<24:21,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 26.47it/s][A
Epoch 8:  54%|█████▎    | 3199/5971 [28:03<24:18,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 25.86it/s][A

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 26.04it/s][A
Epoch 8:  54%|█████▎    | 3203/5971 [28:03<24:14,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 26.33it/s][A
Epoch 8:  54%|█████▎    | 3207/5971 [28:03<24:10,  1.91it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.39it/s][A
Epoch 8:  54%|█████▍    | 3211/5971 [28:03<24:06,  1.91it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  98%|█████████▊| 164/167 [00:06<00:00, 26.59it/s][A
Epoch 8:  54%|█████▍    | 3215/5971 [28:03<24:02,  1.91it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3216/5971 [28:04<24:02,  1.91it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00382, train/loss_vlb_step=2.03e-5, train/loss_step=0.00382, global_step=4899.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [Atimesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.36it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.46it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.30it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.94it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.43it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.80it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.26it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.38it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.48it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.49it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.53it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.57it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.61it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.64it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.66it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.67it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.68it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.67it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.68it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.69it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.69it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.68it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.69it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.70it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.70it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.71it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.71it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.72it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.73it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.72it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.72it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.70it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.35it/s]

Epoch 8:  54%|█████▍    | 3217/5971 [28:15<24:11,  1.90it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.53e-5, train/loss_step=0.0149, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.35it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.45it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.32it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.98it/s][A
Epoch 8:  54%|█████▍    | 3217/5971 [28:18<24:13,  1.89it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.53e-5, train/loss_step=0.0149, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.47it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.84it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  5.06it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:07,  5.26it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.40it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.49it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.56it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.62it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.63it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.66it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.68it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.72it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.72it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.73it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.73it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.73it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.72it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:05,  4.59it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:05,  4.83it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.05it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.24it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.37it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:04,  5.47it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.53it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.56it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.60it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.64it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.67it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.69it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.59it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.61it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.63it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.65it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.66it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.67it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.68it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.69it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.71it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.70it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.71it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.70it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.70it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.70it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.27it/s]

Epoch 8:  54%|█████▍    | 3218/5971 [28:27<24:20,  1.89it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.53e-5, train/loss_step=0.0149, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3218/5971 [28:27<24:20,  1.89it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000111, train/loss_step=0.0271, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.33it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:19,  2.41it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.27it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:11,  3.92it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.36it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.68it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.98it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:01<00:08,  5.20it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.36it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.47it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.54it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:06,  5.59it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.61it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.65it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.68it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:05,  5.70it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:03<00:05,  5.69it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.69it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.70it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.71it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.72it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:04<00:04,  5.72it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.72it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.71it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.72it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.73it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:05<00:03,  5.73it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.72it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.72it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:02,  5.72it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.72it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.72it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:06<00:02,  5.72it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.72it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.72it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.72it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.62it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.54it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:07<00:01,  5.41it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.41it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.45it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.49it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.51it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:08<00:00,  5.52it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.53it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.54it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.30it/s]

Epoch 8:  54%|█████▍    | 3219/5971 [28:39<24:29,  1.87it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0271, train/loss_vlb_step=0.000111, train/loss_step=0.0271, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3219/5971 [28:39<24:29,  1.87it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0521, train/loss_vlb_step=0.000174, train/loss_step=0.0521, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]timesteps used in spaced sampler: 
	[0, 20, 41, 61, 82, 102, 122, 143, 163, 183, 204, 224, 245, 265, 285, 306, 326, 347, 367, 387, 408, 428, 449, 469, 489, 510, 530, 550, 571, 591, 612, 632, 652, 673, 693, 714, 734, 754, 775, 795, 816, 836, 856, 877, 897, 917, 938, 958, 979, 999]


Spaced Sampler:   0%|          | 0/50 [00:00<?, ?it/s][A

Spaced Sampler:   2%|▏         | 1/50 [00:00<00:36,  1.34it/s][A

Spaced Sampler:   4%|▍         | 2/50 [00:00<00:20,  2.38it/s][A

Spaced Sampler:   6%|▌         | 3/50 [00:01<00:14,  3.18it/s][A

Spaced Sampler:   8%|▊         | 4/50 [00:01<00:12,  3.79it/s][A

Spaced Sampler:  10%|█         | 5/50 [00:01<00:10,  4.28it/s][A

Spaced Sampler:  12%|█▏        | 6/50 [00:01<00:09,  4.64it/s][A

Spaced Sampler:  14%|█▍        | 7/50 [00:01<00:08,  4.90it/s][A

Spaced Sampler:  16%|█▌        | 8/50 [00:02<00:08,  5.10it/s][A

Spaced Sampler:  18%|█▊        | 9/50 [00:02<00:07,  5.24it/s][A

Spaced Sampler:  20%|██        | 10/50 [00:02<00:07,  5.34it/s][A

Spaced Sampler:  22%|██▏       | 11/50 [00:02<00:07,  5.42it/s][A

Spaced Sampler:  24%|██▍       | 12/50 [00:02<00:07,  5.41it/s][A

Spaced Sampler:  26%|██▌       | 13/50 [00:02<00:06,  5.43it/s][A

Spaced Sampler:  28%|██▊       | 14/50 [00:03<00:06,  5.45it/s][A

Spaced Sampler:  30%|███       | 15/50 [00:03<00:06,  5.48it/s][A

Spaced Sampler:  32%|███▏      | 16/50 [00:03<00:06,  5.51it/s][A

Spaced Sampler:  34%|███▍      | 17/50 [00:03<00:05,  5.52it/s][A

Spaced Sampler:  36%|███▌      | 18/50 [00:03<00:05,  5.51it/s][A

Spaced Sampler:  38%|███▊      | 19/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  40%|████      | 20/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  42%|████▏     | 21/50 [00:04<00:05,  5.49it/s][A

Spaced Sampler:  44%|████▍     | 22/50 [00:04<00:05,  5.51it/s][A

Spaced Sampler:  46%|████▌     | 23/50 [00:04<00:04,  5.52it/s][A

Spaced Sampler:  48%|████▊     | 24/50 [00:04<00:04,  5.52it/s][A

Spaced Sampler:  50%|█████     | 25/50 [00:05<00:04,  5.53it/s][A

Spaced Sampler:  52%|█████▏    | 26/50 [00:05<00:04,  5.53it/s][A

Spaced Sampler:  54%|█████▍    | 27/50 [00:05<00:04,  5.53it/s][A

Spaced Sampler:  56%|█████▌    | 28/50 [00:05<00:03,  5.54it/s][A

Spaced Sampler:  58%|█████▊    | 29/50 [00:05<00:03,  5.55it/s][A

Spaced Sampler:  60%|██████    | 30/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  62%|██████▏   | 31/50 [00:06<00:03,  5.49it/s][A

Spaced Sampler:  64%|██████▍   | 32/50 [00:06<00:03,  5.53it/s][A

Spaced Sampler:  66%|██████▌   | 33/50 [00:06<00:03,  5.59it/s][A

Spaced Sampler:  68%|██████▊   | 34/50 [00:06<00:02,  5.62it/s][A

Spaced Sampler:  70%|███████   | 35/50 [00:06<00:02,  5.65it/s][A

Spaced Sampler:  72%|███████▏  | 36/50 [00:07<00:02,  5.66it/s][A

Spaced Sampler:  74%|███████▍  | 37/50 [00:07<00:02,  5.67it/s][A

Spaced Sampler:  76%|███████▌  | 38/50 [00:07<00:02,  5.68it/s][A

Spaced Sampler:  78%|███████▊  | 39/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  80%|████████  | 40/50 [00:07<00:01,  5.69it/s][A

Spaced Sampler:  82%|████████▏ | 41/50 [00:07<00:01,  5.70it/s][A

Spaced Sampler:  84%|████████▍ | 42/50 [00:08<00:01,  5.70it/s][A

Spaced Sampler:  86%|████████▌ | 43/50 [00:08<00:01,  5.71it/s][A

Spaced Sampler:  88%|████████▊ | 44/50 [00:08<00:01,  5.72it/s][A

Spaced Sampler:  90%|█████████ | 45/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  92%|█████████▏| 46/50 [00:08<00:00,  5.72it/s][A

Spaced Sampler:  94%|█████████▍| 47/50 [00:09<00:00,  5.72it/s][A

Spaced Sampler:  96%|█████████▌| 48/50 [00:09<00:00,  5.71it/s][A

Spaced Sampler:  98%|█████████▊| 49/50 [00:09<00:00,  5.72it/s][A

Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.73it/s][A
Spaced Sampler: 100%|██████████| 50/50 [00:09<00:00,  5.25it/s]

Epoch 8:  54%|█████▍    | 3220/5971 [28:52<24:39,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0521, train/loss_vlb_step=0.000174, train/loss_step=0.0521, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3220/5971 [28:52<24:39,  1.86it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0903, train/loss_vlb_step=0.000297, train/loss_step=0.0903, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3221/5971 [28:53<24:39,  1.86it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0903, train/loss_vlb_step=0.000297, train/loss_step=0.0903, global_step=4900.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3221/5971 [28:53<24:39,  1.86it/s, loss=0.133, v_num=0, train/loss_simple_step=0.054, train/loss_vlb_step=0.000189, train/loss_step=0.054, global_step=4901.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  54%|█████▍    | 3222/5971 [28:54<24:39,  1.86it/s, loss=0.133, v_num=0, train/loss_simple_step=0.054, train/loss_vlb_step=0.000189, train/loss_step=0.054, global_step=4901.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3222/5971 [28:54<24:39,  1.86it/s, loss=0.126, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000481, train/loss_step=0.141, global_step=4901.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3223/5971 [28:55<24:39,  1.86it/s, loss=0.126, v_num=0, train/loss_simple_step=0.141, train/loss_vlb_step=0.000481, train/loss_step=0.141, global_step=4901.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3223/5971 [28:55<24:39,  1.86it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000153, train/loss_step=0.0424, global_step=4901.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3224/5971 [28:57<24:40,  1.86it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0424, train/loss_vlb_step=0.000153, train/loss_step=0.0424, global_step=4901.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3224/5971 [28:57<24:40,  1.86it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.00444, train/loss_vlb_step=2.22e-5, train/loss_step=0.00444, global_step=4901.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3225/5971 [28:58<24:40,  1.86it/s, loss=0.0973, v_num=0, train/loss_simple_step=0.00444, train/loss_vlb_step=2.22e-5, train/loss_step=0.00444, global_step=4901.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3225/5971 [28:58<24:40,  1.86it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.23e-5, train/loss_step=0.00215, global_step=4902.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3226/5971 [28:59<24:39,  1.86it/s, loss=0.0858, v_num=0, train/loss_simple_step=0.00215, train/loss_vlb_step=1.23e-5, train/loss_step=0.00215, global_step=4902.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3226/5971 [28:59<24:39,  1.86it/s, loss=0.0734, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000737, train/loss_step=0.206, global_step=4902.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  54%|█████▍    | 3227/5971 [29:00<24:39,  1.85it/s, loss=0.0734, v_num=0, train/loss_simple_step=0.206, train/loss_vlb_step=0.000737, train/loss_step=0.206, global_step=4902.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3227/5971 [29:00<24:39,  1.85it/s, loss=0.0771, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000383, train/loss_step=0.116, global_step=4902.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3228/5971 [29:02<24:40,  1.85it/s, loss=0.0771, v_num=0, train/loss_simple_step=0.116, train/loss_vlb_step=0.000383, train/loss_step=0.116, global_step=4902.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3228/5971 [29:02<24:40,  1.85it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00168, train/loss_step=0.356, global_step=4902.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  54%|█████▍    | 3229/5971 [29:03<24:40,  1.85it/s, loss=0.0918, v_num=0, train/loss_simple_step=0.356, train/loss_vlb_step=0.00168, train/loss_step=0.356, global_step=4902.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3229/5971 [29:03<24:40,  1.85it/s, loss=0.102, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000845, train/loss_step=0.224, global_step=4903.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3230/5971 [29:04<24:39,  1.85it/s, loss=0.102, v_num=0, train/loss_simple_step=0.224, train/loss_vlb_step=0.000845, train/loss_step=0.224, global_step=4903.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3230/5971 [29:04<24:39,  1.85it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000247, train/loss_step=0.0725, global_step=4903.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3231/5971 [29:05<24:39,  1.85it/s, loss=0.101, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000247, train/loss_step=0.0725, global_step=4903.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3231/5971 [29:05<24:39,  1.85it/s, loss=0.113, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000973, train/loss_step=0.257, global_step=4903.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  54%|█████▍    | 3232/5971 [29:07<24:40,  1.85it/s, loss=0.113, v_num=0, train/loss_simple_step=0.257, train/loss_vlb_step=0.000973, train/loss_step=0.257, global_step=4903.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3232/5971 [29:07<24:40,  1.85it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00914, train/loss_vlb_step=4.28e-5, train/loss_step=0.00914, global_step=4903.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3233/5971 [29:08<24:40,  1.85it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00914, train/loss_vlb_step=4.28e-5, train/loss_step=0.00914, global_step=4903.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3233/5971 [29:08<24:40,  1.85it/s, loss=0.109, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.0007, train/loss_step=0.180, global_step=4904.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  54%|█████▍    | 3234/5971 [29:09<24:40,  1.85it/s, loss=0.109, v_num=0, train/loss_simple_step=0.180, train/loss_vlb_step=0.0007, train/loss_step=0.180, global_step=4904.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3234/5971 [29:09<24:40,  1.85it/s, loss=0.119, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000774, train/loss_step=0.207, global_step=4904.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3235/5971 [29:10<24:39,  1.85it/s, loss=0.119, v_num=0, train/loss_simple_step=0.207, train/loss_vlb_step=0.000774, train/loss_step=0.207, global_step=4904.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3235/5971 [29:10<24:39,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00287, train/loss_step=0.430, global_step=4904.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  54%|█████▍    | 3236/5971 [29:12<24:40,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00287, train/loss_step=0.430, global_step=4904.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3236/5971 [29:12<24:40,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.07e-5, train/loss_step=0.00405, global_step=4904.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3237/5971 [29:13<24:40,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.07e-5, train/loss_step=0.00405, global_step=4904.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3237/5971 [29:13<24:40,  1.85it/s, loss=0.131, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000534, train/loss_step=0.148, global_step=4905.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  54%|█████▍    | 3238/5971 [29:14<24:40,  1.85it/s, loss=0.131, v_num=0, train/loss_simple_step=0.148, train/loss_vlb_step=0.000534, train/loss_step=0.148, global_step=4905.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3238/5971 [29:14<24:40,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.652, train/loss_vlb_step=0.00988, train/loss_step=0.652, global_step=4905.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  54%|█████▍    | 3239/5971 [29:14<24:39,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.652, train/loss_vlb_step=0.00988, train/loss_step=0.652, global_step=4905.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3239/5971 [29:14<24:39,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.00069, train/loss_step=0.187, global_step=4905.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3240/5971 [29:16<24:40,  1.84it/s, loss=0.169, v_num=0, train/loss_simple_step=0.187, train/loss_vlb_step=0.00069, train/loss_step=0.187, global_step=4905.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3240/5971 [29:16<24:40,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.45e-5, train/loss_step=0.00287, global_step=4905.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3241/5971 [29:17<24:40,  1.84it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00287, train/loss_vlb_step=1.45e-5, train/loss_step=0.00287, global_step=4905.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3241/5971 [29:17<24:40,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000138, train/loss_step=0.0386, global_step=4906.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  54%|█████▍    | 3242/5971 [29:18<24:39,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0386, train/loss_vlb_step=0.000138, train/loss_step=0.0386, global_step=4906.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3242/5971 [29:18<24:39,  1.84it/s, loss=0.176, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00198, train/loss_step=0.385, global_step=4906.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  54%|█████▍    | 3243/5971 [29:19<24:39,  1.84it/s, loss=0.176, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00198, train/loss_step=0.385, global_step=4906.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3243/5971 [29:19<24:39,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00354, train/loss_step=0.530, global_step=4906.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3244/5971 [29:21<24:40,  1.84it/s, loss=0.201, v_num=0, train/loss_simple_step=0.530, train/loss_vlb_step=0.00354, train/loss_step=0.530, global_step=4906.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3244/5971 [29:21<24:40,  1.84it/s, loss=0.206, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=4906.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3245/5971 [29:22<24:40,  1.84it/s, loss=0.206, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000364, train/loss_step=0.111, global_step=4906.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3245/5971 [29:22<24:40,  1.84it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=3.86e-5, train/loss_step=0.00854, global_step=4907.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3246/5971 [29:23<24:40,  1.84it/s, loss=0.206, v_num=0, train/loss_simple_step=0.00854, train/loss_vlb_step=3.86e-5, train/loss_step=0.00854, global_step=4907.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3246/5971 [29:23<24:40,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000133, train/loss_step=0.0352, global_step=4907.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  54%|█████▍    | 3247/5971 [29:24<24:39,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0352, train/loss_vlb_step=0.000133, train/loss_step=0.0352, global_step=4907.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3247/5971 [29:24<24:39,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000199, train/loss_step=0.0544, global_step=4907.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3248/5971 [29:26<24:40,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0544, train/loss_vlb_step=0.000199, train/loss_step=0.0544, global_step=4907.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3248/5971 [29:26<24:40,  1.84it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000191, train/loss_step=0.0533, global_step=4907.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3249/5971 [29:27<24:40,  1.84it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0533, train/loss_vlb_step=0.000191, train/loss_step=0.0533, global_step=4907.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3249/5971 [29:27<24:40,  1.84it/s, loss=0.176, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000518, train/loss_step=0.157, global_step=4908.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  54%|█████▍    | 3250/5971 [29:28<24:40,  1.84it/s, loss=0.176, v_num=0, train/loss_simple_step=0.157, train/loss_vlb_step=0.000518, train/loss_step=0.157, global_step=4908.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3250/5971 [29:28<24:40,  1.84it/s, loss=0.206, v_num=0, train/loss_simple_step=0.673, train/loss_vlb_step=0.0131, train/loss_step=0.673, global_step=4908.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  54%|█████▍    | 3251/5971 [29:29<24:39,  1.84it/s, loss=0.206, v_num=0, train/loss_simple_step=0.673, train/loss_vlb_step=0.0131, train/loss_step=0.673, global_step=4908.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3251/5971 [29:29<24:39,  1.84it/s, loss=0.211, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.0021, train/loss_step=0.355, global_step=4908.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3252/5971 [29:31<24:40,  1.84it/s, loss=0.211, v_num=0, train/loss_simple_step=0.355, train/loss_vlb_step=0.0021, train/loss_step=0.355, global_step=4908.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3252/5971 [29:31<24:40,  1.84it/s, loss=0.222, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.00079, train/loss_step=0.226, global_step=4908.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3253/5971 [29:32<24:40,  1.84it/s, loss=0.222, v_num=0, train/loss_simple_step=0.226, train/loss_vlb_step=0.00079, train/loss_step=0.226, global_step=4908.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3253/5971 [29:32<24:40,  1.84it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=1.01e-5, train/loss_step=0.0017, global_step=4909.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3254/5971 [29:33<24:40,  1.84it/s, loss=0.213, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=1.01e-5, train/loss_step=0.0017, global_step=4909.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  54%|█████▍    | 3254/5971 [29:33<24:40,  1.84it/s, loss=0.208, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=4909.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  55%|█████▍    | 3255/5971 [29:34<24:39,  1.84it/s, loss=0.208, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=4909.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3255/5971 [29:34<24:39,  1.84it/s, loss=0.204, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00152, train/loss_step=0.353, global_step=4909.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  55%|█████▍    | 3256/5971 [29:36<24:40,  1.83it/s, loss=0.204, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00152, train/loss_step=0.353, global_step=4909.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3256/5971 [29:36<24:40,  1.83it/s, loss=0.225, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00362, train/loss_step=0.428, global_step=4909.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3257/5971 [29:37<24:40,  1.83it/s, loss=0.225, v_num=0, train/loss_simple_step=0.428, train/loss_vlb_step=0.00362, train/loss_step=0.428, global_step=4909.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3257/5971 [29:37<24:40,  1.83it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.81e-5, train/loss_step=0.0163, global_step=4910.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3258/5971 [29:38<24:40,  1.83it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0163, train/loss_vlb_step=6.81e-5, train/loss_step=0.0163, global_step=4910.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3258/5971 [29:38<24:40,  1.83it/s, loss=0.227, v_num=0, train/loss_simple_step=0.826, train/loss_vlb_step=0.0703, train/loss_step=0.826, global_step=4910.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  55%|█████▍    | 3259/5971 [29:39<24:40,  1.83it/s, loss=0.227, v_num=0, train/loss_simple_step=0.826, train/loss_vlb_step=0.0703, train/loss_step=0.826, global_step=4910.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3259/5971 [29:39<24:40,  1.83it/s, loss=0.263, v_num=0, train/loss_simple_step=0.908, train/loss_vlb_step=0.229, train/loss_step=0.908, global_step=4910.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  55%|█████▍    | 3260/5971 [29:41<24:40,  1.83it/s, loss=0.263, v_num=0, train/loss_simple_step=0.908, train/loss_vlb_step=0.229, train/loss_step=0.908, global_step=4910.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3260/5971 [29:41<24:40,  1.83it/s, loss=0.268, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000322, train/loss_step=0.0971, global_step=4910.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3261/5971 [29:42<24:40,  1.83it/s, loss=0.268, v_num=0, train/loss_simple_step=0.0971, train/loss_vlb_step=0.000322, train/loss_step=0.0971, global_step=4910.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3261/5971 [29:42<24:40,  1.83it/s, loss=0.266, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.77e-5, train/loss_step=0.00319, global_step=4911.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3262/5971 [29:43<24:40,  1.83it/s, loss=0.266, v_num=0, train/loss_simple_step=0.00319, train/loss_vlb_step=1.77e-5, train/loss_step=0.00319, global_step=4911.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3262/5971 [29:43<24:40,  1.83it/s, loss=0.268, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00311, train/loss_step=0.421, global_step=4911.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  55%|█████▍    | 3263/5971 [29:43<24:40,  1.83it/s, loss=0.268, v_num=0, train/loss_simple_step=0.421, train/loss_vlb_step=0.00311, train/loss_step=0.421, global_step=4911.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3263/5971 [29:43<24:40,  1.83it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.39e-5, train/loss_step=0.00721, global_step=4911.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3264/5971 [29:46<24:40,  1.83it/s, loss=0.242, v_num=0, train/loss_simple_step=0.00721, train/loss_vlb_step=3.39e-5, train/loss_step=0.00721, global_step=4911.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3264/5971 [29:46<24:40,  1.83it/s, loss=0.243, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000471, train/loss_step=0.140, global_step=4911.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  55%|█████▍    | 3265/5971 [29:46<24:40,  1.83it/s, loss=0.243, v_num=0, train/loss_simple_step=0.140, train/loss_vlb_step=0.000471, train/loss_step=0.140, global_step=4911.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3265/5971 [29:46<24:40,  1.83it/s, loss=0.251, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000526, train/loss_step=0.159, global_step=4912.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3266/5971 [29:47<24:40,  1.83it/s, loss=0.251, v_num=0, train/loss_simple_step=0.159, train/loss_vlb_step=0.000526, train/loss_step=0.159, global_step=4912.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3266/5971 [29:47<24:40,  1.83it/s, loss=0.257, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000526, train/loss_step=0.158, global_step=4912.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3267/5971 [29:48<24:39,  1.83it/s, loss=0.257, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000526, train/loss_step=0.158, global_step=4912.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3267/5971 [29:48<24:39,  1.83it/s, loss=0.269, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.0011, train/loss_step=0.287, global_step=4912.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  55%|█████▍    | 3268/5971 [29:50<24:40,  1.83it/s, loss=0.269, v_num=0, train/loss_simple_step=0.287, train/loss_vlb_step=0.0011, train/loss_step=0.287, global_step=4912.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3268/5971 [29:50<24:40,  1.83it/s, loss=0.269, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.000195, train/loss_step=0.0566, global_step=4912.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3269/5971 [29:51<24:40,  1.83it/s, loss=0.269, v_num=0, train/loss_simple_step=0.0566, train/loss_vlb_step=0.000195, train/loss_step=0.0566, global_step=4912.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3269/5971 [29:51<24:40,  1.83it/s, loss=0.27, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000617, train/loss_step=0.185, global_step=4913.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  55%|█████▍    | 3270/5971 [29:52<24:40,  1.82it/s, loss=0.27, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000617, train/loss_step=0.185, global_step=4913.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3270/5971 [29:52<24:40,  1.82it/s, loss=0.245, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.0006, train/loss_step=0.171, global_step=4913.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  55%|█████▍    | 3271/5971 [29:53<24:39,  1.82it/s, loss=0.245, v_num=0, train/loss_simple_step=0.171, train/loss_vlb_step=0.0006, train/loss_step=0.171, global_step=4913.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3271/5971 [29:53<24:39,  1.82it/s, loss=0.229, v_num=0, train/loss_simple_step=0.045, train/loss_vlb_step=0.00016, train/loss_step=0.045, global_step=4913.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3272/5971 [29:55<24:40,  1.82it/s, loss=0.229, v_num=0, train/loss_simple_step=0.045, train/loss_vlb_step=0.00016, train/loss_step=0.045, global_step=4913.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3272/5971 [29:55<24:40,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000924, train/loss_step=0.239, global_step=4913.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3273/5971 [29:56<24:40,  1.82it/s, loss=0.23, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000924, train/loss_step=0.239, global_step=4913.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3273/5971 [29:56<24:40,  1.82it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.79e-5, train/loss_step=0.0138, global_step=4914.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3274/5971 [29:57<24:40,  1.82it/s, loss=0.231, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.79e-5, train/loss_step=0.0138, global_step=4914.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3274/5971 [29:57<24:40,  1.82it/s, loss=0.227, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.54e-5, train/loss_step=0.0171, global_step=4914.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3275/5971 [29:58<24:39,  1.82it/s, loss=0.227, v_num=0, train/loss_simple_step=0.0171, train/loss_vlb_step=7.54e-5, train/loss_step=0.0171, global_step=4914.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3275/5971 [29:58<24:39,  1.82it/s, loss=0.248, v_num=0, train/loss_simple_step=0.778, train/loss_vlb_step=0.0218, train/loss_step=0.778, global_step=4914.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  55%|█████▍    | 3276/5971 [30:00<24:40,  1.82it/s, loss=0.248, v_num=0, train/loss_simple_step=0.778, train/loss_vlb_step=0.0218, train/loss_step=0.778, global_step=4914.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3276/5971 [30:00<24:40,  1.82it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000117, train/loss_step=0.0315, global_step=4914.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3277/5971 [30:01<24:40,  1.82it/s, loss=0.228, v_num=0, train/loss_simple_step=0.0315, train/loss_vlb_step=0.000117, train/loss_step=0.0315, global_step=4914.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3277/5971 [30:01<24:40,  1.82it/s, loss=0.236, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000704, train/loss_step=0.185, global_step=4915.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  55%|█████▍    | 3278/5971 [30:02<24:40,  1.82it/s, loss=0.236, v_num=0, train/loss_simple_step=0.185, train/loss_vlb_step=0.000704, train/loss_step=0.185, global_step=4915.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3278/5971 [30:02<24:40,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=4915.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3279/5971 [30:03<24:39,  1.82it/s, loss=0.201, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=4915.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3279/5971 [30:03<24:39,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00482, train/loss_vlb_step=2.51e-5, train/loss_step=0.00482, global_step=4915.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3280/5971 [30:05<24:40,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.00482, train/loss_vlb_step=2.51e-5, train/loss_step=0.00482, global_step=4915.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3280/5971 [30:05<24:40,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=2.05e-5, train/loss_step=0.00374, global_step=4915.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3281/5971 [30:06<24:40,  1.82it/s, loss=0.151, v_num=0, train/loss_simple_step=0.00374, train/loss_vlb_step=2.05e-5, train/loss_step=0.00374, global_step=4915.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3281/5971 [30:06<24:40,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00037, train/loss_step=0.109, global_step=4916.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  55%|█████▍    | 3282/5971 [30:07<24:40,  1.82it/s, loss=0.157, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.00037, train/loss_step=0.109, global_step=4916.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3282/5971 [30:07<24:40,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.74e-5, train/loss_step=0.00316, global_step=4916.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3283/5971 [30:08<24:39,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00316, train/loss_vlb_step=1.74e-5, train/loss_step=0.00316, global_step=4916.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3283/5971 [30:08<24:39,  1.82it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.6e-5, train/loss_step=0.00295, global_step=4916.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  55%|█████▍    | 3284/5971 [30:10<24:40,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00295, train/loss_vlb_step=1.6e-5, train/loss_step=0.00295, global_step=4916.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▍    | 3284/5971 [30:10<24:40,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000181, train/loss_step=0.0508, global_step=4916.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3285/5971 [30:11<24:40,  1.81it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000181, train/loss_step=0.0508, global_step=4916.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3285/5971 [30:11<24:40,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00128, train/loss_step=0.256, global_step=4917.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  55%|█████▌    | 3286/5971 [30:11<24:40,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.256, train/loss_vlb_step=0.00128, train/loss_step=0.256, global_step=4917.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3286/5971 [30:11<24:40,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000523, train/loss_step=0.153, global_step=4917.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3287/5971 [30:12<24:39,  1.81it/s, loss=0.136, v_num=0, train/loss_simple_step=0.153, train/loss_vlb_step=0.000523, train/loss_step=0.153, global_step=4917.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3287/5971 [30:12<24:39,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00194, train/loss_step=0.403, global_step=4917.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  55%|█████▌    | 3288/5971 [30:15<24:40,  1.81it/s, loss=0.141, v_num=0, train/loss_simple_step=0.403, train/loss_vlb_step=0.00194, train/loss_step=0.403, global_step=4917.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3288/5971 [30:15<24:40,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=4917.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3289/5971 [30:15<24:40,  1.81it/s, loss=0.144, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000333, train/loss_step=0.101, global_step=4917.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3289/5971 [30:15<24:40,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000263, train/loss_step=0.0797, global_step=4918.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3290/5971 [30:16<24:40,  1.81it/s, loss=0.138, v_num=0, train/loss_simple_step=0.0797, train/loss_vlb_step=0.000263, train/loss_step=0.0797, global_step=4918.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3290/5971 [30:16<24:40,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.802, train/loss_vlb_step=0.0348, train/loss_step=0.802, global_step=4918.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  55%|█████▌    | 3291/5971 [30:17<24:39,  1.81it/s, loss=0.17, v_num=0, train/loss_simple_step=0.802, train/loss_vlb_step=0.0348, train/loss_step=0.802, global_step=4918.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3291/5971 [30:17<24:39,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000482, train/loss_step=0.144, global_step=4918.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3292/5971 [30:19<24:40,  1.81it/s, loss=0.175, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000482, train/loss_step=0.144, global_step=4918.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3292/5971 [30:19<24:40,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00123, train/loss_vlb_step=7.38e-6, train/loss_step=0.00123, global_step=4918.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3293/5971 [30:20<24:40,  1.81it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00123, train/loss_vlb_step=7.38e-6, train/loss_step=0.00123, global_step=4918.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3293/5971 [30:20<24:40,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000207, train/loss_step=0.062, global_step=4919.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  55%|█████▌    | 3294/5971 [30:21<24:40,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.062, train/loss_vlb_step=0.000207, train/loss_step=0.062, global_step=4919.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3294/5971 [30:21<24:40,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=1.93e-5, train/loss_step=0.00355, global_step=4919.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3295/5971 [30:22<24:39,  1.81it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00355, train/loss_vlb_step=1.93e-5, train/loss_step=0.00355, global_step=4919.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3295/5971 [30:22<24:39,  1.81it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0888, train/loss_vlb_step=0.000292, train/loss_step=0.0888, global_step=4919.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  55%|█████▌    | 3296/5971 [30:24<24:40,  1.81it/s, loss=0.13, v_num=0, train/loss_simple_step=0.0888, train/loss_vlb_step=0.000292, train/loss_step=0.0888, global_step=4919.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3296/5971 [30:24<24:40,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00483, train/loss_step=0.545, global_step=4919.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  55%|█████▌    | 3297/5971 [30:25<24:40,  1.81it/s, loss=0.156, v_num=0, train/loss_simple_step=0.545, train/loss_vlb_step=0.00483, train/loss_step=0.545, global_step=4919.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3297/5971 [30:25<24:40,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.00012, train/loss_step=0.0317, global_step=4920.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3298/5971 [30:26<24:39,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0317, train/loss_vlb_step=0.00012, train/loss_step=0.0317, global_step=4920.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3298/5971 [30:26<24:39,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0804, train/loss_vlb_step=0.000267, train/loss_step=0.0804, global_step=4920.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3299/5971 [30:27<24:39,  1.81it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0804, train/loss_vlb_step=0.000267, train/loss_step=0.0804, global_step=4920.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3299/5971 [30:27<24:39,  1.81it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000115, train/loss_step=0.0311, global_step=4920.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3300/5971 [30:30<24:40,  1.80it/s, loss=0.148, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000115, train/loss_step=0.0311, global_step=4920.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3300/5971 [30:30<24:40,  1.80it/s, loss=0.169, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00243, train/loss_step=0.430, global_step=4920.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  55%|█████▌    | 3301/5971 [30:31<24:40,  1.80it/s, loss=0.169, v_num=0, train/loss_simple_step=0.430, train/loss_vlb_step=0.00243, train/loss_step=0.430, global_step=4920.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3301/5971 [30:31<24:40,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00467, train/loss_vlb_step=2.48e-5, train/loss_step=0.00467, global_step=4921.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3302/5971 [30:31<24:40,  1.80it/s, loss=0.164, v_num=0, train/loss_simple_step=0.00467, train/loss_vlb_step=2.48e-5, train/loss_step=0.00467, global_step=4921.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3302/5971 [30:31<24:40,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0712, train/loss_vlb_step=0.000242, train/loss_step=0.0712, global_step=4921.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  55%|█████▌    | 3303/5971 [30:32<24:40,  1.80it/s, loss=0.167, v_num=0, train/loss_simple_step=0.0712, train/loss_vlb_step=0.000242, train/loss_step=0.0712, global_step=4921.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3303/5971 [30:32<24:40,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000702, train/loss_step=0.190, global_step=4921.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  55%|█████▌    | 3304/5971 [30:34<24:40,  1.80it/s, loss=0.176, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000702, train/loss_step=0.190, global_step=4921.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3304/5971 [30:34<24:40,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000768, train/loss_step=0.211, global_step=4921.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3305/5971 [30:35<24:40,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000768, train/loss_step=0.211, global_step=4921.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3305/5971 [30:35<24:40,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000933, train/loss_step=0.237, global_step=4922.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3306/5971 [30:36<24:40,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.237, train/loss_vlb_step=0.000933, train/loss_step=0.237, global_step=4922.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3306/5971 [30:36<24:40,  1.80it/s, loss=0.21, v_num=0, train/loss_simple_step=0.691, train/loss_vlb_step=0.0228, train/loss_step=0.691, global_step=4922.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  55%|█████▌    | 3307/5971 [30:37<24:39,  1.80it/s, loss=0.21, v_num=0, train/loss_simple_step=0.691, train/loss_vlb_step=0.0228, train/loss_step=0.691, global_step=4922.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3307/5971 [30:37<24:39,  1.80it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00885, train/loss_vlb_step=4.1e-5, train/loss_step=0.00885, global_step=4922.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3308/5971 [30:39<24:40,  1.80it/s, loss=0.191, v_num=0, train/loss_simple_step=0.00885, train/loss_vlb_step=4.1e-5, train/loss_step=0.00885, global_step=4922.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3308/5971 [30:39<24:40,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.3e-5, train/loss_step=0.0162, global_step=4922.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  55%|█████▌    | 3309/5971 [30:40<24:40,  1.80it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0162, train/loss_vlb_step=6.3e-5, train/loss_step=0.0162, global_step=4922.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3309/5971 [30:40<24:40,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.55e-5, train/loss_step=0.00517, global_step=4923.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3310/5971 [30:41<24:39,  1.80it/s, loss=0.183, v_num=0, train/loss_simple_step=0.00517, train/loss_vlb_step=2.55e-5, train/loss_step=0.00517, global_step=4923.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3310/5971 [30:41<24:39,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00564, train/loss_vlb_step=2.83e-5, train/loss_step=0.00564, global_step=4923.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3311/5971 [30:42<24:39,  1.80it/s, loss=0.143, v_num=0, train/loss_simple_step=0.00564, train/loss_vlb_step=2.83e-5, train/loss_step=0.00564, global_step=4923.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3311/5971 [30:42<24:39,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.59e-5, train/loss_step=0.0138, global_step=4923.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  55%|█████▌    | 3312/5971 [30:44<24:40,  1.80it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0138, train/loss_vlb_step=5.59e-5, train/loss_step=0.0138, global_step=4923.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3312/5971 [30:44<24:40,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00222, train/loss_step=0.386, global_step=4923.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  55%|█████▌    | 3313/5971 [30:45<24:40,  1.80it/s, loss=0.156, v_num=0, train/loss_simple_step=0.386, train/loss_vlb_step=0.00222, train/loss_step=0.386, global_step=4923.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  55%|█████▌    | 3313/5971 [30:45<24:40,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00705, train/loss_vlb_step=3.38e-5, train/loss_step=0.00705, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  56%|█████▌    | 3314/5971 [30:46<24:39,  1.80it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00705, train/loss_vlb_step=3.38e-5, train/loss_step=0.00705, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  56%|█████▌    | 3314/5971 [30:46<24:39,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=9.13e-5, train/loss_step=0.0221, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  56%|█████▌    | 3315/5971 [30:47<24:39,  1.80it/s, loss=0.154, v_num=0, train/loss_simple_step=0.0221, train/loss_vlb_step=9.13e-5, train/loss_step=0.0221, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  56%|█████▌    | 3315/5971 [30:47<24:39,  1.80it/s, loss=0.184, v_num=0, train/loss_simple_step=0.688, train/loss_vlb_step=0.0214, train/loss_step=0.688, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  56%|█████▌    | 3316/5971 [30:49<24:40,  1.79it/s, loss=0.184, v_num=0, train/loss_simple_step=0.688, train/loss_vlb_step=0.0214, train/loss_step=0.688, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  56%|█████▌    | 3316/5971 [30:49<24:40,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:10,  2.36it/s][A
Epoch 8:  56%|█████▌    | 3318/5971 [30:49<24:38,  1.79it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   1%|          | 2/167 [00:00<00:46,  3.53it/s][A
Epoch 8:  56%|█████▌    | 3320/5971 [30:49<24:36,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:17,  9.17it/s][A
Epoch 8:  56%|█████▌    | 3323/5971 [30:50<24:33,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:11, 13.94it/s][A
Epoch 8:  56%|█████▌    | 3326/5971 [30:50<24:30,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 11/167 [00:00<00:08, 17.36it/s][A
Epoch 8:  56%|█████▌    | 3329/5971 [30:50<24:28,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   8%|▊         | 14/167 [00:01<00:07, 19.58it/s][A
Epoch 8:  56%|█████▌    | 3332/5971 [30:50<24:25,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  11%|█         | 18/167 [00:01<00:06, 22.59it/s][A
Epoch 8:  56%|█████▌    | 3335/5971 [30:50<24:22,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.59it/s][A
Epoch 8:  56%|█████▌    | 3338/5971 [30:50<24:19,  1.80it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 24/167 [00:01<00:05, 24.54it/s][A
Epoch 8:  56%|█████▌    | 3341/5971 [30:50<24:16,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  16%|█▌        | 27/167 [00:01<00:05, 25.42it/s][A
Epoch 8:  56%|█████▌    | 3344/5971 [30:50<24:13,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  18%|█▊        | 30/167 [00:01<00:05, 25.13it/s][A
Epoch 8:  56%|█████▌    | 3347/5971 [30:50<24:10,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  20%|█▉        | 33/167 [00:01<00:05, 25.76it/s][A
Epoch 8:  56%|█████▌    | 3350/5971 [30:51<24:07,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  22%|██▏       | 36/167 [00:01<00:05, 25.20it/s][A
Epoch 8:  56%|█████▌    | 3353/5971 [30:51<24:04,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  23%|██▎       | 39/167 [00:01<00:04, 26.46it/s][A
Epoch 8:  56%|█████▌    | 3356/5971 [30:51<24:02,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▌       | 42/167 [00:02<00:04, 27.21it/s][A
Epoch 8:  56%|█████▋    | 3359/5971 [30:51<23:59,  1.81it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  27%|██▋       | 45/167 [00:02<00:04, 26.79it/s][A
Epoch 8:  56%|█████▋    | 3362/5971 [30:51<23:56,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  29%|██▉       | 49/167 [00:02<00:04, 27.81it/s][A
Epoch 8:  56%|█████▋    | 3366/5971 [30:51<23:52,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  31%|███       | 52/167 [00:02<00:04, 27.51it/s][A
Epoch 8:  56%|█████▋    | 3370/5971 [30:51<23:48,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  33%|███▎      | 55/167 [00:02<00:04, 27.54it/s][A
Epoch 8:  57%|█████▋    | 3374/5971 [30:51<23:45,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▍      | 58/167 [00:02<00:03, 27.48it/s][A

Validating:  37%|███▋      | 61/167 [00:02<00:03, 28.10it/s][A
Epoch 8:  57%|█████▋    | 3378/5971 [30:52<23:41,  1.82it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 64/167 [00:02<00:03, 28.63it/s][A
Epoch 8:  57%|█████▋    | 3382/5971 [30:52<23:37,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  40%|████      | 67/167 [00:03<00:03, 27.19it/s][A
Epoch 8:  57%|█████▋    | 3386/5971 [30:52<23:33,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  42%|████▏     | 70/167 [00:03<00:03, 27.48it/s][A

Validating:  44%|████▎     | 73/167 [00:03<00:03, 27.89it/s][A
Epoch 8:  57%|█████▋    | 3390/5971 [30:52<23:30,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  46%|████▌     | 76/167 [00:03<00:03, 26.88it/s][A
Epoch 8:  57%|█████▋    | 3394/5971 [30:52<23:26,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  47%|████▋     | 79/167 [00:03<00:03, 27.21it/s][A
Epoch 8:  57%|█████▋    | 3398/5971 [30:52<23:22,  1.83it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  49%|████▉     | 82/167 [00:03<00:03, 27.87it/s][A

Validating:  51%|█████     | 85/167 [00:03<00:02, 28.01it/s][A
Epoch 8:  57%|█████▋    | 3402/5971 [30:52<23:18,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 28.42it/s][A
Epoch 8:  57%|█████▋    | 3406/5971 [30:53<23:15,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 26.20it/s][A
Epoch 8:  57%|█████▋    | 3410/5971 [30:53<23:11,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 25.91it/s][A
Epoch 8:  57%|█████▋    | 3414/5971 [30:53<23:07,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 27.34it/s][A

Validating:  60%|██████    | 101/167 [00:04<00:02, 27.27it/s][A
Epoch 8:  57%|█████▋    | 3418/5971 [30:53<23:04,  1.84it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  62%|██████▏   | 104/167 [00:04<00:02, 26.37it/s][A
Epoch 8:  57%|█████▋    | 3422/5971 [30:53<23:00,  1.85it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.56it/s][A
Epoch 8:  57%|█████▋    | 3426/5971 [30:53<22:56,  1.85it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.93it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:01, 27.14it/s][A
Epoch 8:  57%|█████▋    | 3430/5971 [30:54<22:53,  1.85it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 116/167 [00:04<00:01, 26.94it/s][A
Epoch 8:  58%|█████▊    | 3434/5971 [30:54<22:49,  1.85it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████▏  | 119/167 [00:04<00:01, 26.38it/s][A
Epoch 8:  58%|█████▊    | 3438/5971 [30:54<22:45,  1.85it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  73%|███████▎  | 122/167 [00:05<00:01, 26.71it/s][A

Validating:  75%|███████▍  | 125/167 [00:05<00:01, 26.45it/s][A
Epoch 8:  58%|█████▊    | 3442/5971 [30:54<22:42,  1.86it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  77%|███████▋  | 128/167 [00:05<00:01, 26.64it/s][A
Epoch 8:  58%|█████▊    | 3446/5971 [30:54<22:38,  1.86it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  78%|███████▊  | 131/167 [00:05<00:01, 26.79it/s][A
Epoch 8:  58%|█████▊    | 3450/5971 [30:54<22:34,  1.86it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  80%|████████  | 134/167 [00:05<00:01, 27.61it/s][A

Validating:  82%|████████▏ | 137/167 [00:05<00:01, 26.80it/s][A
Epoch 8:  58%|█████▊    | 3454/5971 [30:54<22:31,  1.86it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.07it/s][A
Epoch 8:  58%|█████▊    | 3458/5971 [30:55<22:27,  1.86it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.69it/s][A
Epoch 8:  58%|█████▊    | 3462/5971 [30:55<22:24,  1.87it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 146/167 [00:05<00:00, 26.42it/s][A

Validating:  89%|████████▉ | 149/167 [00:06<00:00, 26.94it/s][A
Epoch 8:  58%|█████▊    | 3466/5971 [30:55<22:20,  1.87it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.58it/s][A
Epoch 8:  58%|█████▊    | 3470/5971 [30:55<22:16,  1.87it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 25.90it/s][A
Epoch 8:  58%|█████▊    | 3474/5971 [30:55<22:13,  1.87it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 25.91it/s][A
Epoch 8:  58%|█████▊    | 3478/5971 [30:55<22:09,  1.87it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 27.36it/s][A
Epoch 8:  58%|█████▊    | 3482/5971 [30:55<22:06,  1.88it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  99%|█████████▉| 166/167 [00:06<00:00, 28.49it/s][A
Epoch 8:  58%|█████▊    | 3484/5971 [30:56<22:04,  1.88it/s, loss=0.197, v_num=0, train/loss_simple_step=0.812, train/loss_vlb_step=0.0466, train/loss_step=0.812, global_step=4924.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  58%|█████▊    | 3485/5971 [30:57<22:04,  1.88it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.49e-5, train/loss_step=0.00259, global_step=4925.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3486/5971 [30:58<22:04,  1.88it/s, loss=0.196, v_num=0, train/loss_simple_step=0.00259, train/loss_vlb_step=1.49e-5, train/loss_step=0.00259, global_step=4925.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3486/5971 [30:58<22:04,  1.88it/s, loss=0.2, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000567, train/loss_step=0.168, global_step=4925.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  58%|█████▊    | 3487/5971 [30:59<22:03,  1.88it/s, loss=0.21, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.00088, train/loss_step=0.235, global_step=4925.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3488/5971 [31:01<22:04,  1.87it/s, loss=0.194, v_num=0, train/loss_simple_step=0.111, train/loss_vlb_step=0.000366, train/loss_step=0.111, global_step=4925.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3489/5971 [31:02<22:04,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.87e-5, train/loss_step=0.0105, global_step=4926.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3490/5971 [31:03<22:04,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.0105, train/loss_vlb_step=4.87e-5, train/loss_step=0.0105, global_step=4926.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3490/5971 [31:03<22:04,  1.87it/s, loss=0.196, v_num=0, train/loss_simple_step=0.0936, train/loss_vlb_step=0.00031, train/loss_step=0.0936, global_step=4926.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3491/5971 [31:04<22:03,  1.87it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0252, train/loss_vlb_step=9.51e-5, train/loss_step=0.0252, global_step=4926.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3492/5971 [31:06<22:04,  1.87it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0366, train/loss_vlb_step=0.000138, train/loss_step=0.0366, global_step=4926.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  58%|█████▊    | 3493/5971 [31:07<22:04,  1.87it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00356, train/loss_vlb_step=1.91e-5, train/loss_step=0.00356, global_step=4927.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3494/5971 [31:08<22:03,  1.87it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00356, train/loss_vlb_step=1.91e-5, train/loss_step=0.00356, global_step=4927.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3494/5971 [31:08<22:03,  1.87it/s, loss=0.15, v_num=0, train/loss_simple_step=0.353, train/loss_vlb_step=0.00206, train/loss_step=0.353, global_step=4927.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  59%|█████▊    | 3495/5971 [31:08<22:03,  1.87it/s, loss=0.151, v_num=0, train/loss_simple_step=0.0273, train/loss_vlb_step=0.000105, train/loss_step=0.0273, global_step=4927.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3496/5971 [31:11<22:04,  1.87it/s, loss=0.158, v_num=0, train/loss_simple_step=0.161, train/loss_vlb_step=0.00057, train/loss_step=0.161, global_step=4927.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  59%|█████▊    | 3497/5971 [31:12<22:04,  1.87it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000192, train/loss_step=0.0569, global_step=4928.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3498/5971 [31:13<22:03,  1.87it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0569, train/loss_vlb_step=0.000192, train/loss_step=0.0569, global_step=4928.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3498/5971 [31:13<22:03,  1.87it/s, loss=0.169, v_num=0, train/loss_simple_step=0.170, train/loss_vlb_step=0.000563, train/loss_step=0.170, global_step=4928.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  59%|█████▊    | 3499/5971 [31:14<22:03,  1.87it/s, loss=0.171, v_num=0, train/loss_simple_step=0.0561, train/loss_vlb_step=0.000188, train/loss_step=0.0561, global_step=4928.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3500/5971 [31:16<22:04,  1.87it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0176, train/loss_vlb_step=7.61e-5, train/loss_step=0.0176, global_step=4928.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▊    | 3501/5971 [31:17<22:03,  1.87it/s, loss=0.17, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00208, train/loss_step=0.341, global_step=4929.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  59%|█████▊    | 3502/5971 [31:18<22:03,  1.87it/s, loss=0.17, v_num=0, train/loss_simple_step=0.341, train/loss_vlb_step=0.00208, train/loss_step=0.341, global_step=4929.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3502/5971 [31:18<22:03,  1.87it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0038, train/loss_vlb_step=1.89e-5, train/loss_step=0.0038, global_step=4929.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3503/5971 [31:18<22:03,  1.86it/s, loss=0.139, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000331, train/loss_step=0.101, global_step=4929.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▊    | 3504/5971 [31:21<22:04,  1.86it/s, loss=0.104, v_num=0, train/loss_simple_step=0.100, train/loss_vlb_step=0.00033, train/loss_step=0.100, global_step=4929.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▊    | 3505/5971 [31:22<22:03,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00159, train/loss_step=0.335, global_step=4930.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▊    | 3506/5971 [31:22<22:03,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.335, train/loss_vlb_step=0.00159, train/loss_step=0.335, global_step=4930.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3506/5971 [31:22<22:03,  1.86it/s, loss=0.126, v_num=0, train/loss_simple_step=0.275, train/loss_vlb_step=0.00109, train/loss_step=0.275, global_step=4930.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▊    | 3507/5971 [31:23<22:03,  1.86it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0659, train/loss_vlb_step=0.000221, train/loss_step=0.0659, global_step=4930.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3508/5971 [31:25<22:03,  1.86it/s, loss=0.131, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00231, train/loss_step=0.388, global_step=4930.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  59%|█████▉    | 3509/5971 [31:26<22:03,  1.86it/s, loss=0.151, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00209, train/loss_step=0.400, global_step=4931.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3510/5971 [31:27<22:03,  1.86it/s, loss=0.151, v_num=0, train/loss_simple_step=0.400, train/loss_vlb_step=0.00209, train/loss_step=0.400, global_step=4931.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3510/5971 [31:27<22:03,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.00542, train/loss_vlb_step=2.62e-5, train/loss_step=0.00542, global_step=4931.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3511/5971 [31:28<22:02,  1.86it/s, loss=0.152, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000436, train/loss_step=0.133, global_step=4931.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  59%|█████▉    | 3512/5971 [31:30<22:03,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.229, train/loss_vlb_step=0.000831, train/loss_step=0.229, global_step=4931.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3513/5971 [31:31<22:03,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.34e-5, train/loss_step=0.00483, global_step=4932.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3514/5971 [31:32<22:02,  1.86it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00483, train/loss_vlb_step=2.34e-5, train/loss_step=0.00483, global_step=4932.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3514/5971 [31:32<22:02,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.0729, train/loss_vlb_step=0.000248, train/loss_step=0.0729, global_step=4932.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▉    | 3515/5971 [31:33<22:02,  1.86it/s, loss=0.147, v_num=0, train/loss_simple_step=0.015, train/loss_vlb_step=6.29e-5, train/loss_step=0.015, global_step=4932.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  59%|█████▉    | 3516/5971 [31:35<22:03,  1.86it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00949, train/loss_vlb_step=4.27e-5, train/loss_step=0.00949, global_step=4932.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3517/5971 [31:36<22:03,  1.85it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00457, train/loss_vlb_step=2.44e-5, train/loss_step=0.00457, global_step=4933.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3518/5971 [31:37<22:02,  1.85it/s, loss=0.136, v_num=0, train/loss_simple_step=0.00457, train/loss_vlb_step=2.44e-5, train/loss_step=0.00457, global_step=4933.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3518/5971 [31:37<22:02,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.439, train/loss_vlb_step=0.00258, train/loss_step=0.439, global_step=4933.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  59%|█████▉    | 3519/5971 [31:38<22:02,  1.85it/s, loss=0.147, v_num=0, train/loss_simple_step=0.00189, train/loss_vlb_step=1.1e-5, train/loss_step=0.00189, global_step=4933.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3520/5971 [31:40<22:02,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.387, train/loss_vlb_step=0.00408, train/loss_step=0.387, global_step=4933.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  59%|█████▉    | 3521/5971 [31:41<22:02,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.53e-5, train/loss_step=0.0187, global_step=4934.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3522/5971 [31:42<22:02,  1.85it/s, loss=0.149, v_num=0, train/loss_simple_step=0.0187, train/loss_vlb_step=7.53e-5, train/loss_step=0.0187, global_step=4934.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3522/5971 [31:42<22:02,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.00935, train/loss_vlb_step=4.51e-5, train/loss_step=0.00935, global_step=4934.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3523/5971 [31:43<22:02,  1.85it/s, loss=0.159, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.0011, train/loss_step=0.277, global_step=4934.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  59%|█████▉    | 3524/5971 [31:45<22:02,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.385, train/loss_vlb_step=0.00235, train/loss_step=0.385, global_step=4934.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3525/5971 [31:46<22:02,  1.85it/s, loss=0.172, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00124, train/loss_step=0.308, global_step=4935.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3526/5971 [31:47<22:02,  1.85it/s, loss=0.172, v_num=0, train/loss_simple_step=0.308, train/loss_vlb_step=0.00124, train/loss_step=0.308, global_step=4935.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3526/5971 [31:47<22:02,  1.85it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0975, train/loss_vlb_step=0.000327, train/loss_step=0.0975, global_step=4935.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3527/5971 [31:48<22:01,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.0332, train/loss_vlb_step=0.000129, train/loss_step=0.0332, global_step=4935.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3528/5971 [31:50<22:02,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.843, train/loss_vlb_step=0.107, train/loss_step=0.843, global_step=4935.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  59%|█████▉    | 3529/5971 [31:51<22:02,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000109, train/loss_step=0.030, global_step=4936.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3530/5971 [31:52<22:01,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.030, train/loss_vlb_step=0.000109, train/loss_step=0.030, global_step=4936.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3530/5971 [31:52<22:01,  1.85it/s, loss=0.179, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00121, train/loss_step=0.279, global_step=4936.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▉    | 3531/5971 [31:52<22:01,  1.85it/s, loss=0.178, v_num=0, train/loss_simple_step=0.109, train/loss_vlb_step=0.000362, train/loss_step=0.109, global_step=4936.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3532/5971 [31:55<22:02,  1.84it/s, loss=0.166, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.19e-5, train/loss_step=0.00429, global_step=4936.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3533/5971 [31:55<22:01,  1.84it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0825, train/loss_vlb_step=0.000282, train/loss_step=0.0825, global_step=4937.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  59%|█████▉    | 3534/5971 [31:56<22:01,  1.84it/s, loss=0.17, v_num=0, train/loss_simple_step=0.0825, train/loss_vlb_step=0.000282, train/loss_step=0.0825, global_step=4937.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3534/5971 [31:56<22:01,  1.84it/s, loss=0.206, v_num=0, train/loss_simple_step=0.790, train/loss_vlb_step=0.0579, train/loss_step=0.790, global_step=4937.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  59%|█████▉    | 3535/5971 [31:57<22:01,  1.84it/s, loss=0.208, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000206, train/loss_step=0.0604, global_step=4937.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3536/5971 [31:59<22:01,  1.84it/s, loss=0.21, v_num=0, train/loss_simple_step=0.0468, train/loss_vlb_step=0.000164, train/loss_step=0.0468, global_step=4937.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▉    | 3537/5971 [32:00<22:01,  1.84it/s, loss=0.231, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00309, train/loss_step=0.420, global_step=4938.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  59%|█████▉    | 3538/5971 [32:01<22:01,  1.84it/s, loss=0.231, v_num=0, train/loss_simple_step=0.420, train/loss_vlb_step=0.00309, train/loss_step=0.420, global_step=4938.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3538/5971 [32:01<22:01,  1.84it/s, loss=0.217, v_num=0, train/loss_simple_step=0.166, train/loss_vlb_step=0.000562, train/loss_step=0.166, global_step=4938.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3539/5971 [32:02<22:00,  1.84it/s, loss=0.219, v_num=0, train/loss_simple_step=0.0243, train/loss_vlb_step=9.85e-5, train/loss_step=0.0243, global_step=4938.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3540/5971 [32:04<22:01,  1.84it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00405, train/loss_vlb_step=2.07e-5, train/loss_step=0.00405, global_step=4938.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3541/5971 [32:05<22:01,  1.84it/s, loss=0.218, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00246, train/loss_step=0.395, global_step=4939.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  59%|█████▉    | 3542/5971 [32:06<22:00,  1.84it/s, loss=0.218, v_num=0, train/loss_simple_step=0.395, train/loss_vlb_step=0.00246, train/loss_step=0.395, global_step=4939.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3542/5971 [32:06<22:00,  1.84it/s, loss=0.22, v_num=0, train/loss_simple_step=0.0492, train/loss_vlb_step=0.00017, train/loss_step=0.0492, global_step=4939.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3543/5971 [32:07<22:00,  1.84it/s, loss=0.217, v_num=0, train/loss_simple_step=0.222, train/loss_vlb_step=0.000806, train/loss_step=0.222, global_step=4939.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3544/5971 [32:09<22:01,  1.84it/s, loss=0.199, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.23e-5, train/loss_step=0.00668, global_step=4939.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3545/5971 [32:10<22:00,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000159, train/loss_step=0.0421, global_step=4940.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▉    | 3546/5971 [32:11<22:00,  1.84it/s, loss=0.185, v_num=0, train/loss_simple_step=0.0421, train/loss_vlb_step=0.000159, train/loss_step=0.0421, global_step=4940.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3546/5971 [32:11<22:00,  1.84it/s, loss=0.181, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.64e-5, train/loss_step=0.0183, global_step=4940.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▉    | 3547/5971 [32:12<22:00,  1.84it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0698, train/loss_vlb_step=0.000235, train/loss_step=0.0698, global_step=4940.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3548/5971 [32:14<22:00,  1.83it/s, loss=0.141, v_num=0, train/loss_simple_step=0.00186, train/loss_vlb_step=1.11e-5, train/loss_step=0.00186, global_step=4940.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3549/5971 [32:15<22:00,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00943, train/loss_vlb_step=4.41e-5, train/loss_step=0.00943, global_step=4941.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▉    | 3550/5971 [32:16<22:00,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.00943, train/loss_vlb_step=4.41e-5, train/loss_step=0.00943, global_step=4941.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3550/5971 [32:16<22:00,  1.83it/s, loss=0.131, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.00031, train/loss_step=0.0941, global_step=4941.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  59%|█████▉    | 3551/5971 [32:17<21:59,  1.83it/s, loss=0.126, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.93e-5, train/loss_step=0.0113, global_step=4941.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  59%|█████▉    | 3552/5971 [32:19<22:00,  1.83it/s, loss=0.151, v_num=0, train/loss_simple_step=0.503, train/loss_vlb_step=0.00358, train/loss_step=0.503, global_step=4941.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  60%|█████▉    | 3553/5971 [32:20<22:00,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000324, train/loss_step=0.0968, global_step=4942.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3554/5971 [32:21<21:59,  1.83it/s, loss=0.152, v_num=0, train/loss_simple_step=0.0968, train/loss_vlb_step=0.000324, train/loss_step=0.0968, global_step=4942.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3554/5971 [32:21<21:59,  1.83it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0225, train/loss_vlb_step=8.89e-5, train/loss_step=0.0225, global_step=4942.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  60%|█████▉    | 3555/5971 [32:22<21:59,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.00778, train/loss_vlb_step=3.72e-5, train/loss_step=0.00778, global_step=4942.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3556/5971 [32:24<22:00,  1.83it/s, loss=0.123, v_num=0, train/loss_simple_step=0.303, train/loss_vlb_step=0.00133, train/loss_step=0.303, global_step=4942.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  60%|█████▉    | 3557/5971 [32:25<21:59,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000615, train/loss_step=0.175, global_step=4943.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3558/5971 [32:26<21:59,  1.83it/s, loss=0.111, v_num=0, train/loss_simple_step=0.175, train/loss_vlb_step=0.000615, train/loss_step=0.175, global_step=4943.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3558/5971 [32:26<21:59,  1.83it/s, loss=0.103, v_num=0, train/loss_simple_step=0.00291, train/loss_vlb_step=1.56e-5, train/loss_step=0.00291, global_step=4943.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3559/5971 [32:27<21:59,  1.83it/s, loss=0.105, v_num=0, train/loss_simple_step=0.0542, train/loss_vlb_step=0.000184, train/loss_step=0.0542, global_step=4943.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  60%|█████▉    | 3560/5971 [32:29<21:59,  1.83it/s, loss=0.105, v_num=0, train/loss_simple_step=0.00767, train/loss_vlb_step=3.59e-5, train/loss_step=0.00767, global_step=4943.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3561/5971 [32:30<21:59,  1.83it/s, loss=0.0866, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000123, train/loss_step=0.0337, global_step=4944.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3562/5971 [32:31<21:59,  1.83it/s, loss=0.0866, v_num=0, train/loss_simple_step=0.0337, train/loss_vlb_step=0.000123, train/loss_step=0.0337, global_step=4944.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3562/5971 [32:31<21:59,  1.83it/s, loss=0.103, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00199, train/loss_step=0.377, global_step=4944.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  60%|█████▉    | 3563/5971 [32:31<21:58,  1.83it/s, loss=0.11, v_num=0, train/loss_simple_step=0.367, train/loss_vlb_step=0.00208, train/loss_step=0.367, global_step=4944.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  60%|█████▉    | 3564/5971 [32:34<21:59,  1.82it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0395, train/loss_vlb_step=0.000151, train/loss_step=0.0395, global_step=4944.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3565/5971 [32:35<21:59,  1.82it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000119, train/loss_step=0.0301, global_step=4945.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3566/5971 [32:35<21:58,  1.82it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0301, train/loss_vlb_step=0.000119, train/loss_step=0.0301, global_step=4945.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3566/5971 [32:35<21:58,  1.82it/s, loss=0.11, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.15e-5, train/loss_step=0.00196, global_step=4945.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3567/5971 [32:36<21:58,  1.82it/s, loss=0.113, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.0004, train/loss_step=0.121, global_step=4945.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  60%|█████▉    | 3568/5971 [32:38<21:58,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.523, train/loss_vlb_step=0.00543, train/loss_step=0.523, global_step=4945.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3569/5971 [32:39<21:58,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.35e-5, train/loss_step=0.00256, global_step=4946.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3570/5971 [32:40<21:58,  1.82it/s, loss=0.139, v_num=0, train/loss_simple_step=0.00256, train/loss_vlb_step=1.35e-5, train/loss_step=0.00256, global_step=4946.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3570/5971 [32:40<21:58,  1.82it/s, loss=0.134, v_num=0, train/loss_simple_step=0.0047, train/loss_vlb_step=2.3e-5, train/loss_step=0.0047, global_step=4946.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  60%|█████▉    | 3571/5971 [32:41<21:57,  1.82it/s, loss=0.173, v_num=0, train/loss_simple_step=0.796, train/loss_vlb_step=0.0345, train/loss_step=0.796, global_step=4946.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  60%|█████▉    | 3572/5971 [32:43<21:58,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.334, train/loss_vlb_step=0.00155, train/loss_step=0.334, global_step=4946.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3573/5971 [32:44<21:58,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0585, train/loss_vlb_step=0.000197, train/loss_step=0.0585, global_step=4947.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3574/5971 [32:45<21:57,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0585, train/loss_vlb_step=0.000197, train/loss_step=0.0585, global_step=4947.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3574/5971 [32:45<21:57,  1.82it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0244, train/loss_vlb_step=9.88e-5, train/loss_step=0.0244, global_step=4947.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  60%|█████▉    | 3575/5971 [32:46<21:57,  1.82it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0409, train/loss_vlb_step=0.000146, train/loss_step=0.0409, global_step=4947.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3576/5971 [32:48<21:58,  1.82it/s, loss=0.153, v_num=0, train/loss_simple_step=0.0645, train/loss_vlb_step=0.000218, train/loss_step=0.0645, global_step=4947.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3577/5971 [32:49<21:57,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000927, train/loss_step=0.247, global_step=4948.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  60%|█████▉    | 3578/5971 [32:50<21:57,  1.82it/s, loss=0.156, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000927, train/loss_step=0.247, global_step=4948.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3578/5971 [32:50<21:57,  1.82it/s, loss=0.168, v_num=0, train/loss_simple_step=0.242, train/loss_vlb_step=0.00108, train/loss_step=0.242, global_step=4948.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  60%|█████▉    | 3579/5971 [32:51<21:57,  1.82it/s, loss=0.168, v_num=0, train/loss_simple_step=0.042, train/loss_vlb_step=0.000151, train/loss_step=0.042, global_step=4948.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3580/5971 [32:53<21:57,  1.81it/s, loss=0.187, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.00227, train/loss_step=0.388, global_step=4948.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  60%|█████▉    | 3581/5971 [32:54<21:57,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.8e-5, train/loss_step=0.0259, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3582/5971 [32:55<21:56,  1.81it/s, loss=0.186, v_num=0, train/loss_simple_step=0.0259, train/loss_vlb_step=9.8e-5, train/loss_step=0.0259, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|█████▉    | 3582/5971 [32:55<21:56,  1.81it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00379, train/loss_vlb_step=2.03e-5, train/loss_step=0.00379, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  60%|██████    | 3583/5971 [32:55<21:56,  1.81it/s, loss=0.16, v_num=0, train/loss_simple_step=0.201, train/loss_vlb_step=0.000833, train/loss_step=0.201, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  60%|██████    | 3584/5971 [32:58<21:57,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<02:30,  1.10it/s][A
Epoch 8:  60%|██████    | 3586/5971 [32:59<21:55,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   1%|          | 2/167 [00:01<01:12,  2.27it/s][A

Validating:   3%|▎         | 5/167 [00:01<00:24,  6.56it/s][A
Epoch 8:  60%|██████    | 3590/5971 [32:59<21:52,  1.81it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:01<00:15, 10.18it/s][A
Epoch 8:  60%|██████    | 3594/5971 [32:59<21:48,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 11/167 [00:01<00:11, 13.12it/s][A
Epoch 8:  60%|██████    | 3598/5971 [32:59<21:45,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   8%|▊         | 14/167 [00:01<00:09, 15.52it/s][A

Validating:  10%|█         | 17/167 [00:01<00:08, 18.38it/s][A
Epoch 8:  60%|██████    | 3602/5971 [32:59<21:41,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  12%|█▏        | 20/167 [00:01<00:07, 20.97it/s][A
Epoch 8:  60%|██████    | 3606/5971 [33:00<21:38,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.60it/s][A
Epoch 8:  60%|██████    | 3610/5971 [33:00<21:34,  1.82it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  16%|█▌        | 26/167 [00:01<00:05, 23.60it/s][A

Validating:  17%|█▋        | 29/167 [00:02<00:05, 24.16it/s][A
Epoch 8:  61%|██████    | 3614/5971 [33:00<21:31,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▉        | 32/167 [00:02<00:05, 25.34it/s][A
Epoch 8:  61%|██████    | 3618/5971 [33:00<21:27,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  21%|██        | 35/167 [00:02<00:05, 24.66it/s][A
Epoch 8:  61%|██████    | 3622/5971 [33:00<21:24,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 24.89it/s][A

Validating:  25%|██▍       | 41/167 [00:02<00:04, 25.55it/s][A
Epoch 8:  61%|██████    | 3626/5971 [33:00<21:20,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▋       | 44/167 [00:02<00:04, 26.28it/s][A
Epoch 8:  61%|██████    | 3630/5971 [33:01<21:17,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  28%|██▊       | 47/167 [00:02<00:04, 25.74it/s][A
Epoch 8:  61%|██████    | 3634/5971 [33:01<21:13,  1.83it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 25.95it/s][A

Validating:  32%|███▏      | 53/167 [00:03<00:04, 25.53it/s][A
Epoch 8:  61%|██████    | 3638/5971 [33:01<21:10,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  34%|███▎      | 56/167 [00:03<00:04, 26.48it/s][A
Epoch 8:  61%|██████    | 3642/5971 [33:01<21:06,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▌      | 59/167 [00:03<00:04, 26.69it/s][A
Epoch 8:  61%|██████    | 3646/5971 [33:01<21:03,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 62/167 [00:03<00:04, 25.24it/s][A

Validating:  39%|███▉      | 65/167 [00:03<00:04, 24.28it/s][A
Epoch 8:  61%|██████    | 3650/5971 [33:01<20:59,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  41%|████      | 68/167 [00:03<00:04, 22.77it/s][A
Epoch 8:  61%|██████    | 3654/5971 [33:01<20:56,  1.84it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 71/167 [00:03<00:04, 22.58it/s][A
Epoch 8:  61%|██████▏   | 3658/5971 [33:02<20:52,  1.85it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 23.37it/s][A
Epoch 8:  61%|██████▏   | 3662/5971 [33:02<20:49,  1.85it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  47%|████▋     | 78/167 [00:04<00:03, 25.24it/s][A

Validating:  49%|████▊     | 81/167 [00:04<00:03, 25.05it/s][A
Epoch 8:  61%|██████▏   | 3666/5971 [33:02<20:46,  1.85it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  50%|█████     | 84/167 [00:04<00:03, 25.90it/s][A
Epoch 8:  61%|██████▏   | 3670/5971 [33:02<20:42,  1.85it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  52%|█████▏    | 87/167 [00:04<00:03, 26.53it/s][A
Epoch 8:  62%|██████▏   | 3674/5971 [33:02<20:39,  1.85it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 26.28it/s][A

Validating:  56%|█████▌    | 93/167 [00:04<00:02, 26.63it/s][A
Epoch 8:  62%|██████▏   | 3678/5971 [33:02<20:35,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  57%|█████▋    | 96/167 [00:04<00:02, 27.06it/s][A
Epoch 8:  62%|██████▏   | 3682/5971 [33:03<20:32,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  59%|█████▉    | 99/167 [00:04<00:02, 27.47it/s][A
Epoch 8:  62%|██████▏   | 3686/5971 [33:03<20:29,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  61%|██████    | 102/167 [00:04<00:02, 26.73it/s][A

Validating:  63%|██████▎   | 105/167 [00:05<00:02, 26.62it/s][A
Epoch 8:  62%|██████▏   | 3690/5971 [33:03<20:25,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▍   | 108/167 [00:05<00:02, 27.48it/s][A
Epoch 8:  62%|██████▏   | 3694/5971 [33:03<20:22,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▋   | 111/167 [00:05<00:02, 28.00it/s][A
Epoch 8:  62%|██████▏   | 3698/5971 [33:03<20:18,  1.86it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  68%|██████▊   | 114/167 [00:05<00:01, 27.95it/s][A

Validating:  70%|███████   | 117/167 [00:05<00:01, 28.40it/s][A
Epoch 8:  62%|██████▏   | 3702/5971 [33:03<20:15,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  72%|███████▏  | 120/167 [00:05<00:01, 27.82it/s][A
Epoch 8:  62%|██████▏   | 3706/5971 [33:03<20:12,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.94it/s][A
Epoch 8:  62%|██████▏   | 3710/5971 [33:04<20:08,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.42it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 27.74it/s][A
Epoch 8:  62%|██████▏   | 3714/5971 [33:04<20:05,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  79%|███████▉  | 132/167 [00:06<00:01, 28.20it/s][A
Epoch 8:  62%|██████▏   | 3718/5971 [33:04<20:02,  1.87it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████  | 135/167 [00:06<00:01, 27.47it/s][A
Epoch 8:  62%|██████▏   | 3722/5971 [33:04<19:58,  1.88it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  83%|████████▎ | 138/167 [00:06<00:01, 27.39it/s][A

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 27.58it/s][A
Epoch 8:  62%|██████▏   | 3726/5971 [33:04<19:55,  1.88it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 145/167 [00:06<00:00, 28.91it/s][A
Epoch 8:  62%|██████▏   | 3730/5971 [33:04<19:52,  1.88it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  89%|████████▊ | 148/167 [00:06<00:00, 29.05it/s][A
Epoch 8:  63%|██████▎   | 3734/5971 [33:04<19:48,  1.88it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  90%|█████████ | 151/167 [00:06<00:00, 28.50it/s][A
Epoch 8:  63%|██████▎   | 3738/5971 [33:05<19:45,  1.88it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  92%|█████████▏| 154/167 [00:06<00:00, 28.03it/s][A

Validating:  94%|█████████▍| 157/167 [00:06<00:00, 27.60it/s][A
Epoch 8:  63%|██████▎   | 3742/5971 [33:05<19:42,  1.89it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  96%|█████████▌| 160/167 [00:07<00:00, 26.40it/s][A
Epoch 8:  63%|██████▎   | 3746/5971 [33:05<19:38,  1.89it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  98%|█████████▊| 163/167 [00:07<00:00, 27.05it/s][A
Epoch 8:  63%|██████▎   | 3750/5971 [33:05<19:35,  1.89it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  99%|█████████▉| 166/167 [00:07<00:00, 27.01it/s][A
Epoch 8:  63%|██████▎   | 3752/5971 [33:05<19:34,  1.89it/s, loss=0.195, v_num=0, train/loss_simple_step=0.750, train/loss_vlb_step=0.0281, train/loss_step=0.750, global_step=4949.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  63%|██████▎   | 3753/5971 [33:06<19:33,  1.89it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00788, train/loss_vlb_step=3.65e-5, train/loss_step=0.00788, global_step=4950.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3754/5971 [33:07<19:33,  1.89it/s, loss=0.194, v_num=0, train/loss_simple_step=0.00788, train/loss_vlb_step=3.65e-5, train/loss_step=0.00788, global_step=4950.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3754/5971 [33:07<19:33,  1.89it/s, loss=0.221, v_num=0, train/loss_simple_step=0.536, train/loss_vlb_step=0.0044, train/loss_step=0.536, global_step=4950.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  63%|██████▎   | 3755/5971 [33:08<19:33,  1.89it/s, loss=0.215, v_num=0, train/loss_simple_step=0.00852, train/loss_vlb_step=4.02e-5, train/loss_step=0.00852, global_step=4950.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3756/5971 [33:10<19:33,  1.89it/s, loss=0.217, v_num=0, train/loss_simple_step=0.568, train/loss_vlb_step=0.00504, train/loss_step=0.568, global_step=4950.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  63%|██████▎   | 3757/5971 [33:11<19:33,  1.89it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00307, train/loss_vlb_step=1.68e-5, train/loss_step=0.00307, global_step=4951.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3758/5971 [33:12<19:32,  1.89it/s, loss=0.217, v_num=0, train/loss_simple_step=0.00307, train/loss_vlb_step=1.68e-5, train/loss_step=0.00307, global_step=4951.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3758/5971 [33:12<19:32,  1.89it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0121, train/loss_vlb_step=5.23e-5, train/loss_step=0.0121, global_step=4951.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  63%|██████▎   | 3759/5971 [33:13<19:32,  1.89it/s, loss=0.179, v_num=0, train/loss_simple_step=0.0238, train/loss_vlb_step=9.2e-5, train/loss_step=0.0238, global_step=4951.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  63%|██████▎   | 3760/5971 [33:15<19:33,  1.88it/s, loss=0.162, v_num=0, train/loss_simple_step=0.00132, train/loss_vlb_step=8.01e-6, train/loss_step=0.00132, global_step=4951.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3761/5971 [33:16<19:32,  1.88it/s, loss=0.164, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000276, train/loss_step=0.083, global_step=4952.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  63%|██████▎   | 3762/5971 [33:17<19:32,  1.88it/s, loss=0.164, v_num=0, train/loss_simple_step=0.083, train/loss_vlb_step=0.000276, train/loss_step=0.083, global_step=4952.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3762/5971 [33:17<19:32,  1.88it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00451, train/loss_vlb_step=2.18e-5, train/loss_step=0.00451, global_step=4952.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3763/5971 [33:18<19:32,  1.88it/s, loss=0.169, v_num=0, train/loss_simple_step=0.164, train/loss_vlb_step=0.000545, train/loss_step=0.164, global_step=4952.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  63%|██████▎   | 3764/5971 [33:20<19:32,  1.88it/s, loss=0.179, v_num=0, train/loss_simple_step=0.279, train/loss_vlb_step=0.00103, train/loss_step=0.279, global_step=4952.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  63%|██████▎   | 3765/5971 [33:21<19:32,  1.88it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.78e-6, train/loss_step=0.00165, global_step=4953.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3766/5971 [33:22<19:31,  1.88it/s, loss=0.167, v_num=0, train/loss_simple_step=0.00165, train/loss_vlb_step=9.78e-6, train/loss_step=0.00165, global_step=4953.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3766/5971 [33:22<19:31,  1.88it/s, loss=0.171, v_num=0, train/loss_simple_step=0.325, train/loss_vlb_step=0.00187, train/loss_step=0.325, global_step=4953.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  63%|██████▎   | 3767/5971 [33:22<19:31,  1.88it/s, loss=0.192, v_num=0, train/loss_simple_step=0.459, train/loss_vlb_step=0.00319, train/loss_step=0.459, global_step=4953.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3768/5971 [33:25<19:32,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0647, train/loss_vlb_step=0.000231, train/loss_step=0.0647, global_step=4953.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3769/5971 [33:26<19:31,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000118, train/loss_step=0.0314, global_step=4954.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3770/5971 [33:27<19:31,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.0314, train/loss_vlb_step=0.000118, train/loss_step=0.0314, global_step=4954.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3770/5971 [33:27<19:31,  1.88it/s, loss=0.176, v_num=0, train/loss_simple_step=0.00147, train/loss_vlb_step=8.8e-6, train/loss_step=0.00147, global_step=4954.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3771/5971 [33:27<19:31,  1.88it/s, loss=0.169, v_num=0, train/loss_simple_step=0.0641, train/loss_vlb_step=0.000213, train/loss_step=0.0641, global_step=4954.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3772/5971 [33:30<19:31,  1.88it/s, loss=0.145, v_num=0, train/loss_simple_step=0.265, train/loss_vlb_step=0.00124, train/loss_step=0.265, global_step=4954.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  63%|██████▎   | 3773/5971 [33:30<19:31,  1.88it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00639, train/loss_vlb_step=3.21e-5, train/loss_step=0.00639, global_step=4955.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3774/5971 [33:31<19:30,  1.88it/s, loss=0.145, v_num=0, train/loss_simple_step=0.00639, train/loss_vlb_step=3.21e-5, train/loss_step=0.00639, global_step=4955.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3774/5971 [33:31<19:30,  1.88it/s, loss=0.118, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.11e-5, train/loss_step=0.00199, global_step=4955.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3775/5971 [33:32<19:30,  1.88it/s, loss=0.127, v_num=0, train/loss_simple_step=0.184, train/loss_vlb_step=0.000621, train/loss_step=0.184, global_step=4955.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  63%|██████▎   | 3776/5971 [33:34<19:30,  1.87it/s, loss=0.0994, v_num=0, train/loss_simple_step=0.0142, train/loss_vlb_step=6.04e-5, train/loss_step=0.0142, global_step=4955.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3777/5971 [33:35<19:30,  1.87it/s, loss=0.116, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00155, train/loss_step=0.328, global_step=4956.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  63%|██████▎   | 3778/5971 [33:36<19:30,  1.87it/s, loss=0.116, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00155, train/loss_step=0.328, global_step=4956.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3778/5971 [33:36<19:30,  1.87it/s, loss=0.115, v_num=0, train/loss_simple_step=0.0016, train/loss_vlb_step=9.7e-6, train/loss_step=0.0016, global_step=4956.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3779/5971 [33:37<19:29,  1.87it/s, loss=0.156, v_num=0, train/loss_simple_step=0.851, train/loss_vlb_step=0.0867, train/loss_step=0.851, global_step=4956.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  63%|██████▎   | 3780/5971 [33:39<19:30,  1.87it/s, loss=0.158, v_num=0, train/loss_simple_step=0.0291, train/loss_vlb_step=0.000114, train/loss_step=0.0291, global_step=4956.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3781/5971 [33:40<19:30,  1.87it/s, loss=0.164, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000997, train/loss_step=0.211, global_step=4957.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  63%|██████▎   | 3782/5971 [33:41<19:29,  1.87it/s, loss=0.164, v_num=0, train/loss_simple_step=0.211, train/loss_vlb_step=0.000997, train/loss_step=0.211, global_step=4957.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3782/5971 [33:41<19:29,  1.87it/s, loss=0.172, v_num=0, train/loss_simple_step=0.163, train/loss_vlb_step=0.000555, train/loss_step=0.163, global_step=4957.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3783/5971 [33:42<19:29,  1.87it/s, loss=0.182, v_num=0, train/loss_simple_step=0.362, train/loss_vlb_step=0.00188, train/loss_step=0.362, global_step=4957.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  63%|██████▎   | 3784/5971 [33:44<19:29,  1.87it/s, loss=0.181, v_num=0, train/loss_simple_step=0.264, train/loss_vlb_step=0.000918, train/loss_step=0.264, global_step=4957.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3785/5971 [33:45<19:29,  1.87it/s, loss=0.196, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00114, train/loss_step=0.288, global_step=4958.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  63%|██████▎   | 3786/5971 [33:46<19:29,  1.87it/s, loss=0.196, v_num=0, train/loss_simple_step=0.288, train/loss_vlb_step=0.00114, train/loss_step=0.288, global_step=4958.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3786/5971 [33:46<19:29,  1.87it/s, loss=0.19, v_num=0, train/loss_simple_step=0.216, train/loss_vlb_step=0.000883, train/loss_step=0.216, global_step=4958.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3787/5971 [33:47<19:28,  1.87it/s, loss=0.168, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=5.96e-5, train/loss_step=0.0149, global_step=4958.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3788/5971 [33:49<19:29,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.00222, train/loss_vlb_step=1.32e-5, train/loss_step=0.00222, global_step=4958.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3789/5971 [33:50<19:29,  1.87it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.94e-5, train/loss_step=0.0118, global_step=4959.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  63%|██████▎   | 3790/5971 [33:51<19:28,  1.87it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0118, train/loss_vlb_step=4.94e-5, train/loss_step=0.0118, global_step=4959.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3790/5971 [33:51<19:28,  1.87it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0232, train/loss_vlb_step=9.51e-5, train/loss_step=0.0232, global_step=4959.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  63%|██████▎   | 3791/5971 [33:52<19:28,  1.87it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0329, train/loss_vlb_step=0.000122, train/loss_step=0.0329, global_step=4959.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3792/5971 [33:54<19:28,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.106, train/loss_vlb_step=0.000349, train/loss_step=0.106, global_step=4959.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▎   | 3793/5971 [33:55<19:28,  1.86it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000106, train/loss_step=0.0275, global_step=4960.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3794/5971 [33:56<19:28,  1.86it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0275, train/loss_vlb_step=0.000106, train/loss_step=0.0275, global_step=4960.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3794/5971 [33:56<19:28,  1.86it/s, loss=0.164, v_num=0, train/loss_simple_step=0.154, train/loss_vlb_step=0.000505, train/loss_step=0.154, global_step=4960.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▎   | 3795/5971 [33:57<19:27,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00136, train/loss_vlb_step=7.64e-6, train/loss_step=0.00136, global_step=4960.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3796/5971 [33:59<19:28,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0316, train/loss_vlb_step=0.000113, train/loss_step=0.0316, global_step=4960.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  64%|██████▎   | 3797/5971 [34:00<19:27,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000264, train/loss_step=0.0803, global_step=4961.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3798/5971 [34:00<19:27,  1.86it/s, loss=0.144, v_num=0, train/loss_simple_step=0.0803, train/loss_vlb_step=0.000264, train/loss_step=0.0803, global_step=4961.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3798/5971 [34:00<19:27,  1.86it/s, loss=0.146, v_num=0, train/loss_simple_step=0.046, train/loss_vlb_step=0.000164, train/loss_step=0.046, global_step=4961.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▎   | 3799/5971 [34:01<19:27,  1.86it/s, loss=0.104, v_num=0, train/loss_simple_step=0.0194, train/loss_vlb_step=7.9e-5, train/loss_step=0.0194, global_step=4961.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3800/5971 [34:03<19:27,  1.86it/s, loss=0.12, v_num=0, train/loss_simple_step=0.337, train/loss_vlb_step=0.00156, train/loss_step=0.337, global_step=4961.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▎   | 3801/5971 [34:04<19:27,  1.86it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.87e-5, train/loss_step=0.00344, global_step=4962.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3802/5971 [34:05<19:26,  1.86it/s, loss=0.109, v_num=0, train/loss_simple_step=0.00344, train/loss_vlb_step=1.87e-5, train/loss_step=0.00344, global_step=4962.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3802/5971 [34:05<19:26,  1.86it/s, loss=0.115, v_num=0, train/loss_simple_step=0.283, train/loss_vlb_step=0.00144, train/loss_step=0.283, global_step=4962.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  64%|██████▎   | 3803/5971 [34:06<19:26,  1.86it/s, loss=0.0997, v_num=0, train/loss_simple_step=0.0532, train/loss_vlb_step=0.000188, train/loss_step=0.0532, global_step=4962.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3804/5971 [34:08<19:26,  1.86it/s, loss=0.0968, v_num=0, train/loss_simple_step=0.205, train/loss_vlb_step=0.000718, train/loss_step=0.205, global_step=4962.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▎   | 3805/5971 [34:09<19:26,  1.86it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.81e-5, train/loss_step=0.0108, global_step=4963.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3806/5971 [34:10<19:26,  1.86it/s, loss=0.083, v_num=0, train/loss_simple_step=0.0108, train/loss_vlb_step=4.81e-5, train/loss_step=0.0108, global_step=4963.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▎   | 3806/5971 [34:10<19:26,  1.86it/s, loss=0.114, v_num=0, train/loss_simple_step=0.837, train/loss_vlb_step=0.0433, train/loss_step=0.837, global_step=4963.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  64%|██████▍   | 3807/5971 [34:11<19:25,  1.86it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00429, train/loss_vlb_step=2.33e-5, train/loss_step=0.00429, global_step=4963.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3808/5971 [34:13<19:26,  1.85it/s, loss=0.124, v_num=0, train/loss_simple_step=0.221, train/loss_vlb_step=0.000768, train/loss_step=0.221, global_step=4963.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  64%|██████▍   | 3809/5971 [34:14<19:25,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000758, train/loss_step=0.217, global_step=4964.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3810/5971 [34:15<19:25,  1.85it/s, loss=0.135, v_num=0, train/loss_simple_step=0.217, train/loss_vlb_step=0.000758, train/loss_step=0.217, global_step=4964.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3810/5971 [34:15<19:25,  1.85it/s, loss=0.162, v_num=0, train/loss_simple_step=0.567, train/loss_vlb_step=0.00671, train/loss_step=0.567, global_step=4964.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  64%|██████▍   | 3811/5971 [34:16<19:25,  1.85it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0023, train/loss_vlb_step=1.34e-5, train/loss_step=0.0023, global_step=4964.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3812/5971 [34:18<19:25,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.299, train/loss_vlb_step=0.00152, train/loss_step=0.299, global_step=4964.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▍   | 3813/5971 [34:19<19:25,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=9.94e-6, train/loss_step=0.00168, global_step=4965.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3814/5971 [34:20<19:24,  1.85it/s, loss=0.169, v_num=0, train/loss_simple_step=0.00168, train/loss_vlb_step=9.94e-6, train/loss_step=0.00168, global_step=4965.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3814/5971 [34:20<19:24,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.00668, train/loss_vlb_step=3.19e-5, train/loss_step=0.00668, global_step=4965.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3815/5971 [34:21<19:24,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0725, train/loss_vlb_step=0.000242, train/loss_step=0.0725, global_step=4965.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  64%|██████▍   | 3816/5971 [34:23<19:25,  1.85it/s, loss=0.166, v_num=0, train/loss_simple_step=0.045, train/loss_vlb_step=0.00015, train/loss_step=0.045, global_step=4965.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  64%|██████▍   | 3817/5971 [34:24<19:24,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000555, train/loss_step=0.168, global_step=4966.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3818/5971 [34:25<19:24,  1.85it/s, loss=0.17, v_num=0, train/loss_simple_step=0.168, train/loss_vlb_step=0.000555, train/loss_step=0.168, global_step=4966.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3818/5971 [34:25<19:24,  1.85it/s, loss=0.168, v_num=0, train/loss_simple_step=0.00519, train/loss_vlb_step=2.49e-5, train/loss_step=0.00519, global_step=4966.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3819/5971 [34:26<19:23,  1.85it/s, loss=0.18, v_num=0, train/loss_simple_step=0.253, train/loss_vlb_step=0.000962, train/loss_step=0.253, global_step=4966.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  64%|██████▍   | 3820/5971 [34:28<19:24,  1.85it/s, loss=0.163, v_num=0, train/loss_simple_step=0.0019, train/loss_vlb_step=1.15e-5, train/loss_step=0.0019, global_step=4966.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3821/5971 [34:29<19:23,  1.85it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.17e-5, train/loss_step=0.00199, global_step=4967.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3822/5971 [34:30<19:23,  1.85it/s, loss=0.163, v_num=0, train/loss_simple_step=0.00199, train/loss_vlb_step=1.17e-5, train/loss_step=0.00199, global_step=4967.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3822/5971 [34:30<19:23,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000588, train/loss_step=0.173, global_step=4967.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  64%|██████▍   | 3823/5971 [34:30<19:23,  1.85it/s, loss=0.181, v_num=0, train/loss_simple_step=0.531, train/loss_vlb_step=0.0045, train/loss_step=0.531, global_step=4967.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▍   | 3824/5971 [34:32<19:23,  1.85it/s, loss=0.177, v_num=0, train/loss_simple_step=0.119, train/loss_vlb_step=0.00039, train/loss_step=0.119, global_step=4967.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3825/5971 [34:33<19:23,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00344, train/loss_step=0.445, global_step=4968.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3826/5971 [34:34<19:22,  1.84it/s, loss=0.198, v_num=0, train/loss_simple_step=0.445, train/loss_vlb_step=0.00344, train/loss_step=0.445, global_step=4968.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3826/5971 [34:34<19:22,  1.84it/s, loss=0.157, v_num=0, train/loss_simple_step=0.0113, train/loss_vlb_step=4.92e-5, train/loss_step=0.0113, global_step=4968.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3827/5971 [34:35<19:22,  1.84it/s, loss=0.16, v_num=0, train/loss_simple_step=0.0586, train/loss_vlb_step=0.000198, train/loss_step=0.0586, global_step=4968.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3828/5971 [34:38<19:23,  1.84it/s, loss=0.15, v_num=0, train/loss_simple_step=0.019, train/loss_vlb_step=8.15e-5, train/loss_step=0.019, global_step=4968.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  64%|██████▍   | 3829/5971 [34:38<19:22,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00107, train/loss_step=0.271, global_step=4969.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3830/5971 [34:39<19:22,  1.84it/s, loss=0.153, v_num=0, train/loss_simple_step=0.271, train/loss_vlb_step=0.00107, train/loss_step=0.271, global_step=4969.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3830/5971 [34:39<19:22,  1.84it/s, loss=0.124, v_num=0, train/loss_simple_step=0.00395, train/loss_vlb_step=2.1e-5, train/loss_step=0.00395, global_step=4969.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3831/5971 [34:40<19:21,  1.84it/s, loss=0.131, v_num=0, train/loss_simple_step=0.144, train/loss_vlb_step=0.000485, train/loss_step=0.144, global_step=4969.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▍   | 3832/5971 [34:42<19:22,  1.84it/s, loss=0.119, v_num=0, train/loss_simple_step=0.0547, train/loss_vlb_step=0.000187, train/loss_step=0.0547, global_step=4969.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3833/5971 [34:43<19:21,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.83e-5, train/loss_step=0.0257, global_step=4970.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  64%|██████▍   | 3834/5971 [34:44<19:21,  1.84it/s, loss=0.12, v_num=0, train/loss_simple_step=0.0257, train/loss_vlb_step=9.83e-5, train/loss_step=0.0257, global_step=4970.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3834/5971 [34:44<19:21,  1.84it/s, loss=0.134, v_num=0, train/loss_simple_step=0.276, train/loss_vlb_step=0.00105, train/loss_step=0.276, global_step=4970.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  64%|██████▍   | 3835/5971 [34:45<19:21,  1.84it/s, loss=0.136, v_num=0, train/loss_simple_step=0.121, train/loss_vlb_step=0.000397, train/loss_step=0.121, global_step=4970.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3836/5971 [34:47<19:21,  1.84it/s, loss=0.148, v_num=0, train/loss_simple_step=0.277, train/loss_vlb_step=0.00132, train/loss_step=0.277, global_step=4970.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  64%|██████▍   | 3837/5971 [34:48<19:21,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00151, train/loss_step=0.315, global_step=4971.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3838/5971 [34:49<19:21,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.315, train/loss_vlb_step=0.00151, train/loss_step=0.315, global_step=4971.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3838/5971 [34:49<19:21,  1.84it/s, loss=0.155, v_num=0, train/loss_simple_step=0.00689, train/loss_vlb_step=3.27e-5, train/loss_step=0.00689, global_step=4971.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3839/5971 [34:50<19:20,  1.84it/s, loss=0.164, v_num=0, train/loss_simple_step=0.433, train/loss_vlb_step=0.00294, train/loss_step=0.433, global_step=4971.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  64%|██████▍   | 3840/5971 [34:52<19:21,  1.84it/s, loss=0.175, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000771, train/loss_step=0.213, global_step=4971.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3841/5971 [34:53<19:20,  1.83it/s, loss=0.181, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=4972.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3842/5971 [34:54<19:20,  1.83it/s, loss=0.181, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000421, train/loss_step=0.128, global_step=4972.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3842/5971 [34:54<19:20,  1.83it/s, loss=0.179, v_num=0, train/loss_simple_step=0.136, train/loss_vlb_step=0.000449, train/loss_step=0.136, global_step=4972.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3843/5971 [34:55<19:20,  1.83it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00348, train/loss_vlb_step=1.74e-5, train/loss_step=0.00348, global_step=4972.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3844/5971 [34:57<19:20,  1.83it/s, loss=0.167, v_num=0, train/loss_simple_step=0.396, train/loss_vlb_step=0.00265, train/loss_step=0.396, global_step=4972.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  64%|██████▍   | 3845/5971 [34:58<19:20,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.49e-5, train/loss_step=0.0241, global_step=4973.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3846/5971 [34:59<19:19,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0241, train/loss_vlb_step=9.49e-5, train/loss_step=0.0241, global_step=4973.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3846/5971 [34:59<19:19,  1.83it/s, loss=0.146, v_num=0, train/loss_simple_step=0.0201, train/loss_vlb_step=8.28e-5, train/loss_step=0.0201, global_step=4973.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3847/5971 [35:00<19:19,  1.83it/s, loss=0.145, v_num=0, train/loss_simple_step=0.0288, train/loss_vlb_step=0.00011, train/loss_step=0.0288, global_step=4973.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3848/5971 [35:02<19:19,  1.83it/s, loss=0.144, v_num=0, train/loss_simple_step=0.00369, train/loss_vlb_step=2e-5, train/loss_step=0.00369, global_step=4973.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  64%|██████▍   | 3849/5971 [35:03<19:19,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000629, train/loss_step=0.190, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  64%|██████▍   | 3850/5971 [35:04<19:18,  1.83it/s, loss=0.14, v_num=0, train/loss_simple_step=0.190, train/loss_vlb_step=0.000629, train/loss_step=0.190, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3850/5971 [35:04<19:18,  1.83it/s, loss=0.151, v_num=0, train/loss_simple_step=0.230, train/loss_vlb_step=0.000887, train/loss_step=0.230, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  64%|██████▍   | 3851/5971 [35:05<19:18,  1.83it/s, loss=0.185, v_num=0, train/loss_simple_step=0.824, train/loss_vlb_step=0.105, train/loss_step=0.824, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  65%|██████▍   | 3852/5971 [35:07<19:18,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:20,  2.07it/s][A
Epoch 8:  65%|██████▍   | 3854/5971 [35:07<19:17,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   2%|▏         | 3/167 [00:00<00:28,  5.68it/s][A
Epoch 8:  65%|██████▍   | 3858/5971 [35:07<19:14,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   4%|▎         | 6/167 [00:00<00:14, 10.89it/s][A

Validating:   5%|▌         | 9/167 [00:00<00:10, 15.08it/s][A
Epoch 8:  65%|██████▍   | 3862/5971 [35:08<19:10,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 12/167 [00:00<00:08, 17.82it/s][A
Epoch 8:  65%|██████▍   | 3866/5971 [35:08<19:07,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   9%|▉         | 15/167 [00:01<00:07, 19.98it/s][A
Epoch 8:  65%|██████▍   | 3870/5971 [35:08<19:04,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  11%|█         | 18/167 [00:01<00:07, 21.11it/s][A

Validating:  13%|█▎        | 21/167 [00:01<00:06, 23.18it/s][A
Epoch 8:  65%|██████▍   | 3874/5971 [35:08<19:01,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  15%|█▍        | 25/167 [00:01<00:05, 25.71it/s][A
Epoch 8:  65%|██████▍   | 3878/5971 [35:08<18:57,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 28/167 [00:01<00:05, 25.42it/s][A
Epoch 8:  65%|██████▌   | 3882/5971 [35:08<18:54,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▊        | 31/167 [00:01<00:05, 25.32it/s][A
Epoch 8:  65%|██████▌   | 3886/5971 [35:09<18:51,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  20%|██        | 34/167 [00:01<00:05, 25.76it/s][A

Validating:  22%|██▏       | 37/167 [00:01<00:05, 25.99it/s][A
Epoch 8:  65%|██████▌   | 3890/5971 [35:09<18:48,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  24%|██▍       | 40/167 [00:02<00:04, 26.18it/s][A
Epoch 8:  65%|██████▌   | 3894/5971 [35:09<18:44,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▌       | 43/167 [00:02<00:04, 25.88it/s][A
Epoch 8:  65%|██████▌   | 3898/5971 [35:09<18:41,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  28%|██▊       | 46/167 [00:02<00:04, 26.93it/s][A
Epoch 8:  65%|██████▌   | 3902/5971 [35:09<18:38,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  30%|██▉       | 50/167 [00:02<00:04, 27.65it/s][A

Validating:  32%|███▏      | 53/167 [00:02<00:04, 27.50it/s][A
Epoch 8:  65%|██████▌   | 3906/5971 [35:09<18:35,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  34%|███▎      | 56/167 [00:02<00:03, 27.98it/s][A
Epoch 8:  65%|██████▌   | 3910/5971 [35:09<18:31,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  35%|███▌      | 59/167 [00:02<00:03, 28.48it/s][A
Epoch 8:  66%|██████▌   | 3914/5971 [35:10<18:28,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  37%|███▋      | 62/167 [00:02<00:04, 26.17it/s][A

Validating:  39%|███▉      | 65/167 [00:02<00:03, 26.62it/s][A
Epoch 8:  66%|██████▌   | 3918/5971 [35:10<18:25,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  41%|████      | 68/167 [00:03<00:03, 26.88it/s][A
Epoch 8:  66%|██████▌   | 3922/5971 [35:10<18:22,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 71/167 [00:03<00:03, 26.91it/s][A
Epoch 8:  66%|██████▌   | 3926/5971 [35:10<18:19,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  44%|████▍     | 74/167 [00:03<00:03, 27.62it/s][A

Validating:  46%|████▌     | 77/167 [00:03<00:03, 27.85it/s][A
Epoch 8:  66%|██████▌   | 3930/5971 [35:10<18:15,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  49%|████▊     | 81/167 [00:03<00:02, 29.37it/s][A
Epoch 8:  66%|██████▌   | 3934/5971 [35:10<18:12,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  51%|█████     | 85/167 [00:03<00:02, 29.06it/s][A
Epoch 8:  66%|██████▌   | 3938/5971 [35:10<18:09,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  53%|█████▎    | 88/167 [00:03<00:02, 28.88it/s][A
Epoch 8:  66%|██████▌   | 3942/5971 [35:11<18:06,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  54%|█████▍    | 91/167 [00:03<00:02, 27.19it/s][A
Epoch 8:  66%|██████▌   | 3946/5971 [35:11<18:03,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▋    | 94/167 [00:03<00:02, 27.27it/s][A
Epoch 8:  66%|██████▌   | 3950/5971 [35:11<17:59,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  59%|█████▊    | 98/167 [00:04<00:02, 28.75it/s][A
Epoch 8:  66%|██████▌   | 3954/5971 [35:11<17:56,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  61%|██████    | 102/167 [00:04<00:02, 28.33it/s][A

Validating:  63%|██████▎   | 105/167 [00:04<00:02, 28.65it/s][A
Epoch 8:  66%|██████▋   | 3958/5971 [35:11<17:53,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  65%|██████▍   | 108/167 [00:04<00:02, 28.54it/s][A
Epoch 8:  66%|██████▋   | 3962/5971 [35:11<17:50,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  67%|██████▋   | 112/167 [00:04<00:01, 29.66it/s][A
Epoch 8:  66%|██████▋   | 3966/5971 [35:11<17:47,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 115/167 [00:04<00:01, 29.57it/s][A
Epoch 8:  66%|██████▋   | 3970/5971 [35:11<17:44,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████   | 118/167 [00:04<00:01, 28.86it/s][A

Validating:  72%|███████▏  | 121/167 [00:04<00:01, 28.34it/s][A
Epoch 8:  67%|██████▋   | 3974/5971 [35:12<17:41,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▍  | 124/167 [00:05<00:01, 27.07it/s][A
Epoch 8:  67%|██████▋   | 3978/5971 [35:12<17:38,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  76%|███████▌  | 127/167 [00:05<00:01, 27.15it/s][A
Epoch 8:  67%|██████▋   | 3982/5971 [35:12<17:34,  1.89it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  78%|███████▊  | 130/167 [00:05<00:01, 26.76it/s][A

Validating:  80%|███████▉  | 133/167 [00:05<00:01, 27.07it/s][A
Epoch 8:  67%|██████▋   | 3986/5971 [35:12<17:31,  1.89it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████▏ | 136/167 [00:05<00:01, 27.24it/s][A
Epoch 8:  67%|██████▋   | 3990/5971 [35:12<17:28,  1.89it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  84%|████████▍ | 140/167 [00:05<00:00, 27.91it/s][A
Epoch 8:  67%|██████▋   | 3994/5971 [35:12<17:25,  1.89it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 143/167 [00:05<00:00, 27.85it/s][A
Epoch 8:  67%|██████▋   | 3998/5971 [35:13<17:22,  1.89it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  87%|████████▋ | 146/167 [00:05<00:00, 27.72it/s][A

Validating:  89%|████████▉ | 149/167 [00:05<00:00, 27.41it/s][A
Epoch 8:  67%|██████▋   | 4002/5971 [35:13<17:19,  1.89it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  91%|█████████ | 152/167 [00:06<00:00, 26.55it/s][A
Epoch 8:  67%|██████▋   | 4006/5971 [35:13<17:16,  1.90it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  93%|█████████▎| 155/167 [00:06<00:00, 27.08it/s][A
Epoch 8:  67%|██████▋   | 4010/5971 [35:13<17:13,  1.90it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▍| 158/167 [00:06<00:00, 27.36it/s][A

Validating:  96%|█████████▋| 161/167 [00:06<00:00, 27.68it/s][A
Epoch 8:  67%|██████▋   | 4014/5971 [35:13<17:10,  1.90it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 28.15it/s][A
Epoch 8:  67%|██████▋   | 4018/5971 [35:13<17:07,  1.90it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4020/5971 [35:14<17:05,  1.90it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0198, train/loss_vlb_step=8.1e-5, train/loss_step=0.0198, global_step=4974.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  67%|██████▋   | 4021/5971 [35:15<17:05,  1.90it/s, loss=0.202, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00243, train/loss_step=0.391, global_step=4975.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  67%|██████▋   | 4022/5971 [35:16<17:05,  1.90it/s, loss=0.202, v_num=0, train/loss_simple_step=0.391, train/loss_vlb_step=0.00243, train/loss_step=0.391, global_step=4975.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4022/5971 [35:16<17:05,  1.90it/s, loss=0.195, v_num=0, train/loss_simple_step=0.133, train/loss_vlb_step=0.000439, train/loss_step=0.133, global_step=4975.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4023/5971 [35:16<17:04,  1.90it/s, loss=0.206, v_num=0, train/loss_simple_step=0.347, train/loss_vlb_step=0.0016, train/loss_step=0.347, global_step=4975.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  67%|██████▋   | 4024/5971 [35:19<17:05,  1.90it/s, loss=0.197, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000355, train/loss_step=0.107, global_step=4975.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4025/5971 [35:20<17:04,  1.90it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.0001, train/loss_step=0.0268, global_step=4976.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4026/5971 [35:21<17:04,  1.90it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0268, train/loss_vlb_step=0.0001, train/loss_step=0.0268, global_step=4976.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4026/5971 [35:21<17:04,  1.90it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0185, train/loss_vlb_step=7.79e-5, train/loss_step=0.0185, global_step=4976.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4027/5971 [35:22<17:04,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0701, train/loss_vlb_step=0.000233, train/loss_step=0.0701, global_step=4976.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4028/5971 [35:24<17:04,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.213, train/loss_vlb_step=0.000778, train/loss_step=0.213, global_step=4976.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  67%|██████▋   | 4029/5971 [35:25<17:04,  1.90it/s, loss=0.171, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000963, train/loss_step=0.247, global_step=4977.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4030/5971 [35:26<17:03,  1.90it/s, loss=0.171, v_num=0, train/loss_simple_step=0.247, train/loss_vlb_step=0.000963, train/loss_step=0.247, global_step=4977.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  67%|██████▋   | 4030/5971 [35:26<17:03,  1.90it/s, loss=0.166, v_num=0, train/loss_simple_step=0.0357, train/loss_vlb_step=0.000132, train/loss_step=0.0357, global_step=4977.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4031/5971 [35:26<17:03,  1.90it/s, loss=0.171, v_num=0, train/loss_simple_step=0.101, train/loss_vlb_step=0.000337, train/loss_step=0.101, global_step=4977.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  68%|██████▊   | 4032/5971 [35:29<17:03,  1.89it/s, loss=0.191, v_num=0, train/loss_simple_step=0.789, train/loss_vlb_step=0.026, train/loss_step=0.789, global_step=4977.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  68%|██████▊   | 4033/5971 [35:30<17:03,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00246, train/loss_step=0.318, global_step=4978.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4034/5971 [35:31<17:03,  1.89it/s, loss=0.206, v_num=0, train/loss_simple_step=0.318, train/loss_vlb_step=0.00246, train/loss_step=0.318, global_step=4978.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4034/5971 [35:31<17:03,  1.89it/s, loss=0.205, v_num=0, train/loss_simple_step=0.00605, train/loss_vlb_step=3.12e-5, train/loss_step=0.00605, global_step=4978.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4035/5971 [35:31<17:02,  1.89it/s, loss=0.224, v_num=0, train/loss_simple_step=0.405, train/loss_vlb_step=0.00199, train/loss_step=0.405, global_step=4978.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  68%|██████▊   | 4036/5971 [35:34<17:02,  1.89it/s, loss=0.251, v_num=0, train/loss_simple_step=0.549, train/loss_vlb_step=0.0045, train/loss_step=0.549, global_step=4978.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4037/5971 [35:34<17:02,  1.89it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0612, train/loss_vlb_step=0.000222, train/loss_step=0.0612, global_step=4979.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4038/5971 [35:35<17:02,  1.89it/s, loss=0.245, v_num=0, train/loss_simple_step=0.0612, train/loss_vlb_step=0.000222, train/loss_step=0.0612, global_step=4979.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4038/5971 [35:35<17:02,  1.89it/s, loss=0.233, v_num=0, train/loss_simple_step=0.00139, train/loss_vlb_step=8.39e-6, train/loss_step=0.00139, global_step=4979.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4039/5971 [35:36<17:01,  1.89it/s, loss=0.204, v_num=0, train/loss_simple_step=0.235, train/loss_vlb_step=0.000948, train/loss_step=0.235, global_step=4979.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  68%|██████▊   | 4040/5971 [35:38<17:02,  1.89it/s, loss=0.216, v_num=0, train/loss_simple_step=0.260, train/loss_vlb_step=0.001, train/loss_step=0.260, global_step=4979.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  68%|██████▊   | 4041/5971 [35:39<17:01,  1.89it/s, loss=0.204, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000523, train/loss_step=0.158, global_step=4980.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4042/5971 [35:40<17:01,  1.89it/s, loss=0.204, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000523, train/loss_step=0.158, global_step=4980.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4042/5971 [35:40<17:01,  1.89it/s, loss=0.198, v_num=0, train/loss_simple_step=0.00414, train/loss_vlb_step=2.15e-5, train/loss_step=0.00414, global_step=4980.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4043/5971 [35:41<17:00,  1.89it/s, loss=0.183, v_num=0, train/loss_simple_step=0.0491, train/loss_vlb_step=0.000171, train/loss_step=0.0491, global_step=4980.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4044/5971 [35:43<17:01,  1.89it/s, loss=0.189, v_num=0, train/loss_simple_step=0.227, train/loss_vlb_step=0.000797, train/loss_step=0.227, global_step=4980.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  68%|██████▊   | 4045/5971 [35:44<17:00,  1.89it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.93e-6, train/loss_step=0.0017, global_step=4981.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4046/5971 [35:45<17:00,  1.89it/s, loss=0.187, v_num=0, train/loss_simple_step=0.0017, train/loss_vlb_step=9.93e-6, train/loss_step=0.0017, global_step=4981.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4046/5971 [35:45<17:00,  1.89it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0523, train/loss_vlb_step=0.00018, train/loss_step=0.0523, global_step=4981.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4047/5971 [35:46<17:00,  1.89it/s, loss=0.194, v_num=0, train/loss_simple_step=0.173, train/loss_vlb_step=0.000616, train/loss_step=0.173, global_step=4981.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4048/5971 [35:48<17:00,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.00677, train/loss_vlb_step=3.21e-5, train/loss_step=0.00677, global_step=4981.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4049/5971 [35:49<16:59,  1.88it/s, loss=0.191, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.0021, train/loss_step=0.388, global_step=4982.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]     
Epoch 8:  68%|██████▊   | 4050/5971 [35:50<16:59,  1.88it/s, loss=0.191, v_num=0, train/loss_simple_step=0.388, train/loss_vlb_step=0.0021, train/loss_step=0.388, global_step=4982.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4050/5971 [35:50<16:59,  1.88it/s, loss=0.19, v_num=0, train/loss_simple_step=0.016, train/loss_vlb_step=7.07e-5, train/loss_step=0.016, global_step=4982.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4051/5971 [35:51<16:59,  1.88it/s, loss=0.188, v_num=0, train/loss_simple_step=0.0537, train/loss_vlb_step=0.000191, train/loss_step=0.0537, global_step=4982.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4052/5971 [35:53<16:59,  1.88it/s, loss=0.149, v_num=0, train/loss_simple_step=0.00714, train/loss_vlb_step=3.44e-5, train/loss_step=0.00714, global_step=4982.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4053/5971 [35:54<16:59,  1.88it/s, loss=0.135, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000155, train/loss_step=0.043, global_step=4983.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  68%|██████▊   | 4054/5971 [35:55<16:58,  1.88it/s, loss=0.135, v_num=0, train/loss_simple_step=0.043, train/loss_vlb_step=0.000155, train/loss_step=0.043, global_step=4983.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4054/5971 [35:55<16:58,  1.88it/s, loss=0.153, v_num=0, train/loss_simple_step=0.377, train/loss_vlb_step=0.00171, train/loss_step=0.377, global_step=4983.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4055/5971 [35:56<16:58,  1.88it/s, loss=0.136, v_num=0, train/loss_simple_step=0.0573, train/loss_vlb_step=0.000194, train/loss_step=0.0573, global_step=4983.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4056/5971 [35:58<16:58,  1.88it/s, loss=0.109, v_num=0, train/loss_simple_step=0.0032, train/loss_vlb_step=1.68e-5, train/loss_step=0.0032, global_step=4983.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4057/5971 [35:59<16:58,  1.88it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0978, train/loss_vlb_step=0.000328, train/loss_step=0.0978, global_step=4984.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4058/5971 [35:59<16:57,  1.88it/s, loss=0.111, v_num=0, train/loss_simple_step=0.0978, train/loss_vlb_step=0.000328, train/loss_step=0.0978, global_step=4984.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4058/5971 [35:59<16:57,  1.88it/s, loss=0.136, v_num=0, train/loss_simple_step=0.501, train/loss_vlb_step=0.00486, train/loss_step=0.501, global_step=4984.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  68%|██████▊   | 4059/5971 [36:00<16:57,  1.88it/s, loss=0.135, v_num=0, train/loss_simple_step=0.233, train/loss_vlb_step=0.000948, train/loss_step=0.233, global_step=4984.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4060/5971 [36:03<16:57,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0183, train/loss_vlb_step=7.51e-5, train/loss_step=0.0183, global_step=4984.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4061/5971 [36:04<16:57,  1.88it/s, loss=0.122, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000425, train/loss_step=0.128, global_step=4985.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4062/5971 [36:05<16:57,  1.88it/s, loss=0.122, v_num=0, train/loss_simple_step=0.128, train/loss_vlb_step=0.000425, train/loss_step=0.128, global_step=4985.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4062/5971 [36:05<16:57,  1.88it/s, loss=0.123, v_num=0, train/loss_simple_step=0.0229, train/loss_vlb_step=9.38e-5, train/loss_step=0.0229, global_step=4985.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4063/5971 [36:05<16:56,  1.88it/s, loss=0.121, v_num=0, train/loss_simple_step=0.00383, train/loss_vlb_step=2.13e-5, train/loss_step=0.00383, global_step=4985.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4064/5971 [36:08<16:57,  1.87it/s, loss=0.113, v_num=0, train/loss_simple_step=0.0823, train/loss_vlb_step=0.000273, train/loss_step=0.0823, global_step=4985.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4065/5971 [36:08<16:56,  1.87it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.1e-5, train/loss_step=0.00182, global_step=4986.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4066/5971 [36:09<16:56,  1.87it/s, loss=0.113, v_num=0, train/loss_simple_step=0.00182, train/loss_vlb_step=1.1e-5, train/loss_step=0.00182, global_step=4986.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4066/5971 [36:09<16:56,  1.87it/s, loss=0.112, v_num=0, train/loss_simple_step=0.0251, train/loss_vlb_step=9.35e-5, train/loss_step=0.0251, global_step=4986.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4067/5971 [36:10<16:56,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00655, train/loss_vlb_step=3.2e-5, train/loss_step=0.00655, global_step=4986.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4068/5971 [36:12<16:56,  1.87it/s, loss=0.107, v_num=0, train/loss_simple_step=0.0771, train/loss_vlb_step=0.00026, train/loss_step=0.0771, global_step=4986.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4069/5971 [36:13<16:55,  1.87it/s, loss=0.105, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00175, train/loss_step=0.344, global_step=4987.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  68%|██████▊   | 4070/5971 [36:14<16:55,  1.87it/s, loss=0.105, v_num=0, train/loss_simple_step=0.344, train/loss_vlb_step=0.00175, train/loss_step=0.344, global_step=4987.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4070/5971 [36:14<16:55,  1.87it/s, loss=0.104, v_num=0, train/loss_simple_step=0.00448, train/loss_vlb_step=2.39e-5, train/loss_step=0.00448, global_step=4987.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4071/5971 [36:15<16:55,  1.87it/s, loss=0.114, v_num=0, train/loss_simple_step=0.238, train/loss_vlb_step=0.000829, train/loss_step=0.238, global_step=4987.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  68%|██████▊   | 4072/5971 [36:18<16:55,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.328, train/loss_vlb_step=0.00153, train/loss_step=0.328, global_step=4987.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  68%|██████▊   | 4073/5971 [36:18<16:55,  1.87it/s, loss=0.135, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000555, train/loss_step=0.160, global_step=4988.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4074/5971 [36:19<16:54,  1.87it/s, loss=0.135, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000555, train/loss_step=0.160, global_step=4988.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4074/5971 [36:19<16:54,  1.87it/s, loss=0.117, v_num=0, train/loss_simple_step=0.0067, train/loss_vlb_step=3.35e-5, train/loss_step=0.0067, global_step=4988.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4075/5971 [36:20<16:54,  1.87it/s, loss=0.122, v_num=0, train/loss_simple_step=0.160, train/loss_vlb_step=0.000547, train/loss_step=0.160, global_step=4988.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4076/5971 [36:22<16:54,  1.87it/s, loss=0.13, v_num=0, train/loss_simple_step=0.158, train/loss_vlb_step=0.000521, train/loss_step=0.158, global_step=4988.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4077/5971 [36:23<16:54,  1.87it/s, loss=0.143, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00212, train/loss_step=0.352, global_step=4989.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4078/5971 [36:24<16:53,  1.87it/s, loss=0.143, v_num=0, train/loss_simple_step=0.352, train/loss_vlb_step=0.00212, train/loss_step=0.352, global_step=4989.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4078/5971 [36:24<16:53,  1.87it/s, loss=0.142, v_num=0, train/loss_simple_step=0.490, train/loss_vlb_step=0.00332, train/loss_step=0.490, global_step=4989.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4079/5971 [36:25<16:53,  1.87it/s, loss=0.16, v_num=0, train/loss_simple_step=0.596, train/loss_vlb_step=0.00732, train/loss_step=0.596, global_step=4989.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4080/5971 [36:28<16:53,  1.87it/s, loss=0.159, v_num=0, train/loss_simple_step=0.00196, train/loss_vlb_step=1.14e-5, train/loss_step=0.00196, global_step=4989.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4081/5971 [36:28<16:53,  1.86it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00363, train/loss_vlb_step=1.89e-5, train/loss_step=0.00363, global_step=4990.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4082/5971 [36:29<16:53,  1.86it/s, loss=0.153, v_num=0, train/loss_simple_step=0.00363, train/loss_vlb_step=1.89e-5, train/loss_step=0.00363, global_step=4990.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4082/5971 [36:29<16:53,  1.86it/s, loss=0.153, v_num=0, train/loss_simple_step=0.018, train/loss_vlb_step=7.47e-5, train/loss_step=0.018, global_step=4990.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  68%|██████▊   | 4083/5971 [36:30<16:52,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0721, train/loss_vlb_step=0.000238, train/loss_step=0.0721, global_step=4990.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4084/5971 [36:32<16:52,  1.86it/s, loss=0.155, v_num=0, train/loss_simple_step=0.0558, train/loss_vlb_step=0.000188, train/loss_step=0.0558, global_step=4990.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4085/5971 [36:33<16:52,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000115, train/loss_step=0.0309, global_step=4991.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4086/5971 [36:34<16:52,  1.86it/s, loss=0.156, v_num=0, train/loss_simple_step=0.0309, train/loss_vlb_step=0.000115, train/loss_step=0.0309, global_step=4991.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4086/5971 [36:34<16:52,  1.86it/s, loss=0.186, v_num=0, train/loss_simple_step=0.611, train/loss_vlb_step=0.00711, train/loss_step=0.611, global_step=4991.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  68%|██████▊   | 4087/5971 [36:35<16:51,  1.86it/s, loss=0.189, v_num=0, train/loss_simple_step=0.0682, train/loss_vlb_step=0.00023, train/loss_step=0.0682, global_step=4991.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4088/5971 [36:37<16:52,  1.86it/s, loss=0.192, v_num=0, train/loss_simple_step=0.139, train/loss_vlb_step=0.000459, train/loss_step=0.139, global_step=4991.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  68%|██████▊   | 4089/5971 [36:38<16:51,  1.86it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000179, train/loss_step=0.0508, global_step=4992.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4090/5971 [36:39<16:51,  1.86it/s, loss=0.177, v_num=0, train/loss_simple_step=0.0508, train/loss_vlb_step=0.000179, train/loss_step=0.0508, global_step=4992.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  68%|██████▊   | 4090/5971 [36:39<16:51,  1.86it/s, loss=0.18, v_num=0, train/loss_simple_step=0.0532, train/loss_vlb_step=0.000182, train/loss_step=0.0532, global_step=4992.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  69%|██████▊   | 4091/5971 [36:40<16:50,  1.86it/s, loss=0.212, v_num=0, train/loss_simple_step=0.876, train/loss_vlb_step=0.0563, train/loss_step=0.876, global_step=4992.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  69%|██████▊   | 4092/5971 [36:42<16:51,  1.86it/s, loss=0.218, v_num=0, train/loss_simple_step=0.455, train/loss_vlb_step=0.00269, train/loss_step=0.455, global_step=4992.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4093/5971 [36:43<16:50,  1.86it/s, loss=0.227, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00142, train/loss_step=0.351, global_step=4993.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4094/5971 [36:44<16:50,  1.86it/s, loss=0.227, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00142, train/loss_step=0.351, global_step=4993.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4094/5971 [36:44<16:50,  1.86it/s, loss=0.227, v_num=0, train/loss_simple_step=0.00245, train/loss_vlb_step=1.38e-5, train/loss_step=0.00245, global_step=4993.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4095/5971 [36:45<16:50,  1.86it/s, loss=0.221, v_num=0, train/loss_simple_step=0.0262, train/loss_vlb_step=0.000105, train/loss_step=0.0262, global_step=4993.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  69%|██████▊   | 4096/5971 [36:47<16:50,  1.86it/s, loss=0.218, v_num=0, train/loss_simple_step=0.0969, train/loss_vlb_step=0.000319, train/loss_step=0.0969, global_step=4993.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4097/5971 [36:48<16:50,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.45e-5, train/loss_step=0.0149, global_step=4994.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  69%|██████▊   | 4098/5971 [36:49<16:49,  1.86it/s, loss=0.201, v_num=0, train/loss_simple_step=0.0149, train/loss_vlb_step=6.45e-5, train/loss_step=0.0149, global_step=4994.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4098/5971 [36:49<16:49,  1.86it/s, loss=0.178, v_num=0, train/loss_simple_step=0.0417, train/loss_vlb_step=0.000151, train/loss_step=0.0417, global_step=4994.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4099/5971 [36:50<16:49,  1.85it/s, loss=0.157, v_num=0, train/loss_simple_step=0.169, train/loss_vlb_step=0.000562, train/loss_step=0.169, global_step=4994.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  69%|██████▊   | 4100/5971 [36:52<16:49,  1.85it/s, loss=0.163, v_num=0, train/loss_simple_step=0.129, train/loss_vlb_step=0.000423, train/loss_step=0.129, global_step=4994.0, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4101/5971 [36:53<16:49,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.36e-5, train/loss_step=0.0144, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  69%|██████▊   | 4102/5971 [36:54<16:48,  1.85it/s, loss=0.164, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.36e-5, train/loss_step=0.0144, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4102/5971 [36:54<16:48,  1.85it/s, loss=0.165, v_num=0, train/loss_simple_step=0.0438, train/loss_vlb_step=0.000148, train/loss_step=0.0438, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4103/5971 [36:55<16:48,  1.85it/s, loss=0.173, v_num=0, train/loss_simple_step=0.239, train/loss_vlb_step=0.000868, train/loss_step=0.239, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  69%|██████▊   | 4104/5971 [36:57<16:48,  1.85it/s, loss=0.174, v_num=0, train/loss_simple_step=0.0604, train/loss_vlb_step=0.000207, train/loss_step=0.0604, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▊   | 4105/5971 [36:58<16:48,  1.85it/s, loss=0.179, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000475, train/loss_step=0.145, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]  
Epoch 8:  69%|██████▉   | 4106/5971 [36:59<16:47,  1.85it/s, loss=0.179, v_num=0, train/loss_simple_step=0.145, train/loss_vlb_step=0.000475, train/loss_step=0.145, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4106/5971 [36:59<16:47,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.241, train/loss_vlb_step=0.000843, train/loss_step=0.241, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4107/5971 [37:00<16:47,  1.85it/s, loss=0.158, v_num=0, train/loss_simple_step=0.00243, train/loss_vlb_step=1.34e-5, train/loss_step=0.00243, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4108/5971 [37:02<16:47,  1.85it/s, loss=0.172, v_num=0, train/loss_simple_step=0.425, train/loss_vlb_step=0.00406, train/loss_step=0.425, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  69%|██████▉   | 4109/5971 [37:03<16:47,  1.85it/s, loss=0.175, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000352, train/loss_step=0.107, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4110/5971 [37:04<16:46,  1.85it/s, loss=0.175, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000352, train/loss_step=0.107, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4110/5971 [37:04<16:46,  1.85it/s, loss=0.187, v_num=0, train/loss_simple_step=0.307, train/loss_vlb_step=0.00153, train/loss_step=0.307, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 
Epoch 8:  69%|██████▉   | 4111/5971 [37:05<16:46,  1.85it/s, loss=0.161, v_num=0, train/loss_simple_step=0.351, train/loss_vlb_step=0.00167, train/loss_step=0.351, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4112/5971 [37:07<16:46,  1.85it/s, loss=0.142, v_num=0, train/loss_simple_step=0.0748, train/loss_vlb_step=0.00025, train/loss_step=0.0748, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4113/5971 [37:08<16:46,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.72e-5, train/loss_step=0.0111, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4114/5971 [37:09<16:45,  1.85it/s, loss=0.125, v_num=0, train/loss_simple_step=0.0111, train/loss_vlb_step=4.72e-5, train/loss_step=0.0111, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4114/5971 [37:09<16:45,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.495, train/loss_vlb_step=0.00966, train/loss_step=0.495, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]   
Epoch 8:  69%|██████▉   | 4115/5971 [37:10<16:45,  1.85it/s, loss=0.15, v_num=0, train/loss_simple_step=0.0311, train/loss_vlb_step=0.000116, train/loss_step=0.0311, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4116/5971 [37:12<16:45,  1.84it/s, loss=0.191, v_num=0, train/loss_simple_step=0.919, train/loss_vlb_step=0.155, train/loss_step=0.919, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]    
Epoch 8:  69%|██████▉   | 4117/5971 [37:13<16:45,  1.84it/s, loss=0.196, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4117/5971 [37:19<16:48,  1.84it/s, loss=0.196, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4118/5971 [37:46<16:59,  1.82it/s, loss=0.196, v_num=0, train/loss_simple_step=0.107, train/loss_vlb_step=0.000354, train/loss_step=0.107, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4118/5971 [37:46<16:59,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.000317, train/loss_step=0.0941, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4119/5971 [37:47<16:59,  1.82it/s, loss=0.198, v_num=0, train/loss_simple_step=0.0941, train/loss_vlb_step=0.000317, train/loss_step=0.0941, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4119/5971 [37:47<16:59,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00275, train/loss_vlb_step=1.52e-5, train/loss_step=0.00275, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4120/5971 [37:49<16:59,  1.82it/s, loss=0.19, v_num=0, train/loss_simple_step=0.00275, train/loss_vlb_step=1.52e-5, train/loss_step=0.00275, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  69%|██████▉   | 4120/5971 [37:49<16:59,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156] 

Validating: 0it [00:00, ?it/s][A

Validating:   0%|          | 0/167 [00:00<?, ?it/s][A

Validating:   1%|          | 1/167 [00:00<01:01,  2.70it/s][A
Epoch 8:  69%|██████▉   | 4122/5971 [37:49<16:57,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   1%|          | 2/167 [00:00<00:49,  3.34it/s][A
Epoch 8:  69%|██████▉   | 4124/5971 [37:50<16:56,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   3%|▎         | 5/167 [00:00<00:18,  8.88it/s][A
Epoch 8:  69%|██████▉   | 4127/5971 [37:50<16:54,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   5%|▍         | 8/167 [00:00<00:12, 12.88it/s][A
Epoch 8:  69%|██████▉   | 4130/5971 [37:50<16:51,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   7%|▋         | 11/167 [00:00<00:09, 16.57it/s][A
Epoch 8:  69%|██████▉   | 4133/5971 [37:50<16:49,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:   8%|▊         | 14/167 [00:01<00:08, 17.90it/s][A
Epoch 8:  69%|██████▉   | 4136/5971 [37:50<16:47,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  10%|█         | 17/167 [00:01<00:07, 20.20it/s][A
Epoch 8:  69%|██████▉   | 4139/5971 [37:50<16:44,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  12%|█▏        | 20/167 [00:01<00:06, 21.74it/s][A
Epoch 8:  69%|██████▉   | 4142/5971 [37:50<16:42,  1.82it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  14%|█▍        | 23/167 [00:01<00:06, 22.40it/s][A
Epoch 8:  69%|██████▉   | 4145/5971 [37:51<16:40,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  16%|█▌        | 26/167 [00:01<00:06, 23.34it/s][A
Epoch 8:  69%|██████▉   | 4148/5971 [37:51<16:37,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  17%|█▋        | 29/167 [00:01<00:05, 23.42it/s][A
Epoch 8:  70%|██████▉   | 4151/5971 [37:51<16:35,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  19%|█▉        | 32/167 [00:01<00:05, 24.76it/s][A
Epoch 8:  70%|██████▉   | 4154/5971 [37:51<16:33,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  21%|██        | 35/167 [00:01<00:05, 25.10it/s][A
Epoch 8:  70%|██████▉   | 4157/5971 [37:51<16:30,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  23%|██▎       | 38/167 [00:02<00:05, 25.77it/s][A
Epoch 8:  70%|██████▉   | 4160/5971 [37:51<16:28,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  25%|██▍       | 41/167 [00:02<00:05, 24.31it/s][A
Epoch 8:  70%|██████▉   | 4163/5971 [37:51<16:26,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  26%|██▋       | 44/167 [00:02<00:05, 24.54it/s][A
Epoch 8:  70%|██████▉   | 4166/5971 [37:51<16:24,  1.83it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  29%|██▊       | 48/167 [00:02<00:04, 26.64it/s][A
Epoch 8:  70%|██████▉   | 4170/5971 [37:52<16:21,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  31%|███       | 51/167 [00:02<00:04, 26.02it/s][A
Epoch 8:  70%|██████▉   | 4174/5971 [37:52<16:17,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  32%|███▏      | 54/167 [00:02<00:04, 25.23it/s][A

Validating:  34%|███▍      | 57/167 [00:02<00:04, 24.76it/s][A
Epoch 8:  70%|██████▉   | 4178/5971 [37:52<16:14,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  36%|███▌      | 60/167 [00:02<00:04, 25.16it/s][A
Epoch 8:  70%|███████   | 4182/5971 [37:52<16:11,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  38%|███▊      | 63/167 [00:03<00:04, 24.88it/s][A
Epoch 8:  70%|███████   | 4186/5971 [37:52<16:08,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  40%|███▉      | 66/167 [00:03<00:04, 23.97it/s][A

Validating:  41%|████▏     | 69/167 [00:03<00:04, 24.18it/s][A
Epoch 8:  70%|███████   | 4190/5971 [37:52<16:05,  1.84it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  43%|████▎     | 72/167 [00:03<00:03, 24.40it/s][A
Epoch 8:  70%|███████   | 4194/5971 [37:53<16:02,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  45%|████▍     | 75/167 [00:03<00:03, 25.18it/s][A
Epoch 8:  70%|███████   | 4198/5971 [37:53<15:59,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  47%|████▋     | 78/167 [00:03<00:03, 24.79it/s][A

Validating:  49%|████▊     | 81/167 [00:03<00:03, 26.14it/s][A
Epoch 8:  70%|███████   | 4202/5971 [37:53<15:56,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  50%|█████     | 84/167 [00:03<00:03, 26.49it/s][A
Epoch 8:  70%|███████   | 4206/5971 [37:53<15:53,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  52%|█████▏    | 87/167 [00:03<00:02, 26.90it/s][A
Epoch 8:  71%|███████   | 4210/5971 [37:53<15:50,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  54%|█████▍    | 90/167 [00:04<00:02, 25.81it/s][A
Epoch 8:  71%|███████   | 4214/5971 [37:53<15:47,  1.85it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  56%|█████▋    | 94/167 [00:04<00:02, 27.06it/s][A

Validating:  58%|█████▊    | 97/167 [00:04<00:02, 27.56it/s][A
Epoch 8:  71%|███████   | 4218/5971 [37:53<15:44,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  60%|█████▉    | 100/167 [00:04<00:02, 26.63it/s][A
Epoch 8:  71%|███████   | 4222/5971 [37:54<15:41,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  62%|██████▏   | 103/167 [00:04<00:02, 25.39it/s][A
Epoch 8:  71%|███████   | 4226/5971 [37:54<15:38,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  64%|██████▍   | 107/167 [00:04<00:02, 26.29it/s][A
Epoch 8:  71%|███████   | 4230/5971 [37:54<15:35,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  66%|██████▌   | 110/167 [00:04<00:02, 26.72it/s][A

Validating:  68%|██████▊   | 113/167 [00:04<00:02, 26.18it/s][A
Epoch 8:  71%|███████   | 4234/5971 [37:54<15:32,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  69%|██████▉   | 116/167 [00:05<00:01, 27.02it/s][A
Epoch 8:  71%|███████   | 4238/5971 [37:54<15:29,  1.86it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  71%|███████▏  | 119/167 [00:05<00:01, 25.37it/s][A
Epoch 8:  71%|███████   | 4242/5971 [37:54<15:26,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  74%|███████▎  | 123/167 [00:05<00:01, 27.00it/s][A
Epoch 8:  71%|███████   | 4246/5971 [37:54<15:24,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  75%|███████▌  | 126/167 [00:05<00:01, 27.60it/s][A

Validating:  77%|███████▋  | 129/167 [00:05<00:01, 26.30it/s][A
Epoch 8:  71%|███████   | 4250/5971 [37:55<15:21,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  79%|███████▉  | 132/167 [00:05<00:01, 25.65it/s][A
Epoch 8:  71%|███████   | 4254/5971 [37:55<15:18,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  81%|████████  | 135/167 [00:05<00:01, 26.76it/s][A
Epoch 8:  71%|███████▏  | 4258/5971 [37:55<15:15,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  83%|████████▎ | 138/167 [00:05<00:01, 26.28it/s][A

Validating:  84%|████████▍ | 141/167 [00:06<00:00, 26.97it/s][A
Epoch 8:  71%|███████▏  | 4262/5971 [37:55<15:12,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  86%|████████▌ | 144/167 [00:06<00:00, 26.98it/s][A
Epoch 8:  71%|███████▏  | 4266/5971 [37:55<15:09,  1.87it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  88%|████████▊ | 147/167 [00:06<00:00, 25.72it/s][A
Epoch 8:  72%|███████▏  | 4270/5971 [37:55<15:06,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  90%|████████▉ | 150/167 [00:06<00:00, 24.86it/s][A

Validating:  92%|█████████▏| 153/167 [00:06<00:00, 24.48it/s][A
Epoch 8:  72%|███████▏  | 4274/5971 [37:56<15:03,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  93%|█████████▎| 156/167 [00:06<00:00, 24.81it/s][A
Epoch 8:  72%|███████▏  | 4278/5971 [37:56<15:00,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  95%|█████████▌| 159/167 [00:06<00:00, 24.74it/s][A
Epoch 8:  72%|███████▏  | 4282/5971 [37:56<14:57,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

Validating:  97%|█████████▋| 162/167 [00:06<00:00, 24.54it/s][A

Validating:  99%|█████████▉| 165/167 [00:06<00:00, 25.69it/s][A
Epoch 8:  72%|███████▏  | 4286/5971 [37:56<14:54,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
Epoch 8:  72%|███████▏  | 4288/5971 [37:56<14:53,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]

                                                             [A
Epoch 8:  72%|███████▏  | 4288/5971 [37:57<14:53,  1.88it/s, loss=0.184, v_num=0, train/loss_simple_step=0.0037, train/loss_vlb_step=1.82e-5, train/loss_step=0.0037, global_step=5e+3, train/loss_simple_epoch=0.156, train/loss_vlb_epoch=0.00306, train/loss_epoch=0.156]
